I will create a JIRA for this one. I agree that Namenode should stop creating new files when it has already used up a certain percentage of main memory.
There are two reasons that cause memory pressure on the Namenode. One is the creation of a large number of files. This reduces the free memory pool and the GC has to work even harder to recycle memory. The other reason is when a burst of RPCs arrive at the Namenode (especially Block reports). This spurt causes free memory to reduce dramatically within a couple of seconds and makes GC work harder. And we know that when GC runs hard, the server threads in the JVM starve for CPU, causing timeouts on clients. One line of reasoning is that if we never timeout client RPC requests (HADOOP-2188), then the above situation will not occur. A GC run on the Namenode will cause clients to block and slowdown. My feeling is that we should observe the system post-2188 and then decide whether (and policy) we need to monitor Namenode resources. Thanks, dhruba -----Original Message----- From: Doug Cutting [mailto:[EMAIL PROTECTED] Sent: Monday, December 03, 2007 3:04 PM To: hadoop-dev@lucene.apache.org Subject: Re: limiting memory use on namenode? dhruba Borthakur wrote: > Sanjay and I had a discussion on this one earlier. We thought that this > would help Namenode robustness. Is there an issue in Jira for this? If not, should we add one? > We also thought that this was part of > Java 6, and we could make this feature optionally configurable. The was added in Java 1.5. But we're planning to move to 1.6 in 0.16 anyway. So it does not need to be optional. Whenever the heap is greater than, e.g., 90% of max, I think it would be best to not permit the creation of new files, don't you? Doug