Gert, What version of Hadoop are you using? One of the people at my work who is using 0.17.1 is reporting a similar problem - namenode's heapspace filling up too fast.
This is the status of his cluster (17 node cluster with version 0.17.1) *- 174541 files and directories, 121000 blocks = 295541 total. Heap Size is 898.38 MB / 1.74 GB (50%) ** * Here is the status of one of my clusters. (70 node cluster with version 0.16.3) - *265241 files and directories, 1155060 blocks = 1420301 total. Heap Size is 797.94 MB / 1.39 GB (56%)* ** Notice that the second cluster has about 9 times more blocks than the first one (and more files and dir's, too) but heap usage is in similar figures (actually smaller...) Has anyone also noticed any problems/inefficiencies in namenode's memory utilization in 0.17.x version? On Mon, Jul 28, 2008 at 2:13 AM, Gert Pfeifer <[EMAIL PROTECTED]>wrote: > There I have: > export HADOOP_HEAPSIZE=8000 > ,which should be enough (actually in this case I don't know). > > Running the fsck on the directory it turned out that there are 1785959 > files in this dir... I have no clue how I can get the data out of there. > Can I somehow calculate, how much heap a namenode would need to do an ls on > this dir? > > Gert > > > Taeho Kang schrieb: > > Check how much memory is allocated for the JVM running namenode. >> >> In a file HADOOP_INSTALL/conf/hadoop-env.sh >> you should change a line that starts with "export HADOOP_HEAPSIZE=1000" >> >> It's set to 1GB by default. >> >> >> On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer < >> [EMAIL PROTECTED]> >> wrote: >> >> Update on this one... >>> >>> I put some more memory in the machine running the name node. Now fsck is >>> running. Unfortunately ls fails with a time-out. >>> >>> I identified one directory that causes the trouble. I can run fsck on it >>> but not ls. >>> >>> What could be the problem? >>> >>> Gert >>> >>> Gert Pfeifer schrieb: >>> >>> Hi, >>> >>>> I am running a Hadoop DFS on a cluster of 5 data nodes with a name node >>>> and one secondary name node. >>>> >>>> I have 1788874 files and directories, 1465394 blocks = 3254268 total. >>>> Heap Size max is 3.47 GB. >>>> >>>> My problem is that I produce many small files. Therefore I have a cron >>>> job which just runs daily across the new files and copies them into >>>> bigger files and deletes the small files. >>>> >>>> Apart from this program, even a fsck kills the cluster. >>>> >>>> The problem is that, as soon as I start this program, the heap space of >>>> the name node reaches 100 %. >>>> >>>> What could be the problem? There are not many small files right now and >>>> still it doesn't work. I guess we have this problem since the upgrade to >>>> 0.17. >>>> >>>> Here is some additional data about the DFS: >>>> Capacity : 2 TB >>>> DFS Remaining : 1.19 TB >>>> DFS Used : 719.35 GB >>>> DFS Used% : 35.16 % >>>> >>>> Thanks for hints, >>>> Gert >>>> >>>> >>> >> >
