Each "object" (file, directory, and block) uses about 150 bytes of memory. If you lower the number of files by having larger ones, you save a modest amount of memory, depending on how many blocks your existing files use. The real savings comes from having larger files and a larger block size. Lets say you start by having some number of files where each file fits in a single block (64 MB). If you double the average block usage to two (<=128MB files) by lowering your number of files in half, you'll save up to 1/3 of your NN memory. If you also double your block size, you'll cut your usage by another 1/3. This becomes very significant very quickly.
-Joey On Fri, Jun 10, 2011 at 12:36 PM, Anh Nguyen <angu...@redhat.com> wrote: > On 06/10/2011 04:57 AM, Joey Echeverria wrote: >> >> Hi On, >> >> The namenode stores the full filesystem image in memory. Looking at >> your stats, you have ~30 million files/directories and ~47 million >> blocks. That means that on average, each of your files is only ~1.4 >> blocks in size. One way to lower the pressure on the namenode would >> be to store fewer, larger files. If you're able to concatenate files >> and still parse them, great. Otherwise, Hadoop provides a couple of >> container file formats that might help. >> >> SequenceFiles are Hadoop specific binary files that store key/value >> pairs. If your data fits that model, you can convert the data into >> SequenceFiles when you write it to HDFS, including data from multiple >> input files in a single SequenceFile. Here is a simple example of >> using the SequenceFile API: >> >> http://programmer-land.blogspot.com/2009/04/hadoop-sequence-files.html >> >> Another options are Hadoop Archive files (HARs). A HAR file lets you >> combine multiple smaller files into a virtual filesystem. Here are >> some links with details on HARs: >> >> >> http://developer.yahoo.com/blogs/hadoop/posts/2010/07/hadoop_archive_file_compaction/ >> http://hadoop.apache.org/mapreduce/docs/current/hadoop_archives.html >> >> If you're able to use any of these techniques to grow your average >> file size, then you can also save memory by increasing the block size. >> The default block size is 64MB, most clusters I've been exposed to run >> at 128MB. >> >> -Joey >> > > Hi Joey, > The explanation is really helpful as I've been looking for way to size the > NN heap. > What I am still not clear though is why changing the average file size would > save NN heap usage. > Since it contains the entire FS image, the sum of space would be the same > regardless of file size. > Or is it not the case because with larger file size there would be less > meta-data to maintain. > Thanks in advance for your clarification, particularly for advice on sizing > the heap. > > Anh- >> >> On Fri, Jun 10, 2011 at 7:45 AM, si...@ugcv.com<si...@ugcv.com> wrote: >> >>> >>> Dear all, >>> >>> I'm looking for ways to improve the namenode heap size usage of a >>> 800-node >>> 10PB testing Hadoop cluster that stores around 30 million files. >>> >>> Here's some info: >>> >>> 1 x namenode: 32GB RAM, 24GB heap size >>> 800 x datanode: 8GB RAM, 13TB hdd >>> >>> 33050825 files and directories, 47708724 blocks = 80759549 total. Heap >>> Size >>> is 22.93 GB / 22.93 GB (100%) >>> >>> From the cluster summary report, it seems the heap size usage is always >>> full >>> but couldn't drop, do you guys know of any ways to reduce it ? So far I >>> don't see any namenode OOM errors so it looks memory assigned for the >>> namenode process is (just) enough. But i'm curious which factors would >>> account for the full use of heap size ? >>> >>> Regards, >>> On >>> >>> >>> >>> >>> >> >> >> > > -- Joseph Echeverria Cloudera, Inc. 443.305.9434