Are these really tiny files, or are you really storing 2M x 100MB = 200TB of
data? Or do you have more like 2M x 10KB = 20GB of data?

Map-reduce and HDFS will generally work much better if you can arrange to
have relatively larger files.


On 7/15/07 8:04 AM, "erolagnab" <[EMAIL PROTECTED]> wrote:

> 
> I have a HDFS with 2 datanodes and 1 namenode in 3 different machines, 2G ram
> each.
> Datanode A contains around 700,000 blocks and Datanode B contains 1,200,000+
> blocks, the namenode fails to start due to out of memory when trying to add
> Datanode B into its rack. I have adjusted the java heap memory to 1600MB
> which is the maxinum. But it still runs out of memory.
> 
> AFAIK, namenode loads all blocks information into the memory. If so, then is
> there anyway to estimate how much ram needed for a HDFS with given number of
> blocks in each datanode?

Reply via email to