Thanks Ted,
Unfortunately, those files are really tiny files. Is it a good practice
if HDFS can combine those tiny files into a single block which fits a
standard size of 64M?
Ted Dunning wrote:
Are these really tiny files, or are you really storing 2M x 100MB = 200TB of
data? Or do you have more like 2M x 10KB = 20GB of data?
Map-reduce and HDFS will generally work much better if you can arrange to
have relatively larger files.
On 7/15/07 8:04 AM, "erolagnab" <[EMAIL PROTECTED]> wrote:
I have a HDFS with 2 datanodes and 1 namenode in 3 different machines, 2G ram
each.
Datanode A contains around 700,000 blocks and Datanode B contains 1,200,000+
blocks, the namenode fails to start due to out of memory when trying to add
Datanode B into its rack. I have adjusted the java heap memory to 1600MB
which is the maxinum. But it still runs out of memory.
AFAIK, namenode loads all blocks information into the memory. If so, then is
there anyway to estimate how much ram needed for a HDFS with given number of
blocks in each datanode?