Cagdas Gerede wrote:
We will have 5 million files each having 20 blocks of 2MB. With the minimum replication of 3, we would have 300 million blocks. 300 million blocks would store 600TB. At ~10TB/node, this means a 60 node system.Do you think these numbers are suitable for Hadoop DFS.
Why are you using such small blocks? A larger block size will decrease the strain on Hadoop, but perhaps you have reasons?
Doug