Thanks Doug for your answers. Our interest is on more distributed file
system part rather than map reduce.
I must confess that our block size is not as large as how a lot of people
configure. I appreciate if I can get your and others' input.

Do you think these numbers are suitable?

We will have 5 million files each having 20 blocks of 2MB. With the minimum
replication of 3, we would have 300 million blocks.
300 million blocks would store 600TB. At ~10TB/node, this means a 60 node
system.

Do you think these numbers are suitable for Hadoop DFS.

Cagdas



At ~100M per block, 100M blocks would store 10PB.  At ~1TB/node, this means
> a ~10,000 node system, larger than Hadoop currently supports well (for this
> and other reasons).
>
> Doug
>
>


-- 
------------
Best Regards, Cagdas Evren Gerede
Home Page: http://cagdasgerede.info

Reply via email to