There is nothing really preventing you from filling your HDFS with a
lot of very small files*, so it would depend on your use case;
however, typical usage of Hadoop would prescribe as large of a block
size as is available, in order to stream very large files off the disk
efficiently.

* Except namenode heap space

Cheers,
Anthony

On Tue, Sep 8, 2009 at 10:31 AM, CubicDesign<cubicdes...@gmail.com> wrote:
> Which is the best disk cluster size for a Linux partition (let's say ext3)
> when using Hadoop on top of it?
> The default size is 4KB. It will Hadoop get an advantage if I format the
> disk cluster to 8 or 16KB, or even more?
>

Reply via email to