Hi Vincent,

Maybe you'd be better off using something like MogileFS for this application?

-Todd

On Tue, Oct 11, 2011 at 1:39 AM, Vincent Boucher <vin.bouc...@gmail.com> wrote:
> Hello again,
>
> * Our case:
>
> Most of the files we are dealing with are 10GB wide. Our hdfs configuration
> would be the following: data is stored on mass storage servers (10x50TB) each
> with RAID6; no replica for data.
>
> With a 64MB hdfs block size, it is extremely likely that all of our 10GB files
> will be spread over all the mass storage servers. Consequently, having one of
> these servers down/dead will corrupt the full filesystem (all the 10GB
> files). Not great.
>
> Opting for bigger blocks (blocks of 12.5GB [= 200x64MB]) will reduce the
> spread: the file contents will be stored on a single server. Having one
> server down/dead will corrupt only 10% of the files in the filesystem (since
> there are 10 servers). That's much easier to regenerate/re-download from
> other Tiers than doing it for the full filesystem, as in the case of the 64MB
> blocks.
>
>
> * Questions:
>
>  Is hdfs suitable with huge block size (12.5GB)?
>
>  Do you have experience with hdfs with such block size?
>
>
> Cheers,
>
> Vincent
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to