Re: how does hdfs read archived files

Harsh J Wed, 24 Nov 2010 10:20:30 -0800

I think HARs maintain indexes of all the file boundaries in the blocks
created, and therefore it would "seek" to the beginning point within
the block to begin reading a particular file. So it does not exactly
"read" the entire block to retrieve that file.


On Wed, Nov 24, 2010 at 11:22 PM, Jason Ji <jason_j...@yahoo.com> wrote:
> hi guys,
>
> We plan to use hadoop hdfs  as the storage to store lots of  little files.
>
> According to the document , it is recommended to use hadoop
>
> Archive to compress those little files to get better performance .
>
>
>
> Our question is that since hdfs is reading the entire say 64m  block every
> time,
>
> Does it mean that everytime when we are just trying to retrieve a single
> file
>
> Inside the archive, hdfs will still read the whole block as well ?
>
> If no, what’s the actual behavior ? anyway we can verify it ?
>
>
>
> Thanks in advance.
>
> Jason
>
>
>



-- 
Harsh J
www.harshj.com

Re: how does hdfs read archived files

Reply via email to