Hadoop works fine on the local file system. The example apps don't
even bother copying things into hdfs first. But the problem, as Ted
mentioned, with working with huge numbers of small  files on the
filesystem is IO speed. Hard drives just aren't that fast no matter
how much you spend.

I would bet that hdfs is going to chunk those files up and parcel them
out to the processes in a much more efficient manner than reading them
directly from the local filesystem would provide. I don't have any
numbers to back this up but as the local filesystem ISN'T tuned for
this usage and hdfs IS it seems reasonable to assume better
performance from it.



-- 
- kate = masukomi
http://weblog.masukomi.org/

Reply via email to