Re: why does not hdfs read ahead ?

Steve Loughran Wed, 25 Nov 2009 02:36:17 -0800

Michael Thomas wrote:

Hey guys,
During the SC09 exercise, our data transfer tool was using the FUSEinterface to HDFS. As Brian said, we were also reading 16 files inparallel. This seemed to be the optimal number, beyond which theaggregate read rate did not improve.
We have worked scheduled to modify our data transfer tool to use thenative hadoop java APIs, as well as running some additional testsoffline to see if the HDFS-FUSE interface is the bottleneck as we suspect.
Regards,

--Mike


Was this all local data?

IN Russ Perry's little paper "High Speed Raster Image Streaming ForDigital Presses Using the Hadoop File System", he got 4Gb/s over the LANby having a client app deciding which datanode to pull each block from,rather than having the NN tell them which node to ask for which block

"Measured stream rates approaching 4Gb/s were achieved which is close tothe required rate for streaming pages containing rich designs to adigital press. This required only a minor extension to the Hadoop clientto allow file blocks to be read in parallel from the Hadoop data nodes."


http://www.hpl.hp.com/techreports/2009/HPL-2009-345.html

Re: why does not hdfs read ahead ?

Reply via email to