On Nov 24, 2009, at 12:06 PM, Todd Lipcon wrote: > Also, keep in mind that, when you open a block for reading, the DN > immediately starts writing the entire block (assuming it's requested via the > xceiver protocol) - it's TCP backpressure on the send window that does flow > control there.
Ok, that's a pretty freakin' cool idea. Is it well-documented how this technique works? How does this affect folks (me) who use the pread interface? > So, although it's not explicitly reading ahead, most of the > reads on DFSInputStream should be coming from the TCP receive buffer, not > making round trips. > > At one point a few weeks ago I did hack explicit readahead around > DFSInputStream and didn't see an appreciable difference. I didn't spend much > time on it, though, so I may have screwed something up - wasn't a scientific > test. > Speaking from someone who's worked with storage systems that do an explicit readahead, this can turn out to be a big giant disaster if it's combined with random reads. Big disaster as far as application-level throughput goes - but does make for impressive ganglia graphs! Brian > -Todd > > On Tue, Nov 24, 2009 at 10:02 AM, Eli Collins <[email protected]> wrote: > >> Hey Martin, >> >> It would be an interesting experiment but I'm not sure it would >> improve things as the host (and hardware to some extent) are already >> reading ahead. A useful exercise would be to evaluate whether the new >> default host parameters for on-demand readahead are suitable for >> hadoop. >> >> http://lwn.net/Articles/235164 >> http://lwn.net/Articles/235181 >> >> Thanks, >> Eli >> >> On Mon, Nov 23, 2009 at 11:23 PM, Martin Mituzas <[email protected]> >> wrote: >>> >>> I read the code and find the call >>> DFSInputStream.read(buf, off, len) >>> will cause the DataNode read len bytes (or less if encounting the end of >>> block) , why does not hdfs read ahead to improve performance for >> sequential >>> read? >>> -- >>> View this message in context: >> http://old.nabble.com/why-does-not-hdfs-read-ahead---tp26491449p26491449.html >>> Sent from the Hadoop core-user mailing list archive at Nabble.com. >>> >>> >>
smime.p7s
Description: S/MIME cryptographic signature
