Erik Paulson wrote:
When reading from HDFS, how big are the network read requests, and what controls that? Or, more concretely, if I store files using 64Meg blocks in HDFS and run the simple word count example, and I get the default of one FileSplit/Map task per 64 meg block, how many bytes into the second 64meg block will a mapper read before it first passes a buffer up to the record reader to see if it has found an end-of-line?
This is controlled by io.file.buffer.size, which is 4k by default. Doug
