Hi Chiranjeevi,

By default FileSplitter sets its block size to the hdfs default block size
which is optimal in most cases.

The BlockReaders in Malhar library handle the case when the record is split
across blocks.
AbstractFSReadAheadLineReader does this by ignoring the bytes till the
first end-of-line character and  always reading ahead a line, i.e., even if
a line boundary coincides with the block boundary, it reads next line from
the next block.

Chandni

Reply via email to