Hi, I was trying to parse text input with line-based information in mapper and this problem becomes an issue. I wonder if lines are preserved or broken when a file is cut into blocks by dfs. Also, it looks that although TextInputFormat breaks file into lines records, the InputSplit passed to InputFormat may not preserve lines. If this is the case, is it possible to restore the lines for mapper input, or I have to drop broken lines? Thank you.
Best, -Kevin