Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/164#issuecomment-38025512
If I understand the purpose correctly, this PR is for reading small text
files. But most of the code is to handle the corner case when a file's size is
greater than 2GB. You mentioned Mahout hit this problem. What was the use case
there? If someone needs to concat several 2GB bytes buffers to create a single
Text record, very likely he/she are not doing the right thing, IMHO.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---