Serge P. Nekoval created CSV-229:
------------------------------------
Summary: Allow byte position tracking in CSVParser
Key: CSV-229
URL: https://issues.apache.org/jira/browse/CSV-229
Project: Commons CSV
Issue Type: New Feature
Components: Parser
Reporter: Serge P. Nekoval
Attachments: csv_bytes.patch
This is a patch which adds significant modifications to the
ExtendedBufferedReader.
The problem is that efficient CSV parsing requires *byte positioning*, not
character positioning as currently provided.
The cases where byte positioning is necessary:
* Suspend/resume parsing
* Pagination/split where a large CSV file is read in chunks using file
positioning.
I've found the ExtendedBufferedReader to be unable to manage bytes in its
current state (relying on BufferedReader and characters), so instead I had to
redesign/merge these two classes.
This modification is what we use in our system, so I'm hoping to get it
released (otherwise we have to deal with custom build of Commons CSV).
Architecturally the solution might be incomplete, however it provides what I
need - getBytePosition() from a CSVParser. The entire chain only works if you
provide a Reader AND a charset!
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)