Nate Clark created ARROW-13252:
----------------------------------
Summary: [C++] CSV Add byte offset for error messages
Key: ARROW-13252
URL: https://issues.apache.org/jira/browse/ARROW-13252
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Reporter: Nate Clark
Assignee: Nate Clark
CSV parsing error messages will contain the row number when parallel parsing is
not enabled but when parallel parsing is enabled there is no indication of
where the error occurred in the input. In order to add that context the row
byte offset can be added to the output.
This can be done relatively easily for the parser but associating byte offsets
with the data or row being decoded would require more metadata to be maintained
in the DataBatch. Potentially doubling the size of ParsedValueDesc.
This was mentioned and discussed in comments
[here|https://github.com/apache/arrow/pull/10202#issuecomment-870796708]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)