bbannier opened a new pull request, #5759:
URL: https://github.com/apache/arrow-rs/pull/5759
This patch adds reader support for a comment character for reading CSV
files. While comments like almost nothing around the CSV format are not truly
standardized, a common format supported by many CSV readers[^1][^2] is to
ignore full lines starting with a comment character (often `#`); inline or end
of line comments are not supported.
Example:
# This is a comment in a CSV file without header.
1,2
# Comment inside the data block.
11,22
The implementation of this for Arrow is pretty straight-forward as all we
need to do is expose the existing `comment` option of `csv_core` used to read
CSV files.
Closes #5758.
[^1]:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
[^2]:
https://stat.ethz.ch/R-manual/R-devel/library/utils/html/read.table.html
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]