seddonm1 opened a new pull request #9001: URL: https://github.com/apache/arrow/pull/9001
The current CSV reader cannot parse strings to types with leading/trailing white spaces as the parsers are very strict. This means being able to read and parse the [tpch-dbgen included answers](https://github.com/databricks/tpch-dbgen/tree/master/answers) files is not possible. The underlying csv crate supports a four different [behaviors for trimming strings](https://docs.rs/csv/1.1.5/csv/enum.Trim.html): - `None` (default): does no trimming. - `Headers`: trim only header fields. - `Fields`: trim only field values. - `All`: trim both headers and field values. Rather than exposing all these options and forcing users to understand the underlying csv crate this PR simplifies this decision to boolean: `None` (false) or `All` (true) while retaining the default false behavior. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org