seddonm1 opened a new pull request #9001:
URL: https://github.com/apache/arrow/pull/9001


   The current CSV reader cannot parse strings to types with leading/trailing 
white spaces as the parsers are very strict. This means being able to read and 
parse the [tpch-dbgen included 
answers](https://github.com/databricks/tpch-dbgen/tree/master/answers) files is 
not possible.
   
   The underlying csv crate supports a four different [behaviors for trimming 
strings](https://docs.rs/csv/1.1.5/csv/enum.Trim.html): 
   - `None` (default): does no trimming.
   - `Headers`: trim only header fields.
   - `Fields`: trim only field values.
   - `All`: trim both headers and field values.
   
   Rather than exposing all these options and forcing users to understand the 
underlying csv crate this PR simplifies this decision to boolean: `None` 
(false) or `All` (true) while retaining the default false behavior.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to