connec opened a new issue, #6066:
URL: https://github.com/apache/arrow-rs/issues/6066

   ### Is your feature request related to a problem or challenge?
   
   I'm trying to read CSVs that include newlines in (quoted) values.
   
   ### Describe the solution you'd like
   
   Some googling revealed that this isn't supported currently by the 
`arrow-csv` crate, whereas that functionality does exist in the C++ 
([`ParseOptions::newlines_in_values`](https://arrow.apache.org/docs/cpp/api/formats.html#_CPPv4N5arrow3csv12ParseOptions18newlines_in_valuesE))
 and Python 
([`ParseOptions.newlines_in_values`](https://arrow.apache.org/docs/python/generated/pyarrow.csv.ParseOptions.html#pyarrow.csv.ParseOptions.newlines_in_values))
 implementations.
   
   Ideally, a `newlines_in_values` field could be added to 
[`arrow_csv::reader::Format`](https://docs.rs/arrow-csv/latest/arrow_csv/reader/struct.Format.html)
 to support this functionality.
   
   Note that the Python docs call out the performance implications of this:
   
   > Setting this to True reduces the performance of multi-threaded CSV reading.
   
   I haven't dug into the implementation, but I don't think `arrow-rs` supports 
parallel CSV reading itself? So this implication might not hold at this layer 
(my original use-case is with `datafusion` where this would be relevant).
   
   ### Describe alternatives you've considered
   
   The only alternative I can see would be to preprocess the CSV before feeding 
it to arrow. I haven't explored this option as my use-case is through 
datafusion and this would require a lot of plumbing, and it seems valuable to 
have parity with other arrow CSV packages (C++ and Python, at least).
   
   ### Additional context
   
   N/A


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to