Dandandan opened a new pull request #9488: URL: https://github.com/apache/arrow/pull/9488
This redesigns the csv parser to use `csv-core` instead of `csv`, allowing to avoid the overhead of the `StringRecord` (per-row allocation). This uses a `Vec` now for offsets + data. In the future, this could directly write to arrow arrays instead. The `test_csv` test passes. TODO: * Reuse buffer in between batches * Support header / skipping rows. * Fixing issues * Benchmark * Look at tests ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
