lidavidm opened a new pull request #10270: URL: https://github.com/apache/arrow/pull/10270
This does not implement a fast path for CSV. However, it does configure the CSV reader to not actually deserialize any data, resulting in a large gain. When scanning 85 million rows of the NYC Taxi dataset, scan time dropped from 11 seconds to 2. This also sneaks in an implementation of the fast path for InMemoryFragment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
