Dandandan commented on PR #3365: URL: https://github.com/apache/arrow-rs/pull/3365#issuecomment-1357239281
> The performance gain is significantly better than I expected, to the point where I wonder if I've messed something up 😅 > > In particular the timings not scaling with batch size seems somewhat suspect to me... Timings make sense to me - for a single batch performance difference will be pretty large (as in the benchmark), but for a full csv file with many batches the difference probably is smaller as the `csv`-based implementation re-uses the allocated `StringRecord`s across batches (as long as they are large enough to hold the record). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
