andygrove commented on issue #6127: URL: https://github.com/apache/arrow-datafusion/issues/6127#issuecomment-1523680537
Many of the current benchmarks I see online are querying a single CSV file, so we may want to benchmark that to measure the "first impression" of performance, but a more realistic use case IMO is querying partitioned Parquet files, so would be ideal to be benchmarking both. Personally, I think there is less value in benchmarking partitioned CSV or single Parquet. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
