alamb opened a new pull request, #7210: URL: https://github.com/apache/arrow-datafusion/pull/7210
## Which issue does this PR close? Part of https://github.com/apache/arrow-datafusion/issues/7052 ## Rationale for this change 1. This is not a standard benchmark 2. The benchmark code currently checked into the repo is a single low cardinality grouping query, a scenario that is already well covered by both tpch and clickbench 3. I also think the benchmark has atrophied (because, for example, the https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page now only supplies parquet where the benchmark still has references to CSV data) FWIW the query is: ```sql SELECT passenger_count, MIN(fare_amount), MAX(fare_amount), SUM(fare_amount) FROM tripdata GROUP BY passenger_count"); ``` ## What changes are included in this PR? Remove the NY taxi benchmark code and entry in readme ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 3. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
