lidavidm opened a new pull request #9638: URL: https://github.com/apache/arrow/pull/9638
This adds a microbenchmark for SimplifyWithGuarantee which, especially for a large dataset, can contribute a significant amount of time to reading a dataset, as it's used to evaluate partition expressions against the filter. Two different filters are tested: one is fully simplified, and one has had casts inserted (which will happen if you Bind() against a schema with different types). Two different partition expressions are tested: one is fully simplified, and one compares against dictionary-encoded values (which will happen by default if you infer the schema for a Hive-partitioned, for example). All 4 combinations are additionally tested both when the filter matches the expression and when it does not match. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
