minyyy commented on PR #36121:
URL: https://github.com/apache/spark/pull/36121#issuecomment-1095480110
I ran the TPC-DS benchmark twice locally:
~/Work/spark (expr_set ✔) cat /tmp/master_result | grep Filter
org.apache.spark.sql.catalyst.optimizer.PruneFilters
20420542 / 499545296 4 / 1740
org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints
383832301 / 386115187 271 / 312
org.apache.spark.sql.execution.dynamicpruning.CleanupDynamicPruningFilters
0 / 41344027 0 / 312
org.apache.spark.sql.catalyst.optimizer.EliminateAggregateFilter
0 / 12488623 0 / 1428
org.apache.spark.sql.catalyst.optimizer.CombineFilters
0 / 11010749 0 / 1428
org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithFilter
0 / 2485281 0 / 352
org.apache.spark.sql.catalyst.optimizer.InjectRuntimeFilter
0 / 1394478 0 / 312
org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters
0 / 1097207 0 / 312
org.apache.spark.sql.catalyst.optimizer.InferFiltersFromGenerate
0 / 1063153 0 / 312
~/Work/spark (expr_set ✔) cat /tmp/exprset_result | grep Filter
org.apache.spark.sql.catalyst.optimizer.PruneFilters
12251268 / 475647800 4 / 1740
org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints
320604026 / 322550195 271 / 312
org.apache.spark.sql.execution.dynamicpruning.CleanupDynamicPruningFilters
0 / 40531409 0 / 312
org.apache.spark.sql.catalyst.optimizer.EliminateAggregateFilter
0 / 12426889 0 / 1428
org.apache.spark.sql.catalyst.optimizer.CombineFilters
0 / 10278258 0 / 1428
org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithFilter
0 / 2384373 0 / 352
org.apache.spark.sql.catalyst.optimizer.InjectRuntimeFilter
0 / 1285621 0 / 312
org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters
0 / 1065992 0 / 312
org.apache.spark.sql.catalyst.optimizer.InferFiltersFromGenerate
0 / 1025776 0 / 312
~/Work/spark (master ✔) cat /tmp/master_result2 | grep Filter
org.apache.spark.sql.catalyst.optimizer.PruneFilters
12819415 / 505890443 4 / 1740
org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints
390196686 / 392300726 271 / 312
org.apache.spark.sql.execution.dynamicpruning.CleanupDynamicPruningFilters
0 / 40328844 0 / 312
org.apache.spark.sql.catalyst.optimizer.EliminateAggregateFilter
0 / 12271679 0 / 1428
org.apache.spark.sql.catalyst.optimizer.CombineFilters
0 / 10468968 0 / 1428
org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithFilter
0 / 2274441 0 / 352
org.apache.spark.sql.catalyst.optimizer.InjectRuntimeFilter
0 / 1453612 0 / 312
org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters
0 / 1008569 0 / 312
org.apache.spark.sql.catalyst.optimizer.InferFiltersFromGenerate
0 / 970068 0 / 312
~/Work/spark (master ✔) cat /tmp/exprset_result2 | grep Filter
org.apache.spark.sql.catalyst.optimizer.PruneFilters
12891206 / 468780255 4 / 1740
org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints
319539076 / 321154860 271 / 312
org.apache.spark.sql.execution.dynamicpruning.CleanupDynamicPruningFilters
0 / 47870664 0 / 312
org.apache.spark.sql.catalyst.optimizer.EliminateAggregateFilter
0 / 13480991 0 / 1428
org.apache.spark.sql.catalyst.optimizer.CombineFilters
0 / 10508080 0 / 1428
org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithFilter
0 / 2389351 0 / 352
org.apache.spark.sql.catalyst.optimizer.InjectRuntimeFilter
0 / 1501485 0 / 312
org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters
0 / 1328054 0 / 312
org.apache.spark.sql.catalyst.optimizer.InferFiltersFromGenerate
0 / 973707 0 / 312
The major improvement is to InferFiltersFromConstraints, where it does lots
of unnecessary access to `.canonicalized`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]