IvanVergiliev commented on a change in pull request #24068: [SPARK-27105][SQL]
Optimize away exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24068#discussion_r289731562
##########
File path:
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala
##########
@@ -135,6 +139,34 @@ object FilterPushdownBenchmark extends BenchmarkBase with
SQLHelper {
benchmark.run()
}
+ def filterPushDownBenchmarkWithColumn(
Review comment:
I think we should definitely have some automated benchmark for this.
Otherwise there's nothing in the codebase exercising the behaviour being
changed, and so nothing to prevent future regressions.
I didn't get initially that the proposal is to remove both new benchmarks
here. I'm fine with removing the one that only does the predicate conversion
(`"Predicate conversion benchmark with unbalanced Column"`). However, I'm
strongly opposed to removing both as it would then be too easy to re-introduce
the perf issue. This is a perf area that's already had to be fixed / worked
around before (see the PR introducing the `buildTree` function to avoid skewed
trees, for example) so I would feel uncomfortable leaving it without automated
coverage.
If there's a more appropriate place to put the remaining benchmark (for
example, `OrcV1FilterSuite`), I'm completely open to moving it. However, I'm
under the impression from other code reviews that the regular test suites in
Spark are not meant to test performance, and this would go against that.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]