gengliangwang opened a new pull request #24783: [WIP][SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion URL: https://github.com/apache/spark/pull/24783 This is still WIP, I will come up with benchmark result. ## What changes were proposed in this pull request? In https://github.com/apache/spark/pull/24068, @IvanVergiliev reports that `OrcFilters.createBuilder` has exponential complexity in the height of the filter tree due to the way the check-and-build pattern is implemented. This is because the same method `createBuilder` is called twice recursively for any children under `And`/`Or`/`Not` nodes, so that inside the first call, the second call is called as well(See description in https://github.com/apache/spark/pull/24068 for details). Comparing to the approach in https://github.com/apache/spark/pull/24068, I propose a very simple solution for the issue. We can rely on the result of `convertibleFilters`, which can build a fully convertible tree. With it, we don't need to concern about the children of a certain node is not convertible in method `createBuilder`. ## How was this patch tested? Unit test
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
