gengliangwang opened a new pull request #24783: [WIP][SPARK-27105][SQL] 
Optimize away exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24783
 
 
   This is still WIP, I will come up with benchmark result.
   
   ## What changes were proposed in this pull request?
   
   In https://github.com/apache/spark/pull/24068,  @IvanVergiliev reports that 
`OrcFilters.createBuilder` has exponential complexity in the height of the 
filter tree due to the way the check-and-build pattern is implemented. 
   This is because the same method `createBuilder` is called twice recursively 
for any children under `And`/`Or`/`Not` nodes, so that inside the first call, 
the second call is called as well(See description in 
https://github.com/apache/spark/pull/24068 for details).
   
   Comparing to the  approach in https://github.com/apache/spark/pull/24068, I 
propose a very simple solution for the issue.  We can rely on the result of 
`convertibleFilters`, which can build a fully convertible tree. With it, we 
don't need to concern about the children of a certain node is not convertible 
in method `createBuilder`.
   ## How was this patch tested?
   
   Unit test
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to