VenuReddy2103 opened a new pull request #3616: Polygon expression processing 
using unknown expression and filtering performance improvement
URL: https://github.com/apache/carbondata/pull/3616
 
 
   
    ### Why is this PR needed?
   This PR improves the query processing performance of in_polygon UDF.
    
    ### What changes were proposed in this PR?
    At present, PolygonExpression processing leverages the existing 
InExpression. PolygonExpression internally creates a InExpression as a child to 
it. InExpression is constructed/build from the result of Quad tree algorithm. 
Algorithm returns the list of ranges(with each range having min and max Id for 
that range). And this list is a sorted one.
                 InExpression constitute of 2 childs. One child is a 
columnExpression(for geohash column) and the other is a ListExpression( with 
List of LiternalExpressions. One LiteralExpression for each Id returned from 
algo).
   **Problems associated with this approach.**
   - We expand the list of ranges(with each range having minand max) to all 
individual Ids. And create LiteralExpression for each Id. Since we can have 
large ranges(and the numerous ranges), it consumes huge amount of memory in 
processing.
   - Due to same reason, it slows does the filter execution.
   
   Modifications with this PR:
   Instead we can use UnknownExpression with RowLevelFilterResolverImpl and 
RowLevelFilterExecuterImpl processing. And override evaluate() method to do the 
binary  searchon the list of ranges directly. This will significanly inprove 
the polygon filter query performance.
       
    ### Does this PR introduce any user interface change?
    - Yes. Need to update the design document.
   
    ### Is any new testcase added?
    - Yes. Added an end to end test case
   
       
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to