VenuReddy2103 opened a new pull request #3772:
URL: https://github.com/apache/carbondata/pull/3772


    ### Why is this PR needed?
    At present, carbon doesn't do block/blocklet pruning for polygon fileter 
queries. It does rowlevel filtering at carbon layer and returns result. With 
this approach, all the carbon files are scanned irrespective of the where there 
are any matching rows in the block. It also has spark overhead to launch many 
jobs and tasks to process them. Thus affects the overall performance of polygon 
query.
    
    ### What changes were proposed in this PR?
   Leverage the existing block pruning mechanism in the carbon and avoided the 
unwanted blocks with block pruning. Thus reduce the number of splits. And at 
the executor side, used blocklet pruning and reduced the number of blocklets to 
be read and scanned.
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - Yes
   
       
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to