VenuReddy2103 opened a new pull request #3772:
URL: https://github.com/apache/carbondata/pull/3772
### Why is this PR needed?
At present, carbon doesn't do block/blocklet pruning for polygon fileter
queries. It does rowlevel filtering at carbon layer and returns result. With
this approach, all the carbon files are scanned irrespective of the where there
are any matching rows in the block. It also has spark overhead to launch many
jobs and tasks to process them. Thus affects the overall performance of polygon
query.
### What changes were proposed in this PR?
Leverage the existing block pruning mechanism in the carbon and avoided the
unwanted blocks with block pruning. Thus reduce the number of splits. And at
the executor side, used blocklet pruning and reduced the number of blocklets to
be read and scanned.
### Does this PR introduce any user interface change?
- No
### Is any new testcase added?
- Yes
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]