marchpure opened a new pull request #3620: [CARBONDATA-3700] Optimize prune performance when prunning with multi… URL: https://github.com/apache/carbondata/pull/3620 …-threads Why is this PR needed? When pruning with multi-threads, there is a bug hambers the prunning performance heavily. When the pruning results in no blocklets to map the query filter, The getExtendblocklet function will be triggered to get the extend blocklet metadata, when the Input of this function is an empty blocklet list, this function is expected to return an empty extendblocklet list directyly , but now there is a bug leading to "a hashset add operation" overhead which is meaningless. Meanwhile, When pruning with multi-threads, the getExtendblocklet function will be triggerd for each blocklet, which should be avoided by triggerring this function for each segment. What changes were proposed in this PR? 1) if the input is an empty blocklet list in the getExtendblocklet function, we return an empty extendblocklet list directyly 2) We trigger the getExtendblocklet functon for each segment instead of each blocklet. Does this PR introduce any user interface change? No. Is any new testcase added? Yes. ### Why is this PR needed? ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
