jasperjiaguo commented on PR #9420: URL: https://github.com/apache/pinot/pull/9420#issuecomment-1254495063
> The underlying assumption for this optimization is that filter for high cardinality column usually has higher selectivity. This assumption can hold only when the filter is not range. Should we consider re-ordering the scan iterators during the actual scan because at that time we know the selectivity for each filter (base on how many docs skipped) @Jackie-Jiang Good point. I updated the code to take care of this. Now it re-orders only for (not)eq, (not)in and prioritize these. The ranking behavior is the approximation of org.apache.pinot.controller.recommender.rules.utils.QueryInvertedSortedIndexRecommender#percentSelected. Basically assume a even distribution of values. Please bear with me to come up with a valid testcase for this code later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
