[GitHub] [pinot] kishoreg commented on issue #7642: numEntriesScannedInFilter > numDocsScanned in certain scenarios

GitBox Wed, 27 Oct 2021 07:44:23 -0700


kishoreg commented on issue #7642:
URL: https://github.com/apache/pinot/issues/7642#issuecomment-953001970



   If you use the new range index added by @richardstartin, it will come down 
to 0. Apologies for repeating this again, don't try to correlate 
numScannedInFilter and numDocsScanned. Think of query processing as having two 
parts
   
   1.  Filter (applies all the predicates to find the matching rows)
   2. Post Filter (aggregation/group by or just select + order). In this phase, 
all the matching rows are scanned
   
   numEntriesScannedInFilter is the proxy for work done in the filter phase. 
The value for this can be anywhere between
   - totalDocs * numColsIn Filter Clause - this is the worst case and happens 
when there is no pruning, no indexes
   - 0: this is the best case when the right indexes with the right 
configuration are used. 
   
   numDocsScanned is the proxy for work done post filter phase. The value for 
this is number of rows that match the filter condition. However, this can be 
lowered by using star-tree index.
   
   Hope that helps.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [pinot] kishoreg commented on issue #7642: numEntriesScannedInFilter > numDocsScanned in certain scenarios

Reply via email to