kishoreg opened a new pull request #3525: Adding execution stats for 
numSegmentsQueried/Processed/Matched
URL: https://github.com/apache/incubator-pinot/pull/3525
 
 
   This PR allows us to know how effective is our pruning strategy. The flow is 
as follows
   1. Total number of segments (NumSegments)
   2. Broker side Pruning (NumSegmentsPrunedByBroker)
   3. Broker queries all Servers after pruning(NumSegmentsQueried) 
   4. Each server applies further pruning (NumSegmentsPrunedByServer)
   5. Each server processes all segments after Pruning (NumSegmentsProcessed)
   6. Some of the segments processed have no matching rows 
(NumSegmentsWithNoMatch)
    
   If the pruning(broker + server) is effective, NumSegmentsWithNoMatch should 
be close to zero. Today, we don't have any metrics on numSegmentsProcessed. 
This PR allows us to identify use cases where Pruning is ineffective and we can 
add additional pruners e.g BloomFilter 
   
   In the perf benchmarks, we have seen that the number of Segments can impact 
latency and max throughput.
   
   Could not find any test case for BrokerNativeResponse class that validates 
the metadata fields in a response. Will add one and update the PR.
   
   Sample response after this change
   {
       "aggregationResults": [
       {
           "function": "count_star",
           "value": "9"
       }],
       "exceptions": [],
       "numServersQueried": 1,
       "numServersResponded": 1,
       "numDocsScanned": 9,
       "numEntriesScannedInFilter": 0,
       "numEntriesScannedPostFilter": 0,
       "totalDocs": 66942316,
       "numGroupsLimitReached": false,
       "timeUsedMs": 9,
       "segmentStatistics": [],
       "traceInfo": {},
       **"numSegmentsProcessed": 136,
       "numSegmentsWithNoMatch": 128**
   }
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to