gortiz commented on code in PR #12704:
URL: https://github.com/apache/pinot/pull/12704#discussion_r1582735688
##########
pinot-common/src/main/java/org/apache/pinot/common/response/BrokerResponse.java:
##########
@@ -123,6 +127,17 @@ String toJsonString()
*/
long getMinConsumingFreshnessTimeMs();
+ /**
+ * Get the max number of rows seen by a single operator in the query
processing chain.
+ * <p>
+ * In single-stage this value is equal to {@link
#getNumEntriesScannedPostFilter()} because single-stage operators
+ * cannot generate more rows than the number of rows received. This is not
the case in multi-stage operators, where
+ * operators like join or union can emit more rows than the ones received on
each of its children operators.
+ */
+ default long getMaxRowsInOperator() {
Review Comment:
~This is not a new method but a method that existed in
`BrokerResponseNative` and was inherited by `BrokerResponseV2` . Given it is
already there (even it doesn't make sense) and it was used to calculate
`isPartialResponse` I think it is useful to have it in the base interface now
that `BrokerResponseNativeV2` does not inherit `BrokerResponseNative` anymore.~
~The alternative would be to do not make `isPartialResponse` `default` and
include the same code in both `BrokerResponseNative` and
`BrokerResponseNativeV2`. If you think that is better I can change it, but it
looks a bit repetitive. For sure currently how this stats are being declared
doesn't make sense, but we can try to improve that in the future~
I thought this was `isMaxRowsInJoinReached` not `getMaxRowsInOperator`. That
is why I wrote the two paragraphs above. I'm keeping them just to have the
context.
The reason to define `getMaxRowsInOperator` is different. We need this
method here because this is an important stat. This PR adds this value into the
table that shows stats in Pinot UI and it is easier to assume this exists for
both V1 and V2 than dealing with its absence. Also, it makes sense to have a
stat that shows the max number of rows seen in any operator. In V1 that is
limited by the number of elements returned by a filter, but in V2 that is not
the case. I think having this attribute in both V1 and V2 is going to make the
life easier for users who can be trained to just look for this value instead of
reading `numEntriesScannedPostFilter` in V1 and `maxRowsInOperator` in V2
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]