9aman commented on code in PR #14908:
URL: https://github.com/apache/pinot/pull/14908#discussion_r1929478227
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java:
##########
@@ -1974,6 +1975,22 @@ private Set<String> findConsumingSegments(IdealState
idealState) {
}
}
});
+ // For pauseless tables, a segment marked ONLINE in the ideal state may
not have been committed yet.
Review Comment:
Have addressed this below.
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java:
##########
@@ -1974,6 +1975,22 @@ private Set<String> findConsumingSegments(IdealState
idealState) {
}
}
});
+ // For pauseless tables, a segment marked ONLINE in the ideal state may
not have been committed yet.
+ // We rely on SegmentZkMetadata to determine whether a segment has been
committed (status is DONE)
+ // instead of relying solely on the ideal state.
+ // A segment in COMMITTING state is treated as consuming for pauseStatus.
+ String tableNameWithType = idealState.getResourceName();
+ if
(PauselessConsumptionUtils.isPauselessEnabled(getTableConfig(tableNameWithType)))
{
+ Map<Integer, SegmentZKMetadata> metadataMap =
getLatestSegmentZKMetadataMap(tableNameWithType);
Review Comment:
I started with an assumption that the user has issued a `pause request`
before the status is checked.
I am not sure whether this assumption is correct.
No new consuming segments will be created for the table when the `force
commit` message is sent as part of the pause table call.
This ensures that the last segment is :
1. Marked `CONSUMING` in the `IS` as the commit protocol has not started.
2. Marked `ONLINE` but has status `COMMITTING` status as the commit protocol
has not finished.
I did not want to add a function `findCommittingSegments` here for two
reasons:
1. Correctness: an older `COMMITTING` segment that **failed** to complete
commit protocol will also be returned in the pause status and I feel that's
wrong indication of the pause status. The user is only concerned with the
latest segment's commit status. Previously this status was derived from IS and
for pauseless it derived from ZK metadata.
2. Performance: Fetching ZK metadata for all the segments is an expensive
operation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]