jtao15 commented on code in PR #8645:
URL: https://github.com/apache/pinot/pull/8645#discussion_r866372387
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/retention/RetentionManager.java:
##########
@@ -237,6 +237,21 @@ private void
manageSegmentLineageCleanupForTable(TableConfig tableConfig) {
// entry and its segments
Set<String> destinationSegments = new
HashSet<>(lineageEntry.getSegmentsTo());
destinationSegments.retainAll(segmentsForTable);
+
+ // Retain zombie segments for the case of rerunning the replace
protocol with overlapping segment names.
+ // For the following example, we should not try to delete s1 and
s2, but remove entry1 directly.
+ // entry1: { segmentsFrom: [], segmentsTo: [s1, s2], status:
REVERTED}
+ // entry2: { segmentsFrom: [], segmentsTo: [s1, s2], status:
IN_PROGRESS/COMPLETED}
+ if (lineageEntry.getState() == LineageEntryState.REVERTED) {
Review Comment:
Currently, startReplaceSegment() delete the segments asynchronously.
Deleting the reverted entry immediately requires to change it to blocking (wait
till deletion is done, write the lineage in Zookeeper and return the response).
Moreover, the reverted entry can also be useful for future improvements. If
we add the lineageEntryId as a parameter for segment upload (I'm planning to
work on this), the client side can short circuit if the lineage entry is
reverted. This can help the case of ungraceful shutdown and rerun the flow.
With the reverted entry, we can avoid the potential race condition of
overwriting the segments from two runs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]