[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6483: Guard against multiple consuming segments for same partition

GitBox Mon, 25 Jan 2021 20:17:49 -0800


Jackie-Jiang commented on a change in pull request #6483:
URL: https://github.com/apache/incubator-pinot/pull/6483#discussion_r563974667




##########
File path: 
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java
##########
@@ -793,6 +793,27 @@ void 
updateInstanceStatesForNewConsumingSegment(Map<String, Map<String, String>>
       LOGGER.info("Updating segment: {} to ONLINE state", 
committingSegmentName);
     }
 
+    // There used to be a race condition in pinot (caused by heavy GC on the 
controller during segment commit)
+    // that ended up creating multiple consuming segments for the same stream 
partition, named somewhat like
+    // tableName__1__25__20210920T190005Z and 
tableName__1__25__20210920T190007Z. It was fixed by checking the
+    // Zookeeper Stat object before updating the segment metadata.
+    // These conditions can happen again due to manual operations considered 
as fixes in Issues #5559 and #5263
+    // The following check prevents the table from going into such a state 
(but does not prevent the root cause
+    // of attempting such a zk update).
+    if (instanceStatesMap != null) {

Review comment:
       `instanceStatesMap` won't be `null` here

##########
File path: 
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java
##########
@@ -793,6 +793,27 @@ void 
updateInstanceStatesForNewConsumingSegment(Map<String, Map<String, String>>
       LOGGER.info("Updating segment: {} to ONLINE state", 
committingSegmentName);
     }
 
+    // There used to be a race condition in pinot (caused by heavy GC on the 
controller during segment commit)

Review comment:
       Is this caused by controller running into GC and leader changed, and 2 
controllers trying to create the segment with the same seq num? If so, this 
change won't really work because the `instanceStatesMap` won't be up-to-date

##########
File path: 
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java
##########
@@ -793,6 +793,27 @@ void 
updateInstanceStatesForNewConsumingSegment(Map<String, Map<String, String>>
       LOGGER.info("Updating segment: {} to ONLINE state", 
committingSegmentName);
     }
 
+    // There used to be a race condition in pinot (caused by heavy GC on the 
controller during segment commit)

Review comment:
       I think it can only happen in the following orders:
   - The current leader controller start committing a segment
   - The current leader controller runs into GC and not responding, leadership 
changes to another controller
   - The new leader controller committed the segment
   - The previous leader controller wakes up and commit the segment again
   
   If this is the order, the fix won't work because the `instanceStatesMap` is 
not up-to-date if it is already calculated before the controller runs into GC. 
The right fix might be checking the leadership again before actually committing 
the segment.
   
   @jackjlli What was the fix?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6483: Guard against multiple consuming segments for same partition

Reply via email to