mcvsubbu commented on code in PR #12045:
URL: https://github.com/apache/pinot/pull/12045#discussion_r1406907002
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java:
##########
@@ -821,6 +821,19 @@ public void segmentStoppedConsuming(LLCSegmentName
llcSegmentName, String instan
_controllerMetrics.addMeteredTableValue(realtimeTableName,
ControllerMeter.LLC_ZOOKEEPER_UPDATE_FAILURES, 1L);
throw e;
}
+ // We know that we have successfully set the idealstate to be OFFLINE.
+ // We can now do a best effort to reset the externalview to be OFFLINE if
it is in ERROR state.
+ // If the externalview is not in error state, then this reset will be
ignored by the helix participant
+ // in the server when it receives the ERROR to OFFLINE state transition.
+ // Helix throws an exception if we try to reset state of a partition that
is NOT in ERROR state in EV,
+ // So, if any exceptions are thrown, ignore it here.
+ try {
+ _helixAdmin.resetPartition(_helixManager.getClusterName(), instanceName,
+
TableNameBuilder.REALTIME.tableNameWithType(llcSegmentName.getTableName()),
+ Collections.singletonList(segmentName));
+ } catch (Exception e) {
+ // Ignore
Review Comment:
@sajjad-moradi and I discussed this, and we think the ideal solution is to:
- Create specific reason codes (not strings) that we can exchange between
the controller and server when the server posts any event to the controller
(not just to indicate stop consumption)
- Use the reason code to decide whether the reset API should be invoked
(e.g. a reason code that indicates that the consuming segment was never
created, will end up invoking the helix reset API). I will file an issue for
this as a cleanup task.
Another alternative can be to look at the externalview in this same method
before invoking the API. I had considered that alternative, and discarded it
since it involves another Helix API call that may come in with its own errors
and exceptions to be handled. It seemed to me to better to just call the reset
API once and for all.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]