xyuanlu commented on a change in pull request #1362:
URL: https://github.com/apache/helix/pull/1362#discussion_r488322440
##########
File path:
helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTaskExecutor.java
##########
@@ -919,9 +912,25 @@ public void onMessage(String instanceName, List<Message>
messages,
// discard the message. Controller will resend if this is a valid
message
throw new HelixException(String.format(
"Another state transition for %s:%s is in progress with msg:
%s, p2p: %s, read: %d, current:%d. Discarding %s->%s message",
- message.getResourceName(), message.getPartitionName(),
msg.getMsgId(), String.valueOf(msg.isRelayMessage()),
- msg.getReadTimeStamp(), System.currentTimeMillis(),
message.getFromState(),
- message.getToState()));
+ message.getResourceName(), message.getPartitionName(),
msg.getMsgId(),
+ String.valueOf(msg.isRelayMessage()), msg.getReadTimeStamp(),
+ System.currentTimeMillis(), message.getFromState(),
message.getToState()));
+ }
+ if (createHandler instanceof HelixStateTransitionHandler) {
+ // We only check to state if there is no ST task
scheduled/executing.
+ Exception err = ((HelixStateTransitionHandler)
createHandler).validateStaleMessage(true /*inSchedulerCheck*/);
+ if (err != null) {
+ throw err;
Review comment:
We do continue when catch the exception (line 957). We could change the
message if I think if the wording is not accurate in the catch block.
##########
File path:
helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTaskExecutor.java
##########
@@ -919,9 +912,25 @@ public void onMessage(String instanceName, List<Message>
messages,
// discard the message. Controller will resend if this is a valid
message
throw new HelixException(String.format(
"Another state transition for %s:%s is in progress with msg:
%s, p2p: %s, read: %d, current:%d. Discarding %s->%s message",
- message.getResourceName(), message.getPartitionName(),
msg.getMsgId(), String.valueOf(msg.isRelayMessage()),
- msg.getReadTimeStamp(), System.currentTimeMillis(),
message.getFromState(),
- message.getToState()));
+ message.getResourceName(), message.getPartitionName(),
msg.getMsgId(),
+ String.valueOf(msg.isRelayMessage()), msg.getReadTimeStamp(),
+ System.currentTimeMillis(), message.getFromState(),
message.getToState()));
+ }
+ if (createHandler instanceof HelixStateTransitionHandler) {
+ // We only check to state if there is no ST task
scheduled/executing.
+ Exception err = ((HelixStateTransitionHandler)
createHandler).validateStaleMessage(true /*inSchedulerCheck*/);
+ if (err != null) {
+ throw err;
+ }
+ }
+ if (stateTransitionHandlers.containsKey(messageTarget)) {
Review comment:
Please correct me if I am wrong. I think if we have an old O->S and a
new S->M with current state is S. We do want to ignore the O->S and continue
with S->M.
##########
File path:
helix-core/src/main/java/org/apache/helix/messaging/handling/HelixStateTransitionHandler.java
##########
@@ -463,6 +442,35 @@ public void onError(Exception e, ErrorCode code, ErrorType
type) {
}
+ // Verify the fromState and current state of the stateModel.
+ public Exception validateStaleMessage(boolean inSchedulerCheck) {
Review comment:
TFTR. Will update.
##########
File path:
helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTaskExecutor.java
##########
@@ -919,9 +912,25 @@ public void onMessage(String instanceName, List<Message>
messages,
// discard the message. Controller will resend if this is a valid
message
throw new HelixException(String.format(
"Another state transition for %s:%s is in progress with msg:
%s, p2p: %s, read: %d, current:%d. Discarding %s->%s message",
- message.getResourceName(), message.getPartitionName(),
msg.getMsgId(), String.valueOf(msg.isRelayMessage()),
- msg.getReadTimeStamp(), System.currentTimeMillis(),
message.getFromState(),
- message.getToState()));
+ message.getResourceName(), message.getPartitionName(),
msg.getMsgId(),
+ String.valueOf(msg.isRelayMessage()), msg.getReadTimeStamp(),
+ System.currentTimeMillis(), message.getFromState(),
message.getToState()));
+ }
+ if (createHandler instanceof HelixStateTransitionHandler) {
+ // We only check to state if there is no ST task
scheduled/executing.
+ Exception err = ((HelixStateTransitionHandler)
createHandler).validateStaleMessage(true /*inSchedulerCheck*/);
Review comment:
The handling for these two cases are different.
For the first case when toState == currentState, we just treat the task as
succeeded and finish task. There is no difference between doing the no-op when
executing and bypass the task at scheduling phase.
In the second case when currentState != fromState, executor will set the
task as failed and report state transition error. Ir will also update ZK.
Having same logic causes a test fail. Controller keeps sending many
'INIT->RUNNING' message when currentState is TASK_ERROR if we don't report task
error. Currently task error is reported in HelixStateTransitionHandler.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]