alirezazamani opened a new issue #1143:
URL: https://github.com/apache/helix/issues/1143
There is flag in the cache which is called
_existsLiveInstanceOrCurrentStateChange which is mainly used by Task Framework
logic. This flag is used for the TF logic to check if task's target partition
has been moved to new instance or not. However, there is a possibility of the
race condition here. Since CurrentState and Message are existed in two
different folders and updated separately, if cache refresh happens in between,
we might loose the notification once the target partition has moved. (This
theory has been proved using a test and probably this can be a reason for some
existing flaky tests for targeted jobs). Also if currentState is changed and we
still have a pending message for partition, we do not make any decision for
this partition/task. To resolve this issue, we might want to also consider the
message change for this flag as well. So the code can be something like this:
```
private void refreshClusterStateChangeFlags(Set<HelixConstants.ChangeType>
propertyRefreshed) {
_existsLiveInstanceOrCurrentStateChange =
_propertyDataChangedMap.get(HelixConstants.ChangeType.CURRENT_STATE).getAndSet(false)
||
_propertyDataChangedMap.get(HelixConstants.ChangeType.MESSAGE).getAndSet(false)
||
propertyRefreshed.contains(HelixConstants.ChangeType.CURRENT_STATE)
||
propertyRefreshed.contains(HelixConstants.ChangeType.LIVE_INSTANCE);
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]