pkuwm commented on a change in pull request #1066:
URL: https://github.com/apache/helix/pull/1066#discussion_r460207219
##########
File path: helix-core/src/main/java/org/apache/helix/HelixProperty.java
##########
@@ -148,7 +154,12 @@ public String toString() {
}
}
- private Stat _stat;
+ protected Stat _stat;
+
+ public HelixProperty() {
+ _record = DEFAULT_ZNRECORD;
+ _stat = DEFAULT_STAT;
+ }
Review comment:
If we don't have this default constructor, in Message's constructor, we
need to call `super(ZNRecord)`. HelixProperty defines `ZNRecord` but not
`SessionAwareZNRecord`. Then if we want Message's SessionAwareZNRecord, we may
override `getRecord()` to convert `ZNRecord` to `SessionAwareZNRecord`:
```
Override
public ZNRecord getRecord() {
return new SessionAwareZNRecord(_record);
}
```
The problem is this creates a new record from the original one `_record`. We
could not access the original one: `message.getRecord().setSimpleField()`.
So we need to create SessionAwareZNRecord in Message's constructor and
override parent's ZNRecord. Without the default constructor, we won't be able
to do `_record = new SessionAwareZNRecord()` in Message.
##########
File path:
helix-core/src/main/java/org/apache/helix/controller/GenericHelixController.java
##########
@@ -757,9 +762,22 @@ private void handleEvent(ClusterEvent event,
BaseControllerDataProvider dataProv
}
event.addAttribute(AttributeName.ControllerDataProvider.name(),
dataProvider);
- logger.info(String.format("START: Invoking %s controller pipeline for
cluster %s event: %s %s",
- manager.getClusterName(), dataProvider.getPipelineName(),
event.getEventType(),
- event.getEventId()));
+ String eventSessionId =
event.getAttribute(AttributeName.EVENT_SESSION.name());
+ if (eventSessionId != null) {
+ String managerSessionId = manager.getSessionId();
+ if (!eventSessionId.equals(managerSessionId)) {
Review comment:
I got your point. To achieve one level if statement, we could move
managerSessionId outside. Then the scope of managerSessionId is bigger.
```
String eventSessionId =
event.getAttribute(AttributeName.EVENT_SESSION.name());
String managerSessionId = manager.getSessionId();
if (eventSessionId != null && !eventSessionId.equals(managerSessionId)) {
...
}
```
But after this if statement we don't need managerSessionId. This what I try
to do, following `Effective Java Item - 45 : Minimize the scope of local
variables`: https://gist.github.com/kpgalligan/4cf93abd88044df71da5
##########
File path:
helix-core/src/main/java/org/apache/helix/controller/stages/MessageDispatchStage.java
##########
@@ -78,7 +78,17 @@ protected void processEvent(ClusterEvent event,
MessageOutput messageOutput) thr
batchMessage(dataAccessor.keyBuilder(), messagesToSend, resourceMap,
liveInstanceMap,
manager.getProperties());
+ String expectedSession =
event.getAttribute(AttributeName.EVENT_SESSION.name());
+ // An early check for expected leader session. If the sessions don't
match, it means the
+ // controller lost leadership, then messages should not be sent and
pipeline should stop.
+ // This avoids potential double masters for a single partition.
+ if (expectedSession != null &&
!expectedSession.equals(manager.getSessionId())) {
+ throw new StageException(
+ "Controller: " + manager.getInstanceName() + " lost leadership!
Expected session: "
+ + expectedSession + ", actual: " + manager.getSessionId());
+ }
Review comment:
What I was thinking about is, we also check this stage in
`ReadClusterDataStage()` because this place is the highly possible one that
session expires. In our issue, the session expiry also happened in
`ReadClusterDataStage`.
If we don't check it at every stage, it just costs extra
resource(CPU/memory/network IO) which we could tolerate. I think checking at
every stage may be excessive because most of stages are pretty
fast(`ReadClusterDataStage` takes longer time) and very tiny possibility that
session changes. As long as we protect sending message, correctness is ensured.
We wanted to make minimum code change.
##########
File path: helix-core/src/main/java/org/apache/helix/model/Message.java
##########
@@ -953,4 +970,35 @@ public boolean isValid() {
}
return true;
}
+
+ // A class represents session aware ZNRecord for message. The message should
be written to zk
+ // by the expected session.
+ // TODO: remove this class once public session-aware ZNRecord is available
+ private static class SessionAwareZNRecord extends ZNRecord implements
SessionAwareZkWriteData {
Review comment:
We still want to make minimum changes in this PR. Putting it private
gives us more flexibility to change in future: may add more stuff to this
class. If in future we find that we need `SessionAwareZNRecord` in other
places, we should create such thing as public. We need to discuss more and have
a more comprehensive design for public `SessionAwareZNRecord`. What do you
think?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]