Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23204 )

Change subject: IMPALA-14234: Fix a version mismatch DCHECK hit when admissiond 
cluster membership recovering
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/23204/2/be/src/scheduling/cluster-membership-mgr.cc
File be/src/scheduling/cluster-membership-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/23204/2/be/src/scheduling/cluster-membership-mgr.cc@297
PS2, Line 297: // The recovering membership is not exposed to clients unless 
the post-recovery
             :       // grace pe
> I was more concerned about if we could try to dereference the recovering_me
Thanks for raising this. From what I see, recovering_membership_ is a 
shared_ptr and is only accessed before it’s reset, and new_state holds a shared 
reference, I don't see the underlying Snapshot object would be somehow 
destroyed. Also, this logic has been around for a while and we haven’t observed 
any crashes in this area. Let me know if I’m missing something, are you 
concerned that another thread might modify it unexpectedly?
The flow here looks like:
if (recovering_membership_.get() != nullptr) {
  new_state = recovering_membership_;
}
...
if (ss_is_recovering) {
  recovering_membership_ = new_state;
  return;
}
...
SetState(new_state); // Set to current
...
recovering_membership_.reset();


http://gerrit.cloudera.org:8080/#/c/23204/2/be/src/scheduling/cluster-membership-mgr.cc@300
PS2, Line 300:       // to current membership and cleared.
             :       new_state = recovering_membership_;
             :       if (current_membership_ != nullptr
             : 
> I think we should not remove the DCHECK. The ComputeGroupScheduleStates() c
The version from recovering_membership_ starts at 0, and this seems to be the 
case even when not using the standalone admissiond. I’ll need to take a closer 
look at where it’s coming from, it might just be that the version wasn’t copied 
over?



--
To view, visit http://gerrit.cloudera.org:8080/23204
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iea77347bd4775abd7866817146e326c7c5042f5e
Gerrit-Change-Number: 23204
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu <wydbaggio...@gmail.com>
Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Yida Wu <wydbaggio...@gmail.com>
Gerrit-Comment-Date: Wed, 30 Jul 2025 21:09:26 +0000
Gerrit-HasComments: Yes

Reply via email to