Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20372 )

Change subject: IMPALA-12156: Support High Availability for Statestore
......................................................................


Patch Set 13:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore-subscriber.cc
File be/src/statestore/statestore-subscriber.cc:

http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore-subscriber.cc@178
PS11, Line 178:           "statestored (protocol version $0), the request is 
ignored.",
> added log messages
Done


http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore-subscriber.cc@326
PS11, Line 326:     return statestore_->Start();
> If FLAGS_tolerate_statestore_startup_delay is enabled, subscriber enter rec
Done


http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore-subscriber.cc@358
PS11, Line 358:     IpAddr ip_address;
> statestore_id is received in registration response. Due to race, subscriber
Ack


http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore-subscriber.cc@410
PS11, Line 410:   } else {
> done
Done


http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore.h@137
PS11, Line 137: /// the subscribers. Active statestored also sends heartbeats 
to standby statestored.
> done
Done


http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore.h@140
PS11, Line 140: /// heartbeats from active statestored.
> Will write warning message on standby statestored in this case.
Ack


http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore.h@782
PS11, Line 782:
> Updated comment to the purpose of this variable.
Ack


http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore.cc@654
PS11, Line 654:   DCHECK(metrics != NULL);
> Thanks to catch this. Least significant 64 bits of statestore_id is not use
Oh, I actually missed the UUIDToTUniqueId above. But it still wasn't clear what 
setting lower bits was used for.


http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore.cc@1248
PS11, Line 1248:       } else if (active_conn_states_[subscriber->id()] == 
TStatestoreConnState::FAILED) {
> added DCHECK
Done


http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore.cc@1250
PS11, Line 1250:             || conn_state == TStatestoreConnState::UNKNOWN);
> added DCHECK
Done


http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore.cc@1251
PS11, Line 1251:         --failed_conn_state_count_;
> added DCHECK
Done


http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore.cc@1573
PS11, Line 1573:   // to all subscribers.
> timeout is used for re-sending HA updates if there are subscribers which fa
But it holds a unique_lock on ha_lock_. Won't that prevent 
MonitorStatestoredHaHeartbeat from doing anything while WaitFor is waiting?


http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore.cc@1654
PS11, Line 1654:           
request.__set_catalogd_version(active_catalogd_version);
> updated as suggested
Done


http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore.cc@1792
PS11, Line 1792:   // Negotiate role for HA with peer statestore instance on 
startup.
> Both designate themselves as active, and start to send heartbeats to each o
Done


http://gerrit.cloudera.org:8080/#/c/20372/11/be/src/statestore/statestore.cc@1963
PS11, Line 1963:       // Check if standby statestored is reachable.
> Don't have good plan to handle split-brain in network. Updated comments. An
You added a warning below. I don't have any better ideas at the moment.


http://gerrit.cloudera.org:8080/#/c/20372/11/common/thrift/StatestoreService.thrift
File common/thrift/StatestoreService.thrift:

http://gerrit.cloudera.org:8080/#/c/20372/11/common/thrift/StatestoreService.thrift@499
PS11, Line 499:   // statestore should set itself as inactive if 
dst_statestore_active is set as true.
> done
Done


http://gerrit.cloudera.org:8080/#/c/20372/11/common/thrift/StatestoreService.thrift@500
PS11, Line 500:   // Otherwise, source statestore sets itself as active.
> done
Done


http://gerrit.cloudera.org:8080/#/c/20372/11/tests/custom_cluster/test_statestored_ha.py
File tests/custom_cluster/test_statestored_ha.py:

http://gerrit.cloudera.org:8080/#/c/20372/11/tests/custom_cluster/test_statestored_ha.py@42
PS11, Line 42:
> done
Done


http://gerrit.cloudera.org:8080/#/c/20372/11/tests/custom_cluster/test_statestored_ha.py@48
PS11, Line 48:     request = Subscriber.TSetStatestoreDebugActionRequest(
> removed
Done



--
To view, visit http://gerrit.cloudera.org:8080/20372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd2c814bbad5c04c1d50c2edaa5b910c82a6fd87
Gerrit-Change-Number: 20372
Gerrit-PatchSet: 13
Gerrit-Owner: Wenzhe Zhou <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Andrew Sherman <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Wed, 27 Sep 2023 18:59:19 +0000
Gerrit-HasComments: Yes

Reply via email to