Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
......................................................................


Patch Set 9:

(7 comments)

> Patch Set 9:
>
> (7 comments)
>
> IWYU isn't happy yet:
>
> >>> Fixing #includes in 
> >>> '/home/jenkins-slave/workspace/kudu-master/2/src/kudu/transactions/txn_status_manager.cc'
> @@ -32,6 +32,7 @@
>  #include "kudu/common/wire_protocol.h"
>  #include "kudu/consensus/metadata.pb.h"
>  #include "kudu/consensus/raft_consensus.h"
> +#include "kudu/gutil/casts.h"
>  #include "kudu/gutil/macros.h"
>  #include "kudu/gutil/map-util.h"
>  #include "kudu/gutil/port.h"
> IWYU would have edited 1 files on your behalf.

Thanks a lot for looking into the failure!

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc@304
PS9, Line 304: WaitUntilTxnCoordinatorTasksFinished();
> How do we protect against the race when right after this 'txn_coordinator_t
Are you asking after L304 which we waited 'txn_coordinator_task_counter_' to 
become zero. and then it again becomes non-zero and tablet is shutdown state. 
what happens if TxnStatusManager tries to access the tablet? In that case,  the 
task will not be enabled/ran when the tablet is shutting down. So that 
ShouldRunTxnCoordinatorStalenessTask() will return false and no new tasks will 
run.


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc@471
PS9, Line 471:   if (state_ == STOPPING || state_ == STOPPED) {
             :     return false;
             :   }
             :   return true;
> This looks a bit strange: it returns 'true' even if 'state_' is RUNNING?
Ah, good catch. Sorry it is the opposite meaning as what the method name 
implies here. Adjust the caller as well.


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc@997
PS9, Line 997: state_
> What about SHUTDOWN state?
SHUTDOWN is only set after the tablet is fully stopped. Which the raft 
consensus has been stopped and the call back should no longer be triggered.


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc@1009
PS9, Line 1009: txn_coordinator_task_counter_
> nit: is it expected that this counter goes negative?  If not, maybe add CHE
Done


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/transactions/txn_status_manager.h
File src/kudu/transactions/txn_status_manager.h:

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/transactions/txn_status_manager.h@101
PS9, Line 101: txn_coordinator
> nit: it would be great to add a comment about the expectations about the ty
Done


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tserver/ts_tablet_manager.cc@1439
PS9, Line 1439: !r->CheckRunning().ok()
> Does it mean that the TxnStalenessTrackerTask() will cease to run at this t
Hmm, yeah, I guess we wouldn't want to stop the TxnStalenessTrackerTask() if 
some txn status tablets is shutdown. We may want to resume on checking the 
other tablets.


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tserver/ts_tablet_manager.cc@1457
PS9, Line 1457: continue
> Maybe, we should jump not just to the next element 'replicas' container, bu
I am not sure why we want to do that? Why don't we check for the rest tablets?



--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 9
Gerrit-Owner: Hao Hao <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Hao Hao <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 19 Jan 2021 01:15:04 +0000
Gerrit-HasComments: Yes

Reply via email to