Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
......................................................................


Patch Set 9:

(7 comments)

IWYU isn't happy yet:

>>> Fixing #includes in 
>>> '/home/jenkins-slave/workspace/kudu-master/2/src/kudu/transactions/txn_status_manager.cc'
@@ -32,6 +32,7 @@
 #include "kudu/common/wire_protocol.h"
 #include "kudu/consensus/metadata.pb.h"
 #include "kudu/consensus/raft_consensus.h"
+#include "kudu/gutil/casts.h"
 #include "kudu/gutil/macros.h"
 #include "kudu/gutil/map-util.h"
 #include "kudu/gutil/port.h"
IWYU would have edited 1 files on your behalf.

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc@304
PS9, Line 304: WaitUntilTxnCoordinatorTasksFinished();
How do we protect against the race when right after this 
'txn_coordinator_task_counter_' becomes non-zero again and tablet_->Shutdown() 
is called when TxnStatusManager tries to access the tablet?

Should we just nullify txn_coordinator_ in 
WaitUntilTxnCoordinatorTasksFinished() or set another variable that controls 
the return status of ShouldRunTxnCoordinatorStalenessTask() along with 'state_'?


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc@471
PS9, Line 471:   if (state_ == STOPPING || state_ == STOPPED) {
             :     return false;
             :   }
             :   return true;
This looks a bit strange: it returns 'true' even if 'state_' is RUNNING?


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc@997
PS9, Line 997: state_
What about SHUTDOWN state?


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc@1009
PS9, Line 1009: txn_coordinator_task_counter_
nit: is it expected that this counter goes negative?  If not, maybe add 
CHECK()/DCHECK() to enforce the invariant here?


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/transactions/txn_status_manager.h
File src/kudu/transactions/txn_status_manager.h:

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/transactions/txn_status_manager.h@101
PS9, Line 101: txn_coordinator
nit: it would be great to add a comment about the expectations about the type 
of this parameter.


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tserver/ts_tablet_manager.cc@1439
PS9, Line 1439: !r->CheckRunning().ok()
Does it mean that the TxnStalenessTrackerTask() will cease to run at this 
tablet server once one of its txn status tablets is shutdown (e.g., moved by 
other server by the rebalancer tool)?  I'm not sure that's the behavior we want.


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tserver/ts_tablet_manager.cc@1457
PS9, Line 1457: continue
Maybe, we should jump not just to the next element 'replicas' container, but 
onto next iteration of the outer 'while (true)' loop?



--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 9
Gerrit-Owner: Hao Hao <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Hao Hao <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 16 Jan 2021 06:10:15 +0000
Gerrit-HasComments: Yes

Reply via email to