Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17037 )

Change subject: KUDU-2612 tablet servers automatically register txn participants
......................................................................


Patch Set 6:

(12 comments)

Still need to digest the life cycle of the dispatcher and its callbacks, but 
leaving some questions so far.

http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/client/client-test.cc@7694
PS6, Line 7694: }
Maybe count the rows to verify the state?


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.h@155
PS6, Line 155: txn_participant_registrant
nit: doc this?


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.h@405
PS6, Line 405:     //
nit: spurious line


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.h@456
PS6, Line 456: ready_, ops_queue_.
nit: if so, would be nice to at least mention how users should reason about 
concurrency control for the other members


http://gerrit.cloudera.org:8080/#/c/17037/5/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/17037/5/src/kudu/tablet/tablet_replica.cc@1262
PS5, Line 1262:       const bool has_request_id = op->has_request_id();
> Yep, that's done for all pending write operations in the queue.  Here the q
If we've failed, and updated Init() such that failure calls the client 
callback, we would have already responded to the client and wouldn't need to 
add it to the queue.

In any case, I think your new approach is good and less intrusive. I was at 
first worried that the response could be garbage after the std::move() but 
seems it's only freed after the callback is called.


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@552
PS6, Line 552: dispatcher = FindOrNull(txn_op_dispatchers_, txn_id);
             :     if (!dispatcher) {
             :       dispatcher = &EmplaceOrDie(
             :           &txn_op_dispatchers_,
             :           std::piecewise_construct,
             :           std::forward_as_tuple(txn_id),
             :           std::forward_as_tuple(this, 
FLAGS_tablet_max_pending_txn_write_ops));
             :     }
Does LookupOrEmplace work here?


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@567
PS6, Line 567: SubmitTxnWriteCb
nit: seems we typically name callbacks based on what we're calling back from, 
rather than what the callback does. Maybe call this RegisteredParticipantCb() 
or somesuch?


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@590
PS6, Line 590:     txn_op_dispatchers_.erase(it);
Do we have any guarantees that we've submitted all the pending writes? Or at 
least, called all the callbacks?


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@642
PS6, Line 642: txn_registrator
nit: dispatcher? Same elsewhere


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@1227
PS6, Line 1227: transaction is not open"
Don't we also unregister if we cancel due to not being the leader? Eg in 
SubmitTxnParticipantBeginOp() we may fail to submit the participant op?


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@1258
PS6, Line 1258: // Store the information necessary to recreate WriteOp 
instance: this is
              :       // useful if TabletReplica::SubmitWrite() returns non-OK 
status.
nit: it'd be nice if you could mention the lifecycle of the request and 
response pointers. At first glance, it seems like these values could all be 
bogus upon the std::move() call, but I don't think that's the case.


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@1304
PS6, Line 1304: Cleanup
What's the point of returning a status here? Doesn't this always return 
'status'?



--
To view, visit http://gerrit.cloudera.org:8080/17037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia383f7afd208c44695c57aab82e3818fa1712ce6
Gerrit-Change-Number: 17037
Gerrit-PatchSet: 6
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 17 Feb 2021 21:12:01 +0000
Gerrit-HasComments: Yes

Reply via email to