Andrew Wong has posted comments on this change. Change subject: Allow tablet shutdown without completing txs ......................................................................
Patch Set 11: (12 comments) Not yet ready for review. http://gerrit.cloudera.org:8080/#/c/7439/11//COMMIT_MSG Commit Message: Line 21: Testing is done by adding the following: > forgot to ask: does TestDeleteTableWithConcurrentWrites in delete_table-ite Ah failed occasionally on a previous iteration, but since updating it, I haven't looped it. Will do that now. http://gerrit.cloudera.org:8080/#/c/7439/11/src/kudu/tablet/mvcc.h File src/kudu/tablet/mvcc.h: PS11, Line 270: transactions must not durably complete operations > is this precisely true? or is it that transactions are _allowed_ to abort? They _may_ not durably complete operations. http://gerrit.cloudera.org:8080/#/c/7439/11/src/kudu/tablet/tablet.cc File src/kudu/tablet/tablet.cc: PS11, Line 764: message > use ToString so we retain the original code? Done Line 765: LOG(ERROR) << s.ToString(); > add context of the tablet id, maybe the opid which should be available? Done PS11, Line 800: exitted > typo? exited Done PS11, Line 799: if (PREDICT_FALSE(mvcc_.IsShuttingDown())) { : return Status::Aborted("Transaction exitted early because tablet is shutting down"); : } > mind making a short function like Tablet::CheckNotShuttingDown with this co Done Line 852: LOG(ERROR) << "Failed to check if row is present; ending transaction early"; > we should be able to CHECK(IsShuttingDown()) or something here, right? Done Line 853: CHECK(s.IsDiskFailure()); > I think this is a bit too restrictive. For example if there is some local h Done http://gerrit.cloudera.org:8080/#/c/7439/11/src/kudu/tablet/transactions/transaction_driver.cc File src/kudu/tablet/transactions/transaction_driver.cc: Line 347: if (!state()->tablet_replica()->tablet()->mvcc_manager()->IsShuttingDown()) { > I think reaching into MvccManager is a bit strange here. Even though there The difficulty with using the TabletReplica enum is that it's protected by a lock. We need an immediately-available state to ensure that transactions are not committed, since we've lost the durability guarantee. My justification for this sort of desynchronization of tablet_replica/tablet is that the tablet--the physical guts storing all of the tablet meta/data--is what's failing when we enter this state; the replica merely follows suit asynchronously. Line 352: } > can you add FALLTHROUGH_INTENDED here? Done PS11, Line 356: VLOG_WITH_PREFIX(1) << "Transaction " << ToString() << " failed prior to " : "replication success: " << s.ToString(); > this is no longer accurate in all cases Done http://gerrit.cloudera.org:8080/#/c/7439/11/src/kudu/tserver/ts_tablet_manager.cc File src/kudu/tserver/ts_tablet_manager.cc: PS11, Line 640: if (replica->tablet()) { : // If we're deleting the tablet, there's no point in ensuring that applies : // complete. Currently-running Applies will henceforth return early. : replica->tablet()->mvcc_manager()->SetShuttingDown(); : } : > why not put this in TabletReplica::Shutdown() implementation itself so as t Done -- To view, visit http://gerrit.cloudera.org:8080/7439 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I983620f27e7226806a2cca253db7619731914d42 Gerrit-PatchSet: 11 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes