Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12462 )
Change subject: KUDU-2690: don't roll log schema on failed alter ...................................................................... Patch Set 4: (27 comments) http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc File src/kudu/integration-tests/alter_table-test.cc: http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc@2151 PS2, Line 2151: > I think you can omit this? Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc@2167 PS2, Line 2167: > Can you convert the CHECK_OKs in here to ASSERT_OK? Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc@2175 PS2, Line 2175: > unique_ptr for new code Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc@2217 PS2, Line 2217: : : : > Maybe combine this? Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc@2237 PS2, Line 2237: > Nit: indentation Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc@2254 PS2, Line 2254: : : : > Not sure why we need to involve the master at all. Seems like we could simp Mentioned in private slack; you're right that this test seems overly complicated for the change it's making. The goal of this test was less about testing how the behavior could surface given the care we take in the master wrt table locking and such. Parts of it seem inorganic, but such behavior has been seen in the wild, and this test tries to emulate it while being as organic as possible (even if the way we do so is inorganic). I'll add a more targeted test. This is added as a gist in the commit message. http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc@2263 PS2, Line 2263: > It'll unlock automatically when it goes out of scope. Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc@2270 PS2, Line 2270: > using There is client::sp::share_ptr usage in this file. http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc@2271 PS2, Line 2271: > Can use master-> here Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc@2276 PS2, Line 2276: > Nit: we usually use 'ts' variables to name tservers; could you rename this? Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc@2282 PS2, Line 2282: > I think the test would be net simpler if you avoided sending RPCs altogethe Same as above, this test is targeted as an end-to-end repro for KUDU-2690. http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc@2295 PS2, Line 2295: > Where's the chomping? Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc@2305 PS2, Line 2305: > Why does this need to be wrapped in ASSERT_EVENTUALLY? I can take my answer Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/integration-tests/alter_table-test.cc@2323 PS2, Line 2323: > Probably not necessary; the test will pass if the ASSERT_EQ is satisfied. Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/tablet/tablet_bootstrap.cc File src/kudu/tablet/tablet_bootstrap.cc: http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/tablet/tablet_bootstrap.cc@1450 PS2, Line 1450: // 1. There is no 'result' in the commit message. The alter succeeds, and the : // log updates its schema. > This doesn't match up with the code just below, which allows for there to b Yeah, I was being overly conservative wrt what we can expect on disk and just allowing replay in the bad case (which _should_ result in an error anyway). http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/tablet/tablet_bootstrap.cc@1476 PS2, Line 1476: RETURN_NOT_OK(tablet_->CreatePreparedAlterSchema(&tx_state, &schema)); : : // Apply the alter schema to the tablet > Could you pull this info and the info on L1450-1451 into one block comment Done http://gerrit.cloudera.org:8080/#/c/12462/3/src/kudu/tablet/transactions/alter_schema_transaction.h File src/kudu/tablet/transactions/alter_schema_transaction.h: http://gerrit.cloudera.org:8080/#/c/12462/3/src/kudu/tablet/transactions/alter_schema_transaction.h@124 PS3, Line 124: AlterSchemaTransaction(std::unique_ptr<AlterSchemaTransactionState> state, > warning: function 'kudu::tablet::AlterSchemaTransaction::AlterSchemaTransac Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/tablet/transactions/alter_schema_transaction.h File src/kudu/tablet/transactions/alter_schema_transaction.h: http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/tablet/transactions/alter_schema_transaction.h@97 PS2, Line 97: boost::optional<OperationResultPB> error() const { : return error_; : } : : std::string ToString() const override; : : private: : DISALLOW_COPY_AND_ASSIG > Why not combine these two and return an optional<OperationResultPB>? Distin Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/tablet/transactions/alter_schema_transaction.cc File src/kudu/tablet/transactions/alter_schema_transaction.cc: http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/tablet/transactions/alter_schema_transaction.cc@50 PS2, Line 50: > warning: using decl 'AlterSchemaRequestPB' is unused [misc-unused-using-dec Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/tablet/transactions/alter_schema_transaction.cc@51 PS2, Line 51: void AlterSchemaTransactionState::AcquireSchemaLock(rw_semaphore* l) { > warning: using decl 'AlterSchemaResponsePB' is unused [misc-unused-using-de Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/tablet/transactions/alter_schema_transaction.cc@67 PS2, Line 67: } : > This is a non-atomic "set" operation. Doesn't seem like that matters (error I don't think so because transactions are generally controlled by a driver (or are otherwise driven in a single thread), and so we would only call Apply() from one thread, and hence, call SetError() from one thread. http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/tablet/transactions/alter_schema_transaction.cc@84 PS2, Line 84: void AlterSchemaTransaction::NewReplicateMsg(gscoped_ptr<ReplicateMsg>* replicate_msg) { > Want to include error_ here? Or does that complicate locking? Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/tablet/transactions/alter_schema_transaction.cc@145 PS2, Line 145: void AlterSchemaTransaction::Finish(TransactionResult result) { > Can we std::move() the error into the op? Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/tablet/transactions/alter_schema_transaction.cc@153 PS2, Line 153: // Normally, we would release it in tablet.cc after applying the operation, : // but currently we need to wait until after the COMMIT message is logged : // to release this lock as a workaround for KUDU-915. See the TODO > Do we need to do this if there was an error? Nothing has changed, so I'm inclined to say no. That said, I'm not sure if it's worth including anyway, eg in case the consensus state has changed? Though I'd imagine in that case, whatever changed the consensus state would mark the tablet as dirty. WDYT? http://gerrit.cloudera.org:8080/#/c/12462/3/src/kudu/tablet/transactions/alter_schema_transaction.cc File src/kudu/tablet/transactions/alter_schema_transaction.cc: http://gerrit.cloudera.org:8080/#/c/12462/3/src/kudu/tablet/transactions/alter_schema_transaction.cc@49 PS3, Line 49: using tserver::TabletServerErrorPB; > warning: using decl 'AlterSchemaRequestPB' is unused [misc-unused-using-dec Done http://gerrit.cloudera.org:8080/#/c/12462/3/src/kudu/tablet/transactions/alter_schema_transaction.cc@50 PS3, Line 50: > warning: using decl 'AlterSchemaResponsePB' is unused [misc-unused-using-de Done http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/tablet/transactions/transaction_driver.cc File src/kudu/tablet/transactions/transaction_driver.cc: http://gerrit.cloudera.org:8080/#/c/12462/2/src/kudu/tablet/transactions/transaction_driver.cc@27 PS2, Line 27: #include <glog/logging.h> > warning: #includes are not sorted properly [llvm-include-order] Ack -- To view, visit http://gerrit.cloudera.org:8080/12462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id761851741297e29a4666bec0c34fc4f7285f715 Gerrit-Change-Number: 12462 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Tue, 19 Feb 2019 03:03:19 +0000 Gerrit-HasComments: Yes
