Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/7439 )
Change subject: mvcc: allow tablet shutdown without completing txs ...................................................................... Patch Set 21: (13 comments) http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/integration-tests/stop_tablet-itest.cc File src/kudu/integration-tests/stop_tablet-itest.cc: http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/integration-tests/stop_tablet-itest.cc@108 PS21, Line 108: // Insert data until the tablet becomes visible to the server. you don't need to insert data to get it to be visible - just creating the table should be enough to create the tablet, right? http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/integration-tests/stop_tablet-itest.cc@202 PS21, Line 202: // Now that the leader has been stopped, wait longer and ensure that no : // more writes get through. : SleepFor(MonoDelta::FromSeconds(5)); : ASSERT_EQ(num_rows_after_stop, writes.rows_inserted()); this is odd. Wouldn't you expect it to re-elect a new leader? Or is this because you're reaching into the server and calling 'Stop()' but not actually shutting down the tablet, so the leader keeps sending heartbeats? Worth a comment. http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/tablet/mvcc.h File src/kudu/tablet/mvcc.h: http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/tablet/mvcc.h@276 PS21, Line 276: should nit: s/should/will/ http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/tablet/mvcc.h@280 PS21, Line 280: inline > Is "inline" useful here? I thought methods defined in their header declarat yea, not necessary http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/tablet/mvcc.h@280 PS21, Line 280: is_closed > nit: Perhaps is_open() would be more intuitive than is_closed()? This is be I could go either way on this. I think it's natural to talk about whether something is or is not closed (I don't interpret "closed" as a negative adjective in the same way as "disabled") http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/tablet/mvcc.h@281 PS21, Line 281: .load() > Can we get away with std::memory_order_relaxed? I actually looked at the assembly generated by the default memory order, and load() just turns into a normal load instruction with nothing expensive about it, so I think this is fine as is. http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/tablet/mvcc.h@382 PS21, Line 382: closed_ > perhaps open_ and default to true? I kinda disagree on this one since we often have this "shutdown_" type member variable. http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/tablet/tablet.h File src/kudu/tablet/tablet.h: http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/tablet/tablet.h@121 PS21, Line 121: Unlike Shutdown(), this is not a lifecycle change, but rather a functional : // one. Calls to this are orthogonal to calls to Shutdown(). > It's not clear when one vs. the other should be called. Can you clarify her yea, agree this can still be clarified better whether it's required or optional, should be called before/after Shutdown(), etc. http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/tablet/tablet.h@175 PS21, Line 175: error the tablet nit: missing "if" http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/tablet/tablet.h@176 PS21, Line 176: It is critical for : // transactional integrity that errors here Stop() the tablet. this bit of the comment is confusing in this context. General rule of thumb for header comments is that you should try to focus on what external callers need to know about the function, and be clear which things are "instructions" for the caller vs "FYI" for the caller, vs implementation details (which probably should go in the .cc instead). In this particular example, the caller might be confused by the phrasing -- does this mean that, if this returns an error, the caller is responsible for calling Stop()? Or that if this returns an error, the tablet is guaranteed to have already stopped itself? Structuring it as a positive statement on a post-condition may help make it clearer: "If this returns an error, the Tablet will be in stopped state." (and then if you feel it's critical to explain why this is critical for maintaining correctness, do so in the implementation?) http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/tablet/tablet.h@183 PS21, Line 183: // Returns an error the tablet has been stopped or if there was an error : // (e.g. a disk failure) Applying the operations. It is critical for : // transactional integrity that errors here Stop() the tablet. same http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/tablet/tablet.h@480 PS21, Line 480: // Returns an error the tablet has been stopped or if there was an error : // (e.g. a disk failure) when checking the keys. It is critical for : // transactional integrity that errors here Stop() the tablet. same http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/tablet/tablet.cc File src/kudu/tablet/tablet.cc: http://gerrit.cloudera.org:8080/#/c/7439/21/src/kudu/tablet/tablet.cc@905 PS21, Line 905: // In order for failures here to be safe, the tablet must have been : // stopped, ensuring no further writes can commit durably. : // : // TODO(awong): for now, the tablet will only be Stopped() when it : // shuts down. In the future, the tablet should be Stopped() at a lower : // level when handling recoverable errors (e.g. a disk failures). instead of copy-pasting this comment block a few times, maybe move this to a single spot and refer to it here? -- To view, visit http://gerrit.cloudera.org:8080/7439 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I983620f27e7226806a2cca253db7619731914d42 Gerrit-Change-Number: 7439 Gerrit-PatchSet: 21 Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Comment-Date: Wed, 01 Nov 2017 00:10:33 +0000 Gerrit-HasComments: Yes
