Todd Lipcon has posted comments on this change. Change subject: KUDU-495 (part 2): ensure all catalog writes for an operation are batched ......................................................................
Patch Set 2: (9 comments) http://gerrit.cloudera.org:8080/#/c/2695/2/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: Line 36: // in iteration order (i.e. the same order as TableInfo::GetAllTablets()). Rather than saying 'iteration order', can you say that it's alphabetic order by tablet ID? TableInfo::GetAllTablets() returns in that order today because it's a std::map underneath, right? I think it's also worth adding a comment on GetAllTablets() that it returns in such an order and that callers rely on this property because it defines the locking order. Otherwise I can see this being a subtle regression down the road. Line 37: // This strict ordering prevents deadlocks. Along the same lines, COMMIT must I remember chatting on slack about this at one point recently, but wanted to suggest again: would it be possible for us to have a mode in which our fancy COW lock would fall back to simple RW locking and then might get the TSAN lock-order inversion tracking to catch these issues for us? Or some annotations we could use? This stuff is subtle enough that I feel like we'll make mistakes over and over again without some assurance from sanitizers. And we probably won't have great test coverage of these timing issues. Or would switching to a less fancy lock in itself *cause* the thing to deadlock? Line 234: // Lookup the table. nit (while you are fixing other such nits): it's "Look up" as a verb and "lookup" as a noun. http://gerrit.cloudera.org:8080/#/c/2695/2/src/kudu/master/master-test.cc File src/kudu/master/master-test.cc: Line 556: MasterMetadataVerifier(const unordered_set<string>& live_tables, not sure from here or the class doc above what live and dead table sets are (or why dead is a multiset) Line 607: return Status::Corruption("Live table set mismatch"); would be nice to actually show the set difference here (eg dump a JoinStrings of the expected and actual, or use STLSetDifference or something) Line 626: multiset<string> referenced_tables; instead of a multiset, can you use map<string, int> here and then down below use referenced_tables[tablet.tablet_id]++? Seems slightly more straight-forward Line 707: ColumnSchema("v1", UINT64), are the actual columns relevant here? can you just use one of the existing CreateTestSchema functions or somesuch? Or just use one column Line 754: ASSERT_TRUE(s.IsAlreadyPresent() || s.IsRuntimeError()) << s.ToString(); what RuntimeError do you expect? worht a comment http://gerrit.cloudera.org:8080/#/c/2695/2/src/kudu/util/fault_injection.h File src/kudu/util/fault_injection.h: Line 58: Status DoMaybeReturnFailure(const char* failure_str, double fraction); > I think this is useful but I fear that Status::Failure is not generic enoug Oh, yea, now I see why you were checking RuntimeError in the test. I agree that it would be nice for the macro to look like MAYBE_RETURN_FAILURE(fraction_flag, Status::IOError("Injected IO error")) and with the magic of macros make it not bother evaluating the Status object unless necessary -- To view, visit http://gerrit.cloudera.org:8080/2695 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5cbccf5ce22c005d7aa25bbdefe7502873a8ed7d Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
