Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15506 )
Change subject: tablet: plumb delta stats into delta compaction outputs ...................................................................... Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/15506/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15506/3//COMMIT_MSG@15 PS3, Line 15: I considered also plumbing stats into merge compactions, but opted not : to I'm trying to conceptualize what happens to the delta stats of N REDO files and M UNDO files following a merge compaction. All of the deltas wind up in one UNDO file, except for the "most recent" delta of a row that happens to be a DELETE; that'll stick around in a new singleton REDO file. If there are no DELETEs, I can see how you could preserve delta stats: find the smallest min_timestamp and the largest max_timestamp, and declare that to be the delta stats of the new UNDO file. But if there are DELETEs, how can you make sense of the pre-compaction delta stats, which were associated with file boundaries that are now hopelessly scrambled? http://gerrit.cloudera.org:8080/#/c/15506/3/src/kudu/tablet/delta_compaction.cc File src/kudu/tablet/delta_compaction.cc: http://gerrit.cloudera.org:8080/#/c/15506/3/src/kudu/tablet/delta_compaction.cc@135 PS3, Line 135: unique_ptr<DeltaStats> redo_stats(new DeltaStats); : unique_ptr<DeltaStats> undo_stats(new DeltaStats); Aren't DeltaStats trivially movable (i.e. doesn't the compiler generate move-constructor/move-assignment-operator for them)? If not, could you add those and avoid the allocations? http://gerrit.cloudera.org:8080/#/c/15506/3/src/kudu/tablet/delta_compaction.cc@421 PS3, Line 421: vector<unique_ptr<DeltaStats>> redo_stats; Also hoping we could avoid these allocations by moving DeltaStats instances directly. http://gerrit.cloudera.org:8080/#/c/15506/3/src/kudu/tablet/delta_tracker.cc File src/kudu/tablet/delta_tracker.cc: http://gerrit.cloudera.org:8080/#/c/15506/3/src/kudu/tablet/delta_tracker.cc@108 PS3, Line 108: DCHECK(stats.empty() || blocks.size() == stats.size()) << : Substitute("Unexpected number of stats: expected 0 or $0, got $1", : blocks.size(), stats.size()); This suggests that there should be just a single vector of a struct. http://gerrit.cloudera.org:8080/#/c/15506/3/src/kudu/tablet/tablet_history_gc-test.cc File src/kudu/tablet/tablet_history_gc-test.cc: http://gerrit.cloudera.org:8080/#/c/15506/3/src/kudu/tablet/tablet_history_gc-test.cc@101 PS3, Line 101: erroring out if it failed or our : // metrics don't make sense. Well, we ASSERT fail in this case rather than "erroring out". -- To view, visit http://gerrit.cloudera.org:8080/15506 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iea2f28fb2905ddcc007c88ab80ae2185587400f0 Gerrit-Change-Number: 15506 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 20 Mar 2020 04:41:44 +0000 Gerrit-HasComments: Yes
