Adar Dembo has posted comments on this change. Change subject: KUDU-1943: Add BlockTransaction to Block Manager ......................................................................
Patch Set 20: (30 comments) http://gerrit.cloudera.org:8080/#/c/7207/19//COMMIT_MSG Commit Message: PS19, Line 22: s-per > Done You still need 'localhost' though; that's not an optional parameter I think. I was just saying you could remove the port, which is optional. http://gerrit.cloudera.org:8080/#/c/7207/20/src/kudu/fs/block_manager-stress-test.cc File src/kudu/fs/block_manager-stress-test.cc: Line 282: Slice seed_slice(reinterpret_cast<const uint8_t *>(&seed), sizeof(seed)); Should be <const uint8_t*> PS20, Line 297: WritableBlock *block = dirty_blocks[next_block_idx].get(); : Random &rand = dirty_block_rands[next_block_idx]; You changed the pointer/ref style here. Line 310: // Close all dirty blocks. Update. Line 311: for (const auto &dirty_block : dirty_blocks) { const auto& Line 318: for (auto &dirty_block : dirty_blocks) { auto& Line 326: for (const auto &block : all_dirty_blocks) { const auto& http://gerrit.cloudera.org:8080/#/c/7207/20/src/kudu/fs/block_manager.cc File src/kudu/fs/block_manager.cc: PS20, Line 44: "Control when to flush a block, either at 'finalize', 'close'," : " or 'never'. 'finalize' indicates to flush when writing to " : "the block is finished. 'close' indicates to defer the flushing " : "to when the entire BlockTransaction, that the blocks belong " : "to, is committed. And 'never' is not flush at all."); Rewrite: Controls when to flush a block. Valid values are 'finalize', 'close', or 'never'. If 'finalize', blocks will be flushed when writing is finished. If 'close', blocks will be flushed when their transaction is committed. If 'never', blocks will never be flushed. http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: Line 1410: } > Updated. To clarify, I did this because we were saying 'Finalize indicates Okay, but why bother maintaining FlushDataAsync() as a separate method? It certainly shouldn't be part of the WritableBlock interface because there's no need for anyone outside the block manager to know about the distinction. I suppose you could keep it as per-impl methods in LogWritableBlock/FileWritableBlock, but it's just one line of code, so the decomposition doesn't seem that useful. http://gerrit.cloudera.org:8080/#/c/7207/20/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS20, Line 260: finalize it if has not finalizing it if it has not yet been finalized. PS20, Line 260: And update "Also updates" PS20, Line 336: . , PS20, Line 337: th the PS20, Line 434: // Finalize a fully written block. It round up this container, truncate it if full : // and mark the container as available. : void FinalizeBlock(int64_t block_offset, int64_t block_length); Nit: would prefer to keep BlockCreated and BlockDeleted next to each other since they're fairly symmetric. PS20, Line 543: // : // Declared atomic because it is mutated from BlockDeleted(). Since all counters are now atomic, these comments are no longer interesting. PS20, Line 878: container "data" (to be consistent with "Syncing metadata file") PS20, Line 882: does so at any given point Replace with "is already made durable". Or rewrite as "Append metadata only after data is synced so that there's no chance of metadata landing on the disk before the data." PS20, Line 883: const auto & const auto* PS20, Line 890: if (mode == SYNC) I don't think this one needs to be conditioned on SYNC. Line 891: for (LogWritableBlock* block : blocks) { Would be nice if we could DCHECK that if blocks.size() > 1, all of the blocks are FINALIZED. If they aren't, we have a big problem: we'll end up calling FinalizeBlock() on the same container multiple times from within DoClose() and could corrupt on-disk data via truncate calls. PS20, Line 1040: // Note that this take places _after_ the container has been synced to disk. : // That's OK; truncation isn't needed for correctness, and in the event of a : // crash or error, it will be retried at startup. This needs to be updated. Line 1068: if (PREDICT_TRUE(new_next_block_offset >= next_block_offset())) { Don't need the if statement anymore. PS20, Line 1071: total_bytes_ .IncrementBy(fs_aligned_length(block_offset, block_length)); : total_blocks_.Increment(); Let's defer this to BlockCreated(). Then RoundUpContainer can be renamed to the more clear UpdateNextBlockOffset. PS20, Line 1206: {this} Nit: { this } Line 1216: RETURN_NOT_OK(container_->DoCloseBlocks({this}, LogBlockContainer::SyncMode::NO_SYNC)); Nit: { this } PS20, Line 1289: // We do not mark the container as read-only if encounters failure : // of FlushData(), since the corresponding metadata has not been : // appended yet. Rewrite: "We do not mark the container as read-only if FlushDataAsync() fails since the corresponding metadata has not yet been appended." Line 1317: LogBlockManager* lbm = container_->block_manager(); 'lbm' is only used once; don't bother storing it in a local variable. PS20, Line 1319: BytesAppended() Just use block_length_ here; it's more clear. PS20, Line 1319: id() and block_id_ here. PS20, Line 1751: <LogWritableBlock *> Nit: <LogWritableBlock*> -- To view, visit http://gerrit.cloudera.org:8080/7207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a88884bd Gerrit-PatchSet: 20 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Dan Burkert <danburk...@apache.org> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes