Hao Hao has posted comments on this change. Change subject: WIP KUDU-1943: Add BlockTransaction to Block Manager ......................................................................
Patch Set 8: (41 comments) http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/cfile/bloomfile.h File src/kudu/cfile/bloomfile.h: PS8, Line 47: blocks > It's just one block, right? Why 'blocks'? Done http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/cfile/cfile_writer.cc File src/kudu/cfile/cfile_writer.cc: Line 283: block_->Finalize(); > RETURN_NOT_OK. Done Line 289: block_->Finalize(); > RETURN_NOT_OK. Done http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/cfile/cfile_writer.h File src/kudu/cfile/cfile_writer.h: PS8, Line 109: blocks > Just one block. Done http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/fs/block_manager-stress-test.cc File src/kudu/fs/block_manager-stress-test.cc: Line 396: > Nit: why add an empty line here and not in the WriterThread/ReaderThread fu Removed. http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/fs/block_manager-test.cc File src/kudu/fs/block_manager-test.cc: Line 995: ASSERT_OK(transaction.CommitWrites()); > Before committing, how about trying to open all the blocks in block_ids and It looks like it is not returning not found error for FileBlockManager before the block is closed. Not sure if it is that expected? http://gerrit.cloudera.org:8080/#/c/7207/3/src/kudu/fs/block_manager.h File src/kudu/fs/block_manager.h: Line 143: > I think FLAGS_cfile_do_on_finish has mostly outlived its usefulness (it was Done http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/fs/block_manager.h File src/kudu/fs/block_manager.h: PS8, Line 73: For LogWritableBlock, container : // is released for further usage to avoid unnecessarily creating new containers. > Generalize this example so that it doesn't refer to implementation details. Done PS8, Line 76: For LogWritableBlock, : // it also finishes the blocks belong to the same container together to : // reduce fsync() usage. > Generalize this so that it doesn't refer to a particular implementation. Done Line 91: // The block is finalized as it has been fully written. > How about: "There is some dirty data in the block and no more may be writte Updated. But I think FINALIZED does not necessarily mean 'There is some dirty data in the block' Line 146: // Finalize a fully written block. > This doesn't explain what restrictions the FINALIZED state places on the Wr Done Line 147: virtual Status Finalize() = 0; > Nit: since this has a real effect on the state of the block, please move it Done Line 149: virtual int64_t offset() const { return 0; } > I don't think this belongs in WritableBlock. It doesn't make any sense to a Removed. Line 285: // Group a set of blocks writes together in a transaction. > The names and comments here conflate "write" and "create". Please choose on Done PS8, Line 289: BlockTransaction > Besides the name, what's the difference between ScopedWritableBlockCloser a Done. PS8, Line 305: Status CommitWrites > Can you add docs to this and AddCreatedBlock? Their names aren't as straigh Done Line 320: std::vector<WritableBlock*> created_blocks_; > Since we're now writing C++11 code, could you change this to be a vector of Done http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/fs/file_block_manager.cc File src/kudu/fs/file_block_manager.cc: Line 237: virtual Status Finalize() OVERRIDE; > Nit: move to be just above BytesAppended(). Done http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: Line 377: // Test a container can be reused when the block is finalized > Please also add a variant of this for block_manager-test, to exercise FBM c Done Line 382: unique_ptr<WritableBlock> writer; > The writer goes out of scope (and thus is closed) with every iteration of t Done Line 385: writer->Finalize(); > ASSERT_OK Done http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: PS8, Line 270: have been finalized > What happens if this is called and the blocks are not finalized? Slightly updated the logic. This should only be called if a block is finished. Line 272: Status FinishBlocks(std::vector<WritableBlock*> blocks); > Don't need std:: prefix. Done PS8, Line 365: 'should_update' indicates if it should round up > I think we try to avoid boolean parameters with enums (e.g. a AlignmentAdju Done PS8, Line 810: ner file " << data_file_->filename(); > This is syncing the data file and metadata file, right? may need another VL Done PS8, Line 812: ata()); : RETURN_NOT_OK(SyncMet > Why do we need this here and not in FinishBlock? FinishBlock is called within DoClose() which is SyncMode is 'Sync'. That indicates data/metadata is already been synced or will be synced later. PS8, Line 1174: true > maybe enum here too? Done Line 1174: return DoClose(SYNC, true); > What happens if I Close() (or Abort()) a block that's FINALIZED? Won't we c For the first case, added RoundMode to indicate if Finalize() is called, we should not neither round up the container not call MakeContainerAvailable(). For the second case, you mean thread safety for container's in-memory accounting? I slightly modified the logic to only refer/update the container's in-memory accounting when RoundMode is ROUND_UP which means the container is only available to the current writable block. And for SyncData() and SyncMetadata() they should be thread safe. But to avoid to call fsync() on the synced data, I added a per-container lock to track the current synced offset. Will add test cases. PS8, Line 1260: logBlockManager > How about just 'lbm' or 'bm'? Done PS8, Line 1264: VLOG(3) << "Finalizing block " << id(); > Should we be logging even if it's CLEAN? What is the reason to do so? Line 1265: logBlockManager->RoundUpContainer(container_, block_offset_, > How does this work when a FINALIZED block fails to Close() (or Abort())? Wo Right, updated that so when a FINALIZED block fails to Close() (or Abort()) the container would be marked as 'read only'. Line 1304: if (!should_finish) Finalize(); > Finalize() can fail; you should update s if it does. Done Line 1726: std::lock_guard<simple_spinlock> l(lock_); > Why do you need to take the lock? Removed. Line 1732: for (const auto &entry : created_block_map) { > const auto& Done Line 1813: void LogBlockManager::RoundUpContainer(LogBlockContainer* container, > AFAICT, the only reason to go through the LBM for this (and not call contai Right, your suggestion makes sense. I did some modification to the code so that we always round up the container before make it available in Finalize(). This would maintain the same behavior (which only has one write thread at a time for the container) as before so that we do not need to add per-container lock for this any more. Line 1816: RoundUpContainerUnlocked(container, offset, length); > Since RoundUpContainerUnlocked is only ever called here, you can omit the U Removed. http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/fs/log_block_manager.h File src/kudu/fs/log_block_manager.h: Line 195: FRIEND_TEST(LogBlockManagerTest, TestFinalizeBlock); > Nit: alphabetize. Done Line 264: int64_t offset, int64_t length); > Nit: indentation Removed. http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/tablet/deltafile.h File src/kudu/tablet/deltafile.h: Line 72: // Closes the delta file, releasing the underlying block to 'closer'. > Update Done http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/tablet/diskrowset.h File src/kudu/tablet/diskrowset.h: PS8, Line 91: it > them Done http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/tablet/multi_column_writer.h File src/kudu/tablet/multi_column_writer.h: Line 69: // to 'closer'. > Update. Done -- To view, visit http://gerrit.cloudera.org:8080/7207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a88884bd Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Hao Hao <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
