Hao Hao has posted comments on this change.

Change subject: WIP KUDU-1943: Add BlockTransaction to Block Manager
......................................................................


Patch Set 8:

(41 comments)

http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/cfile/bloomfile.h
File src/kudu/cfile/bloomfile.h:

PS8, Line 47: blocks
> It's just one block, right? Why 'blocks'?
Done


http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/cfile/cfile_writer.cc
File src/kudu/cfile/cfile_writer.cc:

Line 283:     block_->Finalize();
> RETURN_NOT_OK.
Done


Line 289:     block_->Finalize();
> RETURN_NOT_OK.
Done


http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/cfile/cfile_writer.h
File src/kudu/cfile/cfile_writer.h:

PS8, Line 109: blocks
> Just one block.
Done


http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/fs/block_manager-stress-test.cc
File src/kudu/fs/block_manager-stress-test.cc:

Line 396: 
> Nit: why add an empty line here and not in the WriterThread/ReaderThread fu
Removed.


http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/fs/block_manager-test.cc
File src/kudu/fs/block_manager-test.cc:

Line 995:   ASSERT_OK(transaction.CommitWrites());
> Before committing, how about trying to open all the blocks in block_ids and
It looks like it is not returning not found error for FileBlockManager before 
the block is closed. Not sure if it is that expected?


http://gerrit.cloudera.org:8080/#/c/7207/3/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

Line 143: 
> I think FLAGS_cfile_do_on_finish has mostly outlived its usefulness (it was
Done


http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

PS8, Line 73: For LogWritableBlock, container
            : //    is released for further usage to avoid unnecessarily 
creating new containers.
> Generalize this example so that it doesn't refer to implementation details.
Done


PS8, Line 76:  For LogWritableBlock,
            : //    it also finishes the blocks belong to the same container 
together to
            : //    reduce fsync() usage.
> Generalize this so that it doesn't refer to a particular implementation.
Done


Line 91:     // The block is finalized as it has been fully written.
> How about: "There is some dirty data in the block and no more may be writte
Updated. But I think FINALIZED does not necessarily mean 'There is some dirty 
data in the block'


Line 146:   // Finalize a fully written block.
> This doesn't explain what restrictions the FINALIZED state places on the Wr
Done


Line 147:   virtual Status Finalize() = 0;
> Nit: since this has a real effect on the state of the block, please move it
Done


Line 149:   virtual int64_t offset() const { return 0; }
> I don't think this belongs in WritableBlock. It doesn't make any sense to a
Removed.


Line 285: // Group a set of blocks writes together in a transaction.
> The names and comments here conflate "write" and "create". Please choose on
Done


PS8, Line 289: BlockTransaction
> Besides the name, what's the difference between ScopedWritableBlockCloser a
Done.


PS8, Line 305: Status CommitWrites
> Can you add docs to this and AddCreatedBlock? Their names aren't as straigh
Done


Line 320:   std::vector<WritableBlock*> created_blocks_;
> Since we're now writing C++11 code, could you change this to be a vector of
Done


http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 237:   virtual Status Finalize() OVERRIDE;
> Nit: move to be just above BytesAppended().
Done


http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

Line 377: // Test a container can be reused when the block is finalized
> Please also add a variant of this for block_manager-test, to exercise FBM c
Done


Line 382:     unique_ptr<WritableBlock> writer;
> The writer goes out of scope (and thus is closed) with every iteration of t
Done


Line 385:     writer->Finalize();
> ASSERT_OK
Done


http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS8, Line 270: have been finalized
> What happens if this is called and the blocks are not finalized?
Slightly updated the logic. This should only be called if a block is finished.


Line 272:   Status FinishBlocks(std::vector<WritableBlock*> blocks);
> Don't need std:: prefix.
Done


PS8, Line 365: 'should_update' indicates if it should round up
> I think we try to avoid boolean parameters with enums (e.g. a AlignmentAdju
Done


PS8, Line 810: ner file " << data_file_->filename();
> This is syncing the data file and metadata file, right? may need another VL
Done


PS8, Line 812: ata());
             :   RETURN_NOT_OK(SyncMet
> Why do we need this here and not in FinishBlock?
FinishBlock is called within DoClose() which is SyncMode is 'Sync'. That 
indicates data/metadata is already been synced or will be synced later.


PS8, Line 1174: true
> maybe enum here too?
Done


Line 1174:   return DoClose(SYNC, true);
> What happens if I Close() (or Abort()) a block that's FINALIZED? Won't we c
For the first case, added RoundMode to indicate if Finalize() is called, we 
should not neither round up the container not call MakeContainerAvailable().

For the second case, you mean thread safety for container's in-memory 
accounting? I slightly modified the logic to only refer/update the container's 
in-memory accounting when RoundMode is ROUND_UP which means the container is 
only available to the current writable block. And for SyncData() and 
SyncMetadata() they should be thread safe. But to avoid to call fsync() on the 
synced data, I added a per-container lock to track the current synced offset.

Will add test cases.


PS8, Line 1260: logBlockManager
> How about just 'lbm' or 'bm'?
Done


PS8, Line 1264: VLOG(3) << "Finalizing block " << id();
> Should we be logging even if it's CLEAN?
What is the reason to do so?


Line 1265:     logBlockManager->RoundUpContainer(container_, block_offset_,
> How does this work when a FINALIZED block fails to Close() (or Abort())? Wo
Right, updated that so when a FINALIZED block fails to Close() (or Abort()) the 
container would be marked as 'read only'.


Line 1304:       if (!should_finish) Finalize();
> Finalize() can fail; you should update s if it does.
Done


Line 1726:       std::lock_guard<simple_spinlock> l(lock_);
> Why do you need to take the lock?
Removed.


Line 1732:   for (const auto &entry : created_block_map) {
> const auto&
Done


Line 1813: void LogBlockManager::RoundUpContainer(LogBlockContainer* container,
> AFAICT, the only reason to go through the LBM for this (and not call contai
Right, your suggestion makes sense. I did some modification to the code so that 
we always round up the container before make it available in Finalize(). This 
would maintain the same behavior (which only has one write thread at a time for 
the container) as before so that we do not need to add per-container lock for 
this any more.


Line 1816:   RoundUpContainerUnlocked(container, offset, length);
> Since RoundUpContainerUnlocked is only ever called here, you can omit the U
Removed.


http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/fs/log_block_manager.h
File src/kudu/fs/log_block_manager.h:

Line 195:   FRIEND_TEST(LogBlockManagerTest, TestFinalizeBlock);
> Nit: alphabetize.
Done


Line 264:                        int64_t offset, int64_t length);
> Nit: indentation
Removed.


http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/tablet/deltafile.h
File src/kudu/tablet/deltafile.h:

Line 72:   // Closes the delta file, releasing the underlying block to 'closer'.
> Update
Done


http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/tablet/diskrowset.h
File src/kudu/tablet/diskrowset.h:

PS8, Line 91: it
> them
Done


http://gerrit.cloudera.org:8080/#/c/7207/8/src/kudu/tablet/multi_column_writer.h
File src/kudu/tablet/multi_column_writer.h:

Line 69:   // to 'closer'.
> Update.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a88884bd
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Dan Burkert <[email protected]>
Gerrit-Reviewer: Hao Hao <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

Reply via email to