Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
......................................................................


Patch Set 20:

(26 comments)

http://gerrit.cloudera.org:8080/#/c/7207/19//COMMIT_MSG
Commit Message:

PS19, Line 22: s-per
> Nit: can just omit this.
Done


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/cfile/cfile-test.cc
File src/kudu/cfile/cfile-test.cc:

Line 38: DECLARE_bool(cfile_write_checksums);
> warning: redundant 'FLAGS_block_time_to_flush' declaration [readability-red
Done


Line 892:   unique_ptr<WritableBlock> sink;
> What happened to parameterizing this test for all block_time_to_flush value
Refactored the code and the flag specific code in Cfile are removed.


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/cfile/cfile_writer.cc
File src/kudu/cfile/cfile_writer.cc:

PS17, Line 266:   RETURN_NOT_OK(block_->Finalize());
              :   transaction->AddCreatedBlock(std::move(block_));
              :   return Status::OK();
              : }
              : 
              : void CFileWriter::AddMetadataPair(const Slice &key, const Slice 
&value) {
              :   C
> This is incorrect. We should always Finalize() the block; whether or not we
Done


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/block_manager.cc
File src/kudu/fs/block_manager.cc:

Line 43: DEFINE_string(block_manager_flush_control, "finalize",
> Let's rename to "block_manager_flush_control":
Done


Line 43: DEFINE_string(block_manager_flush_control, "finalize",
> warning: redundant 'FLAGS_block_time_to_flush' declaration [readability-red
Done


Line 44:               "Control when to flush a block, either at 'finalize', 
'close',"
> This should apply to any WritableBlock, not just those in a BlockTransactio
Done


PS17, Line 48: neve
> Let's use "never" instead, since we're talking about a point in time.
Done


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

Line 32: namespace kudu {
> This doesn't belong here (block_coalesce_close didn't belong either).
Done


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 340: Status FileWritableBlock::Finalize() {
> This is where we should check the value of block_time_to_flush. We should o
Done


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1176:   deleted_ = true;
> I mentioned this to Dan in our impromptu meeting yesterday; perhaps you wer
Done


PS17, Line 1397: ReadableBlock::Close() {
> You can use 'lbm' here.
Done


Line 1410: }
> And here we also need to gate on block_time_to_flush == 'finalize'.
Updated. To clarify, I did this because we were saying 'Finalize indicates 
Flushing'. But after thought more, I think your suggestion makes more sense. So 
I separated the flushing (I kept FlushDataAsync()) from Finalize and use the 
flag to control when to call FlushDataAsync().


http://gerrit.cloudera.org:8080/#/c/7207/19/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 245: 
> Don't need this; just change "enum CloseFlag" to "enum CloseFlags".
Done


Line 499:   // Note: 'record' may be swapped into 'report'; do not use it after 
calling
> Don't need this; it's only ever used internally (by LogBlockContainer).
Done


Line 582:       total_bytes_(0),
> Can we get by with a boolean? Seems like we could set it in RoundUpContaine
Removed as is not necessary.


PS19, Line 594: }
              : 
> I don't understand what protection the lock offers. Furthermore, PosixRWFil
Discussed this with Adar offline. I will remove this lock as 
PosixRWFile::Sync() already tries to keeps track of whether a sync is 
necessary. Meanwhile, fix KUDU-2102 in another commit.


Line 906: }
> You should add a note here about mutating result_status on failure, like in
Refactored the code and removed FinishBlock.


Line 939:     
RETURN_NOT_OK_HANDLE_ERROR(data_dir_->RefreshIsFull(DataDir::RefreshMode::ALWAYS));
> I don't understand this, and it's making me nervous.
Done


PS19, Line 955: Status LogBlockContainer::AppendMetadata(const BlockRecordPB& 
pb) {
              :   DCHECK(!read_only());
              :   // Note: We don't check for sufficient disk space for 
metadata writes in
              :   // order to allow for block deletion on full disks.
              :   RETURN_NOT_OK_HANDLE_ERROR(metadata_file_->Append(pb));
              :   return Status::OK();
              : }
              : 
> Here's another case where, even though FinishBlock() can be called concurre
Done


Line 1076:         "Container $0 with size $1 is now full, max size is $2",
> These two accesses to next_sync_offset_ aren't atomic. Use StoreMax() inste
Removed.


PS19, Line 1173: 
               : void LogBlock::Delete() {
               :   DCHECK(!deleted_);
               :   deleted_ = true;
               : }
               : 
               : ////////////////////////////////////////////////////////////
               : // LogWritableBlock (definition)
               : ////////////////////////////////////////////////////////////
               : 
               : Log
> I keep circling back to this as a potential synchronization issue. The comp
With the current change since RoundUpContainer is always called before 
MakeContainerAvailable, I think it should also be guaranteed to be called by 
just one writer thread at a time. I refactored the DoClose(), SyncBlocks(), 
FinishBlock() as you suggested, let me know if you think it is still not clear.
 
Updated.


Line 1188:       block_length_(0),
> Surely we needn't duplicate this from LogBlock; if you need it at the conta
Sorry about this, I originally removed the duplicate code by calling the 
container level fs_alighed_length(). You can see that in patch set 4 L1069. But 
apparently I missed this after rebasing the patch later. Updated it now.


PS19, Line 1464: ////////////////////////////////////////////////////////////
               : // LogBlockManager
               : ////////////////////////////////////////////////////////////
               : 
               : const char* LogBlockManager::kContainerMetadataFileSuffix = 
".metadata";
               : const char* LogBlockManager::kContainerDataFileSuffix = 
".data";
               : 
               : static const char* kBlockManagerType = "log";
               : 
               : // These values were arrived at via experimentation. See 
commit 4923a74 for
               : // more details.
               : const map<int64_t, int64_t> 
LogBlockManager::kPerFsBlockSizeBlockLimits({
               :   { 1024, 673 },
               :   { 2048, 1353 },
               :   { 4096, 2721 }});
               : 
               : LogBlockManager::LogBlockManager(Env* env,
               :                                  FsErrorManager* error_manager,
               :                                  const BlockManagerOptions& 
opts)
               :   : mem_tracker_(MemTracker::CreateTracker(-1,
               :                                            "log_block_manager",
               :    
> Rewrite:
Done


Line 1905:   open_block_ids_.erase(lb->block_id());
> I think this control flow (and the corresponding surgery to DoClose) is mor
Refactored the code as suggested. Added DoCloseBlocks() in LogBlockContainer.


PS19, Line 1910: 
> Nit: lwb
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a88884bd
Gerrit-PatchSet: 20
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

Reply via email to