Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14555 )
Change subject: KUDU-2977 sharding block map to speed up tserver startup ...................................................................... Patch Set 2: (7 comments) Thanks for the detailed perf testing. Could you summarize your findings in the commit description? You don't need to provide all of the numbers, but just indicate, given a particular choice X, which option seemed best from the data you collected. http://gerrit.cloudera.org:8080/#/c/14555/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14555/1//COMMIT_MSG@11 PS1, Line 11: > Yea, I ran the benchmark multiple times, pretty stable. Could you summarize this finding and add it to the commit message? http://gerrit.cloudera.org:8080/#/c/14555/2/src/kudu/fs/log_block_manager.h File src/kudu/fs/log_block_manager.h: http://gerrit.cloudera.org:8080/#/c/14555/2/src/kudu/fs/log_block_manager.h@409 PS2, Line 409: // Block IDs container used to prevent collisions when creating new anonymous blocks. Nit: the word "container" is already used elsewhere in the log block manager, so it's confusing to see it here too. Perhaps we can call these shards? So the struct could be called ManagedBlockShard, and managed_blocks_ could be managed_block_shards_? http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.h File src/kudu/fs/log_block_manager.h: http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.h@410 PS1, Line 410: struct ManagedBlocks { > unique_ptr is OK, Done. Why do we need the indirection provided by a unique_ptr? Couldn't we just declare the members directly in the struct? http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.cc@1861 PS1, Line 1861: static const uint64_t kBlockMapMask = kBlockMapChunk - 1; > I also ran StartupBenchmark on 2 servers with different kBlockMapChunk valu What was the significance of running on two servers? Different hardware? Also, could you summarize this and add it to the commit message? http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.cc@2288 PS1, Line 2288: Status s = RemoveLogBlock(block_id, &lb); > I did some benchmark for RemoveLogBlocks(in LogBlockManagerTest.StartupBenc Summarize and add to commit message? http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.cc@2612 PS1, Line 2612: << " which already is alive from another container when " > If we lock once before per ith shard's sub-map insertion, concurrency will Same. http://gerrit.cloudera.org:8080/#/c/14555/2/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/14555/2/src/kudu/fs/log_block_manager.cc@1916 PS2, Line 1916: LogBlockManager::~LogBlockManager() { When doing something on each shard, could you iterate on the number of elements in managed_blocks_? Something like: for (const auto& mb : managed_blocks_) { ... } -- To view, visit http://gerrit.cloudera.org:8080/14555 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0d5c13e051a2c1d6cfd1c9ad7db8a3cd195459d Gerrit-Change-Number: 14555 Gerrit-PatchSet: 2 Gerrit-Owner: Yingchun Lai <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yingchun Lai <[email protected]> Gerrit-Comment-Date: Fri, 01 Nov 2019 00:07:19 +0000 Gerrit-HasComments: Yes
