Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14555 )

Change subject: KUDU-2977 sharding block map to speed up tserver startup
......................................................................


Patch Set 1:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/14555/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14555/1//COMMIT_MSG@11
PS1, Line 11:
Have you run the benchmark multiple times? Are the numbers stable? Could you 
see how they vary when the number of blocks is changed?


http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager-test.cc@1001
PS1, Line 1001: // - only one data directory (typical servers have several)
Remove this.


http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager-test.cc@1017
PS1, Line 1017: true
Nit: prepend with /* force= */ so it's more clear what this argument means.


http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.h
File src/kudu/fs/log_block_manager.h:

http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.h@311
PS1, Line 311:   bool AddLogBlock(LogBlockRefPtr lb);
By renaming this it becomes an overload of the existing AddLogBlock, which is 
confusing. Could you rename it (or the other function) so the difference 
between them is clearer.


http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.h@408
PS1, Line 408:
If the size of each of these three vectors is the same, I'd replace them with a 
single vector holding a struct representing a "shard".


http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.h@410
PS1, Line 410:   std::vector<std::shared_ptr<simple_spinlock>> blocks_locks_;
Why do the entries here and in blocks_by_block_id_arr_ need shared ownership?


http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.cc@1861
PS1, Line 1861: static const uint64_t kBlockMapChunk = 1 << 3;
Would be interesting to see how performance (and overhead) changes as this 
value changes. What about 2, 4, or 16 shards?


http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.cc@2288
PS1, Line 2288:   for (const auto& block_id : block_ids) {
If block_ids is large (i.e. we're deleting a large tablet), this will result in 
a lot of lock activity. Would it be better to acquire all shards' locks up 
front?


http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.cc@2609
PS1, Line 2609:     LOG_TIMING(INFO, "AddLogBlock...") {
This isn't super helpful as what we really care about is total startup time, 
not so much the time spent starting one data directory, right?


http://gerrit.cloudera.org:8080/#/c/14555/1/src/kudu/fs/log_block_manager.cc@2612
PS1, Line 2612:         if (!AddLogBlock(std::move(e.second))) {
Likewise this can result in a lot of lock activity due to the number of entries 
in live_blocks.



--
To view, visit http://gerrit.cloudera.org:8080/14555
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0d5c13e051a2c1d6cfd1c9ad7db8a3cd195459d
Gerrit-Change-Number: 14555
Gerrit-PatchSet: 1
Gerrit-Owner: Yingchun Lai <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 28 Oct 2019 04:33:23 +0000
Gerrit-HasComments: Yes

Reply via email to