Yingchun Lai has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18569 )

Change subject: KUDU-3371 [fs] Use RocksDB to store LBM metadata
......................................................................


Patch Set 71:

(71 comments)

http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@12
PS71, Line 12: long time bootstrap consumption
> nit: long bootstrap times
Done


http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@24
PS71, Line 24: different
> nit: difference
Done


http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@25
PS71, Line 25: is
> nit: is that
Done


http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@26
PS71, Line 26:  a
             : native
> nit: in a native
Done


http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@27
PS71, Line 27: ,
> nit: end the sentence here, it's long enough already
Done


http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@27
PS71, Line 27: the
> nit: The
Done


http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@28
PS71, Line 28: clase
> class
Done


http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@29
PS71, Line 29:  container
> nit: a container
Done


http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@35
PS71, Line 35:  container
> nit: a container
Done


http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@38
PS71, Line 38: load containers when bootstrap
> nit: loading containers during the bootstrap phase
Done


http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@39
PS71, Line 39:  container
> nit: a container
Done


http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@43
PS71, Line 43:  container
> nit: a container
Done


http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@56
PS71, Line 56: use
> nit: uses
Done


http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@57
PS71, Line 57: , it is specified by flag
             :   '--block_manager'.
> The new block manager is enabled by setting --block_manager=logr
Done


http://gerrit.cloudera.org:8080/#/c/18569/71//COMMIT_MSG@66
PS71, Line 66: it
             : shows that reopen staged reduced upto 90% time cost.
> nit: it shows that the time spent to re-open tablet server's metadata when
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/client/CMakeLists.txt
File src/kudu/client/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/client/CMakeLists.txt@284
PS71, Line 284: rocksdb
> This looks a bit odd: how does it happen that Kudu C++ client tests need ro
I guess it's because the tests depends mini_cluster, and mini_cluster depends 
rocksdb.


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/CMakeLists.txt
File src/kudu/fs/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/CMakeLists.txt@44
PS71, Line 44: kudu_test_util
> This looks wrong: a non-test libkudu_fs library is now dependent on libkudu
In https://gerrit.cloudera.org/c/18569/71/src/kudu/fs/dir_manager.cc#221, I use 
the function GetTestDataDirectory() in kudu_test_util (from 
src/kudu/util/test_util.h) to judge whether the proccess is in running in test 
environments.

I updated it to use NO_TESTS macro instead.


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/block_manager-test.cc
File src/kudu/fs/block_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/block_manager-test.cc@1205
PS71, Line 1205: larger
> nit: greater
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/block_manager-test.cc@1243
PS71, Line 1243: to
> into
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/block_manager-test.cc@1243
PS71, Line 1243: is not take effect on
> does not affect
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/block_manager-test.cc@1243
PS71, Line 1243: so the metadata updating of "logr"
               :   // block_manager works well
> updating the metadata stored in the logr-based block manager succeeds witho
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.h
File src/kudu/fs/dir_manager.h:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.h@121
PS71, Line 121: perpare
> preparatory
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.h@202
PS71, Line 202: public
> style nit: add a space before the 'public' label, similar to the other visi
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.h@220
PS71, Line 220: private
> ditto for the 'private' label
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc
File src/kudu/fs/dir_manager.cc:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@207
PS71, Line 207: shared_ptr<rocksdb::Cache> RdbDir::s_block_cache_;
> Why do you need it here?  Isn't it enough to have it initialized using std:
Because it's a static member, it causes a link error if not define it here.

https://stackoverflow.com/questions/272900/undefined-reference-to-static-class-member


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@229
PS71, Line 229: will be
> is
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@229
PS71, Line 229: The
> A
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@229
PS71, Line 229: is not
> does not
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@233
PS71, Line 233: opts.error_if_exists
> Should at least this option be enabled to avoid overriding existing metadat
It's a good point.
We should distinguish creating new data directory and opening existing data 
directory, and set proper options to avoid mishaps.
I added a TODO comment to complete this work because it's a bit of complex, and 
may introduce much changes.


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@281
PS71, Line 281: wait flush jobs to finish is enough
> it's enough to wait for the flush jobs to finish
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@282
PS71, Line 282: may consume longer time which cause longer time to shut down 
server.
> nit: ... may take more time, which results in longer times to shut down a s
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_manager.cc@295
PS71, Line 295: const
> Either remove 'const' or add 'const' to the result pointer.  Otherwise, it'
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_util.cc
File src/kudu/fs/dir_util.cc:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/dir_util.cc@169
PS71, Line 169: dir_type_ == "log" || dir_type_ == "logr"
> This pattern is seen in multiple places.  Does it make sense to introduce a
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/file_block_manager.h
File src/kudu/fs/file_block_manager.h:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/file_block_manager.h@74
PS71, Line 74: kName
> This should have been 'name'; 'kSomething' is for variables, not method/fun
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/fs_manager-test.cc
File src/kudu/fs/fs_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/fs_manager-test.cc@224
PS71, Line 224: done
> nit: addressed
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/fs_report.h
File src/kudu/fs/fs_report.h:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/fs_report.h@189
PS71, Line 189:     std::string container;
              :     std::string rocksdb_key;;
> Could these be 'const'?
No, because std::move is used in constructor to init these members.


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/fs_report.h@190
PS71, Line 190: ;
> nit: remove the extra semi-colon
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc
File src/kudu/fs/log_block_manager-test-util.cc:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc@671
PS71, Line 671: slice_key
> nit: use rocksdb::Slice(key) in-place here as well?
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc@678
PS71, Line 678:       DCHECK(dir);
> I'm not sure this makes sense -- it 'dir' was nullptr, it would crash one l
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc@697
PS71, Line 697: Kinds
> nit: Types
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc@700
PS71, Line 700:   auto r = rand_.Uniform(1);
              :   DCHECK_EQ(r, 0);
> What is this for?
Useless, remvoed.


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc@734
PS71, Line 734: rocksdb::WriteOptions()
> nit: could this be replaced with {} ?
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc@734
PS71, Line 734: slice_key
> nit: use rocksdb::Slice(key) here as well as for the value?
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test-util.cc@741
PS71, Line 741: slice_key
> nit: use rocksdb::Slice(key) in-place here?
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test.cc@1629
PS71, Line 1629:     CHECK_EQ(FLAGS_block_manager, "logr");
> nit: is this necessary given of the condition of the enclosing if() clause?
Not necessary, I removed it.


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test.cc@2612
PS71, Line 2612: CHECK_OK
> Is it possible to use ASSERT_OK() here as well?
No, because this lambda needs a return value. An error returns if using ASSERT*

 /.../src/kudu/fs/log_block_manager-test.cc: In lambda function:
 /.../src/kudu/util/test_macros.h:38:45: error: void value not ignored as it 
ought to be
     FAIL() << "Bad status: " << _s.ToString();  \
                                             ^
 /.../src/kudu/fs/log_block_manager-test.cc:2612:5: note: in expansion of macro 
‘ASSERT_OK’
     ASSERT_OK(s);
     ^~~~~~~~~


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager-test.cc@2773
PS71, Line 2773: // TODO(yingchun): need to clear up dead metadata
> Is this still relevant?  If yes, please add corresponding comment into the
Yes, it's relevant. I added comment in the latest PS in log_block_manager.cc.


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h
File src/kudu/fs/log_block_manager.h:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@546
PS71, Line 546: kName
> style nit: this should be 'name'; 'kSomething' is for variables
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@546
PS71, Line 546:   static const char* kName() { return "log"; }
> Could this be constexpr?
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@592
PS71, Line 592: to an RocksDB
> into a RocksDB
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@592
PS71, Line 592: The metadata part of a container
> All the container's metadata
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@594
PS71, Line 594: will be
> are
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@594
PS71, Line 594: in
> of
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@595
PS71, Line 595: are removed
> are being removed
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@596
PS71, Line 596: in RocksDB will be compacted
> ... in the RocksDB instance is compacted ...
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@595
PS71, Line 595: block
              : // manager
> the block manager
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@599
PS71, Line 599: Compare
> Comparing
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@601
PS71, Line 601: to scan all records
> to scan through all the records
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@601
PS71, Line 601: may
              : // cause lower performance
> may adversely affect the performance
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@603
PS71, Line 603: a lot of options
> many configuration options
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@604
PS71, Line 604:  flexibly.
> nit: drop this part
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.h@666
PS71, Line 666:   static std::string ConstructRocksDBKey(const std::string& 
container_id,
> Could this be 'constexpr' as well?
It seens it couldn't, because the parameters container_id and block_id are 
variables, don't match The requirement of "each of its parameters must be of a 
LiteralType" [1].

1. https://en.cppreference.com/w/cpp/language/constexpr


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.cc@123
PS71, Line 123: useful
> effective
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/fs/log_block_manager.cc@4080
PS71, Line 4080:   // The 'keys' is used to keep the lifetime of the data 
referenced by Slices in 'batch'.
               :   vector<string> keys;
> Why to have this 'keys' array when in the while() cycle below only the curr
It's not enough, we have to keep their lifetime before rdb->Write() the batch.


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/tools/kudu-tool-test.cc@2117
PS71, Line 2117: only logr block manager is not supported
> logr block manager is not yet supported
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/tools/kudu-tool-test.cc@4574
PS71, Line 4574:   if (FLAGS_block_manager == "logr") {
               :     // Exclude the RocksDB data size.
               :     uint64_t size_of_rdb;
               :     
ASSERT_OK(env_->GetFileSizeOnDiskRecursively(JoinPathSegments(data_dir, "rdb"), 
&size_of_rdb));
               :     size_before_delete -= size_of_rdb;
               :   }
> nit: this block and another one at lines 4615 look very similar.  Does it m
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/tserver/tablet_server-test.cc@761
PS71, Line 761: to reset 'dd_manager' to release
              :   // the last reference of dd_manager() to release the RocksDB 
LOCK file, otherwise
> ... to reset 'dd_manager', releasing the last 'dd_manager' reference.  That
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/tserver/tablet_server-test.cc@763
PS71, Line 763: require
> acquire
Done


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/util/CMakeLists.txt
File src/kudu/util/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/util/CMakeLists.txt@a453
PS71, Line 453:
> Why to remove this?  This doesn't look to be a rocksdb-related update.
Because I used a function in [1], now it has been updated, and this change is 
not needed now.


1. https://gerrit.cloudera.org/c/18569/71/src/kudu/fs/dir_manager.cc#221


http://gerrit.cloudera.org:8080/#/c/18569/71/src/kudu/util/CMakeLists.txt@658
PS71, Line 658:     rocksdb)
> Just curious: why lz4 isn't needed here, but is needed for jwt-util-test be
I'm not sure, maybe the order of some dependent libraries take effect?


http://gerrit.cloudera.org:8080/#/c/18569/71/thirdparty/build-definitions.sh
File thirdparty/build-definitions.sh:

http://gerrit.cloudera.org:8080/#/c/18569/71/thirdparty/build-definitions.sh@1192
PS71, Line 1192:     -DWITH_LZ4=ON
> With current settings, does this use LZ4 from Kudu's thirdparty or it someh
I added a new option "-Dlz4_ROOT_DIR=$PREFIX" for RocksDB to find lz4 according 
to [1].
I tried again to build Kudu on a clean node (without LZ4 library installed in 
system), everything built well, and I can see the LZ4 is found in Kudu's 
thirdparty path when build RocksDB.

 -- Found lz4: /root/kudu/thirdparty/installed/uninstrumented/lib/liblz4.so

or

 -- Found lz4: /root/kudu/thirdparty/installed/tsan/lib/liblz4.so

1. 
https://github.com/facebook/rocksdb/blob/v7.7.3/cmake/modules/Findlz4.cmake#L10



--
To view, visit http://gerrit.cloudera.org:8080/18569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie72f6914eb5653a9c034766c6cd3741a8340711f
Gerrit-Change-Number: 18569
Gerrit-PatchSet: 71
Gerrit-Owner: Yingchun Lai <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: KeDeng <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <[email protected]>
Gerrit-Reviewer: Yifan Zhang <[email protected]>
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Reviewer: Yuqi Du <[email protected]>
Gerrit-Comment-Date: Tue, 23 Jan 2024 17:05:22 +0000
Gerrit-HasComments: Yes

Reply via email to