helifu has posted comments on this change. (
http://gerrit.cloudera.org:8080/12121 )
Change subject: [fs]: wrapping up containers in scoped_refptr
......................................................................
Patch Set 6:
Here is a tserver node of our online cluster, and it has been running for
months. So, it has more than 38k+ containers and almost 20k+ are dead.
Test step:
1) rebooted the node to repair the containers;
2) rebooted the node with none scoped_refptr version;
3) rebooted the node with scoped_refptr version;
Step 2's result:
I0109 13:37:40.599534 159639 fs_manager.cc:419] Time spent opening block
manager: real 58.437s user 0.000s sys 0.000s
I0109 13:37:40.599802 159639 fs_manager.cc:436] Opened local filesystem:
/mnt/dfs/0/kudu/tserver/data,/mnt/dfs/1/kudu/tserver/data,/mnt/dfs/2/kudu/tserver/data,/mnt/dfs/3/kudu/tserver/data,/mnt/dfs/4/kudu/tserver/data,/mnt/dfs/5/kudu/tserver/data,/mnt/dfs/6/kudu/tserver/data,/mnt/dfs/7/kudu/tserver/data,/mnt/dfs/8/kudu/tserver/data,/mnt/dfs/9/kudu/tserver/data,/mnt/dfs/10/kudu/tserver/data,/mnt/dfs/ssd0/kudu/tserver/wal
uuid: "89c5dd62ad734d54b7d25bc2d52263d3"
format_stamp: "Formatted at 2018-03-12 07:20:41 on kudu14.lt.163.org"
I0109 13:37:40.599843 159639 fs_report.cc:352] FS layout report
--------------------
wal directory: /mnt/dfs/ssd0/kudu/tserver/wal
metadata directory: /mnt/dfs/0/kudu/tserver/data
11 data directories: /mnt/dfs/0/kudu/tserver/data/data,
/mnt/dfs/1/kudu/tserver/data/data, /mnt/dfs/2/kudu/tserver/data/data,
/mnt/dfs/3/kudu/tserver/data/data, /mnt/dfs/4/kudu/tserver/data/data,
/mnt/dfs/5/kudu/tserver/data/data, /mnt/dfs/6/kudu/tserver/data/data,
/mnt/dfs/7/kudu/tserver/data/data, /mnt/dfs/8/kudu/tserver/data/data,
/mnt/dfs/9/kudu/tserver/data/data, /mnt/dfs/10/kudu/tserver/data/data
Total live blocks: 23590554
Total live bytes: 1449771979998
Total live bytes (after alignment): 1529197359104
Total number of LBM containers: 206682 (147 full)
Did not check for missing blocks
Did not check for orphaned blocks
Total full LBM containers with extra space: 0 (0 repaired)
Total full LBM container extra space in bytes: 0 (0 repaired)
Total incomplete LBM containers: 0 (0 repaired)
Total LBM partial records: 0 (0 repaired)
I0109 13:37:40.625555 159639 env_posix.cc:1676] Not raising this process'
running threads per effective uid limit of 1033414; it is already as high as it
can go
I0109 13:37:40.633038 159639 ts_tablet_manager.cc:344] Loading tablet metadata
(0/1989 complete)
Step 3's result:
I0109 13:43:06.445422 160217 fs_manager.cc:419] Time spent opening block
manager: real 59.316s user 0.000s sys 0.004s
I0109 13:43:06.445688 160217 fs_manager.cc:436] Opened local filesystem:
/mnt/dfs/0/kudu/tserver/data,/mnt/dfs/1/kudu/tserver/data,/mnt/dfs/2/kudu/tserver/data,/mnt/dfs/3/kudu/tserver/data,/mnt/dfs/4/kudu/tserver/data,/mnt/dfs/5/kudu/tserver/data,/mnt/dfs/6/kudu/tserver/data,/mnt/dfs/7/kudu/tserver/data,/mnt/dfs/8/kudu/tserver/data,/mnt/dfs/9/kudu/tserver/data,/mnt/dfs/10/kudu/tserver/data,/mnt/dfs/ssd0/kudu/tserver/wal
uuid: "89c5dd62ad734d54b7d25bc2d52263d3"
format_stamp: "Formatted at 2018-03-12 07:20:41 on kudu14.lt.163.org"
I0109 13:43:06.445729 160217 fs_report.cc:352] FS layout report
--------------------
wal directory: /mnt/dfs/ssd0/kudu/tserver/wal
metadata directory: /mnt/dfs/0/kudu/tserver/data
11 data directories: /mnt/dfs/0/kudu/tserver/data/data,
/mnt/dfs/1/kudu/tserver/data/data, /mnt/dfs/2/kudu/tserver/data/data,
/mnt/dfs/3/kudu/tserver/data/data, /mnt/dfs/4/kudu/tserver/data/data,
/mnt/dfs/5/kudu/tserver/data/data, /mnt/dfs/6/kudu/tserver/data/data,
/mnt/dfs/7/kudu/tserver/data/data, /mnt/dfs/8/kudu/tserver/data/data,
/mnt/dfs/9/kudu/tserver/data/data, /mnt/dfs/10/kudu/tserver/data/data
Total live blocks: 23590554
Total live bytes: 1449771979998
Total live bytes (after alignment): 1529197359104
Total number of LBM containers: 206682 (147 full)
Did not check for missing blocks
Did not check for orphaned blocks
Total full LBM containers with extra space: 0 (0 repaired)
Total full LBM container extra space in bytes: 0 (0 repaired)
Total incomplete LBM containers: 0 (0 repaired)
Total LBM partial records: 0 (0 repaired)
I0109 13:43:06.471539 160217 env_posix.cc:1676] Not raising this process'
running threads per effective uid limit of 1033414; it is already as high as it
can go
I0109 13:43:06.479038 160217 ts_tablet_manager.cc:344] Loading tablet metadata
(0/1989 complete)
------------------------------------------------------------------
So the performance implications are limited.
Below is the part of perf report of step 3:
1.99% data dir 9 [wor kudu-tserver.2 [.]
kudu::fs::LogBlockManager::AddLogBlockUnlocked(scoped_refptr<kudu::fs::internal::LogBlock>)
◆
0.59% data dir 9 [wor [kernel.kallsyms] [k]
copy_user_enhanced_fast_string
?
0.34% data dir 9 [wor kudu-tserver.2 [.] operator new[](unsigned
long)
?
0.27% data dir 9 [wor kudu-tserver.2 [.]
kudu::fs::internal::LogBlockContainer::ProcessRecord(kudu::BlockRecordPB*,
kudu::fs::FsReport*, std::unordered_map<kudu::BlockId const, sc?
0.26% data dir 9 [wor [kernel.kallsyms] [k] copy_page_to_iter
?
0.20% data dir 9 [wor [kernel.kallsyms] [k] generic_file_read_iter
?
0.19% data dir 9 [wor kudu-tserver.2 [.] std::pair<kudu::BlockId
const, scoped_refptr<kudu::fs::internal::LogBlock> >&
spp::sparse_hashtable<std::pair<kudu::BlockId const, scoped_?
0.19% data dir 9 [wor kudu-tserver.2 [.] operator delete[](void*,
std::nothrow_t const&)
?
0.17% data dir 9 [wor kudu-tserver.2 [.] kudu::Status
kudu::pb_util::(anonymous
namespace)::ReadPBStartingAt<kudu::RandomAccessFile>(kudu::RandomAccessFile*,
int, boost::optional<?
0.16% data dir 9 [wor [kernel.kallsyms] [k] do_readv_writev
?
0.16% data dir 9 [wor [kernel.kallsyms] [k] fsnotify
?
0.14% data dir 9 [wor libc-2.19.so [.] preadv64
?
0.14% data dir 9 [wor kudu-tserver.2 [.] bool
InsertIfNotPresent<std::unordered_map<kudu::BlockId const,
scoped_refptr<kudu::fs::internal::LogBlock>, kudu::BlockIdHash, kudu::Bloc?
0.14% data dir 9 [wor kudu-tserver.2 [.]
kudu::BlockRecordPB::MergePartialFromCodedStream(google::protobuf::io::CodedInputStream*)
?
0.14% data dir 9 [wor kudu-tserver.2 [.]
kudu::subtle::RefCountedThreadSafeBase::Release() const
?
0.13% data dir 9 [wor kudu-tserver.2 [.]
std::unordered_map<kudu::BlockId const,
scoped_refptr<kudu::fs::internal::LogBlock>, kudu::BlockIdHash,
kudu::BlockIdEqual, std::allocator?
0.12% data dir 9 [wor kudu-tserver.2 [.]
kudu::fs::internal::LogBlockContainer::BlockCreated(scoped_refptr<kudu::fs::internal::LogBlock>
const&) ?
0.12% data dir 9 [wor [kernel.kallsyms] [k] do_iter_readv_writev
?
0.12% data dir 9 [wor [kernel.kallsyms] [k] system_call
?
0.12% data dir 9 [wor [kernel.kallsyms] [k] __fget
?
0.12% data dir 9 [wor kudu-tserver.2 [.] void
std::__rotate<std::pair<kudu::BlockId,
scoped_refptr<kudu::fs::internal::LogBlock> >*>(std::pair<kudu::BlockId,
scoped_refptr<kudu::f?
0.12% data dir 9 [wor kudu-tserver.2 [.]
kudu::subtle::RefCountedThreadSafeBase::AddRef() const
?
0.10% data dir 9 [wor kudu-tserver.2 [.]
kudu::BlockRecordPB::~BlockRecordPB()
?
0.10% data dir 9 [wor [kernel.kallsyms] [k] find_get_entry
?
0.09% data dir 9 [wor kudu-tserver.2 [.] kudu::(anonymous
namespace)::DoReadV(int, std::string const&, unsigned long,
kudu::ArrayView<kudu::Slice>) [clone .constprop.169] ?
0.09% data dir 9 [wor kudu-tserver.2 [.]
tcmalloc::CentralFreeList::FetchFromOneSpans(int, void**, void**)
?
0.09% data dir 9 [wor kudu-tserver.2 [.]
std::_Hashtable<kudu::BlockId const, std::pair<kudu::BlockId const,
kudu::BlockRecordPB>, std::allocator<std::pair<kudu::BlockId const, ku?
0.09% data dir 9 [wor kudu-tserver.2 [.]
crcutil::Crc32cSSE4::Crc32c(void const*, unsigned long, unsigned long) const
?
0.09% data dir 9 [wor [kernel.kallsyms] [k] __radix_tree_lookup
?
0.09% data dir 9 [wor kudu-tserver.2 [.]
google::protobuf::io::CodedInputStream::ReadVarint64Fallback()
?
0.08% data dir 9 [wor [kernel.kallsyms] [k] rw_copy_check_uvector
?
0.08% data dir 9 [wor [kernel.kallsyms] [k] system_call_after_swapgs
?
0.08% data dir 9 [wor kudu-tserver.2 [.] kudu::Status
kudu::pb_util::(anonymous
namespace)::ValidateAndReadData<kudu::RandomAccessFile>(kudu::RandomAccessFile*,
unsigned long, uns?
0.08% data dir 9 [wor kudu-tserver.2 [.]
google::protobuf::io::CodedInputStream::Refresh()
--
To view, visit http://gerrit.cloudera.org:8080/12121
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3c5c620014782b3d57dcbe047d0df73c949590c7
Gerrit-Change-Number: 12121
Gerrit-PatchSet: 6
Gerrit-Owner: helifu <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: helifu <[email protected]>
Gerrit-Comment-Date: Wed, 09 Jan 2019 06:14:48 +0000
Gerrit-HasComments: No