[jira] [Comment Edited] (KUDU-2707) Improve the performance of the block cache under contention

Alexey Serbin (Jira) Thu, 29 Jun 2023 09:28:18 -0700


    [ 
https://issues.apache.org/jira/browse/KUDU-2707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17736308#comment-17736308
 ]


Alexey Serbin edited comment on KUDU-2707 at 6/29/23 4:27 PM:
--------------------------------------------------------------

The stack mentioned is related to the lock that guards a shard's lookup table 
and updating LRU-related stats.  As a first improvement step we might try to:
* perform the lookup and the update of the LRU-related stats in separate 
critical sections
* use a concurrent dictionary instead of guarding the non-concurrent container 
with the lock.

There are a few implementations of concurrent dictionaries in C++, at least the 
following are available:
* concurrent_hash_map and concurrent_unordered_map of [Intel's 
TBB|https://github.com/oneapi-src/oneTBB] (Apache 2.0 license)
* concurrent hash maps from the [Junction 
project|https://github.com/preshing/junction] (BSD license), see [this 
article|https://preshing.com/20160201/new-concurrent-hash-maps-for-cpp/] for 
more details

Both might be used as a third-party components in Kudu at least based on their 
license types.


was (Author: aserbin):
The stack mentioned is related to the lock that guards the shards dictionary.  
As a first improvement step we might try to use a concurrent dictionary there 
instead of guarding the non-concurrent map with the lock.

There are a few implementations of concurrent dictionaries in C++, at least the 
following are available:
* concurrent_hash_map and concurrent_unordered_map of [Intel's 
TBB|https://github.com/oneapi-src/oneTBB] (Apache 2.0 license)
* concurrent hash maps from the [Junction 
project|https://github.com/preshing/junction] (BSD license), see [this 
article|https://preshing.com/20160201/new-concurrent-hash-maps-for-cpp/] for 
more details

Both might be used as a third-party components in Kudu at least based on their 
license types.

> Improve the performance of the block cache under contention
> -----------------------------------------------------------
>
>                 Key: KUDU-2707
>                 URL: https://issues.apache.org/jira/browse/KUDU-2707
>             Project: Kudu
>          Issue Type: Improvement
>    Affects Versions: 1.10.0
>            Reporter: William Berkeley
>            Priority: Major
>             Fix For: NA
>
>
> While looking at a random write workload where flushes outpace compactions 
> (i.e. the typical case when inserting as fast as possible), there are 
> occasional consensus service queue overflows. Analyzing the stacks of the 
> service threads when this occurs (using the diagnostics log), I see many 
> stacks like
> {noformat}
> 0x3b6720f710 <unknown>
>            0x1fb900a base::internal::SpinLockDelay()
>            0x1fb8ea7 base::SpinLock::SlowLock()
>            0x1ef7394 kudu::(anonymous namespace)::ShardedLRUCache::Lookup()
>            0x1ce379f kudu::cfile::BlockCache::Lookup()
>            0x1cec948 kudu::cfile::CFileReader::ReadBlock()
>            0x1ce5d36 kudu::cfile::BloomFileReader::CheckKeyPresent()
>             0xb311a1 kudu::tablet::CFileSet::CheckRowPresent()
>             0xac46c4 kudu::tablet::DiskRowSet::CheckRowPresent()
>             0xa6b017 
> _ZZN4kudu6tablet6Tablet17BulkCheckPresenceEPKNS_2fs9IOContextEPNS0_21WriteTransactionStateEENKUlvE1_clEv
>             0xa7427e 
> _ZNSt17_Function_handlerIFvPN4kudu6tablet6RowSetEiEZNS1_6Tablet17BulkCheckPresenceEPKNS0_2fs9IOContextEPNS1_21WriteTransactionStateEEUlS3_iE2_E9_M_invokeERKSt9_Any_dataS3_i
>             0xaee074 
> _ZNK4kudu22interval_tree_internal6ITNodeINS_6tablet20RowSetIntervalTraitsEE31ForEachIntervalContainingPointsIZNKS2_10RowSetTree27ForEachRowSetContainingKeysERKSt6vectorINS_5SliceESaIS8_EERKSt8functionIFvPNS2_6RowSetEiEEEUlRKNS2_12_GLOBAL__N_111QueryStructEPNS2_16RowSetWithBoundsEE_N9__gnu_cxx17__normal_iteratorIPSM_S7_ISL_SaISL_EEEEEEvT0_SX_RKT_
>             0xaee1b3 
> _ZNK4kudu22interval_tree_internal6ITNodeINS_6tablet20RowSetIntervalTraitsEE31ForEachIntervalContainingPointsIZNKS2_10RowSetTree27ForEachRowSetContainingKeysERKSt6vectorINS_5SliceESaIS8_EERKSt8functionIFvPNS2_6RowSetEiEEEUlRKNS2_12_GLOBAL__N_111QueryStructEPNS2_16RowSetWithBoundsEE_N9__gnu_cxx17__normal_iteratorIPSM_S7_ISL_SaISL_EEEEEEvT0_SX_RKT_
>             0xaee3a3 kudu::tablet::RowSetTree::ForEachRowSetContainingKeys()
>             0xa80c17 kudu::tablet::Tablet::BulkCheckPresence()
>             0xa8108a kudu::tablet::Tablet::ApplyRowOperations()
> {noformat}
> Note that the slow step in writes for these workloads is generally CPU usage 
> in the apply phase, once they have been running for a while.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (KUDU-2707) Improve the performance of the block cache under contention

Reply via email to