[ 
https://issues.apache.org/jira/browse/HBASE-27699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Kumar Maheshwari reassigned HBASE-27699:
-----------------------------------------------

    Assignee: Vineet Kumar Maheshwari

> User metrics for filtered and read rows are too expensive
> ---------------------------------------------------------
>
>                 Key: HBASE-27699
>                 URL: https://issues.apache.org/jira/browse/HBASE-27699
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Bryan Beaudreault
>            Assignee: Vineet Kumar Maheshwari
>            Priority: Major
>
> MetricsUserAggregateImpl has a pattern like this:
> {code:java}
> String user = getActiveUser();
> if (user != null) {
>   MetricsUserSource userSource = getOrCreateMetricsUser(user);
>   incrementFilteredReadRequests(userSource);
> } {code}
> So every update involves a getOrCreate call, which does a ConcurrentHashMap 
> lookup. This overhead is not too bad for most requests, because it's just 
> executed once per request (i.e. updatePut gets called once at the end, though 
> multi's it happens for every action).
> For updateFilteredReadRequests and updateReadRequestCount, these are 
> currently called in RegionScannerImpl for every row scanned or filtered.  
> Doing the map lookup over and over adds up. Profiling the regionserver under 
> load, I see over 5% of the time spent updating these metrics.
> We should try to collect these metrics maybe in the RpcCallContext, and the 
> translate into user metrics once at the end of the request. Or otherwise find 
> a way to minimize querying the ConcurrentHashMap multiple times in the 
> context of a request. Maybe we should actually stash the MetricsUserSource in 
> the RpcCallContext so that all user metrics only need to do the lookup once, 
> even for multi's.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to