[
https://issues.apache.org/jira/browse/HBASE-27699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vineet Kumar Maheshwari reassigned HBASE-27699:
-----------------------------------------------
Assignee: Vineet Kumar Maheshwari
> User metrics for filtered and read rows are too expensive
> ---------------------------------------------------------
>
> Key: HBASE-27699
> URL: https://issues.apache.org/jira/browse/HBASE-27699
> Project: HBase
> Issue Type: Improvement
> Reporter: Bryan Beaudreault
> Assignee: Vineet Kumar Maheshwari
> Priority: Major
>
> MetricsUserAggregateImpl has a pattern like this:
> {code:java}
> String user = getActiveUser();
> if (user != null) {
> MetricsUserSource userSource = getOrCreateMetricsUser(user);
> incrementFilteredReadRequests(userSource);
> } {code}
> So every update involves a getOrCreate call, which does a ConcurrentHashMap
> lookup. This overhead is not too bad for most requests, because it's just
> executed once per request (i.e. updatePut gets called once at the end, though
> multi's it happens for every action).
> For updateFilteredReadRequests and updateReadRequestCount, these are
> currently called in RegionScannerImpl for every row scanned or filtered.
> Doing the map lookup over and over adds up. Profiling the regionserver under
> load, I see over 5% of the time spent updating these metrics.
> We should try to collect these metrics maybe in the RpcCallContext, and the
> translate into user metrics once at the end of the request. Or otherwise find
> a way to minimize querying the ConcurrentHashMap multiple times in the
> context of a request. Maybe we should actually stash the MetricsUserSource in
> the RpcCallContext so that all user metrics only need to do the lookup once,
> even for multi's.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)