Bryan Beaudreault created HBASE-27699:
-----------------------------------------

             Summary: User metrics for filtered and read rows are too expensive
                 Key: HBASE-27699
                 URL: https://issues.apache.org/jira/browse/HBASE-27699
             Project: HBase
          Issue Type: Improvement
            Reporter: Bryan Beaudreault


MetricsUserAggregateImpl has a pattern like this:
{code:java}
String user = getActiveUser();
if (user != null) {
  MetricsUserSource userSource = getOrCreateMetricsUser(user);
  incrementFilteredReadRequests(userSource);
} {code}
So every update involves a getOrCreate call, which does a ConcurrentHashMap 
lookup. This overhead is not too bad for most requests, because it's just 
executed once per request (i.e. updatePut gets called once at the end, though 
multi's it happens for every action).

For updateFilteredReadRequests and updateReadRequestCount, these are currently 
called in RegionScannerImpl for every row scanned or filtered.  Doing the map 
lookup over and over adds up. Profiling the regionserver under load, I see over 
5% of the time spent updating these metrics.

We should try to collect these metrics maybe in the RpcCallContext, and the 
translate into user metrics once at the end of the request. Or otherwise find a 
way to minimize querying the ConcurrentHashMap multiple times in the context of 
a request. Maybe we should actually stash the MetricsUserSource in the 
RpcCallContext so that all user metrics only need to do the lookup once, even 
for multi's.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to