Bryan Beaudreault created HBASE-27699:
-----------------------------------------
Summary: User metrics for filtered and read rows are too expensive
Key: HBASE-27699
URL: https://issues.apache.org/jira/browse/HBASE-27699
Project: HBase
Issue Type: Improvement
Reporter: Bryan Beaudreault
MetricsUserAggregateImpl has a pattern like this:
{code:java}
String user = getActiveUser();
if (user != null) {
MetricsUserSource userSource = getOrCreateMetricsUser(user);
incrementFilteredReadRequests(userSource);
} {code}
So every update involves a getOrCreate call, which does a ConcurrentHashMap
lookup. This overhead is not too bad for most requests, because it's just
executed once per request (i.e. updatePut gets called once at the end, though
multi's it happens for every action).
For updateFilteredReadRequests and updateReadRequestCount, these are currently
called in RegionScannerImpl for every row scanned or filtered. Doing the map
lookup over and over adds up. Profiling the regionserver under load, I see over
5% of the time spent updating these metrics.
We should try to collect these metrics maybe in the RpcCallContext, and the
translate into user metrics once at the end of the request. Or otherwise find a
way to minimize querying the ConcurrentHashMap multiple times in the context of
a request. Maybe we should actually stash the MetricsUserSource in the
RpcCallContext so that all user metrics only need to do the lookup once, even
for multi's.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)