[
https://issues.apache.org/jira/browse/HBASE-15518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248876#comment-15248876
]
Enis Soztutar commented on HBASE-15518:
---------------------------------------
I have tested this with YCSB to see whether there is any perf impact. As
expected (since aggregation happens in a backgroun thread, no metrics update
inline) I could not find any significant difference in 3 runs with 3M records,
10M ops and average 90K gets/s (all from block cache).
Here are the raw results:
{code}
with patch results
Run 1
[OVERALL], RunTime(ms), 109634.0
[OVERALL], Throughput(ops/sec), 91212.58003903898
[READ], Operations, 1.0E7
[READ], AverageLatency(us), 1058.0882089
[READ], MinLatency(us), 187.0
[READ], MaxLatency(us), 393471.0
[READ], 95thPercentileLatency(us), 2621.0
[READ], 99thPercentileLatency(us), 5367.0
[READ], Return=OK, 10000000
Run 2
[OVERALL], RunTime(ms), 110720.0
[OVERALL], Throughput(ops/sec), 90317.91907514451
[READ], Operations, 1.0E7
[READ], AverageLatency(us), 1066.0874503
[READ], MinLatency(us), 193.0
[READ], MaxLatency(us), 586751.0
[READ], 95thPercentileLatency(us), 2707.0
[READ], 99thPercentileLatency(us), 5391.0
[READ], Return=OK, 10000000
Run 3
[OVERALL], RunTime(ms), 111142.0
[OVERALL], Throughput(ops/sec), 89974.9869536269
[READ], Operations, 1.0E7
[READ], AverageLatency(us), 1071.664219
[READ], MinLatency(us), 200.0
[READ], MaxLatency(us), 500223.0
[READ], 95thPercentileLatency(us), 2707.0
[READ], 99thPercentileLatency(us), 5583.0
[READ], Return=OK, 10000000
Runtime avg: (109634.0 + 110720.0 + 111142.0) / 3 = 110498.66666666667
Throughput avg: (91212.58003903898 + 90317.91907514451 + 89974.9869536269) / 3
= 90501.82868927013
Without patch:
Run 1
[OVERALL], RunTime(ms), 108801.0
[OVERALL], Throughput(ops/sec), 91910.91993639764
[READ], Operations, 1.0E7
[READ], AverageLatency(us), 1048.1014036
[READ], MinLatency(us), 187.0
[READ], MaxLatency(us), 514815.0
[READ], 95thPercentileLatency(us), 2733.0
[READ], 99thPercentileLatency(us), 5743.0
[READ], Return=OK, 10000000
Run 2
[OVERALL], RunTime(ms), 109120.0
[OVERALL], Throughput(ops/sec), 91642.22873900294
[READ], Operations, 1.0E7
[READ], AverageLatency(us), 1062.9898565
[READ], MinLatency(us), 192.0
[READ], MaxLatency(us), 321791.0
[READ], 95thPercentileLatency(us), 2713.0
[READ], 99thPercentileLatency(us), 5707.0
[READ], Return=OK, 10000000
Run 3
[OVERALL], RunTime(ms), 109543.0
[OVERALL], Throughput(ops/sec), 91288.35251910209
[READ], Operations, 1.0E7
[READ], AverageLatency(us), 1067.1782325
[READ], MinLatency(us), 196.0
[READ], MaxLatency(us), 435711.0
[READ], 95thPercentileLatency(us), 2805.0
[READ], 99thPercentileLatency(us), 5831.0
[READ], Return=OK, 10000000
Runtime avg: (108801.0 + 109120.0 + 109543.0) / 3 = 109154.66666666667
Throughput avg: (91910.91993639764 + 91642.22873900294 + 91288.35251910209) / 3
= 91613.8337315009
{code}
Example output from JMX will be:
{code}
{
"name" : "Hadoop:service=HBase,name=RegionServer,sub=Tables",
"modelerType" : "RegionServer,sub=Tables",
"tag.Context" : "regionserver",
"tag.Hostname" : "cn017.l42scl.hortonworks.com",
"Namespace_hbase_table_meta_metric_readRequestCount" : 4510,
"Namespace_hbase_table_meta_metric_writeRequestCount" : 218,
"Namespace_hbase_table_meta_metric_totalRequestCount" : 4728,
"Namespace_hbase_table_namespace_metric_readRequestCount" : 4,
"Namespace_hbase_table_namespace_metric_writeRequestCount" : 0,
"Namespace_hbase_table_namespace_metric_totalRequestCount" : 4,
"Namespace_default_table_tsdb-uid_metric_readRequestCount" : 0,
"Namespace_default_table_tsdb-uid_metric_writeRequestCount" : 0,
"Namespace_default_table_tsdb-uid_metric_totalRequestCount" : 0,
"Namespace_default_table_tsdb-meta_metric_readRequestCount" : 0,
"Namespace_default_table_tsdb-meta_metric_writeRequestCount" : 0,
"Namespace_default_table_tsdb-meta_metric_totalRequestCount" : 0,
"Namespace_default_table_usertable_metric_readRequestCount" : 80000000,
"Namespace_default_table_usertable_metric_writeRequestCount" : 0,
"Namespace_default_table_usertable_metric_totalRequestCount" : 80000000,
"Namespace_default_table_tsdb-tree_metric_readRequestCount" : 0,
"Namespace_default_table_tsdb-tree_metric_writeRequestCount" : 0,
"Namespace_default_table_tsdb-tree_metric_totalRequestCount" : 0,
"Namespace_default_table_TestTable_metric_readRequestCount" : 0,
"Namespace_default_table_TestTable_metric_writeRequestCount" : 0,
"Namespace_default_table_TestTable_metric_totalRequestCount" : 0,
"Namespace_default_table_tsdb_metric_readRequestCount" : 0,
"Namespace_default_table_tsdb_metric_writeRequestCount" : 0,
"Namespace_default_table_tsdb_metric_totalRequestCount" : 0,
"Namespace_default_table_usertable-empty_metric_readRequestCount" : 0,
"Namespace_default_table_usertable-empty_metric_writeRequestCount" : 0,
"Namespace_default_table_usertable-empty_metric_totalRequestCount" : 0,
"numTables" : 9
}
{code}
The test failure is not related. I'll commit this shortly.
> Add Per-Table metrics back
> --------------------------
>
> Key: HBASE-15518
> URL: https://issues.apache.org/jira/browse/HBASE-15518
> Project: HBase
> Issue Type: Sub-task
> Reporter: Enis Soztutar
> Assignee: Alicia Ying Shu
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-15518-v1.patch, HBASE-15518-v2.patch,
> HBASE-15518-v3.patch, HBASE-15518.patch
>
>
> We used to have per-table metrics, but it was removed in some restructuring.
> We have per-region metrics, and per-regionserver metrics, but nothing in
> between.
> For majority of users, per-region is too granular, they are mostly interested
> in table level aggregates. This is especially useful in multi-tenant cases
> where a table's disk usage, number of requests, etc can be made much more
> visible.
> In this jira, we'll add the basic infrastructure to add a single (or a few)
> per-table metrics. Than we can improve on that by adding remaining metrics
> from the region server level.
> The plan is to NOT aggregate per-table metrics at master for now. Just
> aggregation of per-region metrics at the per-table level for every
> regionserver.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)