[
https://issues.apache.org/jira/browse/AMBARI-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15904867#comment-15904867
]
Chuan Jin commented on AMBARI-20392:
------------------------------------
Below is my queries:
{code:sql}
0: jdbc:phoenix:my-zk > select count(1)
. . . . . . . . . . . > FROM METRIC_AGGREGATE
. . . . . . . . . . . > WHERE METRIC_NAME IN ('pkts_out','pkts_in','cpu_wio',
'cpu_idle', 'cpu_nice','cpu_user', 'cpu_system','mem_total','mem_free',
'yarn.NodeManagerMetrics.ContainersCompleted',
'yarn.NodeManagerMetrics.ContainersRunning',
'yarn.NodeManagerMetrics.ContainersFailed',
'yarn.NodeManagerMetrics.ContainersLaunched',
'yarn.NodeManagerMetrics.ContainersKilled',
'yarn.NodeManagerMetrics.ContainersIniting')
. . . . . . . . . . . > AND APP_ID = 'nodemanager'
. . . . . . . . . . . > AND SERVER_TIME >= 1489121698000
. . . . . . . . . . . > AND SERVER_TIME < 1489125298000;
+-----------+
| COUNT(1) |
+-----------+
| 1800 |
+-----------+
1 row selected (37.821 seconds)
{code}
i split them into four queries:
{code:sql}
0: jdbc:phoenix:my-zk > SELECT count(1)
. . . . . . . . . . . > FROM METRIC_AGGREGATE
. . . . . . . . . . . > WHERE METRIC_NAME IN ('pkts_out','pkts_in')
. . . . . . . . . . . > AND APP_ID = 'nodemanager'
. . . . . . . . . . . > AND SERVER_TIME >= 1489121698000
. . . . . . . . . . . > AND SERVER_TIME < 1489125298000;
+-----------+
| COUNT(1) |
+-----------+
| 240 |
+-----------+
1 row selected (0.142 seconds)
0: jdbc:phoenix:my-zk > SELECT count(1)
. . . . . . . . . . . > FROM METRIC_AGGREGATE
. . . . . . . . . . . > WHERE METRIC_NAME IN ('cpu_wio', 'cpu_idle',
'cpu_nice','cpu_user', 'cpu_system')
. . . . . . . . . . . > AND APP_ID = 'nodemanager'
. . . . . . . . . . . > AND SERVER_TIME >= 1489121698000
. . . . . . . . . . . > AND SERVER_TIME < 1489125298000;
+-----------+
| COUNT(1) |
+-----------+
| 600 |
+-----------+
1 row selected (0.266 seconds)
0: jdbc:phoenix:my-zk > SELECT count(1)
. . . . . . . . . . . > FROM METRIC_AGGREGATE
. . . . . . . . . . . > WHERE METRIC_NAME IN ('mem_total','mem_free')
. . . . . . . . . . . > AND APP_ID = 'nodemanager'
. . . . . . . . . . . > AND SERVER_TIME >= 1489121698000
. . . . . . . . . . . > AND SERVER_TIME < 1489125298000;
+-----------+
| COUNT(1) |
+-----------+
| 240 |
+-----------+
1 row selected (0.12 seconds)
0: jdbc:phoenix:my-zk > SELECT count(1)
. . . . . . . . . . . > FROM METRIC_AGGREGATE
. . . . . . . . . . . > WHERE METRIC_NAME IN
('yarn.NodeManagerMetrics.ContainersCompleted',
'yarn.NodeManagerMetrics.ContainersRunning',
'yarn.NodeManagerMetrics.ContainersFailed',
'yarn.NodeManagerMetrics.ContainersLaunched',
'yarn.NodeManagerMetrics.ContainersKilled',
'yarn.NodeManagerMetrics.ContainersIniting')
. . . . . . . . . . . > AND APP_ID = 'nodemanager'
. . . . . . . . . . . > AND SERVER_TIME >= 1489121698000
. . . . . . . . . . . > AND SERVER_TIME < 1489125298000;
+-----------+
| COUNT(1) |
+-----------+
| 720 |
+-----------+
1 row selected (0.154 seconds)
{code}
> Get aggregate metric records from HBase encounters performance issues
> ---------------------------------------------------------------------
>
> Key: AMBARI-20392
> URL: https://issues.apache.org/jira/browse/AMBARI-20392
> Project: Ambari
> Issue Type: Improvement
> Components: ambari-metrics
> Affects Versions: 2.4.2
> Reporter: Chuan Jin
>
> I have a mini cluster ( ~6 nodes) managed by Ambari, and use a distributed
> HBase (~3 nodes) to hold metrics collected from these nodes. After I deploy
> YARN serivce, then I notice that some widgets (Cluster Memory,Cluster
> Disk,...) cannot display properly in the YARN service dashboard page. And
> Ambari Server has continuous timeout exceptions, which complains that it
> doesn't get timeline metrics for connection refused.
> The request timeout parameter is 5s, which means the query of getting metrics
> from HBase takes more time than that. Then I use Phoenix shell to login and
> perform the same query in the HBase , and it takes nearly 30s to finish. But
> If I split the big query into small pieces , i mean, use less values in the
> "metric_name" field in the where ... in clause , then the result return in 1s
> after several small queries.
> The query performance in HBase is highly based on the design of rowkey and
> the proper usage for it. In the method of getting aggregate metrics, AMS
> collector query the METRIC_AGGREGATE table in a way that may cause the
> co-processor to scan several regions across different RS. If we add more
> metrics in the service dashboard, this situation will be worse.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)