Gabor Boros created AMBARI-25326:
------------------------------------

             Summary: AMS - no HBase and Hive metrics post-upgrade when using 2 
collectors
                 Key: AMBARI-25326
                 URL: https://issues.apache.org/jira/browse/AMBARI-25326
             Project: Ambari
          Issue Type: Bug
          Components: ambari-metrics
    Affects Versions: 2.7.3
            Reporter: Gabor Boros
            Assignee: Gabor Boros


Seems like a bug when 2 metric collectors are deployed. Hive and hbase services 
are not able to send metrics

{code}
Error : 2019-06-10 02:42:59,215 INFO timeline 
timeline.HadoopTimelineMetricsSink: No live collector to send metrics to. 
Metrics to be sent will be discarded. This message will be skipped for the next 
20
Debug Error shows this :
2019-06-14 20:35:29,538 DEBUG main timeline.HadoopTimelineMetricsSink: Trying 
to find live collector host from : 
bolhdppname5.micron.com,bolhdppname4.micron.com
2019-06-14 20:35:29,538 DEBUG main timeline.HadoopTimelineMetricsSink: 
Requesting live collector nodes : 
http://bolhdppname5.micron.com,bolhdppname4.micron.com:6188/ws/v1/timeline/metrics/livenodes
2019-06-14 20:35:29,557 DEBUG main timeline.HadoopTimelineMetricsSink: Unable 
to connect to collector, 
http://bolhdppname5.micron.com,bolhdppname4.micron.com:6188/ws/v1/timeline/metrics/livenodes
2019-06-14 20:35:29,557 DEBUG main timeline.HadoopTimelineMetricsSink: 
java.net.UnknownHostException: bolhdppname5.micron.com,bolhdppname4.micron.com
2019-06-14 20:35:29,558 DEBUG main timeline.HadoopTimelineMetricsSink: 
Collector bolhdppname5.micron.com,bolhdppname4.micron.com is not longer live. 
Removing it from list of know live collector hosts : []
2019-06-14 20:35:29,558 DEBUG main timeline.HadoopTimelineMetricsSink: No live 
collectors from configuration.
{code}

Its incorrectly parsing hostnames when there are 2 collectors.
Hive service and Hbase service have ability to determine the live collectors 
either through curl or zookeeper but the configs doesn't support fetching live 
collector node from zookeeper.
To work around this, we added
for hbase
{code}
*.sink.timeline.zookeeper.quorum=bolhdppname5.micron.com:2181,bolhdppname1.micron.com:2181,bolhdppname4.micron.com:2181,bolhdppname2.micron.com:2181,bolhdppname3.micron.com:2181
{code}
in
/var/lib/ambari-server/resources/stacks/HDP/3.0/services/HBASE/package/templates/hadoop-metrics2-hbase.properties-GANGLIA-MASTER.j2
and for hive
Add
{code}
*.sink.timeline.zookeeper.quorum=bolhdppname5.micron.com:2181,bolhdppname1.micron.com:2181,bolhdppname4.micron.com:2181,bolhdppname2.micron.com:2181,bolhdppname3.micron.com:2181
{code}

in all 4 files under 
/var/lib/ambari-server/resources/stacks/HDP/3.0/services/HIVE/package/templates/
 ( on ambari server )

{code}
root@c1207-node1 templates# ll | grep metr
-rwxr-xr-x 1 root root 3032 Sep 18 2018 
hadoop-metrics2-hivemetastore.properties.j2
-rwxr-xr-x 1 root root 3016 Sep 18 2018 
hadoop-metrics2-hiveserver2.properties.j2
-rwxr-xr-x 1 root root 2959 Sep 18 2018 hadoop-metrics2-llapdaemon.j2
-rwxr-xr-x 1 root root 3015 Sep 18 2018 hadoop-metrics2-llaptaskscheduler.j2
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to