Custom dashboards are kep in Grafana sqlite db and so it does make perfect sense that those are still around.
If you see metrics on Storm dashboard the issue is not on the AMS backend, need to make sure STORM is sending the metrics to AMS. Which metrics are you specifically looking for? BR, Sid ________________________________ From: Aaron Bossert <[email protected]> Sent: Thursday, September 21, 2017 10:10 PM To: [email protected]; [email protected] Subject: Re: At my wits end with Ambari Metrics Errors. No Storm and Kafka data and unable to start Metrics collector after cleanup I looked at those and couldn’t find the duplicates. I just finished wiping everything I could find related to ambari metrics and re-installed in distributed mode. Now I am past the duplicate key. Sorry, I was too quick to nuke it, so I can’t get the confine that were problematic. The issue I am left with is the lack of data related to storm beyond the “storm home” dashboard. Though I am suspecting there may be some lingering ambari metrics files and folders I didn’t nuke because there is still a custom dashboard I created. Also, on the “storm components” dashboard I get an error “dashboard unit failed template variables could not be initialized: evaluating ‘b.metricFindQuery(a.query).then’) Get Outlook for iOS<https://aka.ms/o0ukef> ________________________________ From: Siddharth Wagle <[email protected]> Sent: Friday, September 22, 2017 12:19:40 AM To: [email protected] Subject: Re: At my wits end with Ambari Metrics Errors. No Storm and Kafka data and unable to start Metrics collector after cleanup Hi Aaron, Can you make sure to go to the AMS service configs and find these 2 properties: "timeline.metrics.host.aggregate.splitpoints" "timeline.metrics.cluster.aggregate.splitpoints" Click on set recommended and also provide the values on this thread. Ambari stack advisor logic calculates these based on predefined split points picked up from the stack and the error below suggests that the calculated split points have a duplicate value. Once you remove the duplicate value from the comma separated list of strings the service should start up fine. Note: The logic to calculate split points depends on parameters like memory and also services installed. Could you file an Apache Jira for this and provide the following: - /etc/ambari-metrics-collector/conf/ams-env.sh - /etc/ambari-hbase/conf/hbase-env.sh - Screenshot of dashboard to get list of services BR, Sid ________________________________ From: Aaron Bossert <[email protected]> Sent: Thursday, September 21, 2017 8:23 PM To: [email protected] Subject: At my wits end with Ambari Metrics Errors. No Storm and Kafka data and unable to start Metrics collector after cleanup All right, I have an installation of Hortonworks HDP 2.6.1 and HDF 3.0.1. I have been developing a storm topology and had issues with the Ambari Metrics Collector the entire time I have had the system installed, but have just now gotten to a point where I can devote more attention to the broken AMS issues: Upon installation, when going to Grafana, I was able to see some of the metrics, but not all. Notably, Nothing beyond the main Storm dashboard showed any data (“no data points” showed in any of the other dashboards), the same issue exists with Kafka. I can see my topics, but no data points. I have tried uninstalling and reinstalling Ambari Metrics several times. I have tried clearing out all the data as well. I have tried running in both distributed and embedded modes. Between each change, I have cleared out all data in the hbase.rootdir and zookeeper datadir as suggested here: (https://cwiki.apache.org/confluence/display/AMBARI/Cleaning+up+Ambari+Metrics+System+Data) to no avail. Now, I cannot even get the collector to start. I am seeing a weird error as follows: The closest corollary I have been able to find is a bug that is supposedly fixed in my version of Ambari that had to do with very large clusters. My cluster is only 10 nodes. Any help at this point would be greatly appreciated! As I see it, the two issues are entirely unrelated, but who knows…the missing metrics for storm and kafka have been an issue ever since I installed the cluster. The failure to start is new. 2017-09-22 02:54:29,173 INFO org.apache.hadoop.hbase.client.ZooKeeperRegistry: ClusterId read in ZooKeeper is null 2017-09-22 02:54:29,203 INFO org.apache.phoenix.metrics.Metrics: Initializing metrics system: phoenix 2017-09-22 02:54:29,241 WARN org.apache.hadoop.metrics2.impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-phoenix.properties,hadoop-metrics2.properties 2017-09-22 02:54:29,307 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2017-09-22 02:54:29,307 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: phoenix metrics system started 2017-09-22 02:54:29,438 WARN org.apache.hadoop.hbase.io.util.HeapMemorySizeUtil: hbase.regionserver.global.memstore.upperLimit is deprecated by hbase.regionserver.global.memstore.size 2017-09-22 02:54:36,044 WARN org.apache.hadoop.hbase.io.util.HeapMemorySizeUtil: hbase.regionserver.global.memstore.upperLimit is deprecated by hbase.regionserver.global.memstore.size 2017-09-22 02:54:36,751 WARN org.apache.hadoop.hbase.io.util.HeapMemorySizeUtil: hbase.regionserver.global.memstore.upperLimit is deprecated by hbase.regionserver.global.memstore.size 2017-09-22 02:54:41,275 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created SYSTEM.CATALOG 2017-09-22 02:54:46,235 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created SYSTEM.SEQUENCE 2017-09-22 02:54:47,515 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created SYSTEM.STATS 2017-09-22 02:54:48,821 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created SYSTEM.FUNCTION 2017-09-22 02:54:50,102 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created METRICS_METADATA 2017-09-22 02:54:57,444 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created HOSTED_APPS_METADATA 2017-09-22 02:54:58,711 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created INSTANCE_HOST_METADATA 2017-09-22 02:54:59,995 INFO org.apache.hadoop.hbase.client.HBaseAdmin: Created CONTAINER_METRICS 2017-09-22 02:55:00,110 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore failed in state INITED; cause: java.lang.IllegalArgumentException: All split keys must be unique, found duplicate: FSDatasetState.org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.LastVolumeFailureDate\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00, FSDatasetState.org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.LastVolumeFailureDate\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 java.lang.IllegalArgumentException: All split keys must be unique, found duplicate: FSDatasetState.org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.LastVolumeFailureDate\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00, FSDatasetState.org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.LastVolumeFailureDate\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsyncV2(HBaseAdmin.java:740) at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:669) at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1067) at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1427) at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2190) at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:872) at org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:194) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:343) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:331) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:329) at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1421) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.PhoenixHBaseAccessor.initMetricSchema(PhoenixHBaseAccessor.java:440) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:103) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.serviceInit(HBaseTimelineMetricStore.java:95) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:84) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:137) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:147) 2017-09-22 02:55:00,114 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer failed in state INITED; cause: java.lang.IllegalArgumentException: All split keys must be unique, found duplicate: FSDatasetState.org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.LastVolumeFailureDate\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00, FSDatasetState.org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.LastVolumeFailureDate\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 java.lang.IllegalArgumentException: All split keys must be unique, found duplicate: FSDatasetState.org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.LastVolumeFailureDate\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00, FSDatasetState.org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.LastVolumeFailureDate\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsyncV2(HBaseAdmin.java:740) at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:669) at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1067) at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1427) at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2190) at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:872) at org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:194) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:343) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:331) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:329) at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1421) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.PhoenixHBaseAccessor.initMetricSchema(PhoenixHBaseAccessor.java:440) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:103) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.serviceInit(HBaseTimelineMetricStore.java:95) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:84) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:137) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:147) 2017-09-22 02:55:00,116 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping phoenix metrics system... 2017-09-22 02:55:00,116 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: phoenix metrics system stopped. 2017-09-22 02:55:00,117 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: phoenix metrics system shutdown complete. 2017-09-22 02:55:00,117 INFO org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl: Stopping ApplicationHistory 2017-09-22 02:55:00,117 FATAL org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer: Error starting ApplicationHistoryServer java.lang.IllegalArgumentException: All split keys must be unique, found duplicate: FSDatasetState.org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.LastVolumeFailureDate\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00, FSDatasetState.org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.LastVolumeFailureDate\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsyncV2(HBaseAdmin.java:740) at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:669) at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1067) at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1427) at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2190) at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:872) at org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:194) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:343) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:331) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:329) at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1421) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.PhoenixHBaseAccessor.initMetricSchema(PhoenixHBaseAccessor.java:440) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:103) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.serviceInit(HBaseTimelineMetricStore.java:95) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:84) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:137) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:147) 2017-09-22 02:55:00,119 INFO org.apache.hadoop.util.ExitUtil: Exiting with status -1 2017-09-22 02:55:00,124 INFO org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down ApplicationHistoryServer at r7u08.thanos.gotgdt.net/10.55.50.201 ************************************************************/ 2017-09-22 02:55:00,154 WARN org.apache.hadoop.hbase.io.util.HeapMemorySizeUtil: hbase.regionserver.global.memstore.upperLimit is deprecated by hbase.regionserver.global.memstore.size
