Hi Stan,
Could you try to run hbck and see if that fixes the inconsistency. http://hbase.apache.org/0.94/book/hbck.in.depth.html To get to shell: ~]$ su - ams ~]$ export JAVA_HOME=/usr/jdk64/jdk1.8.0_40/ ~]$ cd /usr/lib/ams-hbase/bin/hbase ~] hbase --config /etc/ams-hbase/conf hbck METRIC_RECORD ~] hbase --config /etc/ams-hbase/conf hbck METRIC_AGGREGATE ~] hbase --config /etc/ams-hbase/conf hbck METRIC_RECORD_MINUTE ~] hbase --config /etc/ams-hbase/conf hbck SYSTEM.CATALOG ~] hbase --config /etc/ams-hbase/conf hbck SYSTEM.STATS 'list' command will give you the tables. - Sid ________________________________ From: [email protected] <[email protected]> Sent: Wednesday, October 21, 2015 5:33 PM To: Siddharth Wagle; [email protected] Cc: Daryl Heinz Subject: Re: Ambari Metrics Hello Sid, here are the configs. The logs are very large so instead of emailing, I have expose them on a server which support ssh: host: 24.14.3.243 login: sid passwd: sid Thanks again - Stan -- Ad Altiora Tendo Stanley J. Mlynarczyk - Ph.D. Chief Technology Officer [cid:[email protected]] Mobile: +1 630-607-2223 On 10/21/15 1:36 PM, Siddharth Wagle wrote: Hi Stan, Do not worry about the Mac comment below. It was only to suggest the workaround for incompatible native binaries, example using centos6 repo to install AMS on SLES machine, etc. If you can provide the hbase-ams-master-<host>.log and ambari-metrics-collector.log files, I can provide more info. Also, the configs from: /etc/ams-hbase/conf and /etc/ambari-metrics-collector/conf - Sid ________________________________ From: [email protected]<mailto:[email protected]> <[email protected]><mailto:[email protected]> Sent: Wednesday, October 21, 2015 10:38 AM To: Siddharth Wagle; [email protected]<mailto:[email protected]> Cc: Daryl Heinz Subject: Re: Ambari Metrics Hello Sid, I checked both the cluster with the issue and another of our clusters that is working fine but that is a later version of Ambari (2.1). Both have SNAPPY as compression. Sid, not sure I am understanding the comment below about "MAC". The cluster is a 48 node Dell Node system. In your prior email you suggested checking the Yum and rpm repositories along with OS version and I am still doing this and should have this shortly. Thanks, Stan -- Ad Altiora Tendo Stanley J. Mlynarczyk - Ph.D. Chief Technology Officer [cid:[email protected]] Mobile: +1 630-607-2223 On 10/21/15 12:26 PM, Siddharth Wagle wrote: AMS uses SNAPPY compression by default. So the service would start up fine but fail when Phoenix tried to CREATE TABLE. The work around is to set the compression code property in ams-site to "NONE" instead of SNAPPY. So, it will work on the MAC just not with compression enabled. - Sid ________________________________________ From: Hitesh Shah <[email protected]><mailto:[email protected]> Sent: Wednesday, October 21, 2015 10:20 AM To: [email protected]<mailto:[email protected]> Cc: [email protected]<mailto:[email protected]>; Daryl Heinz Subject: Re: Ambari Metrics @Siddharth, "17:29:40,698 WARN [main] NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable“ The above message is usually meant to be harmless as it is warning about the use of non-performant java implementations instead of using native code paths. Could you explain why this would affect the functionality? Does this mean that one would never be able to deploy/run AMS on a Mac because hadoop never has had any native libs built for Darwin? thanks — Hitesh On Oct 20, 2015, at 6:50 PM, Siddharth Wagle <[email protected]><mailto:[email protected]> wrote: Hi Stan, Based on the col.txt attached, the real problem is: 17:29:40,698 WARN [main] NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable This would mean incorrect binaries installed for AMS. Possibly wrong repo url used to install the components. Can you please provide the ambari.repo URL used to install the service and the version and flavor of the OS on which Metrics Collector is installed? The hb.txt, looks like a clean log file. Here is a link to all info that is useful for debugging: https://cwiki.apache.org/confluence/display/AMBARI/Troubleshooting+Guide Best Regards, Sid From: [email protected]<mailto:[email protected]> <[email protected]><mailto:[email protected]> Sent: Monday, October 19, 2015 12:33 PM To: Siddharth Wagle Cc: Daryl Heinz Subject: Ambari Metrics Hello Siddharth, I am hoping to get your input on an issue that has arisen with the Ambari Metrics Collector. This is with ambari 2.0.1 an HDP 2.2.6. The error message received was: Caused by: java.sql.SQLException: ERROR 1102 (XCL02): Cannot get all table regions Caused by: java.io.IOException: HRegionInfo was null in hbase:meta ------- CUT partial collector log ----- 11:13:35,203 WARN [main] ConnectionManager$HConnectionImplementation:1228 - Encountered problems when prefetch hbase:meta table: java.io.IOException: HRegionInfo was null or empty in Meta for SYSTEM.CATALOG, row=SYSTEM.CATALOG,,99999999999999 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:170) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.prefetchRegionCache(ConnectionManager.java:1222) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1286) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1135) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1118) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1075) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getRegionLocation(ConnectionManager.java:909) at org.apache.phoenix.query.ConnectionQueryServicesImpl.getAllTableRegions(ConnectionQueryServicesImpl.java:401) at org.apache.phoenix.query.ConnectionQueryServicesImpl.checkClientServerCompatibility(ConnectionQueryServicesImpl.java:853) at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:797) at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1107) at org.apache.phoenix.query.DelegateConnectionQueryServices.createTable(DelegateConnectionQueryServices.java:110) at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:1527) at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:535) at org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:184) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:260) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:252) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:250) at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1026) at org.apache.phoenix.query.ConnectionQueryServicesImpl$9.call(ConnectionQueryServicesImpl.java:1532) at org.apache.phoenix.query.ConnectionQueryServicesImpl$9.call(ConnectionQueryServicesImpl.java:1501) at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77) at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1501) at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:162) at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:126) at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:133) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:233) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.DefaultPhoenixDataSource.getConnection(DefaultPhoenixDataSource.java:69) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.PhoenixHBaseAccessor.getConnection(PhoenixHBaseAccessor.java:149) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.PhoenixHBaseAccessor.getConnectionRetryingOnException(PhoenixHBaseAccessor.java:127) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.PhoenixHBaseAccessor.initMetricSchema(PhoenixHBaseAccessor.java:268) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:64) at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.serviceInit(HBaseTimelineMetricStore.java:58) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:84) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:137) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:147) The (partial) contents of the embedded hbase and collector logs are in the attached. Any light that you could shed on this would be appreciated. The incident I believe started after an upgrade on July 20th at 17:29 pm Thanks in advance, Stan -- -- Ad Altiora Tendo Stanley J. Mlynarczyk - Ph.D. Chief Technology Officer <image001.jpg> Mobile: +1 630-607-2223
