Hi,
For your information, increasing the heap size for embeded hbase/regionserver fixed it. Thanks Le 22/09/2016 à 20:21, Siddharth Wagle a écrit : > > Hi Eric, > > > Any reason for using 2.2.0 ? There we a bunch of issues fixed in the > later versions with regards to AMS. I would suggest using at least > 2.2.2 for performance reasons. > > > Cluster level data is much lesser in volume than Host level. > > So in terms of ttl, > > *.host.aggregator.* makes more sense instead of *.cluster.aggregator.*.ttl > > > I would set memory configs to Collector = 1 GB and Master = 2 GB, RS > = 1 GB > > in a embedded mode AMS. (ams-env and ams-hbase-env) > > > Other recommendations are documented on the wiki. > > > BR, > > Sid > > > ------------------------------------------------------------------------ > *From:* Eric Troies <erictro...@gmail.com> > *Sent:* Thursday, September 22, 2016 12:57 AM > *To:* user@ambari.apache.org > *Subject:* Re: [metrics collector] stopping by itself > > Hi Siddharth, > > Thank you, > > We're using version 2.2.0.0 with about 150 hosts. > Did not find any error like the one we have on confluence. > > I've set timeline.metrics.cluster.aggregator.second.ttl from 15 days > to 3, will see if that helps. > > Regards, > > Olivier > > On Wed, Sep 21, 2016 at 6:04 PM, Siddharth Wagle > <swa...@hortonworks.com <mailto:swa...@hortonworks.com>> wrote: > > Hi Eric, > > > Please take a look at the troubleshooting section on the wiki: > > https://cwiki.apache.org/confluence/display/AMBARI/Troubleshooting > <https://cwiki.apache.org/confluence/display/AMBARI/Troubleshooting> > > > How many node cluster do you have? > > What is the version of Ambari? > > > BR, > > Sid > > > ------------------------------------------------------------------------ > *From:* Eric Troies <erictro...@gmail.com > <mailto:erictro...@gmail.com>> > *Sent:* Wednesday, September 21, 2016 6:48 AM > *To:* user@ambari.apache.org <mailto:user@ambari.apache.org> > *Subject:* [metrics collector] stopping by itself > > > Hi, > > After a few minutes running, I have my ambari collector stopping, > with final lines in the log: > > > 2016-09-21 13:13:58,573 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping > phoenix metrics system... > 2016-09-21 13:13:58,577 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: phoenix metrics > system stopped. > 2016-09-21 13:13:58,577 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: phoenix metrics > system shutdown complete. > 2016-09-21 13:13:58,578 INFO > > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl: > Stopping ApplicationHistory > 2016-09-21 13:13:58,578 INFO org.apache.hadoop.ipc.Server: > Stopping server on 60200 > 2016-09-21 13:13:58,581 INFO org.apache.hadoop.ipc.Server: > Stopping IPC Server Responder > 2016-09-21 13:13:58,581 INFO > > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer: > SHUTDOWN_MSG: > /************************************************************ > SHUTDOWN_MSG: Shutting down ApplicationHistoryServer at hostname > ************************************************************/ > 2016-09-21 13:13:58,581 INFO org.apache.hadoop.ipc.Server: > Stopping IPC Server listener on 60200 > > Note that previously I've also been increasing the heap size to 1G > because I had GC errors. > > Before I have a lot of stack trace like the following. > > Thanks, > > Eric > > > 2016-09-21 13:13:58,534 WARN > org.apache.hadoop.yarn.webapp.GenericExceptionHandler: > INTERNAL_SERVER_ERROR > javax.ws.rs <http://javax.ws.rs>.WebApplicationException: > org.apache.phoenix.execute.Com > <http://org.apache.phoenix.execute.Com>mitException: > java.io.InterruptedIOException > at > > org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TimelineWebServices.postMetrics(TimelineWebServices.java:279) > at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown > Source) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > > com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) > at > > com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185) > at > > com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) > at > > com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288) > at > > com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) > at > > com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) > at > > com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) > at > > com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) > at > > com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469) > at > > com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400) > at > > com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349) > at > > com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339) > at > > com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) > at > > com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537) > at > > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:895) > at > > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:843) > at > > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:804) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) > at > com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) > at > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at > > org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109) > at > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at > > org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1243) > at > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at > org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) > at > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) > at > > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > at > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) > at > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:767) > at > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) > at > > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) > at > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) > at org.mortbay.jetty.Server.handle(Server.java:326) > at > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) > at > > org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945) > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756) > at > org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218) > at > org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) > at > > org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) > at > > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) > Caused by: org.apache.phoenix.execute.Com > <http://org.apache.phoenix.execute.Com>mitException: > java.io.InterruptedIOException > at > org.apache.phoenix.execute.MutationState.commit(MutationState.java:444) > at > > org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:461) > at > > org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:458) > at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) > at > > org.apache.phoenix.jdbc.PhoenixConnection.commit(PhoenixConnection.java:458) > at > > org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.PhoenixHBaseAccessor.insertMetricRecords(PhoenixHBaseAccessor.java:429) > at > > org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.putMetrics(HBaseTimelineMetricStore.java:323) > at > > org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TimelineWebServices.postMetrics(TimelineWebServices.java:275) > ... 46 more > > >