I have a version of the memcached example running on a docker image, and now
I'd like to port that to a real cluster (to get a working starting point for
the actual service I want to run in slider).
I suspect the configuration issues could be in the zoo keeper or yarn service
registry configuration.
Running the following (sanitized) commands:
slider install-package --package
/home/foolish_ewe/mybuild/incubator-slider/app-packages/memcached/jmemcached-1.0.1.zip
--name jmemcached --debug --replacepkg
slider create jmemcached --template
/home/foolish_ewe/mybuild/incubator-slider/app-packages/memcached/appConfig.json
--resources
/home/foolish_ewe/mybuild/incubator-slider/app-packages/memcached/resources-default.json
--manager rm.yarn.cluster.mycompany.com:8032 --debug --zkhosts
zookeeper.cluster.mycompany.com:2181 --zkpath /slider_test/clustername/
I'm seeing failed zookeeper connections to localhost:2181 the AM logs:
2017-05-02 16:16:07,992 [main-SendThread(localhost:2181)] WARN
zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing
socket connection and attempting reconnect
java.net.ConnectException: Connection refused
How can I tweak the connection string?
If I look at slider/conf/slider-client.xml, I am still using the default
configuration and see the following setting:
<property>
<name>hadoop.registry.zk.quorum</name>
<value>@ZK-QUORUM</value>
</property>
First off, I'm not sure about the @ZK-QUORUM syntax means, overriding this with
with connection string with a single host provides no relief from the dreaded
symptom.
The AM logs look like:
2017-05-02 16:16:07,401 [main] INFO appmaster.SliderAppMaster - Registry
service username =fooolish_ewe
2017-05-02 16:16:07,462 [main] INFO appmaster.SliderAppMaster - Service Record
ServiceRecord{description='Slider Application Master'; external endpoints: {{
"api" : "http://",
"addressType" : "uri",
"protocolType" : "webui",
"addresses" : [ {
"uri" : "http://cluster.mycompany.com:42734"
} ]
}; {
"api" : "classpath:org.apache.slider.management",
"addressType" : "uri",
"protocolType" : "REST",
"addresses" : [ {
"uri" : "http://cluster.mycompany.com:42734/ws/v1/slider/mgmt"
} ]
}; {
"api" : "classpath:org.apache.slider.publisher",
"addressType" : "uri",
"protocolType" : "REST",
"addresses" : [ {
"uri" : "http://cluster.mycompany.com:42734/ws/v1/slider/publisher"
} ]
}; {
"api" : "classpath:org.apache.slider.registry",
"addressType" : "uri",
"protocolType" : "REST",
"addresses" : [ {
"uri" : "http://cluster.mycompany.com:42734/ws/v1/slider/registry"
} ]
}; {
"api" : "classpath:org.apache.slider.publisher.configurations",
"addressType" : "uri",
"protocolType" : "REST",
"addresses" : [ {
"uri" : "http://cluster.mycompany.com:42734/ws/v1/slider/publisher/slider"
} ]
}; {
"api" : "classpath:org.apache.slider.publisher.exports",
"addressType" : "uri",
"protocolType" : "REST",
"addresses" : [ {
"uri" : "http://cluster.mycompany.com:42734/ws/v1/slider/publisher/exports"
} ]
}; }; internal endpoints: {{
"api" : "classpath:org.apache.slider.agents.secure",
"addressType" : "uri",
"protocolType" : "REST",
"addresses" : [ {
"uri" : "https://cluster.mycompany.com:40466/ws/v1/slider/agents"
} ]
}; {
"api" : "classpath:org.apache.slider.agents.oneway",
"addressType" : "uri",
"protocolType" : "REST",
"addresses" : [ {
"uri" : "https://cluster.mycompany.com:59141/ws/v1/slider/agents"
} ]
}; }, attributes: {"yarn:id"="application_1492599342357_0064"
"yarn:persistence"="application" }}
2017-05-02 16:16:07,992 [main-SendThread(localhost:2181)] WARN
zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing
socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
[Several repetitions of the previous error omitted for clarity and then...]
2017-05-02 16:16:12,877 [780172372@qtp-747004588-0] ERROR webapp.Dispatcher -
error handling URI: /slideram
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263)
at
com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178)
at
com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
at
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62)
at
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900)
at
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
at
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795)
at
com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
at
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
at
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at
org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:164)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1286)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: java.lang.NullPointerException
at
org.apache.slider.providers.AbstractProviderService.buildEndpointDetails(AbstractProviderService.java:352)
at
org.apache.slider.providers.AbstractProviderService.buildMonitorDetails(AbstractProviderService.java:337)
at
org.apache.slider.providers.agent.AgentProviderService.buildMonitorDetails(AgentProviderService.java:810)
at
org.apache.slider.server.appmaster.web.view.IndexBlock.addProviderServiceOptions(IndexBlock.java:129)
at
org.apache.slider.server.appmaster.web.view.IndexBlock.doIndex(IndexBlock.java:85)
at
org.apache.slider.server.appmaster.web.view.IndexBlock.render(IndexBlock.java:60)
at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:67)
at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77)
at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
at org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117)
at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845)
at
org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:56)
at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
at org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
at
org.apache.slider.server.appmaster.web.SliderAMController.index(SliderAMController.java:47)
... 39 more
2017-05-02 16:16:13,495 [main-SendThread(localhost:2181)] WARN
zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing
socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
[More repetitions of the previous error deleted]
2017-05-02 16:16:22,474 [main] ERROR curator.ConnectionState - Connection timed
out for connection string (localhost:2181) and timeout (15000) / elapsed (18944)
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
ConnectionLoss
at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:198)
at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88)
at
org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:113)
at
org.apache.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:457)
at
org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:239)
at
org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:234)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at
org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:230)
at
org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:215)
at
org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:42)
at
org.apache.hadoop.registry.client.impl.zk.CuratorService.zkDelete(CuratorService.java:673)
at
org.apache.hadoop.registry.client.impl.zk.RegistryOperationsService.delete(RegistryOperationsService.java:160)
at
org.apache.slider.server.services.yarnregistry.YarnRegistryViewForProviders.putService(YarnRegistryViewForProviders.java:186)
at
org.apache.slider.server.services.yarnregistry.YarnRegistryViewForProviders.registerSelf(YarnRegistryViewForProviders.java:224)
at
org.apache.slider.server.appmaster.SliderAppMaster.registerServiceInstance(SliderAppMaster.java:1084)
at
org.apache.slider.server.appmaster.SliderAppMaster.createAndRunCluster(SliderAppMaster.java:885)
at
org.apache.slider.server.appmaster.SliderAppMaster.runService(SliderAppMaster.java:525)
at
org.apache.slider.core.main.ServiceLauncher.launchService(ServiceLauncher.java:188)
at
org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(ServiceLauncher.java:475)
at
org.apache.slider.core.main.ServiceLauncher.launchServiceAndExit(ServiceLauncher.java:403)
at
org.apache.slider.core.main.ServiceLauncher.serviceMain(ServiceLauncher.java:630)
at
org.apache.slider.server.appmaster.SliderAppMaster.main(SliderAppMaster.java:2240)
2017-05-02 16:16:23,403 [main-SendThread(localhost:2181)] WARN
zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing
socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2017-05-02 16:16:24,504 [main-SendThread(localhost:2181)] WARN
zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing
socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
With best regards:
Bill