Hi Mauricio, Sorry for the delay in reply.
This can sometimes happen during startup or shutdown. It's a race between the leading scheuler (de)registering itself in ZooKeeper (and the client or UI noticing) and starting up (tearing down) its storage. It's usually transient, but definitely a wart. I've filed https://issues.apache.org/jira/browse/AURORA-1509 to track improving this. On Wed, Sep 23, 2015 at 9:33 PM Mauricio Garavaglia < [email protected]> wrote: > Hi guys, I have this problem when suddenly one scheduler stops working, > every request ended up in the following stack trace (since this was not the > leader it also fails to redirect the request to the proper host). > > I have 5 schedulers configured with 5zk; this error goes away if I restart > the scheduler; but I couldn't find yet how to reproduce the issue > consistently. Please let me know if I could give you more info to > troubleshoot. Thanks > > > I0924 04:01:46.763665 244 leveldb.cpp:343] Persisting action (86 bytes) > to leveldb took 3.839569ms > I0924 04:01:46.763694 244 replica.cpp:679] Persisted action at 3413 > I0924 04:01:46.765290 217 replica.cpp:658] Replica received learned > notice for position 3413 > I0924 04:01:46.768878 217 leveldb.cpp:343] Persisting action (88 bytes) > to leveldb took 3.558083ms > I0924 04:01:46.768910 217 replica.cpp:679] Persisted action at 3413 > I0924 04:01:46.768925 217 replica.cpp:664] Replica learned APPEND action > at position 3413 > I0924 04:01:46.853 THREAD133 > com.twitter.common.zookeeper.ServerSetImpl$ServerSetWatcher.logChange: > server set /aurora/scheduler change: from 0 members to 1 > joined: > ServiceInstance(serviceEndpoint:Endpoint(host:10.192.255.21, port:8081), > additionalEndpoints:{http=Endpoint(host:10.192.255.21, port:8081)}, > status:ALIVE) > I0924 04:01:46.853 THREAD133 > org.apache.aurora.scheduler.http.LeaderRedirect$SchedulerMonitor.onChange: > Found leader scheduler at > [ServiceInstance(serviceEndpoint:Endpoint(host:10.192.255.21, port:8081), > additionalEndpoints:{http=Endpoint(host:10.192.255.21, port:8081)}, > status:ALIVE)] > I0924 04:01:57.685 THREAD133 > com.twitter.common.zookeeper.CandidateImpl$4.onGroupChange: Candidate > /aurora/scheduler/singleton_candidate_0000000196 waiting for the next > leader election, current voting: [singleton_candidate_0000000198, > singleton_candidate_0000000200, singleton_candidate_0000000193, > singleton_candidate_0000000194, singleton_candidate_0000000196] > I0924 04:01:57.736865 261 network.hpp:424] ZooKeeper group memberships > changed > I0924 04:01:57.737118 234 group.cpp:659] Trying to get > '/aurora/replicated-log/0000000193' in ZooKeeper > I0924 04:01:57.737802 234 group.cpp:659] Trying to get > '/aurora/replicated-log/0000000194' in ZooKeeper > I0924 04:01:57.738327 234 group.cpp:659] Trying to get > '/aurora/replicated-log/0000000195' in ZooKeeper > I0924 04:01:57.738834 234 group.cpp:659] Trying to get > '/aurora/replicated-log/0000000196' in ZooKeeper > I0924 04:01:57.739236 234 group.cpp:659] Trying to get > '/aurora/replicated-log/0000000197' in ZooKeeper > I0924 04:01:57.739910 241 network.hpp:466] ZooKeeper group PIDs: { > log-replica(1)@10.192.255.21:8083, log-replica(1)@10.192.255.22:8083, > log-replica(1)@10.192.255.23:8083, log-replica(1)@10.192.255.24:8083, > log-replica(1)@10.192.255.25:8083 } > I0924 04:02:59.327913 219 replica.cpp:511] Replica received write request > for position 3414 > I0924 04:02:59.333317 219 leveldb.cpp:343] Persisting action (217 bytes) > to leveldb took 5.351646ms > I0924 04:02:59.333377 219 replica.cpp:679] Persisted action at 3414 > I0924 04:02:59.334916 241 replica.cpp:658] Replica received learned > notice for position 3414 > I0924 04:02:59.338354 241 leveldb.cpp:343] Persisting action (219 bytes) > to leveldb took 3.403633ms > I0924 04:02:59.338392 241 replica.cpp:679] Persisted action at 3414 > I0924 04:02:59.338404 241 replica.cpp:664] Replica learned APPEND action > at position 3414 > I0924 04:03:24.492 THREAD171 > org.apache.aurora.scheduler.http.RequestLogger$1.log: 10.192.255.23 > 10.207.4.10 [24/Sep/2015:04:03:24 +0000] "POST /api HTTP/1.1" 200 0 " > http://10.192.255.23:8081/scheduler" "Mozilla/5.0 (Macintosh; Intel Mac OS > X 10.10; rv:40.0) Gecko/20100101 Firefox/40.0" 1 > I0924 04:03:24.505 THREAD171 > org.apache.aurora.scheduler.thrift.aop.LoggingInterceptor.invoke: > getRoleSummary() > W0924 04:03:24.512 THREAD171 > org.apache.aurora.scheduler.thrift.aop.LoggingInterceptor.invoke: Uncaught > transient exception while handling getRoleSummary() > org.apache.aurora.scheduler.storage.Storage$TransientStorageException: > Storage is not READY > at > > org.apache.aurora.scheduler.storage.CallOrderEnforcingStorage.checkInState(CallOrderEnforcingStorage.java:78) > at > > org.apache.aurora.scheduler.storage.CallOrderEnforcingStorage.read(CallOrderEnforcingStorage.java:114) > at > > org.apache.aurora.scheduler.thrift.ReadOnlySchedulerImpl.getRoleSummary(ReadOnlySchedulerImpl.java:230) > at > > org.apache.aurora.scheduler.thrift.SchedulerThriftInterface.getRoleSummary(SchedulerThriftInterface.java:463) > at > > org.apache.aurora.scheduler.thrift.aop.ThriftStatsExporterInterceptor.invoke(ThriftStatsExporterInterceptor.java:47) > at > > org.apache.aurora.scheduler.thrift.aop.FeatureToggleInterceptor.invoke(FeatureToggleInterceptor.java:38) > at > > org.apache.aurora.scheduler.thrift.aop.LoggingInterceptor.invoke(LoggingInterceptor.java:102) > at > > org.apache.aurora.scheduler.thrift.aop.ServerInfoInterceptor.invoke(ServerInfoInterceptor.java:30) > at > > org.apache.aurora.gen.ReadOnlyScheduler$Processor$getRoleSummary.getResult(ReadOnlyScheduler.java:886) > at > > org.apache.aurora.gen.ReadOnlyScheduler$Processor$getRoleSummary.getResult(ReadOnlyScheduler.java:871) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at org.apache.thrift.server.TServlet.doPost(TServlet.java:83) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at > > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) > at > > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) > at > > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > org.apache.aurora.scheduler.http.HttpStatsFilter.doFilter(HttpStatsFilter.java:69) > at > > org.apache.aurora.scheduler.http.AbstractFilter.doFilter(AbstractFilter.java:44) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82) > at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:294) > at > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) > at > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > at > > com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) > at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) > at > > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1288) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:443) > at > > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1044) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:372) > at > > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:978) > at > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) > at > > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) > at > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) > at > > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:317) > at > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) > at org.eclipse.jetty.server.Server.handle(Server.java:369) > at > > org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:486) > at > > org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:944) > at > > org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1005) > at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865) > at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) > at > > org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) > at > > org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667) > at > > org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) > at > > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) > at > > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) > at java.lang.Thread.run(Thread.java:745) >
