Great, thanks for the answer! On Wed, Sep 30, 2015 at 10:05 PM, Kevin Sweeney <[email protected]> wrote:
> Hi Mauricio, > > Sorry for the delay in reply. > > This can sometimes happen during startup or shutdown. It's a race between > the leading scheuler (de)registering itself in ZooKeeper (and the client or > UI noticing) and starting up (tearing down) its storage. It's usually > transient, but definitely a wart. I've filed > https://issues.apache.org/jira/browse/AURORA-1509 to track improving this. > > On Wed, Sep 23, 2015 at 9:33 PM Mauricio Garavaglia < > [email protected]> wrote: > > > Hi guys, I have this problem when suddenly one scheduler stops working, > > every request ended up in the following stack trace (since this was not > the > > leader it also fails to redirect the request to the proper host). > > > > I have 5 schedulers configured with 5zk; this error goes away if I > restart > > the scheduler; but I couldn't find yet how to reproduce the issue > > consistently. Please let me know if I could give you more info to > > troubleshoot. Thanks > > > > > > I0924 04:01:46.763665 244 leveldb.cpp:343] Persisting action (86 bytes) > > to leveldb took 3.839569ms > > I0924 04:01:46.763694 244 replica.cpp:679] Persisted action at 3413 > > I0924 04:01:46.765290 217 replica.cpp:658] Replica received learned > > notice for position 3413 > > I0924 04:01:46.768878 217 leveldb.cpp:343] Persisting action (88 bytes) > > to leveldb took 3.558083ms > > I0924 04:01:46.768910 217 replica.cpp:679] Persisted action at 3413 > > I0924 04:01:46.768925 217 replica.cpp:664] Replica learned APPEND > action > > at position 3413 > > I0924 04:01:46.853 THREAD133 > > com.twitter.common.zookeeper.ServerSetImpl$ServerSetWatcher.logChange: > > server set /aurora/scheduler change: from 0 members to 1 > > joined: > > ServiceInstance(serviceEndpoint:Endpoint(host:10.192.255.21, port:8081), > > additionalEndpoints:{http=Endpoint(host:10.192.255.21, port:8081)}, > > status:ALIVE) > > I0924 04:01:46.853 THREAD133 > > > org.apache.aurora.scheduler.http.LeaderRedirect$SchedulerMonitor.onChange: > > Found leader scheduler at > > [ServiceInstance(serviceEndpoint:Endpoint(host:10.192.255.21, port:8081), > > additionalEndpoints:{http=Endpoint(host:10.192.255.21, port:8081)}, > > status:ALIVE)] > > I0924 04:01:57.685 THREAD133 > > com.twitter.common.zookeeper.CandidateImpl$4.onGroupChange: Candidate > > /aurora/scheduler/singleton_candidate_0000000196 waiting for the next > > leader election, current voting: [singleton_candidate_0000000198, > > singleton_candidate_0000000200, singleton_candidate_0000000193, > > singleton_candidate_0000000194, singleton_candidate_0000000196] > > I0924 04:01:57.736865 261 network.hpp:424] ZooKeeper group memberships > > changed > > I0924 04:01:57.737118 234 group.cpp:659] Trying to get > > '/aurora/replicated-log/0000000193' in ZooKeeper > > I0924 04:01:57.737802 234 group.cpp:659] Trying to get > > '/aurora/replicated-log/0000000194' in ZooKeeper > > I0924 04:01:57.738327 234 group.cpp:659] Trying to get > > '/aurora/replicated-log/0000000195' in ZooKeeper > > I0924 04:01:57.738834 234 group.cpp:659] Trying to get > > '/aurora/replicated-log/0000000196' in ZooKeeper > > I0924 04:01:57.739236 234 group.cpp:659] Trying to get > > '/aurora/replicated-log/0000000197' in ZooKeeper > > I0924 04:01:57.739910 241 network.hpp:466] ZooKeeper group PIDs: { > > log-replica(1)@10.192.255.21:8083, log-replica(1)@10.192.255.22:8083, > > log-replica(1)@10.192.255.23:8083, log-replica(1)@10.192.255.24:8083, > > log-replica(1)@10.192.255.25:8083 } > > I0924 04:02:59.327913 219 replica.cpp:511] Replica received write > request > > for position 3414 > > I0924 04:02:59.333317 219 leveldb.cpp:343] Persisting action (217 > bytes) > > to leveldb took 5.351646ms > > I0924 04:02:59.333377 219 replica.cpp:679] Persisted action at 3414 > > I0924 04:02:59.334916 241 replica.cpp:658] Replica received learned > > notice for position 3414 > > I0924 04:02:59.338354 241 leveldb.cpp:343] Persisting action (219 > bytes) > > to leveldb took 3.403633ms > > I0924 04:02:59.338392 241 replica.cpp:679] Persisted action at 3414 > > I0924 04:02:59.338404 241 replica.cpp:664] Replica learned APPEND > action > > at position 3414 > > I0924 04:03:24.492 THREAD171 > > org.apache.aurora.scheduler.http.RequestLogger$1.log: 10.192.255.23 > > 10.207.4.10 [24/Sep/2015:04:03:24 +0000] "POST /api HTTP/1.1" 200 0 " > > http://10.192.255.23:8081/scheduler" "Mozilla/5.0 (Macintosh; Intel Mac > OS > > X 10.10; rv:40.0) Gecko/20100101 Firefox/40.0" 1 > > I0924 04:03:24.505 THREAD171 > > org.apache.aurora.scheduler.thrift.aop.LoggingInterceptor.invoke: > > getRoleSummary() > > W0924 04:03:24.512 THREAD171 > > org.apache.aurora.scheduler.thrift.aop.LoggingInterceptor.invoke: > Uncaught > > transient exception while handling getRoleSummary() > > org.apache.aurora.scheduler.storage.Storage$TransientStorageException: > > Storage is not READY > > at > > > > > org.apache.aurora.scheduler.storage.CallOrderEnforcingStorage.checkInState(CallOrderEnforcingStorage.java:78) > > at > > > > > org.apache.aurora.scheduler.storage.CallOrderEnforcingStorage.read(CallOrderEnforcingStorage.java:114) > > at > > > > > org.apache.aurora.scheduler.thrift.ReadOnlySchedulerImpl.getRoleSummary(ReadOnlySchedulerImpl.java:230) > > at > > > > > org.apache.aurora.scheduler.thrift.SchedulerThriftInterface.getRoleSummary(SchedulerThriftInterface.java:463) > > at > > > > > org.apache.aurora.scheduler.thrift.aop.ThriftStatsExporterInterceptor.invoke(ThriftStatsExporterInterceptor.java:47) > > at > > > > > org.apache.aurora.scheduler.thrift.aop.FeatureToggleInterceptor.invoke(FeatureToggleInterceptor.java:38) > > at > > > > > org.apache.aurora.scheduler.thrift.aop.LoggingInterceptor.invoke(LoggingInterceptor.java:102) > > at > > > > > org.apache.aurora.scheduler.thrift.aop.ServerInfoInterceptor.invoke(ServerInfoInterceptor.java:30) > > at > > > > > org.apache.aurora.gen.ReadOnlyScheduler$Processor$getRoleSummary.getResult(ReadOnlyScheduler.java:886) > > at > > > > > org.apache.aurora.gen.ReadOnlyScheduler$Processor$getRoleSummary.getResult(ReadOnlyScheduler.java:871) > > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > > at org.apache.thrift.server.TServlet.doPost(TServlet.java:83) > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > > at > > > > > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) > > at > > > > > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) > > at > > > > > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > org.apache.aurora.scheduler.http.HttpStatsFilter.doFilter(HttpStatsFilter.java:69) > > at > > > > > org.apache.aurora.scheduler.http.AbstractFilter.doFilter(AbstractFilter.java:44) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:168) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82) > > at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:294) > > at > > > > > com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) > > at > > > > > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) > > at > > > > > com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) > > at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) > > at > > > > > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1288) > > at > > > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:443) > > at > > > > > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1044) > > at > > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:372) > > at > > > > > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:978) > > at > > > > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) > > at > > > > > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) > > at > > > > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) > > at > > > > > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:317) > > at > > > > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) > > at org.eclipse.jetty.server.Server.handle(Server.java:369) > > at > > > > > org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:486) > > at > > > > > org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:944) > > at > > > > > org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1005) > > at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865) > > at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) > > at > > > > > org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) > > at > > > > > org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:667) > > at > > > > > org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52) > > at > > > > > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) > > at > > > > > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) > > at java.lang.Thread.run(Thread.java:745) > > >
