keith-turner opened a new issue, #5714:
URL: https://github.com/apache/accumulo/issues/5714

   **Describe the bug**
   
   Sometimes the test 
org.apache.accumulo.test.functional.ShutdownIT.shutdownDuringDeleteTable will 
fail. When it does happen will see the following client side exception.
   
   ```
   org.apache.accumulo.core.client.AccumuloException: Internal error processing 
waitForFateOperation
        at 
org.apache.accumulo.core.clientImpl.TableOperationsImpl.handleFateOperation(TableOperationsImpl.java:433)
        at 
org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:451)
        at 
org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:441)
        at 
org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1840)
        at 
org.apache.accumulo.core.clientImpl.TableOperationsImpl.delete(TableOperationsImpl.java:814)
        at 
org.apache.accumulo.test.functional.ShutdownIT.lambda$shutdownDuringDeleteTable$0(ShutdownIT.java:89)
        at java.base/java.lang.Thread.run(Thread.java:840)
   ```
   
   Which is caused by the following exception in the manager
   
   ```
   2025-07-03T14:16:06,515 109 [thrift.ProcessFunction] ERROR: Internal error 
processing finishFateOperation
   java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@3328ca62[Not
 completed, task = 
java.util.concurrent.Executors$RunnableAdapter@b9dc721[Wrapped task = 
org.apache.accumulo.core.trace.TraceWrappedRunnable@332f9385]] rejected from 
org.apache.accumulo.core.util.threads.ThreadPools$2@7806a7c5[Terminated, pool 
size = 0, active threads = 0, queued tasks = 0, completed tasks = 353]
           at 
java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2065)
 ~[?:?]
           at 
java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:833)
 ~[?:?]
           at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:340)
 ~[?:?]
           at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:562)
 ~[?:?]
           at 
org.apache.accumulo.core.util.threads.ThreadPools$2.schedule(ThreadPools.java:683)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor.execute(ScheduledThreadPoolExecutor.java:705)
 ~[?:?]
           at 
org.apache.accumulo.core.util.threads.ThreadPools$2.execute(ThreadPools.java:669)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.clientImpl.ConditionalWriterImpl.queue(ConditionalWriterImpl.java:322)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at java.base/java.util.HashMap.forEach(HashMap.java:1421) ~[?:?]
           at 
org.apache.accumulo.core.clientImpl.ConditionalWriterImpl.queue(ConditionalWriterImpl.java:311)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.clientImpl.ConditionalWriterImpl.write(ConditionalWriterImpl.java:434)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.clientImpl.ConditionalWriterImpl.write(ConditionalWriterImpl.java:841)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.fate.user.FateMutatorImpl.tryMutate(FateMutatorImpl.java:245)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.fate.user.UserFateStore.tryReserve(UserFateStore.java:227)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.fate.AbstractFateStore.reserve(AbstractFateStore.java:146)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.logging.FateLogger$1.reserve(FateLogger.java:113) 
~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at org.apache.accumulo.core.fate.Fate.delete(Fate.java:434) 
~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.manager.FateServiceHandler.finishFateOperation(FateServiceHandler.java:862)
 ~[accumulo-manager-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at jdk.internal.reflect.GeneratedMethodAccessor17.invoke(Unknown 
Source) ~[?:?]
           at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:?]
           at java.base/java.lang.reflect.Method.invoke(Method.java:569) ~[?:?]
           at 
org.apache.accumulo.core.trace.TraceUtil.lambda$wrapService$0(TraceUtil.java:203)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at jdk.proxy2/jdk.proxy2.$Proxy27.finishFateOperation(Unknown 
Source) ~[?:?]
           at 
org.apache.accumulo.core.manager.thrift.FateService$Processor$finishFateOperation.getResult(FateService.java:633)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.accumulo.core.manager.thrift.FateService$Processor$finishFateOperation.getResult(FateService.java:609)
 ~[accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:40) 
[libthrift-0.17.0.jar:0.17.0]
           at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:40) 
[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:147) 
[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:50) 
[accumulo-server-base-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:492)
 [libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:129)
 [accumulo-server-base-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at org.apache.thrift.server.Invocation.run(Invocation.java:18) 
[libthrift-0.17.0.jar:0.17.0]
           at 
org.apache.accumulo.core.trace.TraceWrappedRunnable.run(TraceWrappedRunnable.java:52)
 [accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
 [?:?]
           at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
 [?:?]
           at 
org.apache.accumulo.core.trace.TraceWrappedRunnable.run(TraceWrappedRunnable.java:52)
 [accumulo-core-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
           at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
   
   ```
   
   **Expected behavior**
   
   The client does not see an exception, the client should wait for a new 
manager and make RPC request there.
   
   One way this could be fixed is to stop the thrift server in the manager 
before stopping fate.  If workable, then this seems like it could be a simple 
solution.
   
   Another possible way to fix to this is to add an exception to Fate related 
RPCs that indicates that fate is not currently available and that the client 
should retry later.  This would be similar to NotServingTablet thrift 
exceptions that some tablet related RPCs throw.  This would be more complicated 
and does not seems worthwhile unless there is no simpler solution.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to