keith-turner commented on issue #3759:
URL: https://github.com/apache/accumulo/issues/3759#issuecomment-2251132771

   > Re-opening due to possible issue found in 
https://github.com/apache/accumulo/pull/4751 . See 
https://github.com/apache/accumulo/pull/4751#issuecomment-2250181639.
   
   I plan to revert the commit related to #4751 and move this issue to 4.0.0 
milestone for now because I can not see how to fix this w/o a major refactoring 
of the tablet close code (the comments on #4751 provide some insight into one 
refactoring that is needed, but that may be the tip of an ice berg).  A major 
refactoring of the code could introduce new bugs and this bug probably has a 
low chance of happening, so it does not seem worth the risk to me in 2.1.X at 
the moment
   
   The reason I am thinking moving this to 4.0.0 is because a lot of code was 
removed from the tablet server in 4.0.0 which makes a major refactoring of the 
Tablet code easier to do.  The issue   #3670 is already open for 4.0.0 and this 
bug could possibly be fixed as part of that cleanup or it could be done after 
that cleanup and would be much easier to do after a cleanup like #3670.
   
   If we figure out a way to fix this in 2.1.X w/o rewriting large parts of the 
Tablet synchronization code, then this issue could be moved back to a 2.1.X 
milestone.
   
   This bug involves two different kinds of locks, one is a Java lock object 
and the other is Java object monitor lock.  I was curious if jstack could 
detect deadlocks between these different kinds of locks.  Did a bit of 
searching and did not find anything definitive.  So wrote the following test 
program and for this jstack did detect the deadlock.  So its good to know if it 
does happen that jstack can detect it (at least w/ jdk 17).
   
   
   ```java
   public class TestLocks {
       public static void main(String[] args) throws Exception {
           ExecutorService executorService = Executors.newFixedThreadPool(2);
   
           var logLock = new ReentrantLock();
           var tablet = new HashMap<String,String>();
   
           var future1 = executorService.submit(()-> {
               // This simulates how Tablet close may acquire locks.
               synchronized (tablet) {
                   Thread.sleep(5000);
                   logLock.lock();
                   System.out.println("1");
               }
               return null;
           });
   
   
           var future2 = executorService.submit(()-> {
               //This simulates how other code in Tablet outside of close will 
acquire locks
               logLock.lock();
               Thread.sleep(5000);
               synchronized (tablet){
                   System.out.println("2");
               }
               return null;
           });
   
           future1.get();
           future2.get();
       }
   }
   ```
   
   The following is the output of running jstack against the above program 
using OpenJDK17, so at least that version will detect deadlock between 
different lock types.
   
   ```
   Found one Java-level deadlock:
   =============================
   "pool-1-thread-1":
     waiting for ownable synchronizer 0x000000062ae3eed0, (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync),
     which is held by "pool-1-thread-2"
   
   "pool-1-thread-2":
     waiting to lock monitor 0x00007562c4000fe0 (object 0x000000062ae3efd8, a 
java.util.HashMap),
     which is held by "pool-1-thread-1"
   
   Java stack information for the threads listed above:
   ===================================================
   "pool-1-thread-1":
        at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
        - parking to wait for  <0x000000062ae3eed0> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
        at 
java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:211)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:715)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:938)
        at 
java.util.concurrent.locks.ReentrantLock$Sync.lock([email protected]/ReentrantLock.java:153)
        at 
java.util.concurrent.locks.ReentrantLock.lock([email protected]/ReentrantLock.java:322)
        at org.apache.accumulo.server.TestLocks.lambda$main$0(TestLocks.java:19)
        - locked <0x000000062ae3efd8> (a java.util.HashMap)
        at 
org.apache.accumulo.server.TestLocks$$Lambda$14/0x0000756324001208.call(Unknown 
Source)
        at 
java.util.concurrent.FutureTask.run([email protected]/FutureTask.java:264)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
        at java.lang.Thread.run([email protected]/Thread.java:840)
   "pool-1-thread-2":
        at org.apache.accumulo.server.TestLocks.lambda$main$1(TestLocks.java:31)
        - waiting to lock <0x000000062ae3efd8> (a java.util.HashMap)
        at 
org.apache.accumulo.server.TestLocks$$Lambda$16/0x0000756324001430.call(Unknown 
Source)
        at 
java.util.concurrent.FutureTask.run([email protected]/FutureTask.java:264)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
        at java.lang.Thread.run([email protected]/Thread.java:840)
   
   Found 1 deadlock.
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to