lhotari commented on pull request #9764: URL: https://github.com/apache/pulsar/pull/9764#issuecomment-787866127
> But I am not sure we can touch so many sensible parts of the codebase in one single patch. > We could break things without knowing and it will be hard to track down to the cause. > > I don't know which is the best strategy to incorporate this work safely that's a valid concern when a major change is made. However, the problems that the long living locks cause are nondeterministic and much hard to track down. One of the issues with StampedLocks used by ConcurrentOpenHashMap is that a dead lock doesn't get detected by the JVM. I made [a thread dump of an experiment](https://jstack.review/?https://gist.github.com/lhotari/66524bc10f7768a0bfc6bceb7d523b84) where I caused a dead lock by modifying the `testDeadlockPreventionWithForEachInSnapshot` to use `forEach` and a timeout (`@Test(timeOut = 10000L)`). I cherry-picked #9766 to get a standard thread dump with locks information. The StampedLocks information don't seem to be covered. StampedLocks aren't re-entrant so that is yet another way how deadlocks can occur since a single thread can deadlock. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
