Jason918 opened a new issue #13004: URL: https://github.com/apache/pulsar/issues/13004
**Describe the bug** Current unit test `org.apache.pulsar.metadata.LockManagerTest#updateValueWhenKeyDisappears` have a small chance that will fails with following exception: > > java.util.concurrent.CompletionException: org.apache.pulsar.metadata.api.MetadataStoreException$LockBusyException: Resource at /my/path/1 is already locked > > at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) > at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346) > at java.base/java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:777) > at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) > at org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl.lambda$acquireWithNoRevalidation$7(ResourceLockImpl.java:167) > at java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986) > at java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970) > at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) > at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073) > at org.apache.pulsar.metadata.impl.DelayInjectionMetadataStore.lambda$getRandomDelayStage$0(DelayInjectionMetadataStore.java:83) > at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) > at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:829) > Caused by: org.apache.pulsar.metadata.api.MetadataStoreException$LockBusyException: Resource at /my/path/1 is already locked > ... 13 more It fails on the line here: https://github.com/apache/pulsar/blob/693a066d73ea4012fb2bb750d7450474f210cccd/pulsar-metadata/src/test/java/org/apache/pulsar/metadata/LockManagerTest.java#L198 After some digging, I found that it's because there is a race condition of method `org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl#revalidate`. Call stack A: 1. `store.delete("/my/path/1", Optional.empty()).join();` 2. Node Delete Event 2. LockManagerImpl#handleDataNotification 3. ResourceLockImpl#lockWasInvalidated 4. **ResourceLockImpl#revalidate** Call stack B: 1. lock.updateValue("value-2").join(); 2. org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl#acquire 3. ResourceLockImpl#acquireWithNoRevalidation fails with LockBusyException 4. **ResourceLockImpl#revalidate** , See: https://github.com/apache/pulsar/blob/693a066d73ea4012fb2bb750d7450474f210cccd/pulsar-metadata/src/main/java/org/apache/pulsar/metadata/coordination/impl/ResourceLockImpl.java#L130 Once the node is deleted and two `ResourceLockImpl#revalidate` are called at the same time, one of them is going to fail. So in the case above `lock.updateValue` is failed. **To Reproduce** Steps to reproduce the behavior: 1. It's easier to reproduce this when we add a 5ms delay in `MetadataStore#get` 2. Run updateValueWhenKeyDisappears a few times 3. See error. **Expected behavior** lock.updateValue should always success in this case. **Screenshots** NA **Desktop (please complete the following information):** - OS: [e.g. iOS] **Additional context** NA -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
