Hi there,

I get an error when trying to start a raft node, which is deployed inside a 
kubernetes cluster. Here's the error info:



Caused by: java.io.IOException: Failed to lock storage 
/data/ratis-data/dynamic-service-2.dynamic-service-gcek/43dea5d8-f076-11ec-8ea0-0242ac120002.
 The directory is already locked
    at 
org.apache.ratis.server.storage.RaftStorageDirectoryImpl.tryLock(RaftStorageDirectoryImpl.java:236)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
    at 
org.apache.ratis.server.storage.RaftStorageDirectoryImpl.lambda$lock$0(RaftStorageDirectoryImpl.java:194)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
    at org.apache.ratis.util.JavaUtils.attempt(JavaUtils.java:166) 
~[ratis-common-2.3.0.jar!/:2.3.0]
    at org.apache.ratis.util.FileUtils.attempt(FileUtils.java:40) 
~[ratis-common-2.3.0.jar!/:2.3.0]
    at 
org.apache.ratis.server.storage.RaftStorageDirectoryImpl.lock(RaftStorageDirectoryImpl.java:194)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
    at 
org.apache.ratis.server.storage.RaftStorageDirectoryImpl.analyzeStorage(RaftStorageDirectoryImpl.java:153)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
    at 
org.apache.ratis.server.storage.RaftStorageImpl.analyzeAndRecoverStorage(RaftStorageImpl.java:97)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
    at 
org.apache.ratis.server.storage.RaftStorageImpl.<init&gt;(RaftStorageImpl.java:67)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
org.apache.ratis.server.storage.RaftStorageImpl.<init&gt;(RaftStorageImpl.java:52)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
org.apache.ratis.server.impl.ServerState.<init&gt;(ServerState.java:116) 
~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
org.apache.ratis.server.impl.RaftServerImpl.<init&gt;(RaftServerImpl.java:201) 
~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$5(RaftServerProxy.java:274)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
 ~[?:?]
&nbsp; &nbsp; at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
&nbsp; &nbsp; at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
&nbsp; &nbsp; at java.lang.Thread.run(Thread.java:829) ~[?:?]
&nbsp;Caused by: java.nio.channels.OverlappingFileLockException
&nbsp; &nbsp; at 
org.apache.ratis.server.storage.RaftStorageDirectoryImpl.tryLock(RaftStorageDirectoryImpl.java:227)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
org.apache.ratis.server.storage.RaftStorageDirectoryImpl.lambda$lock$0(RaftStorageDirectoryImpl.java:194)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at org.apache.ratis.util.JavaUtils.attempt(JavaUtils.java:166) 
~[ratis-common-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at org.apache.ratis.util.FileUtils.attempt(FileUtils.java:40) 
~[ratis-common-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
org.apache.ratis.server.storage.RaftStorageDirectoryImpl.lock(RaftStorageDirectoryImpl.java:194)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
org.apache.ratis.server.storage.RaftStorageDirectoryImpl.analyzeStorage(RaftStorageDirectoryImpl.java:153)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
org.apache.ratis.server.storage.RaftStorageImpl.analyzeAndRecoverStorage(RaftStorageImpl.java:97)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
org.apache.ratis.server.storage.RaftStorageImpl.<init&gt;(RaftStorageImpl.java:67)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
org.apache.ratis.server.storage.RaftStorageImpl.<init&gt;(RaftStorageImpl.java:52)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
org.apache.ratis.server.impl.ServerState.<init&gt;(ServerState.java:116) 
~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
org.apache.ratis.server.impl.RaftServerImpl.<init&gt;(RaftServerImpl.java:201) 
~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$5(RaftServerProxy.java:274)
 ~[ratis-server-2.3.0.jar!/:2.3.0]
&nbsp; &nbsp; at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
 ~[?:?]
&nbsp; &nbsp; at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
&nbsp; &nbsp; at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]

&nbsp; &nbsp; at java.lang.Thread.run(Thread.java:829) ~[?:?]




And I've tried to recreate the raft directory(by recreating the pvc) and 
restart the pod, but still get the same issue.




Each pod has it's own data storage, there's no reason it will be locked by two 
ratis process.

So I guess it might be some kind of bug? I found a JIRA bug here: 
https://issues.apache.org/jira/browse/RATIS-538, which is 

almost the same.




Any ideas how to fix it? 




Thanks,




Riguz Lee

Reply via email to