[ 
https://issues.apache.org/jira/browse/RATIS-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chung En Lee updated RATIS-2194:
--------------------------------
    Priority: Blocker  (was: Major)

> The FileLock wasn't properly released
> -------------------------------------
>
>                 Key: RATIS-2194
>                 URL: https://issues.apache.org/jira/browse/RATIS-2194
>             Project: Ratis
>          Issue Type: Bug
>            Reporter: Chung En Lee
>            Assignee: Chung En Lee
>            Priority: Blocker
>
> Found OverlappingFileLockExceptio when restarting RaftServer. 
> {code:java}
> 2024-11-15 10:20:34,913 [ec434f7b-7b08-41b6-ab49-e0081c7628e0-impl-thread2] 
> ERROR storage.RaftStorageDirectory 
> (RaftStorageDirectoryImpl.java:tryLock(232)) - It appears that another 
> process has already locked the storage directory: 
> /home/runner/work/ozone/ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-39b72d62-70b5-4dca-b921-16ab22773f65/ozone-meta/datanode-5/ratis/6307bde8-27c4-4e8e-a05b-c0f893560afd
> java.nio.channels.OverlappingFileLockException
>     at sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:255)
>     at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:152)
>     at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1107)
>     at java.nio.channels.FileChannel.tryLock(FileChannel.java:1155)
>     at 
> org.apache.ratis.server.storage.RaftStorageDirectoryImpl.tryLock(RaftStorageDirectoryImpl.java:223)
>     at 
> org.apache.ratis.server.storage.RaftStorageDirectoryImpl.lambda$lock$0(RaftStorageDirectoryImpl.java:193)
>     at org.apache.ratis.util.JavaUtils.lambda$attempt$7(JavaUtils.java:212)
>     at org.apache.ratis.util.JavaUtils.attempt(JavaUtils.java:225)
>     at org.apache.ratis.util.JavaUtils.attempt(JavaUtils.java:212)
>     at org.apache.ratis.util.FileUtils.attempt(FileUtils.java:45)
>     at 
> org.apache.ratis.server.storage.RaftStorageDirectoryImpl.lock(RaftStorageDirectoryImpl.java:193)
>     at 
> org.apache.ratis.server.storage.RaftStorageDirectoryImpl.analyzeStorage(RaftStorageDirectoryImpl.java:156)
>     at 
> org.apache.ratis.server.storage.RaftStorageImpl.analyzeAndRecoverStorage(RaftStorageImpl.java:106)
>     at 
> org.apache.ratis.server.storage.RaftStorageImpl.initialize(RaftStorageImpl.java:66)
>     at 
> org.apache.ratis.server.storage.StorageImplUtils$Op.recover(StorageImplUtils.java:176)
>     at 
> org.apache.ratis.server.storage.StorageImplUtils$Op.run(StorageImplUtils.java:129)
>     at 
> org.apache.ratis.server.storage.StorageImplUtils.initRaftStorage(StorageImplUtils.java:100)
>     at 
> org.apache.ratis.server.impl.ServerState.lambda$new$2(ServerState.java:118)
>     at 
> org.apache.ratis.util.MemoizedCheckedSupplier.get(MemoizedCheckedSupplier.java:68)
>     at 
> org.apache.ratis.server.impl.ServerState.initialize(ServerState.java:140)
>     at 
> org.apache.ratis.server.impl.RaftServerImpl.start(RaftServerImpl.java:384)
>     at org.apache.ratis.util.ConcurrentUtils.accept(ConcurrentUtils.java:203)
>     at 
> org.apache.ratis.util.ConcurrentUtils.lambda$null$4(ConcurrentUtils.java:182)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:750){code}
> When closing `RaftServerProxy`, all `RaftServerImpl` in `implMap` should be 
> closed. However, if there was a new `RaftServerImpl` started by 
> `RaftServerProxy#groupManagementAsync`, it would not be closed properly. The 
> log will say that the new `RaftServerImpl` has been closed, but its raft 
> store is still being initialised by `implExecutor`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to