Clay B. created RATIS-692:
-----------------------------
Summary: RaftStorageDirectory.tryLock throws a very deep re-tried
IOException
Key: RATIS-692
URL: https://issues.apache.org/jira/browse/RATIS-692
Project: Ratis
Issue Type: Bug
Components: server
Reporter: Clay B.
Working with our Namazu infrastructure, the first issue I hit when dialing up
the faulty I/O injection rate is as follows:
{code}
2019-09-27 14:13:45 ERROR RaftStorageDirectory:336 - Failed to acquire lock on
/home/vagrant/test_data/data0_slowed/64656d6f-5261-6674-4772-6f7570313233/in_use.lock.
If this storage directory is mounted via NFS, ensure that the appropriate nfs
lock services are running.
java.io.IOException: Input/output error
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:512)
at
org.apache.ratis.server.storage.RaftStorageDirectory.tryLock(RaftStorageDirectory.java:327)
at
org.apache.ratis.server.storage.RaftStorageDirectory.lock(RaftStorageDirectory.java:291)
at
org.apache.ratis.server.storage.RaftStorageDirectory.analyzeStorage(RaftStorageDirectory.java:264)
at
org.apache.ratis.server.storage.RaftStorage.analyzeAndRecoverStorage(RaftStorage.java:100)
at
org.apache.ratis.server.storage.RaftStorage.<init>(RaftStorage.java:63)
at org.apache.ratis.server.impl.ServerState.<init>(ServerState.java:109)
at
org.apache.ratis.server.impl.RaftServerImpl.<init>(RaftServerImpl.java:110)
at
org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:208)
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "main" java.io.IOException: Input/output error
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:512)
at
org.apache.ratis.server.storage.RaftStorageDirectory.tryLock(RaftStorageDirectory.java:327)
at
org.apache.ratis.server.storage.RaftStorageDirectory.lock(RaftStorageDirectory.java:291)
at
org.apache.ratis.server.storage.RaftStorageDirectory.analyzeStorage(RaftStorageDirectory.java:264)
at
org.apache.ratis.server.storage.RaftStorage.analyzeAndRecoverStorage(RaftStorage.java:100)
at
org.apache.ratis.server.storage.RaftStorage.<init>(RaftStorage.java:63)
at org.apache.ratis.server.impl.ServerState.<init>(ServerState.java:109)
at
org.apache.ratis.server.impl.RaftServerImpl.<init>(RaftServerImpl.java:110)
at
org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:208)
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}
It looks like the call chain does not re-try anywhere however.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)