[ 
https://issues.apache.org/jira/browse/HDDS-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16697459#comment-16697459
 ] 

Shashikant Banerjee commented on HDDS-850:
------------------------------------------

Thanks [~msingh] for the review comments. Patch v3 addresses your review 
comments.

> ReadStateMachineData hits OverlappingFileLockException in 
> ContainerStateMachine
> -------------------------------------------------------------------------------
>
>                 Key: HDDS-850
>                 URL: https://issues.apache.org/jira/browse/HDDS-850
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Datanode
>    Affects Versions: 0.4.0
>            Reporter: Shashikant Banerjee
>            Assignee: Shashikant Banerjee
>            Priority: Major
>             Fix For: 0.4.0
>
>         Attachments: HDDS-850.000.patch, HDDS-850.001.patch, 
> HDDS-850.002.patch, HDDS-850.003.patch
>
>
> {code:java}
> 2018-11-16 09:54:41,599 ERROR org.apache.ratis.server.impl.LogAppender: 
> GrpcLogAppender(0813f1a9-61be-4cab-aa05-d5640f4a8341 -> 
> c6ad906f-7e71-4bac-bde3-d22bc1aa8c7d) hit IOException while loading raft log
> org.apache.ratis.server.storage.RaftLogIOException: 
> 0813f1a9-61be-4cab-aa05-d5640f4a8341: Failed readStateMachineData for (t:2, 
> i:1), STATEMACHINELOGENTRY, client-7D19FB803B1E, cid=0
>         at 
> org.apache.ratis.server.storage.RaftLog$EntryWithData.getEntry(RaftLog.java:370)
>         at 
> org.apache.ratis.server.impl.LogAppender$LogEntryBuffer.getAppendRequest(LogAppender.java:167)
>         at 
> org.apache.ratis.server.impl.LogAppender.createRequest(LogAppender.java:216)
>         at 
> org.apache.ratis.grpc.server.GrpcLogAppender.appendLog(GrpcLogAppender.java:152)
>         at 
> org.apache.ratis.grpc.server.GrpcLogAppender.runAppenderImpl(GrpcLogAppender.java:96)
>         at 
> org.apache.ratis.server.impl.LogAppender.runAppender(LogAppender.java:100)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.nio.channels.OverlappingFileLockException
>         at sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:255)
>         at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:152)
>         at 
> sun.nio.ch.AsynchronousFileChannelImpl.addToFileLockTable(AsynchronousFileChannelImpl.java:178)
>         at 
> sun.nio.ch.SimpleAsynchronousFileChannelImpl.implLock(SimpleAsynchronousFileChannelImpl.java:185)
>         at 
> sun.nio.ch.AsynchronousFileChannelImpl.lock(AsynchronousFileChannelImpl.java:118)
>         at 
> org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.readData(ChunkUtils.java:178)
>         at 
> org.apache.hadoop.ozone.container.keyvalue.impl.ChunkManagerImpl.readChunk(ChunkManagerImpl.java:197)
>         at 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handleReadChunk(KeyValueHandler.java:542)
>         at 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:174)
>         at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:178)
>         at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:290)
>         at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.readStateMachineData(ContainerStateMachine.java:404)
>         at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$readStateMachineData$6(ContainerStateMachine.java:462)
>         at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         ... 1 more
> {code}
> This happens in the Ratis leader where the stateMachineData is not  in the 
> cached segments in Ratis while it gets a request for ReadStateMachineData 
> while writeStateMachineData is not completed yet. The approach would be to 
> cache the stateMachineData inside ContainerStateMachine and not cache it 
> inside ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to