[
https://issues.apache.org/jira/browse/HDDS-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917241#comment-16917241
]
Hudson commented on HDDS-2026:
------------------------------
FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17191 (See
[https://builds.apache.org/job/Hadoop-trunk-Commit/17191/])
HDDS-2026. Overlapping chunk region cannot be read concurrently (aengineer: rev
0883ce102113cdc9527ab8aa548895a8418cb6bb)
* (edit)
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/ChunkUtils.java
* (add)
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/helpers/TestChunkUtils.java
> Overlapping chunk region cannot be read concurrently
> ----------------------------------------------------
>
> Key: HDDS-2026
> URL: https://issues.apache.org/jira/browse/HDDS-2026
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: Ozone Datanode
> Reporter: Doroszlai, Attila
> Assignee: Doroszlai, Attila
> Priority: Critical
> Labels: pull-request-available
> Fix For: 0.5.0
>
> Attachments: HDDS-2026-repro.patch, changes.diff,
> first-cut-proposed.diff
>
> Time Spent: 2h
> Remaining Estimate: 0h
>
> Concurrent requests to datanode for the same chunk may result in the
> following exception in datanode:
> {code}
> java.nio.channels.OverlappingFileLockException
> at java.base/sun.nio.ch.FileLockTable.checkList(FileLockTable.java:229)
> at java.base/sun.nio.ch.FileLockTable.add(FileLockTable.java:123)
> at
> java.base/sun.nio.ch.AsynchronousFileChannelImpl.addToFileLockTable(AsynchronousFileChannelImpl.java:178)
> at
> java.base/sun.nio.ch.SimpleAsynchronousFileChannelImpl.implLock(SimpleAsynchronousFileChannelImpl.java:185)
> at
> java.base/sun.nio.ch.AsynchronousFileChannelImpl.lock(AsynchronousFileChannelImpl.java:118)
> at
> org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.readData(ChunkUtils.java:175)
> at
> org.apache.hadoop.ozone.container.keyvalue.impl.ChunkManagerImpl.readChunk(ChunkManagerImpl.java:213)
> at
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handleReadChunk(KeyValueHandler.java:574)
> at
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:195)
> at
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:271)
> at
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:148)
> at
> org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:73)
> at
> org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:61)
> {code}
> It seems this is covered by retry logic, as key read is eventually successful
> at client side.
> The problem is that:
> bq. File locks are held on behalf of the entire Java virtual machine. They
> are not suitable for controlling access to a file by multiple threads within
> the same virtual machine.
> ([source|https://docs.oracle.com/javase/8/docs/api/java/nio/channels/FileLock.html])
> code ref:
> [{{ChunkUtils.readData}}|https://github.com/apache/hadoop/blob/c92de8209d1c7da9e7ce607abeecb777c4a52c6a/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/ChunkUtils.java#L175]
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]