[ 
https://issues.apache.org/jira/browse/HDDS-2026?focusedWorklogId=301784&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301784
 ]

ASF GitHub Bot logged work on HDDS-2026:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 27/Aug/19 08:37
            Start Date: 27/Aug/19 08:37
    Worklog Time Spent: 10m 
      Work Description: bshashikant commented on issue #1349: HDDS-2026. 
Overlapping chunk region cannot be read concurrently
URL: https://github.com/apache/hadoop/pull/1349#issuecomment-525200958
 
 
   Thanks @adoroszlai for working on this. The locking logic seems good to me.
   But in Ozone world, a chunk once written is immutable , and hence we might 
not need the lock at all
   while reading a chunk. 
   
   @anuengineer @elek @mukul1987 what do you think?
   
   There is also a race condition which exists in the system where a chunk 
might get deleted by BlockDeleting service in Datanode while readChunk is 
happening which is where i think we need to have synchronization on a container 
level (not precisely at the chunk level) for closed containers but this problem 
is beyond the scope of this jira.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 301784)
    Time Spent: 1.5h  (was: 1h 20m)

> Overlapping chunk region cannot be read concurrently
> ----------------------------------------------------
>
>                 Key: HDDS-2026
>                 URL: https://issues.apache.org/jira/browse/HDDS-2026
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Datanode
>            Reporter: Doroszlai, Attila
>            Assignee: Doroszlai, Attila
>            Priority: Critical
>              Labels: pull-request-available
>         Attachments: HDDS-2026-repro.patch, changes.diff, 
> first-cut-proposed.diff
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Concurrent requests to datanode for the same chunk may result in the 
> following exception in datanode:
> {code}
> java.nio.channels.OverlappingFileLockException
>    at java.base/sun.nio.ch.FileLockTable.checkList(FileLockTable.java:229)
>    at java.base/sun.nio.ch.FileLockTable.add(FileLockTable.java:123)
>    at 
> java.base/sun.nio.ch.AsynchronousFileChannelImpl.addToFileLockTable(AsynchronousFileChannelImpl.java:178)
>    at 
> java.base/sun.nio.ch.SimpleAsynchronousFileChannelImpl.implLock(SimpleAsynchronousFileChannelImpl.java:185)
>    at 
> java.base/sun.nio.ch.AsynchronousFileChannelImpl.lock(AsynchronousFileChannelImpl.java:118)
>    at 
> org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.readData(ChunkUtils.java:175)
>    at 
> org.apache.hadoop.ozone.container.keyvalue.impl.ChunkManagerImpl.readChunk(ChunkManagerImpl.java:213)
>    at 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handleReadChunk(KeyValueHandler.java:574)
>    at 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:195)
>    at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:271)
>    at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:148)
>    at 
> org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:73)
>    at 
> org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:61)
> {code}
> It seems this is covered by retry logic, as key read is eventually successful 
> at client side.
> The problem is that:
> bq. File locks are held on behalf of the entire Java virtual machine. They 
> are not suitable for controlling access to a file by multiple threads within 
> the same virtual machine. 
> ([source|https://docs.oracle.com/javase/8/docs/api/java/nio/channels/FileLock.html])
> code ref: 
> [{{ChunkUtils.readData}}|https://github.com/apache/hadoop/blob/c92de8209d1c7da9e7ce607abeecb777c4a52c6a/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/ChunkUtils.java#L175]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to