[ 
https://issues.apache.org/jira/browse/HDFS-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082110#comment-14082110
 ] 

Steve Loughran commented on HDFS-6803:
--------------------------------------

Stack, Liang, the HADOOP-9361 spec of FS API tries to define what is going on, 
though the semantics of concurrent relative reads is vague. Some of the 
javadocs say "isolated operation"; a lot of the implementations just do {{seek, 
read, seek-back}}.

# any tenents on concurrency are things that should go there
# I had put in "There is no requirement for the stream implementation to be 
thread-safe.", which was the conclusion I'd reached from looking at the code, 
especially the handling of {{read(offset) }} in most implementations. That is 
even though {{PositionedReadable.read()}} says "thread safe". That statement 
needs to be looked at.

I would propose that you start a new section to 
{{hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md}} which lays 
down more rigorously what the thread-safety requirements of all implementations 
are, with reviews of all the existing input streams to see that they conform to 
what is defined. This isn't something that can be easily tested, so a 
code-review is all we have.

Then for HDFS the concurrency tenents can be defined to state that certain 
operations are non-blocking *as an optimisation*. That's because other 
filesystems may behave differently, and it's potentially dangerous to make 
assumptions about concurrency that don't hold everywhere, including future 
versions of HDFS

> Documenting DFSClient#DFSInputStream expectations reading and preading in 
> concurrent context
> --------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6803
>                 URL: https://issues.apache.org/jira/browse/HDFS-6803
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>    Affects Versions: 2.4.1
>            Reporter: stack
>         Attachments: DocumentingDFSClientDFSInputStream (1).pdf
>
>
> Reviews of the patch posted the parent task suggest that we be more explicit 
> about how DFSIS is expected to behave when being read by contending threads. 
> It is also suggested that presumptions made internally be made explicit 
> documenting expectations.
> Before we put up a patch we've made a document of assertions we'd like to 
> make into tenets of DFSInputSteam.  If agreement, we'll attach to this issue 
> a patch that weaves the assumptions into DFSIS as javadoc and class comments. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to