[ 
https://issues.apache.org/jira/browse/HDFS-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195980#comment-14195980
 ] 

Steve Loughran commented on HDFS-6803:
--------------------------------------

maybe we should say


MUST be consistent with serialized operations
SHOULD  be concurrent

What we really wants is for two parallel operations to always produce the right 
data; concurrency boosts throughput, but is not guarantees
{code}
 read(pos1,dest,, len) ->
  dest[0..len-1] = [data(FS, path, pos1), data(FS, path, pos1+1) ... data(FS, 
path, pos1+ len -1]
{code}
and  {{read(pos2, dest2, len2)}} does the same for pos2..pos2+len2-1
 
This defines the isolation; the SHOULD/MAY sets the policy.



> Documenting DFSClient#DFSInputStream expectations reading and preading in 
> concurrent context
> --------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6803
>                 URL: https://issues.apache.org/jira/browse/HDFS-6803
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>    Affects Versions: 2.4.1
>            Reporter: stack
>         Attachments: 9117.md.txt, DocumentingDFSClientDFSInputStream (1).pdf, 
> DocumentingDFSClientDFSInputStream.v2.pdf
>
>
> Reviews of the patch posted the parent task suggest that we be more explicit 
> about how DFSIS is expected to behave when being read by contending threads. 
> It is also suggested that presumptions made internally be made explicit 
> documenting expectations.
> Before we put up a patch we've made a document of assertions we'd like to 
> make into tenets of DFSInputSteam.  If agreement, we'll attach to this issue 
> a patch that weaves the assumptions into DFSIS as javadoc and class comments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to