[ 
https://issues.apache.org/jira/browse/HDFS-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-9092:
--------------------------------
    Attachment: HDFS-9092.002.patch

Thanks a lot [~brandonli]!

I found trunk is changed such that the patch no longer compiles. Uploading rev 
002 to address it. In addition, removed extra spaces reported by last jenkins 
job.

I originally planned to get this into 2.6.2, however, I found that both 2.7 and 
2.6 miss some changes thus the patch can not be applied cleanly. I'm targetting 
this change to 2.8 for now.


> Nfs silently drops overlapping write requests and causes data copying to fail
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-9092
>                 URL: https://issues.apache.org/jira/browse/HDFS-9092
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: nfs
>    Affects Versions: 2.7.1
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-9092.001.patch, HDFS-9092.002.patch
>
>
> When NOT using 'sync' option, the NFS writes may issue the following warning:
> org.apache.hadoop.hdfs.nfs.nfs3.OpenFileCtx: Got an overlapping write 
> (1248751616, 1249677312), nextOffset=1248752400. Silently drop it now
> and the size of data copied via NFS will stay at 1248752400.
> Found what happened is:
> 1. The write requests from client are sent asynchronously. 
> 2. The NFS gateway has handler to handle the incoming requests by creating an 
> internal write request structuire and put it into cache;
> 3. In parallel, a separate thread in NFS gateway takes requests out from the 
> cache and writes the data to HDFS.
> The current offset is how much data has been written by the write thread in 
> 3. The detection of overlapping write request happens in 2, but it only 
> checks the write request against the curent offset, and trim the request if 
> necessary. Because the write requests are sent asynchronously, if two 
> requests are beyond the current offset, and they overlap, it's not detected 
> and both are put into the cache. This cause the symptom reported in this case 
> at step 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to