[
https://issues.apache.org/jira/browse/HDFS-7259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14175510#comment-14175510
]
Brandon Li commented on HDFS-7259:
----------------------------------
Uploaded a patch.
The basic idea is to provide a configurable property "nfs.large.file.upload".
It's turned on by default:
1. if client asks to commit non-sequential trunk of data, NFS gateway return
success with the hope that client will send the prerequisite writes.
2. if client asks to commit a sequential trunk(means it can be flushed to
HDFS), NFS gateway return a special error NFS3ERR_JUKEBOX indicating the client
needs to retry. Meanwhile, NFS gateway keeps flush data to HDFS and do sync
eventually.
The reason to let client wait is that, we want the client to wait for the last
commit. Otherwise, client thinks file upload finished (e.g., cp command returns
success) but NFS could be still flushing staged data to HDFS.
However, we don't know which one is the last commit. We make the assumption
that a commit after sequencial writes may be the last.
> Unresponseive NFS mount point due to deferred COMMIT response
> -------------------------------------------------------------
>
> Key: HDFS-7259
> URL: https://issues.apache.org/jira/browse/HDFS-7259
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: nfs
> Affects Versions: 2.2.0
> Reporter: Brandon Li
> Assignee: Brandon Li
> Attachments: HDFS-7259.001.patch
>
>
> Since the gateway can't commit random write, it caches the COMMIT requests in
> a queue and send back response only when the data can be committed or stream
> timeout (failure in the latter case). This could cause problems two patterns:
> (1) file uploading failure
> (2) the mount dir is stuck on the same client, but other NFS clients can
> still access NFS gateway.
> The error pattern (2) is because there are too many COMMIT requests pending,
> so the NFS client can't send any other requests(e.g., for "ls") to NFS
> gateway with its pending requests limit.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)