[
https://issues.apache.org/jira/browse/HDFS-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinayakumar B updated HDFS-3405:
--------------------------------
Attachment: HDFS-3405.patch
Hi All,
Sorry for the late response here.
Changes in the latest patch.
1. Verified uploading of big image files and Confirmed that *timeout
(dfs.image.transfer.timeout)* set while uploading the file is just a
SocketTimeout not the entire transfer timeout. This confirmed by trying to
upload 1GB sized image file with only 5 sec timeout. It was successfull even
though total upload in very slow n/w took ~5 min. So we can reduce default
value of *dfs.image.transfer.timeout* to 60 second which is default socket
timeout in hadoop.
2. I was facing OOME while uploading 2GB sized files due to internal buffering
of HTTP streaming. Increased the Heapsizes upto 15GB, still it was taking lot
of time
So, used {{connection.setChunkedStreamingMode(64 * 1024);}} with one extra
parameter {{File-Length}} instead of {{Content-Length}} to indicate the file
length for verification. Since {{Content-Length}} is a integer, for more than
2GB sized files this will not be correct.
After that verified upload of 2GB+ sized files successfully.
3. Updated all previous comments.
> Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged
> fsimages
> ------------------------------------------------------------------------------------
>
> Key: HDFS-3405
> URL: https://issues.apache.org/jira/browse/HDFS-3405
> Project: Hadoop HDFS
> Issue Type: Improvement
> Affects Versions: 1.0.0, 3.0.0, 2.0.5-alpha
> Reporter: Aaron T. Myers
> Assignee: Vinayakumar B
> Attachments: HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch,
> HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch,
> HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch,
> HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch,
> HDFS-3405.patch
>
>
> As Todd points out in [this
> comment|https://issues.apache.org/jira/browse/HDFS-3404?focusedCommentId=13272986&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13272986],
> the current scheme for a checkpointing daemon to upload a merged fsimage
> file to an NN is to issue an HTTP get request to tell the target NN to issue
> another GET request back to the checkpointing daemon to retrieve the merged
> fsimage file. There's no fundamental reason the checkpointing daemon can't
> just use an HTTP POST or PUT to send back the merged fsimage file, rather
> than the double-GET scheme.
--
This message was sent by Atlassian JIRA
(v6.2#6252)