[ 
https://issues.apache.org/jira/browse/HADOOP-15961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16748045#comment-16748045
 ] 

Steve Loughran commented on HADOOP-15961:
-----------------------------------------

BTW, Looking at this patch, I think the progress call could go in the inner 
loop, 
{code}
...
UploadPartResult partResult = writeOperations.uploadPart(part);
offset += uploadPartSize;
parts.add(partResult.getPartETag());
progress.progess()   //HERE
}
{code}

That way, it'll be invoked every 32, 64MB of part upload. If the task created 
4GB of data, without the per-part uploads you could still get some timeout just 
from the time to upload. a progress event per block eliminates this problem

> S3A committers: make sure there's regular progress() calls
> ----------------------------------------------------------
>
>                 Key: HADOOP-15961
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15961
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Steve Loughran
>            Assignee: lqjacklee
>            Priority: Minor
>         Attachments: HADOOP-15961-001.patch, HADOOP-15961-002.patch
>
>
> MAPREDUCE-7164 highlights how inside job/task commit more context.progress() 
> callbacks are needed, just for HDFS.
> the S3A committers should be reviewed similarly.
> At a glance:
> StagingCommitter.commitTaskInternal() is at risk if a task write upload 
> enough data to the localfs that the upload takes longer than the timeout.
> it should call progress it every single file commits, or better: modify 
> {{uploadFileToPendingCommit}} to take a Progressable for progress callbacks 
> after every part upload.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to