[ 
https://issues.apache.org/jira/browse/HADOOP-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884305#comment-15884305
 ] 

Hudson commented on HADOOP-14028:
---------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11308 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11308/])
HADOOP-14028. S3A BlockOutputStreams doesn't delete temporary files in (stevel: 
rev dab00da19f25619ccc71c7f803a235b21766bf1e)
* (edit) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ABlockOutputArray.java
* (edit) 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java
* (edit) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ABlockOutputByteBuffer.java
* (edit) 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AUtils.java
* (edit) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ABlockOutputDisk.java
* (edit) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/scale/AbstractSTestS3AHugeFiles.java
* (edit) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestUtils.java
* (edit) 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInstrumentation.java
* (edit) 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
* (edit) 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestDataBlocks.java
* (edit) 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ADataBlocks.java


> S3A BlockOutputStreams doesn't delete temporary files in multipart uploads or 
> handle part upload failures
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-14028
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14028
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 2.8.0
>         Environment: JDK 8 + ORC 1.3.0 + hadoop-aws 3.0.0-alpha2
>            Reporter: Seth Fitzsimmons
>            Assignee: Steve Loughran
>            Priority: Critical
>             Fix For: 2.8.0
>
>         Attachments: HADOOP-14028-006.patch, HADOOP-14028-007.patch, 
> HADOOP-14028-branch-2-001.patch, HADOOP-14028-branch-2-008.patch, 
> HADOOP-14028-branch-2-009.patch, HADOOP-14028-branch-2.8-002.patch, 
> HADOOP-14028-branch-2.8-003.patch, HADOOP-14028-branch-2.8-004.patch, 
> HADOOP-14028-branch-2.8-005.patch, HADOOP-14028-branch-2.8-007.patch, 
> HADOOP-14028-branch-2.8-008.patch
>
>
> I have `fs.s3a.fast.upload` enabled with 3.0.0-alpha2 (it's exactly what I 
> was looking for after running into the same OOM problems) and don't see it 
> cleaning up the disk-cached blocks.
> I'm generating a ~50GB file on an instance with ~6GB free when the process 
> starts. My expectation is that local copies of the blocks would be deleted 
> after those parts finish uploading, but I'm seeing more than 15 blocks in 
> /tmp (and none of them have been deleted thus far).
> I see that DiskBlock deletes temporary files when closed, but is it closed 
> after individual blocks have finished uploading or when the entire file has 
> been fully written to the FS (full upload completed, including all parts)?
> As a temporary workaround to avoid running out of space, I'm listing files, 
> sorting by atime, and deleting anything older than the first 20: `ls -ut | 
> tail -n +21 | xargs rm`
> Steve Loughran says:
> > They should be deleted as soon as the upload completes; the close() call 
> > that the AWS httpclient makes on the input stream triggers the deletion. 
> > Though there aren't tests for it, as I recall.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to