[
https://issues.apache.org/jira/browse/HADOOP-18706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17716847#comment-17716847
]
ASF GitHub Bot commented on HADOOP-18706:
-----------------------------------------
cbevard1 commented on code in PR #5563:
URL: https://github.com/apache/hadoop/pull/5563#discussion_r1178227622
##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ABlockOutputArray.java:
##########
@@ -79,6 +80,42 @@ public void testRegularUpload() throws IOException {
verifyUpload("regular", 1024);
}
+ /**
+ * Test that the DiskBlock's local file doesn't result in error when the S3
key exceeds the max
+ * char limit of the local file system. Currently
+ * {@link java.io.File#createTempFile(String, String, File)} is being relied
on to handle the
+ * truncation.
+ * @throws IOException
+ */
+ @Test
+ public void testDiskBlockCreate() throws IOException {
+ S3ADataBlocks.BlockFactory diskBlockFactory =
+ new S3ADataBlocks.DiskBlockFactory(getFileSystem());
+ String s3Key = // 1024 char
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+ "very_long_s3_key";
+ S3ADataBlocks.DataBlock dataBlock = diskBlockFactory.create("spanId",
s3Key, 1,
+ getFileSystem().getDefaultBlockSize(), null);
+ LOG.info(dataBlock.toString()); // block file name and location can be
viewed in failsafe-report
+
+ // delete the block file
+ dataBlock.innerClose();
Review Comment:
I added an assertion to make sure the tmp file is created.
> The temporary files for disk-block buffer aren't unique enough to recover
> partial uploads.
> -------------------------------------------------------------------------------------------
>
> Key: HADOOP-18706
> URL: https://issues.apache.org/jira/browse/HADOOP-18706
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs/s3
> Reporter: Chris Bevard
> Priority: Minor
> Labels: pull-request-available
>
> If an application crashes during an S3ABlockOutputStream upload, it's
> possible to complete the upload if fast.upload.buffer is set to disk by
> uploading the s3ablock file with putObject as the final part of the multipart
> upload. If the application has multiple uploads running in parallel though
> and they're on the same part number when the application fails, then there is
> no way to determine which file belongs to which object, and recovery of
> either upload is impossible.
> If the temporary file name for disk buffering included the s3 key, then every
> partial upload would be recoverable.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]