[
https://issues.apache.org/jira/browse/HADOOP-18706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715856#comment-17715856
]
ASF GitHub Bot commented on HADOOP-18706:
-----------------------------------------
steveloughran commented on code in PR #5563:
URL: https://github.com/apache/hadoop/pull/5563#discussion_r1175384535
##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ABlockOutputArray.java:
##########
@@ -79,6 +80,42 @@ public void testRegularUpload() throws IOException {
verifyUpload("regular", 1024);
}
+ /**
+ * Test that the DiskBlock's local file doesn't result in error when the S3
key exceeds the max
+ * char limit of the local file system. Currently
+ * {@link java.io.File#createTempFile(String, String, File)} is being relied
on to handle the
+ * truncation.
+ * @throws IOException
+ */
+ @Test
+ public void testDiskBlockCreate() throws IOException {
+ S3ADataBlocks.BlockFactory diskBlockFactory =
Review Comment:
use try-with-resources, even if I doubt this is at risk of leaking things
##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ABlockOutputArray.java:
##########
@@ -79,6 +80,42 @@ public void testRegularUpload() throws IOException {
verifyUpload("regular", 1024);
}
+ /**
+ * Test that the DiskBlock's local file doesn't result in error when the S3
key exceeds the max
+ * char limit of the local file system. Currently
+ * {@link java.io.File#createTempFile(String, String, File)} is being relied
on to handle the
+ * truncation.
+ * @throws IOException
+ */
+ @Test
+ public void testDiskBlockCreate() throws IOException {
+ S3ADataBlocks.BlockFactory diskBlockFactory =
+ new S3ADataBlocks.DiskBlockFactory(getFileSystem());
+ String s3Key = // 1024 char
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+ "very_long_s3_key";
+ S3ADataBlocks.DataBlock dataBlock = diskBlockFactory.create("spanId",
s3Key, 1,
+ getFileSystem().getDefaultBlockSize(), null);
+ LOG.info(dataBlock.toString()); // block file name and location can be
viewed in failsafe-report
+
+ // delete the block file
+ dataBlock.innerClose();
Review Comment:
are there any more asserts here, e.g that the file exists afterwards?
> The temporary files for disk-block buffer aren't unique enough to recover
> partial uploads.
> -------------------------------------------------------------------------------------------
>
> Key: HADOOP-18706
> URL: https://issues.apache.org/jira/browse/HADOOP-18706
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs/s3
> Reporter: Chris Bevard
> Priority: Minor
> Labels: pull-request-available
>
> If an application crashes during an S3ABlockOutputStream upload, it's
> possible to complete the upload if fast.upload.buffer is set to disk by
> uploading the s3ablock file with putObject as the final part of the multipart
> upload. If the application has multiple uploads running in parallel though
> and they're on the same part number when the application fails, then there is
> no way to determine which file belongs to which object, and recovery of
> either upload is impossible.
> If the temporary file name for disk buffering included the s3 key, then every
> partial upload would be recoverable.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]