[ 
https://issues.apache.org/jira/browse/HADOOP-18706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17810868#comment-17810868
 ] 

ASF GitHub Bot commented on HADOOP-18706:
-----------------------------------------

steveloughran commented on code in PR #5771:
URL: https://github.com/apache/hadoop/pull/5771#discussion_r1466282702


##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##########
@@ -1369,12 +1369,69 @@ public S3AEncryptionMethods getS3EncryptionAlgorithm() {
   File createTmpFileForWrite(String pathStr, long size,
       Configuration conf) throws IOException {
     initLocalDirAllocatorIfNotInitialized(conf);
-    Path path = directoryAllocator.getLocalPathForWrite(pathStr,
-        size, conf);
+    Path path = directoryAllocator.getLocalPathForWrite(pathStr, size, conf);
     File dir = new File(path.getParent().toUri().getPath());
-    String prefix = path.getName();
-    // create a temp file on this directory
-    return File.createTempFile(prefix, null, dir);
+    return safeCreateTempFile(pathStr, null, dir);
+  }
+
+  // TODO remove this method when hadoop upgrades to a newer version of java 
than 1.8
+  /**
+   * Ensure that the temp file prefix and suffix don't exceed the maximum 
number of characters
+   * allowed by the underlying file system. This validation isn't required in 
Java 9+ since
+   * {@link java.io.File#createTempFile(String, String, File)} automatically 
truncates file names.
+   *
+   * @param prefix prefix for the temporary file
+   * @param suffix suffix for the temporary file
+   * @param dir directory to create the temporary file in
+   * @return a unique temporary file
+   * @throws IOException
+   */
+  static File safeCreateTempFile(String prefix, String suffix, File dir) 
throws IOException
+  {
+    // avoid validating multiple times.
+    // if the jvm running is version 9+ then defer to java.io.File validation 
implementation
+    if(Float.parseFloat(System.getProperty("java.class.version")) >= 53) {
+      return File.createTempFile(prefix, null, dir);
+    }
+
+    // if no suffix was defined assume the default
+    if(suffix == null) {
+      suffix = ".tmp";
+    }
+    // Use only the file name from the supplied prefix
+    prefix = (new File(prefix)).getName();
+
+    int prefixLength = prefix.length();
+    int suffixLength = suffix.length();
+    int maxRandomSuffixLen = 19; // 
Long.toUnsignedString(Long.MAX_VALUE).length()
+
+    String name;
+    int nameMax = 255; // unable to access the underlying FS directly, so 
assume 255

Review Comment:
   make a constant, e.g ASSUMED_MAX_FILENAME



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##########
@@ -1369,12 +1369,69 @@ public S3AEncryptionMethods getS3EncryptionAlgorithm() {
   File createTmpFileForWrite(String pathStr, long size,
       Configuration conf) throws IOException {
     initLocalDirAllocatorIfNotInitialized(conf);
-    Path path = directoryAllocator.getLocalPathForWrite(pathStr,
-        size, conf);
+    Path path = directoryAllocator.getLocalPathForWrite(pathStr, size, conf);
     File dir = new File(path.getParent().toUri().getPath());
-    String prefix = path.getName();
-    // create a temp file on this directory
-    return File.createTempFile(prefix, null, dir);
+    return safeCreateTempFile(pathStr, null, dir);
+  }
+
+  // TODO remove this method when hadoop upgrades to a newer version of java 
than 1.8
+  /**
+   * Ensure that the temp file prefix and suffix don't exceed the maximum 
number of characters
+   * allowed by the underlying file system. This validation isn't required in 
Java 9+ since
+   * {@link java.io.File#createTempFile(String, String, File)} automatically 
truncates file names.
+   *
+   * @param prefix prefix for the temporary file
+   * @param suffix suffix for the temporary file
+   * @param dir directory to create the temporary file in
+   * @return a unique temporary file
+   * @throws IOException
+   */
+  static File safeCreateTempFile(String prefix, String suffix, File dir) 
throws IOException
+  {
+    // avoid validating multiple times.
+    // if the jvm running is version 9+ then defer to java.io.File validation 
implementation
+    if(Float.parseFloat(System.getProperty("java.class.version")) >= 53) {

Review Comment:
   this should go in org.apache.hadoop.util.Shell; there's already something 
similar. will need a test somehow.



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ADataBlocks.java:
##########
@@ -798,6 +800,8 @@ public String toString() {
    * Buffer blocks to disk.
    */
   static class DiskBlockFactory extends BlockFactory {
+    private static final String ESCAPED_FORWARD_SLASH = "EFS";

Review Comment:
   think I might prefer something more distinguishable from text, e.g "_FS_" 
and "_BS_" so its easier to read in a dir listing



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##########
@@ -1369,12 +1369,69 @@ public S3AEncryptionMethods getS3EncryptionAlgorithm() {
   File createTmpFileForWrite(String pathStr, long size,
       Configuration conf) throws IOException {
     initLocalDirAllocatorIfNotInitialized(conf);
-    Path path = directoryAllocator.getLocalPathForWrite(pathStr,
-        size, conf);
+    Path path = directoryAllocator.getLocalPathForWrite(pathStr, size, conf);
     File dir = new File(path.getParent().toUri().getPath());
-    String prefix = path.getName();
-    // create a temp file on this directory
-    return File.createTempFile(prefix, null, dir);
+    return safeCreateTempFile(pathStr, null, dir);
+  }
+
+  // TODO remove this method when hadoop upgrades to a newer version of java 
than 1.8
+  /**
+   * Ensure that the temp file prefix and suffix don't exceed the maximum 
number of characters
+   * allowed by the underlying file system. This validation isn't required in 
Java 9+ since
+   * {@link java.io.File#createTempFile(String, String, File)} automatically 
truncates file names.
+   *
+   * @param prefix prefix for the temporary file
+   * @param suffix suffix for the temporary file
+   * @param dir directory to create the temporary file in
+   * @return a unique temporary file
+   * @throws IOException
+   */
+  static File safeCreateTempFile(String prefix, String suffix, File dir) 
throws IOException
+  {
+    // avoid validating multiple times.
+    // if the jvm running is version 9+ then defer to java.io.File validation 
implementation
+    if(Float.parseFloat(System.getProperty("java.class.version")) >= 53) {
+      return File.createTempFile(prefix, null, dir);
+    }
+
+    // if no suffix was defined assume the default
+    if(suffix == null) {
+      suffix = ".tmp";
+    }
+    // Use only the file name from the supplied prefix
+    prefix = (new File(prefix)).getName();
+
+    int prefixLength = prefix.length();
+    int suffixLength = suffix.length();
+    int maxRandomSuffixLen = 19; // 
Long.toUnsignedString(Long.MAX_VALUE).length()
+
+    String name;
+    int nameMax = 255; // unable to access the underlying FS directly, so 
assume 255
+    int excess = prefixLength + maxRandomSuffixLen + suffixLength - nameMax;
+
+    // shorten the prefix length if the file name exceeds 255 chars

Review Comment:
   and replace explicit size with "too long"



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ABlockOutputArray.java:
##########
@@ -79,6 +82,46 @@ public void testRegularUpload() throws IOException {
     verifyUpload("regular", 1024);
   }
 
+  /**
+   * Test that the DiskBlock's local file doesn't result in error when the S3 
key exceeds the max
+   * char limit of the local file system. Currently
+   * {@link java.io.File#createTempFile(String, String, File)} is being relied 
on to handle the
+   * truncation.
+   * @throws IOException
+   */
+  @Test
+  public void testDiskBlockCreate() throws IOException {
+    String s3Key = // 1024 char
+        
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+        
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+        
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+        
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+        
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+        
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+        
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+        
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+        
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+        
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+        
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+        
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+        
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+        
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+        "very_long_s3_key";
+    long blockSize = getFileSystem().getDefaultBlockSize();
+    try (S3ADataBlocks.BlockFactory diskBlockFactory =
+           new S3ADataBlocks.DiskBlockFactory(getFileSystem());
+         S3ADataBlocks.DataBlock dataBlock =
+             diskBlockFactory.create("spanId", s3Key, 1, blockSize, null);
+    ) {
+      String tmpDir = getConfiguration().get("hadoop.tmp.dir");
+      boolean created = Arrays.stream(
+          Objects.requireNonNull(new File(tmpDir).listFiles()))
+          .anyMatch(f -> f.getName().contains("very_long_s3_key"));
+      assertTrue(String.format("tmp file should have been created locally in 
%s", tmpDir), created);

Review Comment:
   * use AssertJ. it should actually be possible to add an assert that the 
array matches the requirement.
   * should be exactly one; test suite setup should delete everything with 
"very_long_s3_key" in it; so should teardown too, but to support things like 
IDE debugging, its best to do both



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3ABlockOutputStream.java:
##########
@@ -59,6 +63,9 @@ private S3ABlockOutputStream.BlockOutputStreamBuilder 
mockS3ABuilder() {
         mock(S3ADataBlocks.BlockFactory.class);
     long blockSize = Constants.DEFAULT_MULTIPART_SIZE;
     WriteOperationHelper oHelper = mock(WriteOperationHelper.class);
+    AuditSpan auditSpan = mock(AuditSpan.class);

Review Comment:
   you can just use `org.apache.hadoop.fs.s3a.audit.impl.NoopSpan` here; one 
less thing to mock.



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ABlockOutputArray.java:
##########
@@ -79,6 +82,46 @@ public void testRegularUpload() throws IOException {
     verifyUpload("regular", 1024);
   }
 
+  /**
+   * Test that the DiskBlock's local file doesn't result in error when the S3 
key exceeds the max
+   * char limit of the local file system. Currently
+   * {@link java.io.File#createTempFile(String, String, File)} is being relied 
on to handle the
+   * truncation.
+   * @throws IOException
+   */
+  @Test
+  public void testDiskBlockCreate() throws IOException {
+    String s3Key = // 1024 char

Review Comment:
   1. you could do this with a shorter string and simply concatenate it a few 
times
   2. its duplicated in TestS3aFilesystem...it should be a constant in 
S3ATestConstants



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AFileSystem.java:
##########
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.Timeout;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.regex.Pattern;
+
+/**
+ * Unit tests for {@link S3AFileSystem}.
+ */
+public class TestS3AFileSystem extends Assert {
+  final File TEMP_DIR = new File("target/build/test/TestS3AFileSystem");
+  final String longStr = // 1024 char
+    "very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" 
+
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      "very_long_s3_key";
+  final String longStrTruncated = "very_long_s3_key__very_long_s3_key__";
+
+  @Rule
+  public Timeout testTimeout = new Timeout(30 * 1000);
+
+  @Before
+  public void init() throws IOException {
+    Files.createDirectories(TEMP_DIR.toPath());
+  }
+
+  @After
+  public void teardown() throws IOException {
+    File[] testOutputFiles = TEMP_DIR.listFiles();
+    for(File file: testOutputFiles) {
+      Files.delete(file.toPath());
+    }
+    Files.deleteIfExists(TEMP_DIR.toPath());
+  }
+
+  @Before
+  public void nameThread() {
+    Thread.currentThread().setName("JUnit");
+  }
+
+  /**
+   * Test the {@link S3AFileSystem#safeCreateTempFile(String, String, File)}.
+   * The code verifies that the input prefix and suffix don't exceed the file 
system's max name
+   * length and cause an exception.
+   *
+   * This test verifies the basic contract of the process.
+   */
+  @Test
+  public void testSafeCreateTempFile() throws Throwable {
+    // fitting name isn't changed
+    File noChangesRequired = 
S3AFileSystem.safeCreateTempFile("noChangesRequired", ".tmp", TEMP_DIR);
+    assertTrue(noChangesRequired.exists());
+    String noChangesRequiredName = noChangesRequired.getName();
+    assertTrue(noChangesRequiredName.startsWith("noChangesRequired"));
+    assertTrue(noChangesRequiredName.endsWith(".tmp"));
+
+    // a long prefix should be truncated
+    File excessivelyLongPrefix = S3AFileSystem.safeCreateTempFile(longStr, 
".tmp", TEMP_DIR);

Review Comment:
   each of these should be split into their own test method so they can fail 
independently. I know, we ignore that in ITests, but that is because they're 
integration tests with longer setup overhead



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AFileSystem.java:
##########
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.Timeout;
+
+import java.io.File;

Review Comment:
   nit: ordering...these should come first.



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AFileSystem.java:
##########
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.Timeout;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.regex.Pattern;
+
+/**
+ * Unit tests for {@link S3AFileSystem}.
+ */
+public class TestS3AFileSystem extends Assert {
+  final File TEMP_DIR = new File("target/build/test/TestS3AFileSystem");
+  final String longStr = // 1024 char
+    "very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" 
+
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      "very_long_s3_key";
+  final String longStrTruncated = "very_long_s3_key__very_long_s3_key__";
+
+  @Rule
+  public Timeout testTimeout = new Timeout(30 * 1000);
+
+  @Before
+  public void init() throws IOException {
+    Files.createDirectories(TEMP_DIR.toPath());
+  }
+
+  @After
+  public void teardown() throws IOException {
+    File[] testOutputFiles = TEMP_DIR.listFiles();
+    for(File file: testOutputFiles) {
+      Files.delete(file.toPath());
+    }
+    Files.deleteIfExists(TEMP_DIR.toPath());
+  }
+
+  @Before
+  public void nameThread() {
+    Thread.currentThread().setName("JUnit");
+  }
+
+  /**
+   * Test the {@link S3AFileSystem#safeCreateTempFile(String, String, File)}.
+   * The code verifies that the input prefix and suffix don't exceed the file 
system's max name
+   * length and cause an exception.
+   *
+   * This test verifies the basic contract of the process.
+   */
+  @Test
+  public void testSafeCreateTempFile() throws Throwable {
+    // fitting name isn't changed
+    File noChangesRequired = 
S3AFileSystem.safeCreateTempFile("noChangesRequired", ".tmp", TEMP_DIR);
+    assertTrue(noChangesRequired.exists());

Review Comment:
   bad news, use AssertJ. more verbose but the assertions it raised include 
details about the failure. which is what I insist on, sorry. If a yetus build 
fails, i want more than a line number. if this isn't done automatically, used 
.describedAs() which takes String.format() string + varargs list



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##########
@@ -1369,12 +1369,69 @@ public S3AEncryptionMethods getS3EncryptionAlgorithm() {
   File createTmpFileForWrite(String pathStr, long size,
       Configuration conf) throws IOException {
     initLocalDirAllocatorIfNotInitialized(conf);
-    Path path = directoryAllocator.getLocalPathForWrite(pathStr,
-        size, conf);
+    Path path = directoryAllocator.getLocalPathForWrite(pathStr, size, conf);
     File dir = new File(path.getParent().toUri().getPath());
-    String prefix = path.getName();
-    // create a temp file on this directory
-    return File.createTempFile(prefix, null, dir);
+    return safeCreateTempFile(pathStr, null, dir);
+  }
+
+  // TODO remove this method when hadoop upgrades to a newer version of java 
than 1.8
+  /**
+   * Ensure that the temp file prefix and suffix don't exceed the maximum 
number of characters
+   * allowed by the underlying file system. This validation isn't required in 
Java 9+ since
+   * {@link java.io.File#createTempFile(String, String, File)} automatically 
truncates file names.
+   *
+   * @param prefix prefix for the temporary file
+   * @param suffix suffix for the temporary file
+   * @param dir directory to create the temporary file in
+   * @return a unique temporary file
+   * @throws IOException
+   */
+  static File safeCreateTempFile(String prefix, String suffix, File dir) 
throws IOException
+  {
+    // avoid validating multiple times.
+    // if the jvm running is version 9+ then defer to java.io.File validation 
implementation
+    if(Float.parseFloat(System.getProperty("java.class.version")) >= 53) {
+      return File.createTempFile(prefix, null, dir);
+    }
+
+    // if no suffix was defined assume the default
+    if(suffix == null) {
+      suffix = ".tmp";
+    }
+    // Use only the file name from the supplied prefix
+    prefix = (new File(prefix)).getName();
+
+    int prefixLength = prefix.length();
+    int suffixLength = suffix.length();
+    int maxRandomSuffixLen = 19; // 
Long.toUnsignedString(Long.MAX_VALUE).length()
+
+    String name;
+    int nameMax = 255; // unable to access the underlying FS directly, so 
assume 255
+    int excess = prefixLength + maxRandomSuffixLen + suffixLength - nameMax;
+
+    // shorten the prefix length if the file name exceeds 255 chars
+    if (excess > 0) {
+      // Attempt to shorten the prefix length to no less than 3
+      prefixLength = shortenSubName(prefixLength, excess, 3);
+      prefix = prefix.substring(0, prefixLength);
+    }
+    // shorten the suffix if the file name still exceeds 255 chars
+    excess = prefixLength + maxRandomSuffixLen + suffixLength - nameMax;
+    if (excess > 0) {
+      // Attempt to shorten the suffix length to no less than 3
+      suffixLength = shortenSubName(suffixLength, excess, 3);
+      suffix = suffix.substring(0, suffixLength);
+    }
+
+    return File.createTempFile(prefix, suffix, dir);
+  }
+
+  private static int shortenSubName(int subNameLength, int excess, int 
nameMin) {

Review Comment:
   add javadocs. a unit test would be good too and straightforward to add.



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AFileSystem.java:
##########
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.Timeout;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.regex.Pattern;
+
+/**
+ * Unit tests for {@link S3AFileSystem}.
+ */
+public class TestS3AFileSystem extends Assert {
+  final File TEMP_DIR = new File("target/build/test/TestS3AFileSystem");
+  final String longStr = // 1024 char
+    "very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" 
+
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      
"very_long_s3_key__very_long_s3_key__very_long_s3_key__very_long_s3_key__" +
+      "very_long_s3_key";
+  final String longStrTruncated = "very_long_s3_key__very_long_s3_key__";
+
+  @Rule
+  public Timeout testTimeout = new Timeout(30 * 1000);
+
+  @Before
+  public void init() throws IOException {
+    Files.createDirectories(TEMP_DIR.toPath());

Review Comment:
   use `S3ATestUtils` code here for handling of parallel test cases and ease of 
future maintenance
   ```
   Configuration conf = new Configuration(false)
   S3ATestUtils.prepareTestConfiguration(conf)
   conf.get(BUFFER_DIR)
   ```
   
   



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java:
##########
@@ -1369,12 +1369,69 @@ public S3AEncryptionMethods getS3EncryptionAlgorithm() {
   File createTmpFileForWrite(String pathStr, long size,
       Configuration conf) throws IOException {
     initLocalDirAllocatorIfNotInitialized(conf);
-    Path path = directoryAllocator.getLocalPathForWrite(pathStr,
-        size, conf);
+    Path path = directoryAllocator.getLocalPathForWrite(pathStr, size, conf);
     File dir = new File(path.getParent().toUri().getPath());
-    String prefix = path.getName();
-    // create a temp file on this directory
-    return File.createTempFile(prefix, null, dir);
+    return safeCreateTempFile(pathStr, null, dir);
+  }
+
+  // TODO remove this method when hadoop upgrades to a newer version of java 
than 1.8
+  /**
+   * Ensure that the temp file prefix and suffix don't exceed the maximum 
number of characters
+   * allowed by the underlying file system. This validation isn't required in 
Java 9+ since
+   * {@link java.io.File#createTempFile(String, String, File)} automatically 
truncates file names.
+   *
+   * @param prefix prefix for the temporary file
+   * @param suffix suffix for the temporary file
+   * @param dir directory to create the temporary file in
+   * @return a unique temporary file
+   * @throws IOException
+   */
+  static File safeCreateTempFile(String prefix, String suffix, File dir) 
throws IOException
+  {
+    // avoid validating multiple times.
+    // if the jvm running is version 9+ then defer to java.io.File validation 
implementation
+    if(Float.parseFloat(System.getProperty("java.class.version")) >= 53) {
+      return File.createTempFile(prefix, null, dir);
+    }
+
+    // if no suffix was defined assume the default
+    if(suffix == null) {
+      suffix = ".tmp";
+    }
+    // Use only the file name from the supplied prefix
+    prefix = (new File(prefix)).getName();
+
+    int prefixLength = prefix.length();
+    int suffixLength = suffix.length();
+    int maxRandomSuffixLen = 19; // 
Long.toUnsignedString(Long.MAX_VALUE).length()
+
+    String name;
+    int nameMax = 255; // unable to access the underlying FS directly, so 
assume 255
+    int excess = prefixLength + maxRandomSuffixLen + suffixLength - nameMax;
+
+    // shorten the prefix length if the file name exceeds 255 chars
+    if (excess > 0) {
+      // Attempt to shorten the prefix length to no less than 3
+      prefixLength = shortenSubName(prefixLength, excess, 3);

Review Comment:
   again, make a constant and use in both places



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ABlockOutputArray.java:
##########
@@ -79,6 +82,46 @@ public void testRegularUpload() throws IOException {
     verifyUpload("regular", 1024);
   }
 
+  /**
+   * Test that the DiskBlock's local file doesn't result in error when the S3 
key exceeds the max
+   * char limit of the local file system. Currently
+   * {@link java.io.File#createTempFile(String, String, File)} is being relied 
on to handle the

Review Comment:
   this comment is out of date





> Improve S3ABlockOutputStream recovery
> -------------------------------------
>
>                 Key: HADOOP-18706
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18706
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>            Reporter: Chris Bevard
>            Assignee: Chris Bevard
>            Priority: Minor
>              Labels: pull-request-available
>
> If an application crashes during an S3ABlockOutputStream upload, it's 
> possible to complete the upload if fast.upload.buffer is set to disk by 
> uploading the s3ablock file with putObject as the final part of the multipart 
> upload. If the application has multiple uploads running in parallel though 
> and they're on the same part number when the application fails, then there is 
> no way to determine which file belongs to which object, and recovery of 
> either upload is impossible.
> If the temporary file name for disk buffering included the s3 key, then every 
> partial upload would be recoverable.
> h3. Important disclaimer
> This change does not directly add the Syncable semantics which applications 
> that require {{Syncable.hsync()}} to only return after all pending data has 
> been durably written to the destination path. S3 is not a filesystem and this 
> change does not make it so.
> What is does do is assist anyone trying to implement some post-crash recovery 
> process which
> # interrogates s3 to identofy pending uploads to a specific path and get a 
> list of uploaded blocks yet to be committed
> # scans the local fs.s3a.buffer dir directories to identify in-progress-write 
> blocks for the same target destination. That is those which were being 
> uploaded, queued for uploaded and the single "new data being written to" 
> block for an output stream
> # uploads all those pending blocks
> # generates a new POST to complete a multipart upload with all the blocks in 
> the correct order
> All this patch does is ensure the buffered block filenames include the final 
> path and block ID, to aid in identify which blocks need to be uploaded and 
> what order. 
> h2. warning
> causes HADOOP-18744 -always include the relevant fix when backporting



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to