rdblue commented on a change in pull request #1442: HADOOP-16570. S3A 
committers encounter scale issues
URL: https://github.com/apache/hadoop/pull/1442#discussion_r330148238
 
 

 ##########
 File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java
 ##########
 @@ -349,16 +368,9 @@ public void recoverTask(TaskAttemptContext taskContext) 
throws IOException {
    * @throws IOException IO failure
    */
   protected void maybeCreateSuccessMarkerFromCommits(JobContext context,
-      List<SinglePendingCommit> pending) throws IOException {
+      ActiveCommit pending) throws IOException {
     List<String> filenames = new ArrayList<>(pending.size());
-    for (SinglePendingCommit commit : pending) {
-      String key = commit.getDestinationKey();
-      if (!key.startsWith("/")) {
-        // fix up so that FS.makeQualified() sets up the path OK
-        key = "/" + key;
-      }
-      filenames.add(key);
-    }
+    filenames.addAll(pending.committedObjects);
 
 Review comment:
   If this is not going to create the _SUCCESS marker because the file list is 
too large, why get all committed file names here? I think this should be inside 
whatever check `maybeCreateSuccessMarker` has to avoid the memory consumption.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to