[ 
https://issues.apache.org/jira/browse/HADOOP-17833?focusedWorklogId=756382&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-756382
 ]

ASF GitHub Bot logged work on HADOOP-17833:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/Apr/22 13:23
            Start Date: 13/Apr/22 13:23
    Worklog Time Spent: 10m 
      Work Description: steveloughran commented on PR #3289:
URL: https://github.com/apache/hadoop/pull/3289#issuecomment-1098045775

   ```
   
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/CommitContext.java:348:
  private class PoolSubmitter implements TaskPool.Submitter, Closeable {: Class 
PoolSubmitter should be declared as final. [FinalClass]
   
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/files/PersistentCommitData.java:105:
    return serializer.load(fs, path,status);:36: ',' is not followed by 
whitespace. [WhitespaceAfter]
   
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/CreateFileBuilder.java:22:import
 java.util.Collections;:8: Unused import - java.util.Collections. 
[UnusedImports]
   
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/MkdirOperation.java:190:
    void createFakeDirectory(final Path dir) throws IOException;:30: Redundant 
'final' modifier. [RedundantModifier]
   
./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/WriteOperationHelper.java:326:
   * {@link S3AFileSystem#finishedWrite(String, long, String, String, 
org.apache.hadoop.fs.s3a.impl.PutObjectOptions)}: Line is longer than 100 
characters (found 118). [LineLength]
   
./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/commit/ITestCommitOperationCost.java:256:
          commitOperations.commitOrFail(singleCommit);: 'block' child has 
incorrect indentation level 10, expected level should be 6. [Indentation]
   
./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/commit/ITestCommitOperationCost.java:257:
          IOStatistics st = commitOperations.getIOStatistics();: 'block' child 
has incorrect indentation level 10, expected level should be 6. [Indentation]
   
./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/commit/ITestCommitOperationCost.java:258:
          return ioStatisticsToPrettyString(st);: 'block' child has incorrect 
indentation level 10, expected level should be 6. [Indentation]
   
./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/performance/ITestS3ADeleteCost.java:284:
        );: 'method call rparen' has incorrect indentation level 8, expected 
level should be 4. [Indentation]
   
   
   
   Code | Warning
   

Issue Time Tracking
-------------------

    Worklog Id:     (was: 756382)
    Time Spent: 5h 50m  (was: 5h 40m)

> Improve Magic Committer Performance
> -----------------------------------
>
>                 Key: HADOOP-17833
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17833
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 3.3.1
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Magic committer tasks can be slow because every file created with 
> overwrite=false triggers a HEAD (verify there's no file) and a LIST (that 
> there's no dir). And because of delayed manifestations, it may not behave as 
> expected.
> ParquetOutputFormat is one example of a library which does this.
> we could fix parquet to use overwrite=true, but (a) there may be surprises in 
> other uses (b) it'd still leave the list and (c) do nothing for other formats 
> call
> Proposed: createFile() under a magic path to skip all probes for file/dir at 
> end of path
> Only a single task attempt Will be writing to that directory and it should 
> know what it is doing. If there is conflicting file names and parts across 
> tasks that won't even get picked up at this point. Oh and none of the 
> committers ever check for this: you'll get the last file manifested (s3a) or 
> renamed (file)
> If we skip the checks we will save 2 HTTP requests/file.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to