[ https://issues.apache.org/jira/browse/MAPREDUCE-7500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931982#comment-17931982 ]
ASF GitHub Bot commented on MAPREDUCE-7500: ------------------------------------------- robreeves opened a new pull request, #7425: URL: https://github.com/apache/hadoop/pull/7425 <!-- Thanks for sending a pull request! 1. If this is your first time, please read our contributor guidelines: https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute 2. Make sure your PR title starts with JIRA issue id, e.g., 'HADOOP-17799. Your PR title ...'. --> ### Description of PR This PR adds a new feature to commit files optimistically (assumes no conflicting file/dir in the destination) to avoid a `FileSystem.getFileStatus` RPC. The default behavior has not been changed. To use this feature this config must be set `mapreduce.fileoutputcommitter.optimistic.file.commit.enabled=true`. This is useful for cases like Spark where no destination conflict is expected and the `FileSystem.getFileStatus` RPC is wasted time. When I profiled the commit time for a Spark job before this enhancement, it showed this call was taking 50% of the time (HDFS with intermittent latency in our environment). ### How was this patch tested? **Correctness** I modified all tests in `FileOutputCommitter` tests to run with and without this configuration. I modified the test class to use parameterized tests using the default configs and this change enabled. There may also be an opportunity to move the v1/v2 algorithm tests into the parameterized test, but I opted to leave that refactor for later to minimize unnecessary changes. ``` [INFO] Running org.apache.hadoop.mapreduce.lib.output.TestFileOutputCommitter [INFO] Tests run: 44, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.946 s - in org.apache.hadoop.mapreduce.lib.output.TestFileOutputCommitter ``` **Performance** I tested the performance of the changes using Spark writing to HDFS for partitioned and non-partitioned datasets. The summary of the improvement is: - For the non-partitioned commit, the average commit time decreased from 16.6min to 4.8min (71% improvement). - For the partitioned commit, the average commit time decreased from 4.3min to 1.5min (65% improvement).  Non-partitioned test Spark script: ```scala val fileCount = 5000 val path = "/path/temp_data_no_part" spark.range(0, fileCount, 1, fileCount).write .mode(SaveMode.Overwrite) .option("path", path) .save() ``` Partitioned test Spark script: ```scala val fileCount = 1000 val partitionCount = 5 val path = "/path/temp_data_part" spark .range(0, fileCount, 1, fileCount) .withColumn("part", $"id" % lit(partitionCount)) .write .mode(SaveMode.Overwrite) .option("path", path) .partitionBy("part") .save() ``` ### For code changes: - [x] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? > Support optimistic file renames in the commit protocol > ------------------------------------------------------ > > Key: MAPREDUCE-7500 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7500 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: client > Environment: The commit protocol in FileOutputCommitter now supports > optimistic commits for files. This saves a FileSystem.getFileStatus call for > cases where it is unexpected to have conflict in the destination location at > commit time (e.g. Spark). This feature is disabled by default. To enable it > set mapreduce.fileoutputcommitter.optimistic.file.commit.enabled=true. > Reporter: Rob Reeves > Priority: Minor > Labels: pull-request-available > Attachments: flamegraph_commit.png > > > During a file commit in FileOutputCommitter, it assumes a file may be in the > destination location and if so will delete it first. This means for every > file commit is calls FileSystem.getFileStatus for the destination. For the > Spark use case, there will be nothing existing in the destination location > for the expected case so the getFileStatus call is wasted in all, but > exceptional and unexpected cases. > The getFileStatus call can take significant time. When I profiled a commit in > our environment (HDFS, intermittent latency issues) the > FileSystem.getFileStatus call takes 50% of the commit time. We have an > aggressive auto-msync setting, but even when I disabled msync I saw the same > behavior. I attached an example flame graph for the commit time > (getFileStatus time is highlighted in pink). > To avoid the time spent on getFileStatus, there should be an option to > optimistically commit the file assuming there will be no conflict in the > destination. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org