[
https://issues.apache.org/jira/browse/MAPREDUCE-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated MAPREDUCE-7341:
--------------------------------------
Labels: pull-request-available (was: )
> Add a task-manifest output committer for Azure and GCS
> ------------------------------------------------------
>
> Key: MAPREDUCE-7341
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7341
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Components: client
> Affects Versions: 3.3.1
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Add a task-manifest output committer for Azure and GCS
> The S3A committers are very popular in Spark on S3, as they are both correct
> and fast.
> The classic FileOutputCommitter v1 and v2 algorithms are all that is
> available for Azure ABFS and Google GCS, and they have limitations.
> The v2 algorithm isn't safe in the presence of failed task attempt commits,
> so we
> recommend the v1 algorithm for Azure. But that is slow because it
> sequentially lists
> then renames files and directories, one-by-one. The latencies of list
> and rename make things slow.
> Google GCS lacks the atomic directory rename required for v1 correctness;
> v2 can be used (which doesn't have the job commit performance limitations),
> but it's not safe.
> Proposed
> * Add a new FileOutputFormat committer which uses an intermediate manifest to
> pass the list of files created by a TA to the job committer.
> * Job committer to parallelise reading these task manifests and submit all the
> rename operations into a pool of worker threads. (also: mkdir, directory
> deletions on cleanup)
> * Use the committer plugin mechanism added for s3a to make this the default
> committer for ABFS
> (i.e. no need to make any changes to FileOutputCommitter)
> * Add lots of IOStatistics instrumentation + logging of operations in the
> JobCommit
> for visibility of where delays are occurring.
> * Reuse the S3A committer _SUCCESS JSON structure to publish IOStats & other
> data
> for testing/support.
> This committer will be faster than the V1 algorithm because of the
> parallelisation, and
> because a manifest written by create-and-rename will be exclusive to a single
> task
> attempt, delivers the isolation which the v2 committer lacks.
> This is not an attempt to do an iceberg/hudi/delta-lake style manifest-only
> format
> for describing the contents of a table; the final output is still a directory
> tree
> which must be scanned during query planning.
> As such the format is still suboptimal for cloud storage -but at least we
> will have
> faster job execution during the commit phases.
>
> Note: this will also work on HDFS, where again, it should be faster than
> the v1 committer. However the target is very much Spark with ABFS and GCS; no
> plans to worry about MR as that simplifies the challenge of dealing with job
> restart (i.e. you don't have to)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]