[
https://issues.apache.org/jira/browse/HADOOP-17318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran updated HADOOP-17318:
------------------------------------
Description:
Reported failure of magic committer block uploads as pending upload ID is
unknown. Likely cause: it's been aborted by another job
# Make it possible to turn off cleanup of pending uploads in magic committer
# log more about uploads being deleted in committers
# and upload ID in the S3aBlockOutputStream errors
There are other concurrency issues when you look close, see SPARK-33230
* magic committer uses app attempt ID as path under __magic; if there are
duplicate then they will conflict
* staging committer local temp dir uses app attempt id
Fix will be to have a job UUID which for spark will be picked up from the
SPARK-33230 changes, (option to self-generate in job setup for hadoop 3.3.1+
older spark builds); fall back to app-attempt *unless that fallback has been
disabled*
MR: configure to use app attempt ID
Spark: configure to fail job setup if app attempt ID is the source of a job uuid
was:
Reported failure of magic committer block uploads as pending upload ID is
unknown. Likely cause: it's been aborted by another job
# Make it possible to turn off cleanup of pending uploads in magic committer
# log more about uploads being deleted in committers
# and upload ID in the S3aBlockOutputStream errors
> S3A committer to support concurrent jobs with same app attempt ID & dest dir
> ----------------------------------------------------------------------------
>
> Key: HADOOP-17318
> URL: https://issues.apache.org/jira/browse/HADOOP-17318
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.3.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> Reported failure of magic committer block uploads as pending upload ID is
> unknown. Likely cause: it's been aborted by another job
> # Make it possible to turn off cleanup of pending uploads in magic committer
> # log more about uploads being deleted in committers
> # and upload ID in the S3aBlockOutputStream errors
> There are other concurrency issues when you look close, see SPARK-33230
> * magic committer uses app attempt ID as path under __magic; if there are
> duplicate then they will conflict
> * staging committer local temp dir uses app attempt id
> Fix will be to have a job UUID which for spark will be picked up from the
> SPARK-33230 changes, (option to self-generate in job setup for hadoop 3.3.1+
> older spark builds); fall back to app-attempt *unless that fallback has been
> disabled*
> MR: configure to use app attempt ID
> Spark: configure to fail job setup if app attempt ID is the source of a job
> uuid
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]