[
https://issues.apache.org/jira/browse/HADOOP-17318?focusedWorklogId=509312&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-509312
]
ASF GitHub Bot logged work on HADOOP-17318:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 09/Nov/20 19:17
Start Date: 09/Nov/20 19:17
Worklog Time Spent: 10m
Work Description: steveloughran edited a comment on pull request #2399:
URL: https://github.com/apache/hadoop/pull/2399#issuecomment-724222044
Running integration tests on this with spark + patch and the 3.4.0-SNAPSHOT
builds. Ignoring compilation issues with spark trunk, hadoop-trunk, scala
versions and scalatest, I'm running tests in
[cloud-integration](https://github.com/hortonworks-spark/cloud-integration)
```
S3AParquetPartitionSuite:
2020-11-09 10:55:36,664 [ScalaTest-main-running-S3AParquetPartitionSuite]
INFO commit.AbstractS3ACommitter (AbstractS3ACommitter.java:<init>(180)) - Job
UUID d6b6cd70-0303-46a6-8ff4-240dd14511d6 source spark.sql.sources.writeJobUUID
2020-11-09 10:55:36,733 [ScalaTest-main-running-S3AParquetPartitionSuite]
INFO output.FileOutputCommitter (FileOutputCommitter.java:<init>(141)) - File
Output Committer Algorithm version is 1
2020-11-09 10:55:36,733 [ScalaTest-main-running-S3AParquetPartitionSuite]
INFO output.FileOutputCommitter (FileOutputCommitter.java:<init>(156)) -
FileOutputCommitter skip cleanup _temporary folders under output
directory:false, ignore cleanup failures: false
2020-11-09 10:55:36,734 [ScalaTest-main-running-S3AParquetPartitionSuite]
INFO commit.AbstractS3ACommitterFactory
(S3ACommitterFactory.java:createTaskCommitter(83)) - Using committer directory
to output data to
s3a://stevel-ireland/cloud-integration/DELAY_LISTING_ME/S3AParquetPartitionSuite/part-columns/p1=1/p2=foo
2020-11-09 10:55:36,734 [ScalaTest-main-running-S3AParquetPartitionSuite]
INFO commit.AbstractS3ACommitterFactory
(AbstractS3ACommitterFactory.java:createOutputCommitter(54)) - Using Committer
StagingCommitter{AbstractS3ACommitter{role=Task committer
attempt_20201109105536_0000_m_000000_0, name=directory,
outputPath=s3a://stevel-ireland/cloud-integration/DELAY_LISTING_ME/S3AParquetPartitionSuite/part-columns/p1=1/p2=foo,
workPath=file:/Users/stevel/Projects/sparkwork/cloud-integration/cloud-examples/target/test/s3a/d6b6cd70-0303-46a6-8ff4-240dd14511d6-attempt_20201109105536_0000_m_000000_0/_temporary/0/_temporary/attempt_20201109105536_0000_m_000000_0,
uuid='d6b6cd70-0303-46a6-8ff4-240dd14511d6', uuid
source=JobUUIDSource{text='spark.sql.sources.writeJobUUID'}},
commitsDirectory=file:/Users/stevel/Projects/sparkwork/cloud-integration/cloud-examples/tmp/staging/stevel/d6b6cd70-0303-46a6-8ff4-240dd14511d6/staging-uploads,
uniqueFilenames=true, conflictResolution=APPEND. uploadPartSize=67108864,
wrappedCommitter=FileOutputCommitter{PathOutputCommitter{context=TaskAttemptContextImpl{JobContextImpl{jobId=job_20201109105536_0000};
taskId=attempt_20201109105536_0000_m_000000_0, status=''};
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter@759c53e5};
outputPath=file:/Users/stevel/Projects/sparkwork/cloud-integration/cloud-examples/tmp/staging/stevel/d6b6cd70-0303-46a6-8ff4-240dd14511d6/staging-uploads,
workPath=null, algorithmVersion=1, skipCleanup=false,
ignoreCleanupFailures=false}} for
s3a://stevel-ireland/cloud-integration/DELAY_LISTING_ME/S3AParquetPartitionSuite/part-columns/p1=1/p2=foo
2020-11-09 10:55:36,736 [ScalaTest-main-running-S3AParquetPartitionSuite]
INFO staging.DirectoryStagingCommitter
(DirectoryStagingCommitter.java:setupJob(71)) - Conflict Resolution mode is
APPEND
2020-11-09 10:55:36,879 [ScalaTest-main-running-S3AParquetPartitionSuite]
INFO commit.AbstractS3AC
```
1. Spark is passing down a unique job ID (committer is configured to require
it) ` Job UUID d6b6cd70-0303-46a6-8ff4-240dd14511d6 source
spark.sql.sources.writeJobUUID`
1. This used for the local fs work path of the staging committer
`file:/Users/stevel/Projects/sparkwork/cloud-integration/cloud-examples/target/test/s3a/d6b6cd70-0303-46a6-8ff4-240dd14511d6-attempt_20201109105536_0000_m_000000_0/_temporary/0/_temporary/attempt_20201109105536_0000_m_000000_0,`
1. And for the cluster FS (which is file:// here)
`file:/Users/stevel/Projects/sparkwork/cloud-integration/cloud-examples/tmp/staging/stevel/d6b6cd70-0303-46a6-8ff4-240dd14511d6/staging-uploads`
that is: spark is setting the UUID and the committer is picking it up and
using as appropriate
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 509312)
Time Spent: 2h 50m (was: 2h 40m)
> S3A committer to support concurrent jobs with same app attempt ID & dest dir
> ----------------------------------------------------------------------------
>
> Key: HADOOP-17318
> URL: https://issues.apache.org/jira/browse/HADOOP-17318
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.3.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 2h 50m
> Remaining Estimate: 0h
>
> Reported failure of magic committer block uploads as pending upload ID is
> unknown. Likely cause: it's been aborted by another job
> # Make it possible to turn off cleanup of pending uploads in magic committer
> # log more about uploads being deleted in committers
> # and upload ID in the S3aBlockOutputStream errors
> There are other concurrency issues when you look close, see SPARK-33230
> * magic committer uses app attempt ID as path under __magic; if there are
> duplicate then they will conflict
> * staging committer local temp dir uses app attempt id
> Fix will be to have a job UUID which for spark will be picked up from the
> SPARK-33230 changes, (option to self-generate in job setup for hadoop 3.3.1+
> older spark builds); fall back to app-attempt *unless that fallback has been
> disabled*
> MR: configure to use app attempt ID
> Spark: configure to fail job setup if app attempt ID is the source of a job
> uuid
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]