steveloughran commented on pull request #2399:
URL: https://github.com/apache/hadoop/pull/2399#issuecomment-724222044


   Running integration tests on this with spark + patch and the 3.4.0-SNAPSHOT 
builds. Ignoring compilation issues with spark trunk, hadoop-trunk, scala 
versions and scalatest, I'm running tests in 
[cloud-integration](https://github.com/hortonworks-spark/cloud-integration)
   
   ```
   S3AParquetPartitionSuite:
   2020-11-09 10:55:36,664 [ScalaTest-main-running-S3AParquetPartitionSuite] 
INFO  commit.AbstractS3ACommitter (AbstractS3ACommitter.java:<init>(180)) - Job 
UUID d6b6cd70-0303-46a6-8ff4-240dd14511d6 source spark.sql.sources.writeJobUUID
   2020-11-09 10:55:36,733 [ScalaTest-main-running-S3AParquetPartitionSuite] 
INFO  output.FileOutputCommitter (FileOutputCommitter.java:<init>(141)) - File 
Output Committer Algorithm version is 1
   2020-11-09 10:55:36,733 [ScalaTest-main-running-S3AParquetPartitionSuite] 
INFO  output.FileOutputCommitter (FileOutputCommitter.java:<init>(156)) - 
FileOutputCommitter skip cleanup _temporary folders under output 
directory:false, ignore cleanup failures: false
   2020-11-09 10:55:36,734 [ScalaTest-main-running-S3AParquetPartitionSuite] 
INFO  commit.AbstractS3ACommitterFactory 
(S3ACommitterFactory.java:createTaskCommitter(83)) - Using committer directory 
to output data to 
s3a://stevel-ireland/cloud-integration/DELAY_LISTING_ME/S3AParquetPartitionSuite/part-columns/p1=1/p2=foo
   2020-11-09 10:55:36,734 [ScalaTest-main-running-S3AParquetPartitionSuite] 
INFO  commit.AbstractS3ACommitterFactory 
(AbstractS3ACommitterFactory.java:createOutputCommitter(54)) - Using Committer 
StagingCommitter{AbstractS3ACommitter{role=Task committer 
attempt_20201109105536_0000_m_000000_0, name=directory, 
outputPath=s3a://stevel-ireland/cloud-integration/DELAY_LISTING_ME/S3AParquetPartitionSuite/part-columns/p1=1/p2=foo,
 
workPath=file:/Users/stevel/Projects/sparkwork/cloud-integration/cloud-examples/target/test/s3a/d6b6cd70-0303-46a6-8ff4-240dd14511d6-attempt_20201109105536_0000_m_000000_0/_temporary/0/_temporary/attempt_20201109105536_0000_m_000000_0,
 uuid='d6b6cd70-0303-46a6-8ff4-240dd14511d6', uuid 
source=JobUUIDSource{text='spark.sql.sources.writeJobUUID'}}, 
commitsDirectory=file:/Users/stevel/Projects/sparkwork/cloud-integration/cloud-examples/tmp/staging/stevel/d6b6cd70-0303-46a6-8ff4-240dd14511d6/staging-uploads,
 uniqueFilenames=true, conflictResolution=APPEND. uploadPartS
 ize=67108864, 
wrappedCommitter=FileOutputCommitter{PathOutputCommitter{context=TaskAttemptContextImpl{JobContextImpl{jobId=job_20201109105536_0000};
 taskId=attempt_20201109105536_0000_m_000000_0, status=''}; 
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter@759c53e5}; 
outputPath=file:/Users/stevel/Projects/sparkwork/cloud-integration/cloud-examples/tmp/staging/stevel/d6b6cd70-0303-46a6-8ff4-240dd14511d6/staging-uploads,
 workPath=null, algorithmVersion=1, skipCleanup=false, 
ignoreCleanupFailures=false}} for 
s3a://stevel-ireland/cloud-integration/DELAY_LISTING_ME/S3AParquetPartitionSuite/part-columns/p1=1/p2=foo
   2020-11-09 10:55:36,736 [ScalaTest-main-running-S3AParquetPartitionSuite] 
INFO  staging.DirectoryStagingCommitter 
(DirectoryStagingCommitter.java:setupJob(71)) - Conflict Resolution mode is 
APPEND
   2020-11-09 10:55:36,879 [ScalaTest-main-running-S3AParquetPartitionSuite] 
INFO  commit.AbstractS3AC
   ```
   
   1. Spark is passing down a unique job ID (committer is configured to require 
it) ` Job UUID d6b6cd70-0303-46a6-8ff4-240dd14511d6 source 
spark.sql.sources.writeJobUUID`
   1. This used for the local fs work path of the staging committer 
`file:/Users/stevel/Projects/sparkwork/cloud-integration/cloud-examples/target/test/s3a/d6b6cd70-0303-46a6-8ff4-240dd14511d6-attempt_20201109105536_0000_m_000000_0/_temporary/0/_temporary/attempt_20201109105536_0000_m_000000_0,`
  
   1. And for the cluster FS (which is file:// here)
   
`file:/Users/stevel/Projects/sparkwork/cloud-integration/cloud-examples/tmp/staging/stevel/d6b6cd70-0303-46a6-8ff4-240dd14511d6/staging-uploads`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to