Re: Performance Problems Migrating to S3A Committers

Artemis User Wed, 23 Jun 2021 13:05:20 -0700

Thanks Johnny for sharing your experience. Have you tried to use S3Acommitter? Looks like this one is introduced in the latest Hadoop forsolving problems with other committers.


https://hadoop.apache.org/docs/r3.1.1/hadoop-aws/tools/hadoop-aws/committers.html


- ND

On 6/22/21 6:41 PM, Johnny Burns wrote:

Hello.
I’m Johnny, I work at Stripe. We’re heavy Spark users and we’ve beenexploring using s3 committers.Currently we first write the data toHDFS and then upload it to S3. However, now with S3 offering strongconsistency guarantees, we are evaluating if we can write datadirectly to S3.
We’re having some troubles with performance, so hoping someone mighthave some guidance which can unblock this.
    File Format
We are using parquetas the File Format. We do have icebergtables aswell, and they are indeed able to commit directly to S3(withminimallocal disk usage). We can’t migrate all of our jobs to icebergrightnow. Hence, we are looking for a committer that is performant and candirectly write parquetfiles to S3(withminimal local disk usage).
    What have we tried?
We’ve tried using both the“magic”and“directory”committers. We'resetting the following configs (in addition to the "magic/directory"committer.name <http://committer.name>).
"spark.hadoop.fs.s3a.committer.magic.enabled":"true",


"spark.hadoop.mapreduce.outputcommitter.factory.scheme.s3a":"org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory",

"spark.sql.sources.commitProtocolClass":"org.apache.spark.internal.io.cloud.PathOutputCommitProtocol",

"spark.sql.parquet.output.committer.class":"org.apache.spark.internal.io.cloud.BindingParquetOutputCommitter",
Both committers have shown performance regressions on large jobs.We’re currently focused on trying to make the directory committer workbecause we’ve seen /fewer/slowdowns with that one, but I’ll describethe problems with each.
We’ve been testing the committers on a large job with 100ktasks(creating7.3TB of output).
    Observationsfor magic committer


Using the magic committer, we see slowdowns in two places:

  * *S3 Writing****(inside**the task)*

  * The slowdown seems to occur just after the s3 multipart write. The
    finishedWrite
    
<https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L4253>function
    tries to do some cleanup and kicks off
    thisdeleteUnnecessaryFakeDirectories
    
<https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L4350-L4373>function
    
<https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L4350-L4373>.


  * This causes 503’s due to hitting AWS rate limits on
    com.amazonaws.services.s3.model.DeleteObjectsRequest

  * I'm not sure what directories are actually getting cleaned up here
    (I assume the _magic directories are still needed up until the job
    commit).

  * *Job Commit*

  * Have not dug down into the details here, but assume it is
    something similar to what we’re seeing in the directory committer
    case below.


    Observationsfor directory committer
We’ve observed that the“directory”s3committer performance is on-parwith our existing HDFS commit for task execution and task commit. Theslowdowns we’re seeing are in the job commit phase.
The job commit happens almost instantaneously in the HDFS case, vstaking about an hour for the s3 directory committer.
We’ve enabled DEBUG logging for the s3 committer. It seems like thathour is mostly spent doing things which you wouldexpect(completing100k delayedComplete s3 uploads). I've attached anexample of some of the logs we see repeated over-and-over during the 1hour job commit (I redacted some of the directories and SHAs but thelogs are otherwise unchanged).
One thing I notice is that we see object_delete_requests += 1in thelogs. I’m not sure if that means it’s doing an s3 delete, or it isdeleting the HDFS manifest files(toclean up the task).
    Alternatives - Should we check out directCommitter?
We’ve also considered using the directCommitter. We understand thatthe directCommitter is discouraged because it does not supportspeculative execution(andfor some failure cases). Given that we do notuse speculative execution at Stripe, would the directCommitter be aviable option for us? What are the failure scenarios to consider?
    Alternatives - Can S3FileIO work well with parquet files?
Netflix has a tool called s3FileIO<https://iceberg.apache.org/aws/#s3-fileio>. We’re wondering if it canbe used with spark, or only with Iceburg.
---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Re: Performance Problems Migrating to S3A Committers

Reply via email to