[
https://issues.apache.org/jira/browse/HADOOP-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17750405#comment-17750405
]
ASF GitHub Bot commented on HADOOP-18757:
-----------------------------------------
dongjoon-hyun commented on code in PR #5706:
URL: https://github.com/apache/hadoop/pull/5706#discussion_r1282145500
##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/impl/CommitContext.java:
##########
@@ -236,7 +236,7 @@ private ExecutorService buildThreadPool(
.setDaemon(true)
.setNameFormat(THREAD_PREFIX + jobId + "-%d")
.build();
- return new HadoopThreadPoolExecutor(0, numThreads,
+ return new HadoopThreadPoolExecutor(numThreads, numThreads,
Review Comment:
Hi, @steveloughran . In this case, it seems that there is no workaround in
Apache Hadoop 3.3.5 and 3.3.6. Do you have any recommendation for this issue?
Currently, Apache Spark 3.5.0 RC1 tag is using Apache Hadoop 3.3.6.
- https://github.com/apache/spark/releases/tag/v3.5.0-rc1
```
[SPARK-44197][BUILD] Upgrade Hadoop to 3.3.6
[SPARK-42913][BUILD] Upgrade Hadoop to 3.3.5
```
> S3A Committer only finalizes the commits in a single thread
> -----------------------------------------------------------
>
> Key: HADOOP-18757
> URL: https://issues.apache.org/jira/browse/HADOOP-18757
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 3.3.5
> Reporter: Moditha Hewasinghage
> Assignee: Moditha Hewasinghage
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> S3A Committer is being bottle-necked on the driver when finalizing the
> commits. It seems like only a single thread is being used to finalize the
> commit. In the experiment we are saving 36,000 files ending committing for
> almost 2 hours each file taking 0.1 - 0.5 seconds while all the executors
> stay idle while the driver commits. I have attached the driver log snippets
> to support this theory in comparison to spark 3.4.0.
> The most likely reason is the usage of
> [ThreadPoolExecutor]([https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadPoolExecutor.html])
> in
> [https://github.com/apache/hadoop/blob/706d88266abcee09ed78fbaa0ad5f74d818ab0e9/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/impl/CommitContext.java#L239-L244]
> where an LinkedBlockingQueue is used with a corePoolSize of 0 which ends up
> with a single thread and no new threads created.
> From the documentation
> {code:java}
> Unbounded queues. Using an unbounded queue (for example a LinkedBlockingQueue
> without a predefined capacity) will cause new tasks to wait in the queue when
> all corePoolSize threads are busy. Thus, no more than corePoolSize threads
> will ever be created. (And the value of the maximumPoolSize therefore doesn't
> have any effect.) This may be appropriate when each task is completely
> independent of others, so tasks cannot affect each others execution; for
> example, in a web page server. While this style of queuing can be useful in
> smoothing out transient bursts of requests, it admits the possibility of
> unbounded work queue growth when commands continue to arrive on average
> faster than they can be processed.{code}
> Magic Committer spark 3.5.0-SNAPSHOT
> {code:java}
> 2023-05-26 15:35:04,852 DEBUG impl.CommitContext: creating thread pool of
> size 32
> 2023-05-26 15:35:04,922 INFO yarn.YarnAllocator: Driver requested a total
> number of 0 executor(s) for resource profile id: 0.
> 2023-05-26 15:35:07,910 INFO commit.AbstractS3ACommitter: Starting:
> committing the output of 36000 task(s)
> 2023-05-26 15:35:07,914 INFO commit.AbstractS3ACommitter: Starting: Loading
> and committing files in pendingset
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b9167b1a-35a5-448
> e-8022-31e0f7457354/00/task_202305261454174742717183892533843_0031_m_000000.pendingset
> 2023-05-26 15:35:07,941 DEBUG files.PersistentCommitData: Reading commit data
> from file
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b9167b1a-35a5-448e-8022-31e0f7457354/00/
> task_202305261454174742717183892533843_0031_m_000000.pendingset
> 2023-05-26 15:35:08,036 DEBUG impl.CommitContext: creating thread pool of
> size 32
> 2023-05-26 15:35:08,037 DEBUG impl.CommitOperations: Committing single commit
> SinglePendingCommit
> {version=2,
> uri='s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b9167b1a-35a5-
> 448e-8022-31e0f7457354/00/tasks/attempt_202305261454174742717183892533843_0031_m_000000_1920855/__base/rawRequestType=imp_dsp/part-00000-b196e04e-c87b-43e1-97e0-7ad34ef0bedf.c000.zstd.parquet',
>
> destination='<PATH>/date=20230520/rawRequestType=imp_dsp/part-00000-b196e04e-c87b-43e1-97e0-7ad34ef0bedf.c000.zstd.parquet',
>
> uploadId='SCp78J9aYOmnPwrtfd5a.Q6B9Zu6olQw3eZcpyt.W.BCu6.M6fz54nlTe2ZYUDicSskFgPocsjlVXAbfiPn3Xu
> 26MzXNdWcg5j_kBGg9iWpeoWh4K21gt7bbKetML95MXAck15yP.VGFeOleoAspUg--',
> created=1685113555232, saved=1685113555232, size=110636173, date='Fri May 26
> 15:05:55 UTC 2023', jobId='b9167b1a-35a5-448e-8022-31e0f7457354',
> taskId='attemp t_202305261454174742717183892533843_0031_m_000000_1920855',
> notes='',
> etags=[64b9d14a89890cf7f931ff05074f6b6a,7a8c22d95d85736806205ad0f84094a5,82f55a1b0595b38420c6b55cb2763a46,212a3105afcf4a9045730f885a622e0a,b563ec91932a66498
>
> e8e9d717b625961,0425b7ae17a8a2af80734f0adf4debe9,496e9fbc08a583657fd29779dcd848d6,72bff33d090699baacb6a6171760a675,b2c6b9b76256416d304392bfbabca382,1a8f63d960cf50589b83b407ddfc77c7,d79cf49f16cd61fb00faf67034268529,e3b7ea2bd0ff
>
> f2e20a89eccd0ebbeb4a,8edc254a94902fee2884350b4a89f223,f0858853a4542b562c6973e28ee40c19]}
> 2023-05-26 15:35:08,037 INFO impl.CommitOperations: Starting: Committing file
> <PATH>/date=20230520/rawRequestType=imp_dsp/part-00000-b196e04e-c87b-43e1-97e0-7ad34ef0bedf.c000.zstd.parquet
> s
> ize 110636173
> 2023-05-26 15:35:08,198 DEBUG impl.CommitOperations: Successful commit of
> file length 110636173
> 2023-05-26 15:35:08,199 INFO impl.CommitOperations: Committing file
> <PATH>/date=20230520/rawRequestType=imp_dsp/part-00000-b196e04e-c87b-43e1-97e0-7ad34ef0bedf.c000.zstd.parquet
> size 110636
> 173: duration 0:00.162s
> 2023-05-26 15:35:08,208 INFO commit.AbstractS3ACommitter: Loading and
> committing files in pendingset
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b9167b1a-35a5-448e-8022-31e
> 0f7457354/00/task_202305261454174742717183892533843_0031_m_000000.pendingset:
> duration 0:00.294s
> 2023-05-26 15:35:08,208 INFO commit.AbstractS3ACommitter: Starting: Loading
> and committing files in pendingset
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b9167b1a-35a5-448
> e-8022-31e0f7457354/00/task_202305261454174742717183892533843_0031_m_000001.pendingset
> 2023-05-26 15:35:08,208 DEBUG files.PersistentCommitData: Reading commit data
> from file
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b9167b1a-35a5-448e-8022-31e0f7457354/00/
> task_202305261454174742717183892533843_0031_m_000001.pendingset
> 2023-05-26 15:35:08,232 DEBUG impl.CommitOperations: Committing single commit
> SinglePendingCommit
> {version=2,
> uri='s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b9167b1a-35a5-
> 448e-8022-31e0f7457354/00/tasks/attempt_202305261454174742717183892533843_0031_m_000001_1920856/__base/rawRequestType=imp_dsp/part-00001-b196e04e-c87b-43e1-97e0-7ad34ef0bedf.c000.zstd.parquet',
>
> destination='<PATH>/date=20230520/rawRequestType=imp_dsp/part-00001-b196e04e-c87b-43e1-97e0-7ad34ef0bedf.c000.zstd.parquet',
>
> uploadId='I.5Kns.n5JKLbupr5.AWbzApL9YDGhQyI438WQT3JXUjevkQS8u1aQwQhkVbwBoxnFXItCbKJCS8EwPdtoijtc
> UFrjU0eU3pNvt7lwcYRJ0iklLwXexHbYMpZq3pDj0udwsHCq0trnqyq2Ee.o.prw--',
> created=1685113548923, saved=1685113548923, size=110570244, date='Fri May 26
> 15:05:48 UTC 2023', jobId='b9167b1a-35a5-448e-8022-31e0f7457354',
> taskId='attemp t_202305261454174742717183892533843_0031_m_000001_1920856',
> notes='',
> etags=[835dd18d4ce995c22c1a84de5cd26fd8,c1f2acd3360b2f94ce90ef0d80edd7e5,208787fce146d917dbc50b1cf4adff4c,a66b228379201a0a278061b096dc2faf,58134b7ae228f0d2a
>
> 6879ec53a3f35c1,e847a7614639d3f1a43a1b3090b603e1,2d8eac2399238e6d356fe0bdfafd076e,e0c3a964ee2a3c270f67e7e4a7791440,84b05192e896e41e1d16aa363f0eb48b,c3cf764e4361da4573b8dfa6361e4174,78b145f682946c7e18750758da3578cd,d9cd438899d2
>
> 0522668b9cd7f61cc098,e8b3944929f17d927ed7aef31800bdbf,52d279a75e9382ead94fe4d9b81c3bf9]}
> 2023-05-26 15:35:08,232 INFO impl.CommitOperations: Starting: Committing file
> <PATH>/date=20230520/rawRequestType=imp_dsp/part-00001-b196e04e-c87b-43e1-97e0-7ad34ef0bedf.c000.zstd.parquet
> s
> ize 110570244
> 2023-05-26 15:35:08,438 DEBUG impl.CommitOperations: Successful commit of
> file length 110570244
> 2023-05-26 15:35:08,438 INFO impl.CommitOperations: Committing file
> <PATH>/date=20230520/rawRequestType=imp_dsp/part-00001-b196e04e-c87b-43e1-97e0-7ad34ef0bedf.c000.zstd.parquet
> size 110570
> 244: duration 0:00.206s
> 2023-05-26 15:35:08,448 INFO commit.AbstractS3ACommitter: Loading and
> committing files in pendingset
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b9167b1a-35a5-448e-8022-31e
> 0f7457354/00/task_202305261454174742717183892533843_0031_m_000001.pendingset:
> duration 0:00.240s
> 2023-05-26 15:35:08,448 INFO commit.AbstractS3ACommitter: Starting: Loading
> and committing files in pendingset
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b9167b1a-35a5-448
> e-8022-31e0f7457354/00/task_202305261454174742717183892533843_0031_m_000002.pendingset
> 2023-05-26 15:35:08,448 DEBUG files.PersistentCommitData: Reading commit data
> from file
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b9167b1a-35a5-448e-8022-31e0f7457354/00/
> task_202305261454174742717183892533843_0031_m_000002.pendingset
> 2023-05-26 15:35:08,489 DEBUG impl.CommitOperations: Committing single commit
> SinglePendingCommit
> {version=2,
> uri='s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b9167b1a-35a5-
> 448e-8022-31e0f7457354/00/tasks/attempt_202305261454174742717183892533843_0031_m_000002_1920857/__base/rawRequestType=imp_dsp/part-00002-b196e04e-c87b-43e1-97e0-7ad34ef0bedf.c000.zstd.parquet',
>
> destination='<PATH>/date=20230520/rawRequestType=imp_dsp/part-00002-b196e04e-c87b-43e1-97e0-7ad34ef0bedf.c000.zstd.parquet',
>
> uploadId='AB982jhG0EAfnSxXWphX7R.QfqrkS2ipP6APqJZYGlqDl335WX4X2xn81wy7l2i5NBQrvG.eVcQgYFudk51wi5
> EMDJRNVz__SEWyF0xTncpIHpaC5xn8La.AY8gzCHQrAfRJ83nErQhe4Idmhcs04w--',
> created=1685113553072, saved=1685113553072, size=110136058, date='Fri May 26
> 15:05:53 UTC 2023', jobId='b9167b1a-35a5-448e-8022-31e0f7457354',
> taskId='attemp t_202305261454174742717183892533843_0031_m_000002_1920857',
> notes='',
> etags=[f043628528fc9bdd6db4107fd175eb1d,11e36b4bde35bdb1d2e160ffd5fcc3f6,5105cb3ebee1b27cfec66f7682679316,1a85e5c9477e855e38f7bd4e81bed28e,7c57360865e12c19b
>
> 3125cbfb8c786f8,59d928629e7afff4c4c038a1dadd0ce7,bfcedee2f51b45c20a097fbbae85a48f,7e3ff2665714d8cdc6ec20134856eb4c,0117e5c47f94a9ff4d8d0eafd1f6b76e,3c4f27e69544e4cc5297c56413231639,a32cd6422aaae63bdcbaafa7400ab889,1d5d07f0a0b2
>
> 58773993a456bc7ec7ee,a220e786aa36f643edc330a184e23d88,e19a85ab3accaa4dc697e7dfbf5d5325]}
> 2023-05-26 15:35:08,489 INFO impl.CommitOperations: Starting: Committing file
> <PATH>/date=20230520/rawRequestType=imp_dsp/part-00002-b196e04e-c87b-43e1-97e0-7ad34ef0bedf.c000.zstd.parquet
> s
> ize 110136058
> 2023-05-26 15:35:08,646 DEBUG impl.CommitOperations: Successful commit of
> file length 110136058
> 2023-05-26 15:35:08,646 INFO impl.CommitOperations: Committing file
> <PATH>/date=20230520/rawRequestType=imp_dsp/part-00002-b196e04e-c87b-43e1-97e0-7ad34ef0bedf.c000.zstd.parquet
> size 110136058: duration 0:00.157s
> {code}
> Magic Committer spark 3.4.0
>
> {code:java}
> 2023-05-24 09:47:19,906 INFO yarn.YarnAllocator: Driver requested a total
> number of 524 executor(s) for resource profile id: 0.
> 2023-05-24 09:47:23,064 INFO commit.AbstractS3ACommitter: Starting:
> committing the output of 36000 task(s)
> 2023-05-24 09:47:23,069 DEBUG commit.AbstractS3ACommitter: Task committer
> attempt_202305240933298436217226765687147_0000_m_000000_0: creating thread
> pool of size 32
> 2023-05-24 09:47:23,074 DEBUG commit.Tasks: Executing task
> 2023-05-24 09:47:23,074 INFO commit.AbstractS3ACommitter: Starting: Loading
> and committing files in pendingset
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic
> /job-b15cffa2-2764-45f6-883b-7ae3c5785060/task_20230524093329924571364925494992_0031_m_000000.pendingset
> 2023-05-24 09:47:23,075 DEBUG commit.Tasks: Executing task
> 2023-05-24 09:47:23,075 INFO commit.AbstractS3ACommitter: Starting: Loading
> and committing files in pendingset
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic
> /job-b15cffa2-2764-45f6-883b-7ae3c5785060/task_20230524093329924571364925494992_0031_m_000001.pendingset
> 2023-05-24 09:47:23,075 DEBUG commit.Tasks: Executing task
> 2023-05-24 09:47:23,075 INFO commit.AbstractS3ACommitter: Starting: Loading
> and committing files in pendingset
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic
> /job-b15cffa2-2764-45f6-883b-7ae3c5785060/task_20230524093329924571364925494992_0031_m_000002.pendingset
> 2023-05-24 09:47:23,075 DEBUG commit.Tasks: Executing task
> 2023-05-24 09:47:23,075 INFO commit.AbstractS3ACommitter: Starting: Loading
> and committing files in pendingset
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic
> /job-b15cffa2-2764-45f6-883b-7ae3c5785060/task_20230524093329924571364925494992_0031_m_000003.pendingset
> 2023-05-24 09:47:23,075 DEBUG commit.Tasks: Executing task
> 2023-05-24 09:47:23,075 INFO commit.AbstractS3ACommitter: Starting: Loading
> and committing files in pendingset
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic
> /job-b15cffa2-2764-45f6-883b-7ae3c5785060/task_20230524093329924571364925494992_0031_m_000004.pendingset
> 2023-05-24 09:47:23,076 DEBUG commit.Tasks: Executing task
> 2023-05-24 09:47:23,076 INFO commit.AbstractS3ACommitter: Starting: Loading
> and committing files in pendingset
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic
> /job-b15cffa2-2764-45f6-883b-7ae3c5785060/task_20230524093329924571364925494992_0031_m_000005.pendingset
> 2023-05-24 09:47:23,076 DEBUG commit.Tasks: Executing task
> 2023-05-24 09:47:23,076 INFO commit.AbstractS3ACommitter: Starting: Loading
> and committing files in pendingset
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic
> /job-b15cffa2-2764-45f6-883b-7ae3c5785060/task_20230524093329924571364925494992_0031_m_000006.pendingset
> 2023-05-24 09:47:23,076 DEBUG commit.Tasks: Executing task
> 2023-05-24 09:47:23,076 DEBUG files.PendingSet: Reading pending commits in
> file
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b15cffa2-2764-45f6-883b-7a
> e3c5785060/task_20230524093329924571364925494992_0031_m_000004.pendingset
> 2023-05-24 09:47:23,076 INFO commit.AbstractS3ACommitter: Starting: Loading
> and committing files in pendingset
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic
> /job-b15cffa2-2764-45f6-883b-7ae3c5785060/task_20230524093329924571364925494992_0031_m_000007.pendingset
> 2023-05-24 09:47:23,076 DEBUG files.PendingSet: Reading pending commits in
> file
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b15cffa2-2764-45f6-883b-7a
> e3c5785060/task_20230524093329924571364925494992_0031_m_000003.pendingset
> 2023-05-24 09:47:23,076 DEBUG files.PendingSet: Reading pending commits in
> file
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b15cffa2-2764-45f6-883b-7a
> e3c5785060/task_20230524093329924571364925494992_0031_m_000000.pendingset
> 2023-05-24 09:47:23,076 DEBUG files.PendingSet: Reading pending commits in
> file
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b15cffa2-2764-45f6-883b-7a
> e3c5785060/task_20230524093329924571364925494992_0031_m_000002.pendingset
> 2023-05-24 09:47:23,076 DEBUG files.PendingSet: Reading pending commits in
> file
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b15cffa2-2764-45f6-883b-7a
> e3c5785060/task_20230524093329924571364925494992_0031_m_000001.pendingset
> 2023-05-24 09:47:23,078 DEBUG commit.Tasks: Executing task
> 2023-05-24 09:47:23,076 DEBUG files.PendingSet: Reading pending commits in
> file
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b15cffa2-2764-45f6-883b-7ae3c5785060/task_20230524093329924571364925494992_0031_m_000005.pendingset
> 2023-05-24 09:47:23,076 DEBUG files.PendingSet: Reading pending commits in
> file
> s3://<BUCKET_NAME>/<PATH>/date=20230520/__magic/job-b15cffa2-2764-45f6-883b-7ae3c5785060/task_20230524093329924571364925494992_0031_m_000007.pendingset{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]