CTTY commented on issue #10415:
URL: https://github.com/apache/hudi/issues/10415#issuecomment-1977660648
This looks similar to this issue: https://github.com/apache/hudi/issues/7487
where user ran into S3 throttling issue due to too many S3 calls.
Was wondering if you can check if
ad1happy2go commented on issue #10415:
URL: https://github.com/apache/hudi/issues/10415#issuecomment-1919021038
Thanks for trying @ergophobiac. @CTTY any insights here ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
ergophobiac commented on issue #10415:
URL: https://github.com/apache/hudi/issues/10415#issuecomment-1886257556
Hello @ad1happy2go ,
We ran a test with the same configurations, just one addition:
spark.hadoop.fs.s3a.connection.maximum=2000. (We found a resource saying the
default on
ergophobiac commented on issue #10415:
URL: https://github.com/apache/hudi/issues/10415#issuecomment-1874167209
Hey @ad1happy2go, we have a test case running, we'll observe till we're sure
it's stable and let you know how it turns out.
--
This is an automated message from the Apache Git
ad1happy2go commented on issue #10415:
URL: https://github.com/apache/hudi/issues/10415#issuecomment-1874092759
@ergophobiac Did you got a chance to try this out?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
ad1happy2go commented on issue #10415:
URL: https://github.com/apache/hudi/issues/10415#issuecomment-1870277355
@ergophobiac Are you setting fs.s3a.connection.maximum to a higher value?
Can you try increasing its value and try?
--
This is an automated message from the Apache Git Service.
ergophobiac opened a new issue, #10415:
URL: https://github.com/apache/hudi/issues/10415
**Describe the problem you faced**
Stack: Hudi 0.13.1, EMR 6.13.0, Spark 3.4.1
We are writing to an MOR table in S3, using Spark Structured Streaming job
on EMR. Once this job has run for