tjtoll commented on issue #4682: URL: https://github.com/apache/hudi/issues/4682#issuecomment-1076318960
Good morning, We are experiencing the same issue with .10 and .9 (see UI below). Also using S3, but using AWS Glue not EMR. What stands out to me are the 3 consecutive 'Getting small files from partitions' stages that with 4, 20, 100 tasks respectively. The stages with 4 and 20 tasks obviously getting very poor parallelization. The identical behavior exists on my UI and [ChiehFu](https://github.com/ChiehFu)'s  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
