KnightChess commented on issue #10418:
URL: https://github.com/apache/hudi/issues/10418#issuecomment-1877175659

   @zhangjw123321 look like `hoodie.bulkinsert.shuffle.parallelism` can not 
work on non-partitioned table in the code. In the spark ui, may be you not set 
`spark.default.parallelism` so `reduceBykey` will use the parrent rdd 
partitions size. Can you try `set spark.default.parallelism=100;` I think it 
will reduce the parallelism in `stage 10` to 100.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to