[GitHub] [hudi] nsivabalan commented on pull request #9722: [HUDI-6863] Revert auto-tuning of dedup parallelism

via GitHub Fri, 15 Sep 2023 09:43:56 -0700


nsivabalan commented on PR #9722:
URL: https://github.com/apache/hudi/pull/9722#issuecomment-1721566580


   Lets revisit the problems 6802 was tackliing. Main issue it was addressing 
is, making our shuffle parallelism dynamic and relative to the incoming df's 
num partitions. So, if someone is running 1000s of pipelines, they don't need 
to statically set the right value for shuffle parallelism for each of the 1000 
pipelines. 
   
   can you help me understand whats the issue we are hitting that warrants us 
to revert it?
   also, this would mean that we are going back to old state where we expect 
users to explicitly configure the shuffle parallelism. 
   If so, do we have a plan around dynamically choosing the right shuffle 
partition value depending on incoming batch? 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] nsivabalan commented on pull request #9722: [HUDI-6863] Revert auto-tuning of dedup parallelism

Reply via email to