nsivabalan commented on issue #4656: URL: https://github.com/apache/hudi/issues/4656#issuecomment-1018900717
Couple of observations. 1. May I know why you are setting upsert parallelism to 15k. 15k is very high. Is that intentionally tuned. If not, would recommend to something like 200 to 300. 2. I see your 2nd table has lot of file groups which were replaced. And so it does add some latency while checking valid file groups. So, once the cleaner comes through and deletes all replaced file groups, I feel the latency hit should go away. In general, do you know total file groups in table1 vs table2. If they are drastically different, then latency is expected to be different. CC @xushiyan @codope @yihua -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
