Re: [I] [SUPPORT] Issue with Repartition on Kafka Input DataFrame and Same Precombine Value Rows In One Batch [hudi]

2024-04-16 Thread via GitHub
ad1happy2go commented on issue #10995: URL: https://github.com/apache/hudi/issues/10995#issuecomment-2058447334 @brightwon Thanks for the update. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] [SUPPORT] Issue with Repartition on Kafka Input DataFrame and Same Precombine Value Rows In One Batch [hudi]

2024-04-16 Thread via GitHub
brightwon closed issue #10995: [SUPPORT] Issue with Repartition on Kafka Input DataFrame and Same Precombine Value Rows In One Batch URL: https://github.com/apache/hudi/issues/10995 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] [SUPPORT] Issue with Repartition on Kafka Input DataFrame and Same Precombine Value Rows In One Batch [hudi]

2024-04-15 Thread via GitHub
ad1happy2go commented on issue #10995: URL: https://github.com/apache/hudi/issues/10995#issuecomment-2056550465 @brightwon Do you have any more doubts? Feel free to close if you are good on this. Thanks. -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] [SUPPORT] Issue with Repartition on Kafka Input DataFrame and Same Precombine Value Rows In One Batch [hudi]

2024-04-12 Thread via GitHub
ad1happy2go commented on issue #10995: URL: https://github.com/apache/hudi/issues/10995#issuecomment-2051561910 @brightwon Yes changing precombining key will not be allowed. I do understand you trying to repartition to scale the tagging stage. You can try repartition on record key and see

Re: [I] [SUPPORT] Issue with Repartition on Kafka Input DataFrame and Same Precombine Value Rows In One Batch [hudi]

2024-04-11 Thread via GitHub
brightwon commented on issue #10995: URL: https://github.com/apache/hudi/issues/10995#issuecomment-2050814895 @ad1happy2go Thank you for your reply. What I want is to speed up the tagging stage. Could you suggest a solution? I can achieve this by using repartition with a completely

Re: [I] [SUPPORT] Issue with Repartition on Kafka Input DataFrame and Same Precombine Value Rows In One Batch [hudi]

2024-04-11 Thread via GitHub
ad1happy2go commented on issue #10995: URL: https://github.com/apache/hudi/issues/10995#issuecomment-2050027649 @brightwon I do understand your issue. As precombine key should be more of ordering field ideally should contains different values for same record key. In your case, If the