Re: [I] [SUPPORT] Data loss in MOR table after clustering partition [hudi]

via GitHub Sun, 10 Dec 2023 23:22:00 -0800


ad1happy2go commented on issue #9977:
URL: https://github.com/apache/hudi/issues/9977#issuecomment-1849459064


   @mzheng-plaid I looked into your code and found out that number of fields in 
your dataset is more than 100 which is the cause for this issue. I set this 
config `("spark.sql.codegen.maxFields","120") ` and it worked for me. After 
setting this there is no duplicates. 
   
   Can you please try with this config, and if your original dataset also have 
more than 100 columns then this config may work for you. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [SUPPORT] Data loss in MOR table after clustering partition [hudi]

Reply via email to