ad1happy2go commented on issue #11213:
URL: https://github.com/apache/hudi/issues/11213#issuecomment-2109419958

   @bibhu107 Why can't be achieve this with current functionality? You can 
preprocess your data frame doing something like groupBy and collect_list and 
then save to hudi. You can further implement your custom payload to do whatever 
you want to achieve merging list (Previous and current)
   
   Although as each contract id has 100,000 items, if we create nested 
structure then single record payload itself will be too huge and performance 
will be very bad. Not sure if that big list even JVM will be able to 
accommodate and it can fail.  Why can't have denormalised structure with record 
key as contract_id and item_id. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to