bvaradar commented on issue #1582:
URL: https://github.com/apache/incubator-hudi/issues/1582#issuecomment-623022854


   Thanks for the details. One of the primary contract within Hudi is the 
uniqueness of record key within partition/dataset. Instead, can you materialize 
the grouping within the record. To elaborate, can you create a nested array of 
struct field : "audit_log" (inner struct having same structure as top-level 
struct without audit_log) in your schema which would contain basically the list 
of record images at each ingest time and have your custom payload append all 
previous images as part of combineAndGetUpdateValue and preCombine. This way if 
you want the latest image, you simply have to skip projecting "audit_log" in 
your query and don't have to deal with reduce-by. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to