Ambarish-Giri edited a comment on issue #3605: URL: https://github.com/apache/hudi/issues/3605#issuecomment-917646992
Hi @nsivabalan , Sure will try bulk-insert once and update. Also regarding "right value for avg record size config" its specific to Copy On Write hoodie.copyonwrite.record.size.estimate. For Merge on Read there is no such config? 1# Upserts can be spread across partitions or can be specific as well as per the data received for that day, and it can have just appends as well. 2# No the records key doesn't have any timestamp affinity, as mentioned the record key is concat(segmentId,uuid4). SegmentId is an integer value i.e. it can be same for multiple records and uuid4 is standard unique random value ( note: "-" are being removed from the uuid4 values though), but a combination of both identifies a record uniquely and partition key is again segmentId as it has low cardinality -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
