ad1happy2go commented on issue #9549: URL: https://github.com/apache/hudi/issues/9549#issuecomment-1694665618
@lvyanquan Yes, This is expected and by design as you are not providing the precombine field in your table. We normally do that for out immutable datasets. So it runs the operation type insert which dont use tagging and you may saw duplicates. We also do small file handling so in use case 1, you don't see duplicates due to this. You can turn on the flag allow duplicates(hoodie.merge.allow.duplicate.on.inserts) for that. I understand this may cause confusion that is why in new release we are auto enabling the flag so all duplicates will flow in. You can provide some precombineField, then it will behave as upsert as no duplicates will be created. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
