This might be a basic question - I'm experimenting with Hudi (Pyspark). I have used Insert/Upsert options to write delta into my data lake. However, one is not clear to me
Step 1:- I write 50 records Step 2:- Im writing 50 records out of which only *10 records have been changed* (I'm using upsert mode & tried with MERGE_ON_READ also COPY_ON_WRITE) Step 3: I was expecting only 10 records will be written but it writes whole 50 records is this a normal behaviour? Which means do I need to determine the delta myself and write them alone? Am I missing something?
