sandyfog opened a new issue, #6862:
URL: https://github.com/apache/paimon/issues/6862

   ### Search before asking
   
   - [x] I searched in the [issues](https://github.com/apache/paimon/issues) 
and found nothing similar.
   
   
   ### Paimon version
   
   1.2.0
   
   ### Compute Engine
   
   flink 1.20
   
   ### Minimal reproduce step
   
   ##  Create the source changelog table
   ```
   CREATE TABLE testlog(
     id STRING PRIMARY KEY NOT ENFORCED,
     f1 STRING,
     delete INT
   ) WITH (
     'merge-engine' = 'deduplicate',
     'changelog-producer' = 'lookup'
   );
   
   -- seed data
   INSERT INTO testlog VALUES ('11', '11', 0), ('12', '12', 0);
   
   -- update id=12 to be logically deleted
   INSERT INTO testlog VALUES ('12', '12', 1), ('13', '13', 0);
   ```
   
   ## Query source table 
   ```
   SELECT * FROM testlog;
   
   op   id   f1   delete
   +I   11   11   0
   +I   12   12   0
   -U   12   12   0
   +U   01   12   1
   +I   13   13   0
   ```
   
   ## Query source table with filter
   ```
   SELECT * FROM testlog WHERE delete = 0;
   
   op   id   f1   delete
   +I   11   11   0
   +I   12   12   0
   -U   12   12   0
   +I   13   13   0
   ```
   Note:
   After applying the filter delete = 0, the +U  12  12  1 is completely 
dropped.
   Consequently, the only message about id = 12 that reaches the downstream 
Paimon table is -U 12 12 0.
   
   ## Filter out logically-deleted rows and write into a partial-update table
   ```
   CREATE TABLE testlog01 (
     id STRING PRIMARY KEY NOT ENFORCED,
     f1 STRING,
     delete INT
   ) WITH (
     'merge-engine' = 'partial-update',
     'partial-update.remove-record-on-delete' = 'true',
     'changelog-producer' = 'lookup'
   );
   
   INSERT INTO testlog01
   SELECT * FROM testlog WHERE delete = 0;
   ```
   
   ## Query the target table
   ```
   SELECT * FROM testlog01;
   
   ```
   
   ### What doesn't meet your expectations?
   
   ## Expected result
   ```
   iop   id   f1   delete
   +I   11   11   0
   +I   12   12   0
   -D  12   12   0
   +I   13   13   0
   ```
   
   (id=12 should delete because its last message is -U and no +U reaches the 
sink)
   
   ## Actual result
   ```
   op   id   f1   delete
   +I   11   11   0
   +I   12   12   0
   -U   12   12   0
   +U   01   12   0
   +I   13   13   0
   ```
   The stale row id=12 is still present, breaking data correctness.
   
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [x] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to