dailai commented on issue #6831:
URL: https://github.com/apache/seatunnel/issues/6831#issuecomment-2114734192

   > > > > let's focus only the insert behavior of sink and paimon. a PK table 
using deduplicate merge engine, like: `create table test(id int, tchar string, 
primary key(id) not enforced);` , and some data in the table:
   > > > > id   tchar
   > > > > 1    abc
   > > > > 2    abc
   > > > > 3    abc
   > > > > if insert some data where the primary keys already exist:
   > > > > id   tchar
   > > > > 1    ccc
   > > > > 2    ccc
   > > > > query this table (default lastest snapshot) should like (the result 
of using flink/spark to insert):
   > > > > id   tchar
   > > > > 1    ccc
   > > > > 2    ccc
   > > > > 3    abc
   > > > > when using seatunnel's sink to insert, it looks like paimon has not 
correctly merged all of the data:
   > > > > id   tchar
   > > > > 1    abc
   > > > > 2    ccc
   > > > > 3    abc
   > > > > and I try to query with time travel, every paimon's snapshot isn't 
correct,just like lost some data. that's why I think something may be wrong 
with the paimon sink, OR can't use paimon sink in batch mode to insert data 
have same PK.
   > > > 
   > > > 
   > > > Which connector does your source use?
   > > 
   > > 
   > > JDBC
   > 
   > Does your source table have a primary key?
   
   If your source table has a primary key, is inserting data with the same 
primary key that is actully a modification operation?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to