RoderickAdriance opened a new issue, #11285:
URL: https://github.com/apache/hudi/issues/11285

   We have a requirement to synchronize mysql data to hudi using Flink-cdc, but 
when mysql deletes the data the hudi table data will be deleted as well, we 
want to do a logical delete.
   We want to do logical deletion. Here's how we implemented it, but it doesn't 
work.
   We add a new op field to identify if the data has been deleted or not .Then 
insert into the hudi table.
   
   
   Flink-sql:
   CREATE TABLE products (
       id INT,
       name STRING,
       description STRING,
       create_time TIMESTAMP,
       op STRING METADATA FROM 'row_kind' VIRTUAL,
       PRIMARY KEY (id) NOT ENFORCED
     ) WITH (
       'connector' = 'mysql-cdc',
       'hostname' = 'xxx',
       'port' = '3306',
       'username' = 'root',
       'password' = '123456',
       'database-name' = 'app_db',
       'table-name' = 'products',
       'server-time-zone' = 'UTC'
     );
   
   
   CREATE TABLE hudi.test.products(
       id INT,
       name VARCHAR(255),
       description VARCHAR(255),
       create_time TIMESTAMP,
       is_deleted BOOLEAN,
       PRIMARY KEY (id) NOT ENFORCED
   )
   WITH (
     'connector' = 'hudi',
     'path' = '/user/hive/warehouse/test.db/products',
     'table.type' = 'COPY_ON_WRITE',
     'hoodie.datasource.write.recordkey.field'='id',
     'hoodie.datasource.write.precombine.field'='id'
   );
   
   
   insert into hudi.test.products 
   select 
   id,
   name,
   description,
   create_time,
   CASE WHEN op = '-D' THEN TRUE ELSE FALSE END AS is_deleted
   from products
   
   
   
   
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version : 0.14.1
   
   * Flink version :1.18.1
   
   * Hive version :3.1.2
   
   * Hadoop version :3.3.6
   
   * Storage (HDFS/S3/GCS..) : HDFS
   
   * Running on Docker? (yes/no) : no
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to