[GitHub] [incubator-seatunnel] zhaomin1423 commented on issue #988: [DISCUSS][Feature][core] Add dirty data management

GitBox Wed, 23 Feb 2022 20:23:23 -0800


zhaomin1423 commented on issue #988:
URL: 
https://github.com/apache/incubator-seatunnel/issues/988#issuecomment-1049481572



   The dirty data management has two aspect. First, We can handle data one by 
one, then, the database must support transactions because when writing a batch 
data with few dirty data, the database must rollback. Therefore, we can write 
the batch one by one to catch the dirty data. In spark, add a datasource 
strategy to transform WriteToDataSourceV2 to an extended 
WriteToDataSourceV2Exec. So, we can handle the data one by one to mange dirty 
data. Then, to implement a jdbc connector base on DataSourceV2 API.
   
   Welcome to comment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-seatunnel] zhaomin1423 commented on issue #988: [DISCUSS][Feature][core] Add dirty data management

Reply via email to