[GitHub] [flink-connector-mongodb] Jiabao-Sun commented on pull request #4: [hotfix] Fix unstable test of MongoSinkITCase.testRecovery

via GitHub Fri, 31 Mar 2023 02:02:40 -0700


Jiabao-Sun commented on PR #4:
URL: 
https://github.com/apache/flink-connector-mongodb/pull/4#issuecomment-1491579741


   > So...would this also fail with this error in production?
   > 
   > The sink supports AT_LEAST_ONCE semantics, which should imply that 
duplicate writes are fine and don't cause errors. But now they seemingly do?
   
   The `MongoRowDataSerializationSchema` uses upsert write which is idempotent 
and there will be no write conflict.
   
   If we set `batchIntervalMs != -1` and `batchSize != -1`, we may write some 
data not checkpoint. In this case users are required to ensure idempotent 
writes. I think this problem also happens in elasticsearch-connector. The 
`BulkProcessor` will be periodic flush data of scheduled tasks, but it has not 
been checkpointed.
   
   So do we need to only allow writing at the time of checkpoint in 
AT-LEAST-ONE semantics?
   @zentol, What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink-connector-mongodb] Jiabao-Sun commented on pull request #4: [hotfix] Fix unstable test of MongoSinkITCase.testRecovery

Reply via email to