Jiabao-Sun commented on PR #4: URL: https://github.com/apache/flink-connector-mongodb/pull/4#issuecomment-1491579741
> So...would this also fail with this error in production? > > The sink supports AT_LEAST_ONCE semantics, which should imply that duplicate writes are fine and don't cause errors. But now they seemingly do? The `MongoRowDataSerializationSchema` uses upsert write which is idempotent and there will be no write conflict. If we set `batchIntervalMs != -1` and `batchSize != -1`, we may write some data not checkpoint. In this case users are required to ensure idempotent writes. I think this problem also happens in elasticsearch-connector. The `BulkProcessor` will be periodic flush data of scheduled tasks, but it has not been checkpointed. So do we need to only allow writing at the time of checkpoint in AT-LEAST-ONE semantics? @zentol, What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
