steveniemitz edited a comment on pull request #15863: URL: https://github.com/apache/beam/pull/15863#issuecomment-1017579519
random musings from me, because we've tried to do something like this as well w/ our own SQL-ish IO. If you introduce an (implicit) reshuffle between the producer of the rows being written and the writer, you'll possibly break an implicit contract that users have been relying on that mutations produced are applied in-order to the JDBC destination. For example, if a GBK is triggering every 10 seconds and the next transform is a JdbcIO, by default that GBK trigger will fuse w/ the JdbcIO writer and apply "inline", so all triggers will apply in order. If you apply batching (with autosharding or not), multiple mutations for the same row may be grouped into multiple different batches, which will then be applied in a non-deterministic order. This can cause older firings to overwrite newer ones depending on the order they're applied in. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
