[
https://issues.apache.org/jira/browse/BEAM-13184?focusedWorklogId=712116&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-712116
]
ASF GitHub Bot logged work on BEAM-13184:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 20/Jan/22 14:48
Start Date: 20/Jan/22 14:48
Worklog Time Spent: 10m
Work Description: steveniemitz edited a comment on pull request #15863:
URL: https://github.com/apache/beam/pull/15863#issuecomment-1017579519
random musings from me, because we've tried to do something like this as
well w/ our own SQL-ish IO.
If you introduce an (implicit) reshuffle between the producer of the rows
being written and the writer, you'll possibly break an implicit contract that
users have been relying on that mutations produced are applied in-order to the
JDBC destination.
For example, if a GBK is triggering every 10 seconds and the next transform
is a JdbcIO, by default that GBK trigger will fuse w/ the JdbcIO writer and
apply "inline", so all triggers will apply in order. If you apply batching
(with autosharding or not), multiple mutations for the same row may be grouped
into multiple different batches, which will then be applied in a
non-deterministic order. This can cause older firings to overwrite newer ones
depending on the order they're applied in.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 712116)
Time Spent: 8h 50m (was: 8h 40m)
> Support autosharding for JdbcIO writers
> ---------------------------------------
>
> Key: BEAM-13184
> URL: https://issues.apache.org/jira/browse/BEAM-13184
> Project: Beam
> Issue Type: Improvement
> Components: io-java-jdbc
> Reporter: Pablo Estrada
> Assignee: Pablo Estrada
> Priority: P2
> Fix For: 2.37.0
>
> Time Spent: 8h 50m
> Remaining Estimate: 0h
>
> This should improve efficiency for Jdbc writers on streaming pipelines
--
This message was sent by Atlassian Jira
(v8.20.1#820001)