Without getting into the super specifics of your use-case, it sounds like you might want to checkout the DebeziumIO for CDC ( Change Data Capture ). I think DebeziumIO can generally handle even much more complex use cases than it sounds like you are trying for.
Some pointers/talks from last year's beam summit: https://www.youtube.com/watch?v=hu5FacAeQ-8 https://www.youtube.com/watch?v=U_RshngpxLc On Fri, Apr 22, 2022 at 4:41 AM Eric Berryman <eric.berry...@gmail.com> wrote: > Does an unbounded JdbcIO exist, or would I need to wrap the existing one > in a spilttable DoFn? Or maybe there is an easier way to do it? > > Thank you again, > Eric > > > > On Wed, Apr 20, 2022, 21:59 Ahmet Altay <al...@google.com> wrote: > >> /cc @Pablo Estrada <pabl...@google.com> @John Casey >> <johnjca...@google.com> >> >> On Wed, Apr 20, 2022 at 6:29 PM Eric Berryman <eric.berry...@gmail.com> >> wrote: >> >>> Hello, >>> >>> I have a rather simple use case where I would like to read a db table, >>> which acts as a queue (~ hundreds millions events in initial load, but only >>> thousands of events per day), and write that data out to a sink. This >>> pipeline would be unbounded. >>> >>> I'm looking for reading material, and or code, which displays reading >>> from the JdbcIO API with checkpoints. I would like to avoid the initial >>> load on restarts, upgrades, etc. :) >>> >>> Thank you for your time! >>> Eric >>> >>