Hmm, do you mean that if you have a windowed PCollection as an input for
"JdbcIO.write().withResults()" (“signal" transform) and then you apply
"Wait.on(signal)" then it doesn’t work as expected [1]?
[1]
Thanks Cham but actually we already had a similar open issue for this [1] and
I’m currently working on that one. Also, I created an umbrella Jira for such
tasks for all other Java SDK IOs since I believe this behaviour can be quite
demanded in general [2]
I’d be happy to discuss it in more
In our beam pipeline we are writing out the results (to MySQL) in separate
(parallel) stages. When all stages finished writing successfully, we want
to set a "final" flag on all DB tables (so, any client who uses this data
will always use the latest data, but only if it was completely processed).
Yes, it is needed for KafkaIo as well. Thank you for creating the ticket.
For such JDBC operations are there any suggestions on a way forward?
Currently trying out an approach using JDBC template inside a DoFn. This
creates a complexity when performing batch operations.
On Thu, Jan 27, 2022 at
Hi Alexey,
Thank you for your reply. Yes, that's the use case here.
Tried the approach that you have suggested, but for the window to get
triggered shouldn't we apply some transform like GroupByKey? or else the
window is ignored right? As we dont have such a transform applied, we did
not observe
Hi Yomal de Silva,
IIUC, you need to pass downstream the records only once they were stored in DB
with JdbcIO, don’t you? And your pipeline is unbounded.
So, why not to split into windows your input data PCollection (if it was not
done upstream) and use Wait.on() with
Hi beam team,
I have been working on a data pipeline which consumes data from an SFTP
location and the data is passed through multiple transforms and finally
published to a kafka topic. During this we need to maintain the state of
the file, so we write to the database during the pipeline.
We came