Thanks Cham but actually we already had a similar open issue for this [1] and 
I’m currently working on that one. Also, I created an umbrella Jira for such 
tasks for all other Java SDK IOs since I believe this behaviour can be quite 
demanded in general [2] 

I’d be happy to discuss it in more details.

—
Alexey

[1] https://issues.apache.org/jira/browse/BEAM-13298 
<https://issues.apache.org/jira/browse/BEAM-13298>
[2] https://issues.apache.org/jira/browse/BEAM-13584 
<https://issues.apache.org/jira/browse/BEAM-13584>




> On 26 Jan 2022, at 19:38, Chamikara Jayalath <chamik...@google.com> wrote:
> 
> Yeah, I think Wait transform will not work here as well. We have to replace 
> Kafka write transform's PDone output with a proper write result. I think +Wei 
> Hsia <mailto:weih...@google.com> has been looking into doing something like 
> this. Created https://issues.apache.org/jira/browse/BEAM-13748 
> <https://issues.apache.org/jira/browse/BEAM-13748> for tracking.
> 
> Thanks,
> Cham
> 
> On Wed, Jan 26, 2022 at 10:32 AM Yomal de Silva <yomal.prav...@gmail.com 
> <mailto:yomal.prav...@gmail.com>> wrote:
> Hi Alexey,
> Thank you for your reply. Yes, that's the use case here. 
> 
> Tried the approach that you have suggested, but for the window to get 
> triggered shouldn't we apply some transform like GroupByKey? or else the 
> window is ignored right? As we dont have such a transform applied, we did not 
> observe any windowing behavior in the pipeline. 
> 
> On Wed, Jan 26, 2022 at 10:04 PM Alexey Romanenko <aromanenko....@gmail.com 
> <mailto:aromanenko....@gmail.com>> wrote:
> Hi Yomal de Silva,
> 
> IIUC, you need to pass downstream the records only once they were stored in 
> DB with JdbcIO, don’t you? And your pipeline is unbounded. 
> 
> So, why not to split into windows your input data PCollection (if it was not 
> done upstream) and use Wait.on() with JdbcIO.write().withResults()? This is 
> exactly a case for which it was initially developed.
> 
> —
> Alexey
> 
> 
> > On 26 Jan 2022, at 10:16, Yomal de Silva <yomal.prav...@gmail.com 
> > <mailto:yomal.prav...@gmail.com>> wrote:
> > 
> > Hi beam team,
> > I have been working on a data pipeline which consumes data from an SFTP 
> > location and the data is passed through multiple transforms and finally 
> > published to a kafka topic. During this we need to maintain the state of 
> > the file, so we write to the database during the pipeline. 
> > We came across that JDBCIo write is only considered as a sink and it 
> > returns a PDone. But for our requirement we need to pass down the elements 
> > that were persisted.
> > 
> > Using writeVoid is also not useful since this is an unbounded stream and we 
> > cannot use Wait.On with that without using windows and triggers. 
> > 
> > Read file --> write the file state in db --> enrich data --> publish to 
> > kafka --> update file state
> > 
> > The above is the basic requirement from the pipeline. Any suggestions?
> > 
> > Thank you
> 

Reply via email to