Yes, please use a ParDo. The Custom Sink API is not intended for use with unbounded collections (i.e. in pretty much any streaming pipelines) and it's generally due for a redesign. ParDo is currently almost always a better choice when you want to implement a connector writing data to a third-party system, unless you're just implementing export to a particular file format (in which case FileBasedSink is appropriate).
Concur with what Raghu said about @Setup/@Teardown. On Thu, Mar 16, 2017 at 3:02 PM Raghu Angadi <[email protected]> wrote: > ParDo is ok. > > Do you open a connection in each processElement() invocation? If you can > reuse the connection, you can open once in @Setup method and close it in > @Teardown. > > Raghu. > > On Thu, Mar 16, 2017 at 2:19 PM, sowmya balasubramanian < > [email protected]> wrote: > > Hi All, > > I am newbie who has recently entered the world of GCP and pipelines. > > I have a streaming pipeline in which I write to a Redis sink at the end. > The pipeline writes about 60,000 events per 15 minute window it processes. > I implemented the writing to Redis using a ParDo. > > The prototype worked well for small set of streaming events. However, when > I tested with my full dataset, every now and then I noticed the Redis > client (Jedis) threw a SocketException. (The client opens connection every > time it has to write to Redis, then closes the connection) > > Couple of questions I have: > > 1. Is there a preferred Redis client for the pipeline? > 2. Does it make sense to write a Custom Redis sink instead of a ParDo? > > Thanks, > Sowmya > > > > > > >
