Re: I/O connectors for streaming GRPC source

2020-09-30 Thread Luke Cwik
There is no generic gRPC connector and it is unlikely that there ever will
be one.

A lot of the time integration with external systems is for ingesting large
amounts of data which works best with certain features which gRPC doesn't
natively support but an application protocol built on top of gRPC usually
does. Things like checkpointing and resuming from a position in the stream,
being able to split streams, acking messages so they aren't published in
the stream,  To learn more, you should take a look at this splittable
DoFn blog[1].

There are a few sources that have been written that use gRPC but Apache
Beam integrates using a higher level application specific protocol. Take a
look at SpannerIO[2] and PubsubLite[3] since they wrap gRPC with their
client libraries.

You can always start by writing a normal DoFn that connects to this service
and eventually migrating to a splittable DoFn once you have scaling
concerns.

1: https://beam.apache.org/blog/splittable-do-fn/
2:
https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
3:
https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsublite/PubsubLiteIO.java

On Wed, Sep 30, 2020 at 11:24 AM Maksim Pilipeyko <
maksim.pilipe...@colada.biz> wrote:

> Hi,
>
>
>
> What connector can I use if I should read data from streaming grpc api?
>
>
>
> Best regards,
> Maksim
>


I/O connectors for streaming GRPC source

2020-09-30 Thread Maksim Pilipeyko
Hi,

What connector can I use if I should read data from streaming grpc api?

Best regards,
Maksim