On Tue, Jan 23, 2018 at 2:48 PM, Dmitry Demeshchuk <[email protected]> wrote:
> Hi list, > > My understanding is that ReadStringsFromPubSub > <https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/pubsub.py#L42> > doesn't > provide any way of getting the message metadata (attributes, publish > timestamp, etc). Looking further suggests > <https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/pubsub.py#L181> > that the majority of the PubSub functionality is inside Dataflow. > > Hence, some questions: > > 1. Is my understanding of the current state of things correct? > This is correct. > > 2. Is there any API I can piggyback on to write my own PubSub source? My > guess would be that I can use NativeSource, but is that really so? > No, unfortunately. Not yet. SDF for Python will be this API. > > 3. If the answer to both of the above is "no", is there any idea when this > will be officially supported? > There is no ETA. I am _hoping_ that in 2 releases we will implement an improved pubsub source that can do (1). And (2) can happen after that. > > What I'm doing right now is meant only for small things, so I probably > don't mind switching to Java for this specific task. Just trying to make > sure there's no better way. > If you have production need, I will recommend using Java. Otherwise stay tuned. > > Thanks! > > -- > Best regards, > Dmitry Demeshchuk. >
