On Tue, Jan 23, 2018 at 2:48 PM, Dmitry Demeshchuk <[email protected]>
wrote:

> Hi list,
>
> My understanding is that ReadStringsFromPubSub
> <https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/pubsub.py#L42>
>  doesn't
> provide any way of getting the message metadata (attributes, publish
> timestamp, etc). Looking further suggests
> <https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/pubsub.py#L181>
> that the majority of the PubSub functionality is inside Dataflow.
>
> Hence, some questions:
>
> 1. Is my understanding of the current state of things correct?
>

This is correct.


>
> 2. Is there any API I can piggyback on to write my own PubSub source? My
> guess would be that I can use NativeSource, but is that really so?
>

No, unfortunately. Not yet. SDF for Python will be this API.


>
> 3. If the answer to both of the above is "no", is there any idea when this
> will be officially supported?
>

There is no ETA. I am _hoping_ that in 2 releases we will implement an
improved pubsub source that can do (1). And (2) can happen after that.


>
> What I'm doing right now is meant only for small things, so I probably
> don't mind switching to Java for this specific task. Just trying to make
> sure there's no better way.
>

If you have production need, I will recommend using Java. Otherwise stay
tuned.


>
> Thanks!
>
> --
> Best regards,
> Dmitry Demeshchuk.
>

Reply via email to