[jira] [Commented] (BEAM-8046) Unable to read from bigquery and publish to pubsub using dataflow runner (python SDK)

2020-06-01 Thread Beam JIRA Bot (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17122698#comment-17122698
 ] 

Beam JIRA Bot commented on BEAM-8046:
-

This issue is P2 but has been unassigned without any comment for 60 days so it 
has been labeled "stale-P2". If this issue is still affecting you, we care! 
Please comment and remove the label. Otherwise, in 14 days the issue will be 
moved to P3.

Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed 
explanation of what these priorities mean.


> Unable to read from bigquery and publish to pubsub using dataflow runner 
> (python SDK)
> -
>
> Key: BEAM-8046
> URL: https://issues.apache.org/jira/browse/BEAM-8046
> Project: Beam
>  Issue Type: Improvement
>  Components: io-py-gcp, runner-dataflow
>Affects Versions: 2.13.0, 2.14.0
>Reporter: James Hutchison
>Priority: P2
>  Labels: stale-P2
>
> With the Python SDK:
> The dataflow runner does not allow use of reading from bigquery in streaming 
> pipelines.
> Pubsub is not allowed for batch pipelines.
> Thus, there's no way to create a pipeline on the dataflow runner that reads 
> from bigquery and publishes to pubsub.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8046) Unable to read from bigquery and publish to pubsub using dataflow runner (python SDK)

2019-08-27 Thread Chamikara Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916973#comment-16916973
 ] 

Chamikara Jayalath commented on BEAM-8046:
--

I think the issue there is current PubSub implementation in Python SDK being a 
native source/sink that is implemented in Dataflow streaming backend.

> Unable to read from bigquery and publish to pubsub using dataflow runner 
> (python SDK)
> -
>
> Key: BEAM-8046
> URL: https://issues.apache.org/jira/browse/BEAM-8046
> Project: Beam
>  Issue Type: Improvement
>  Components: io-py-gcp, runner-dataflow
>Affects Versions: 2.13.0, 2.14.0
>Reporter: James Hutchison
>Priority: Major
>
> With the Python SDK:
> The dataflow runner does not allow use of reading from bigquery in streaming 
> pipelines.
> Pubsub is not allowed for batch pipelines.
> Thus, there's no way to create a pipeline on the dataflow runner that reads 
> from bigquery and publishes to pubsub.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8046) Unable to read from bigquery and publish to pubsub using dataflow runner (python SDK)

2019-08-27 Thread James Hutchison (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916957#comment-16916957
 ] 

James Hutchison commented on BEAM-8046:
---

I would find more value in being able to publish to pubsub in a batch pipeline. 
It seems odd that it prevents you from using it in a batch pipeline as it 
doesn't feel like it should require anything special.

> Unable to read from bigquery and publish to pubsub using dataflow runner 
> (python SDK)
> -
>
> Key: BEAM-8046
> URL: https://issues.apache.org/jira/browse/BEAM-8046
> Project: Beam
>  Issue Type: Improvement
>  Components: io-py-gcp, runner-dataflow
>Affects Versions: 2.13.0, 2.14.0
>Reporter: James Hutchison
>Priority: Major
>
> With the Python SDK:
> The dataflow runner does not allow use of reading from bigquery in streaming 
> pipelines.
> Pubsub is not allowed for batch pipelines.
> Thus, there's no way to create a pipeline on the dataflow runner that reads 
> from bigquery and publishes to pubsub.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8046) Unable to read from bigquery and publish to pubsub using dataflow runner (python SDK)

2019-08-27 Thread Chamikara Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916899#comment-16916899
 ] 

Chamikara Jayalath commented on BEAM-8046:
--

BigQuery is an bounded source so we'll need to add an unbounded read from 
bounded source wrapper to support it in a streaming pipelines. For example we 
have [1] for Java.

Also, we don't have an unbounded source framework for Python yet. Splittable 
DoFn is currently in the works to this end.

 

[1][https://github.com/apache/beam/blob/master/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/UnboundedReadFromBoundedSource.java]

> Unable to read from bigquery and publish to pubsub using dataflow runner 
> (python SDK)
> -
>
> Key: BEAM-8046
> URL: https://issues.apache.org/jira/browse/BEAM-8046
> Project: Beam
>  Issue Type: Improvement
>  Components: io-py-gcp, runner-dataflow
>Affects Versions: 2.13.0, 2.14.0
>Reporter: James Hutchison
>Priority: Major
>
> With the Python SDK:
> The dataflow runner does not allow use of reading from bigquery in streaming 
> pipelines.
> Pubsub is not allowed for batch pipelines.
> Thus, there's no way to create a pipeline on the dataflow runner that reads 
> from bigquery and publishes to pubsub.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8046) Unable to read from bigquery and publish to pubsub using dataflow runner (python SDK)

2019-08-27 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916888#comment-16916888
 ] 

Valentyn Tymofieiev commented on BEAM-8046:
---

AFAIK the limitations described here are correct as of now.
[~altay] [~angoenka] [~chamikara] Do we have an issue that tracks support of 
Bigquery source in streaming pipelines?



> Unable to read from bigquery and publish to pubsub using dataflow runner 
> (python SDK)
> -
>
> Key: BEAM-8046
> URL: https://issues.apache.org/jira/browse/BEAM-8046
> Project: Beam
>  Issue Type: Improvement
>  Components: io-py-gcp, runner-dataflow
>Affects Versions: 2.13.0, 2.14.0
>Reporter: James Hutchison
>Priority: Major
>
> With the Python SDK:
> The dataflow runner does not allow use of reading from bigquery in streaming 
> pipelines.
> Pubsub is not allowed for batch pipelines.
> Thus, there's no way to create a pipeline on the dataflow runner that reads 
> from bigquery and publishes to pubsub.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8046) Unable to read from bigquery and publish to pubsub using dataflow runner (python SDK)

2019-08-27 Thread James Hutchison (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916837#comment-16916837
 ] 

James Hutchison commented on BEAM-8046:
---

clarified that you can't _read_ from BigQuery in streaming pipelines

> Unable to read from bigquery and publish to pubsub using dataflow runner 
> (python SDK)
> -
>
> Key: BEAM-8046
> URL: https://issues.apache.org/jira/browse/BEAM-8046
> Project: Beam
>  Issue Type: Improvement
>  Components: io-py-gcp, runner-dataflow
>Affects Versions: 2.13.0, 2.14.0
>Reporter: James Hutchison
>Priority: Major
>
> With the Python SDK:
> The dataflow runner does not allow use of reading from bigquery in streaming 
> pipelines.
> Pubsub is not allowed for batch pipelines.
> Thus, there's no way to create a pipeline on the dataflow runner that reads 
> from bigquery and publishes to pubsub.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8046) Unable to read from bigquery and publish to pubsub using dataflow runner (python SDK)

2019-08-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/BEAM-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916534#comment-16916534
 ] 

Ismaël Mejía commented on BEAM-8046:


[~tvalentyn] or someone else can please give an answer to this one.

> Unable to read from bigquery and publish to pubsub using dataflow runner 
> (python SDK)
> -
>
> Key: BEAM-8046
> URL: https://issues.apache.org/jira/browse/BEAM-8046
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Affects Versions: 2.13.0, 2.14.0
>Reporter: James Hutchison
>Priority: Major
>
> With the Python SDK:
> The dataflow runner does not allow use of bigquery in streaming pipelines.
> Pubsub is not allowed for batch pipelines.
> Thus, there's no way to create a pipeline on the dataflow runner that reads 
> from bigquery and publishes to pubsub.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)