[ 
https://issues.apache.org/jira/browse/BEAM-1440?focusedWorklogId=355248&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-355248
 ]

ASF GitHub Bot logged work on BEAM-1440:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Dec/19 15:01
            Start Date: 06/Dec/19 15:01
    Worklog Time Spent: 10m 
      Work Description: kamilwu commented on issue #9772: [BEAM-1440] Create a 
BigQuery source that implements iobase.BoundedSource for Python
URL: https://github.com/apache/beam/pull/9772#issuecomment-562605478
 
 
   Thanks @robertwb for your comments!
   
   > Why does this not work on the direct runners. Is it an issue of needing to 
be split first?
   
   Yes. I've already created a jira for this: 
https://issues.apache.org/jira/browse/BEAM-8528
   
   > would it make sense to implement this as an SDF instead?
   
   My first attempt was a regular (non splittable) DoFn that triggers export 
job followed by `MatchAll` and `ReadMatches` transforms. This worked, but I had 
troubles with implementing the rest: waiting for query job, waiting for export 
job and removing json files after reading. Using Source API turned out to be 
simpler. 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 355248)
    Time Spent: 16h 40m  (was: 16.5h)

> Create a BigQuery source (that implements iobase.BoundedSource) for Python SDK
> ------------------------------------------------------------------------------
>
>                 Key: BEAM-1440
>                 URL: https://issues.apache.org/jira/browse/BEAM-1440
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Chamikara Madhusanka Jayalath
>            Assignee: Kamil Wasilewski
>            Priority: Major
>          Time Spent: 16h 40m
>  Remaining Estimate: 0h
>
> Currently we have a BigQuery native source for Python SDK [1].
> This can only be used by Dataflow runner.
> We should  implement a Beam BigQuery source that implements 
> iobase.BoundedSource [2] interface so that other runners that try to use 
> Python SDK can read from BigQuery as well. Java SDK already has a Beam 
> BigQuery source [3].
> [1] 
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery.py
> [2] 
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/iobase.py#L70
> [3] 
> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1189



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to