[
https://issues.apache.org/jira/browse/BEAM-4824?focusedWorklogId=152182&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-152182
]
ASF GitHub Bot logged work on BEAM-4824:
----------------------------------------
Author: ASF GitHub Bot
Created on: 08/Oct/18 10:07
Start Date: 08/Oct/18 10:07
Worklog Time Spent: 10m
Work Description: reuvenlax commented on issue #6055: [BEAM-4824] Batch
BigQueryIO returns job results
URL: https://github.com/apache/beam/pull/6055#issuecomment-427781427
@calonso sorry I've been traveling internationally a lot recently (on
vacation now) and haven't stayed on top of this.
I have some concerns about adding this capability given that there will be a
new BigQuery connector that doesn't use these inserts at all, and code that
relies on this functionality simply will not function with the new connector.
However as long as the capability is guarded behind Experimental (meaning that
the functionality will change, and no compatibility guarantees are made), I
think it's ok to add this.
My biggest concern with the current PR is changing the types of PCollections
(e.g. from String -> BigQueryWriteResult). Several runners (Dataflow, Flink)
support updating streaming pipelines. However if one of these PCollections has
changed types, the update must fail. Is there any way to structure this without
changing intermediate types?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 152182)
Time Spent: 1h 40m (was: 1.5h)
> Get BigQueryIO batch loads to return something actionable
> ---------------------------------------------------------
>
> Key: BEAM-4824
> URL: https://issues.apache.org/jira/browse/BEAM-4824
> Project: Beam
> Issue Type: Improvement
> Components: io-java-gcp
> Reporter: Carlos Alonso
> Assignee: Carlos Alonso
> Priority: Minor
> Time Spent: 1h 40m
> Remaining Estimate: 0h
>
> ATM BigQueryIO batchloads returns an empty collection that has no information
> related to how the load job finished. It is even returned before the job
> finishes.
>
> Change it so that:
> # The returning PCollection only appers when the job has actually finished
> # The returning PCollection contains information about the job result
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)