[ 
https://issues.apache.org/jira/browse/BEAM-4824?focusedWorklogId=145452&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145452
 ]

ASF GitHub Bot logged work on BEAM-4824:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/Sep/18 19:43
            Start Date: 18/Sep/18 19:43
    Worklog Time Spent: 10m 
      Work Description: reuvenlax commented on issue #6055: [BEAM-4824] Batch 
BigQueryIO returns job results
URL: https://github.com/apache/beam/pull/6055#issuecomment-422524409
 
 
   BTW my second comment still stands I think. BigQueryIO currently uses load 
jobs as an implementation detail. It might end up creating one load job per 
table, or it might end up creating multiple load jobs per table (if the table 
is very large). Collapsing the multiple jobs together might be very confusing. 
I think making information about these jobs part of the public API is very 
confusing, when the actual logical model is per record.
   
   Another thing: there will be upcoming changes to the BigQuery API, and we 
plan on getting rid of load jobs entirely from BigQueryIO. If we make 
information about load jobs part of the public API, it might be problematic 
when we remove the load jobs.
   
   Is this something that could be accomplished with better logging, or are 
there concrete use cases for wanting the output in a PCollection?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 145452)
    Time Spent: 1h 10m  (was: 1h)

> Get BigQueryIO batch loads to return something actionable
> ---------------------------------------------------------
>
>                 Key: BEAM-4824
>                 URL: https://issues.apache.org/jira/browse/BEAM-4824
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-gcp
>            Reporter: Carlos Alonso
>            Assignee: Carlos Alonso
>            Priority: Minor
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> ATM BigQueryIO batchloads returns an empty collection that has no information 
> related to how the load job finished. It is even returned before the job 
> finishes.
>  
> Change it so that:
>  # The returning PCollection only appers when the job has actually finished
>  # The returning PCollection contains information about the job result



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to