[GitHub] [beam] reuvenlax commented on issue #6055: [BEAM-4824] Batch BigQueryIO returns job results

GitHub Tue, 18 Sep 2018 12:44:08 -0700

BTW my second comment still stands I think. BigQueryIO currently uses load jobs 
as an implementation detail. It might end up creating one load job per table, 
or it might end up creating multiple load jobs per table (if the table is very 
large). Collapsing the multiple jobs together might be very confusing. I think 
making information about these jobs part of the public API is very confusing, 
when the actual logical model is per record.


Another thing: there will be upcoming changes to the BigQuery API, and we plan 
on getting rid of load jobs entirely from BigQueryIO. If we make information 
about load jobs part of the public API, it might be problematic when we remove 
the load jobs.

Is this something that could be accomplished with better logging, or are there 
concrete use cases for wanting the output in a PCollection?

[ Full content available at: https://github.com/apache/beam/pull/6055 ]
This message was relayed via gitbox.apache.org for [email protected]

[GitHub] [beam] reuvenlax commented on issue #6055: [BEAM-4824] Batch BigQueryIO returns job results

Reply via email to