pabloem commented on a change in pull request #13170:
URL: https://github.com/apache/beam/pull/13170#discussion_r522405868
##########
File path: sdks/python/apache_beam/io/gcp/bigquery.py
##########
@@ -79,6 +79,41 @@
`ReadFromBigQuery`, you can use the flag `use_json_exports` to export
data as JSON, and receive base64-encoded bytes.
+ReadAllFromBigQuery
+-------------------
+Beam 2.27.0 introduces a new transform called `ReadAllFromBigQuery` which
+allows you to define table and query reads from BigQuery at pipeline
+runtime.:::
+
+ read_requests = p | beam.Create([
+ ReadFromBigQueryRequest(query='SELECT * FROM mydataset.mytable'),
+ ReadFromBigQueryRequest(table='myproject.mydataset.mytable')])
+ results = read_requests | ReadAllFromBigQuery()
+
+A good application for this transform is in streaming pipelines to
+refresh a side input coming from BigQuery. This would work like so:::
+
+ side_input = (
+ p
+ | 'PeriodicImpulse' >> PeriodicImpulse(
+ first_timestamp, last_timestamp, interval, True)
+ | 'MapToReadRequest' >> beam.Map(
+ lambda x: BigQueryReadRequest(table='dataset.table'))
Review comment:
Regarding names - yes, that's a little confusing. The only names should
be:
- `ReadFromBigQueryRequest` - this is an input element for
`ReadAllFromBigQuery`, and it represents a query or a table to be read (with a
few other parameters).
- `ReadAllFromBigQuery` - This is the transform that issues BQ reads.
All other names are misnaming in the configuration
----
Regarding your example - that's interesting. I recognize that what you show
would be the most common use case (same query/table always, rather than
varying) - with the only exception that some queries could be slightly updated
over time (e.g. read only partitions of the last few days).
otoh, this would create two ways of using the transform, and complicate the
constructor (all of the parameters in ReadFromBQRequest would need to be
available in the constructor).
Users could build this functionality themselves though. My feeling is that
it's better to build a transform that is more composable, and provide an
example for users trying to build the functionality you propose. WDYT?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]