kmjung commented on pull request #13083:
URL: https://github.com/apache/beam/pull/13083#issuecomment-731295425
In your examples above:
```
BigQueryIO
.read<T>(...)
.from("myproject.mydataset.mytable")
.withSelectedFields("my_string_field_1")
.withMethod(Method.DIRECT_READ))
```
This would incur only BigQuery storage API charges for the uncompressed size
of the `my_string_field_1` column (e.g. at $1.10/TiB). The BigQuery query
engine isn't involved here, and so neither is the $5/TiB query cost.
```
BigQueryIO
.read<T>(...)
.fromQuery("SELECT my_string_field_1 ||
'my_concat_business_logic_for_this_field' FROM `myproject.mydataset.mytable`")
.usingStandardSql()
.withMethod(Method.DIRECT_READ))
```
This is a BigQuery query -- it will be executed as a query job, the query
results will be written to an anonymous table, and then Beam will use the
storage API to read the results from the anonymous table. You'll pay the
standard $5/TiB on-demand query cost here (unless you're using a BigQuery
reservation), but there won't be any costs associated with the storage API
usage in this case because the target is an anonymous table.
I think your last example sums things up correctly.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]