Hi Jeff, Using Java.
Yeah - we are issuing a query rather than reading a table. Materializing the results myself and reading them back seems simple enough. I will give that a try! Thanks, Matt On Thu, Jul 2, 2020 at 9:42 AM Jeff Klukas <[email protected]> wrote: > It sounds like your pipeline is issuing a query rather than reading a > whole table. > > Are you using Java or Python? I'm only familiar with the Java SDK so my > answer may be Java-biased. > > I would recommend materializing the query results to a table, and then > configuring your pipeline to read that table rather than reading from a > query. In that case, no query job is involved so you incur no query cost. > > By default, the read from a table will do an export to avro files. There > is no GCP cost associated with that export, but there is a quota involved, > which you may run into if you run your pipeline repeatedly. So an even > better loop would be to do the export to GCS out of band, and then > reference those avro files. But that would require much more extensive code > changes in your pipeline whereas the switch of reading from a query to > reading from a table is a one-line code change. > > You can also avoid the export to avro files by configuring BigQueryIO to > use direct reads from your temporary table rather than file exports. There > is a cost associated with direct reads, but it should generally be much > smaller than the cost of repeatedly running a query. > > On Thu, Jul 2, 2020 at 9:28 AM Matt Terwilliger < > [email protected]> wrote: > >> Hello, >> >> I'm writing a Beam pipeline that does some relatively expensive reads >> from BigQuery. I want to be able to run the pipeline in a development loop >> without racking up a huge bill. >> >> I know BigQuery has support for query caching, but from the docs, that >> only works if you don't specify a destination table. >> >> For the purposes of development, I don't mind trading off stale data >> (i.e. reusing an existing destination table if it exists) to save money. >> >> Is there any way to do this now, or relevant any open issues? I did a >> quick pass through JIRA but couldn't find anything. >> >> Thanks, >> Matt >> >
