urquha opened a new issue, #25531: URL: https://github.com/apache/beam/issues/25531
### What happened? I am trying to write to big query using the python SDK. I passed my table reference into WriteToBigQuery as project:dataset.Tablename I get this error `beam.py 162 <module> parsed_results | beam.io.WriteToBigQuery( bigquery.py 1934 __init__ self.table_reference = bigquery_tools.parse_table_reference( bigquery_tools.py 244 parse_table_reference if isinstance(table, TableReference): TypeError: isinstance() arg 2 must be a type, a tuple of types, or a union` I looked in the sdk and I can't find where TableReference is defined, on line 87 of [bigquery tools](https://github.com/apache/beam/blob/6adecd438790d8c1b5182043db16232b68ff7a98/sdks/python/apache_beam/io/gcp/bigquery_tools.py), there is a line which imports the table reference but when I run the command myself it gives me an error `from apache_beam.io.gcp.internal.clients.bigquery import TableReference *** ImportError: cannot import name 'TableReference' from 'apache_beam.io.gcp.internal.clients.bigquery' (/opt/homebrew/lib/python3.10/site-packages/apache_beam/io/gcp/internal/clients/bigquery/__init__.py)` I believe this leads to the isinstance having a none as the table reference, which is breaking my code. I have copied code from a few different places which supposedly would work and it all gives the same error, I really hope that I'm doing something dumb. I even tried copying the [test](https://github.com/apache/beam/blob/6adecd438790d8c1b5182043db16232b68ff7a98/sdks/python/apache_beam/io/gcp/bigquery_tools_test.py) by importing and using the bigquery client: `from apache_beam.io.gcp.internal.clients import bigquery bigquery.TableReference() *** AttributeError: module 'apache_beam.io.gcp.internal.clients.bigquery' has no attribute 'TableReference' ` Also, when I try to use the code from the docs there is a bracket missing from the table names variable and it uses the Create method twice which is not allowed and returns an error `with Pipeline() as p: elements = (p | beam.Create([ {'type': 'error', 'timestamp': '12:34:56', 'message': 'bad'}, {'type': 'user_log', 'timestamp': '12:34:59', 'query': 'flu symptom'}, ])) table_names = (p | beam.Create([ ('error', 'my_project:dataset1.error_table_for_today'), ('user_log', 'my_project:dataset1.query_table_for_today'), ]) <------ table_names_dict = beam.pvalue.AsDict(table_names) elements | beam.io.gcp.bigquery.WriteToBigQuery( table=lambda row, table_dict: table_dict[row['type']], table_side_inputs=(table_names_dict,))` ### Issue Priority Priority: 1 (data loss / total loss of function) ### Issue Components - [X] Component: Python SDK - [ ] Component: Java SDK - [ ] Component: Go SDK - [ ] Component: Typescript SDK - [ ] Component: IO connector - [ ] Component: Beam examples - [ ] Component: Beam playground - [ ] Component: Beam katas - [ ] Component: Website - [ ] Component: Spark Runner - [ ] Component: Flink Runner - [ ] Component: Samza Runner - [ ] Component: Twister2 Runner - [ ] Component: Hazelcast Jet Runner - [ ] Component: Google Cloud Dataflow Runner -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
