Juta Staes created BEAM-7439:
--------------------------------
Summary: Bigquery Write with schema None: TypeError: 'NoneType'
object has no attribute '__getitem__'
Key: BEAM-7439
URL: https://issues.apache.org/jira/browse/BEAM-7439
Project: Beam
Issue Type: Bug
Components: sdk-py-core
Reporter: Juta Staes
When running a simple write to bigquery on apache-beam==2.12.0
{code:java}
input_data = [
{'str': 'test')}
]
(pipeline | 'create' >> beam.Create(input_data)
| 'write' >> beam.io.WriteToBigQuery(
'<project-id>:beam_test.test'))
{code}
I get the following error:
{code:java}
WARNING:root:Start running in the cloud
Traceback (most recent call last):
File "test_pipeline.py", line 193, in <module>
main()
File "test_pipeline.py", line 183, in main
'<project-id>:beam_test.test'))
File
"/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pvalue.py",
line 112, in __or__
return self.pipeline.apply(ptransform, self)
File
"/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
line 470, in apply
label or transform.label)
File
"/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
line 480, in apply
return self.apply(transform, pvalueish)
File
"/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
line 516, in apply
pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
File
"/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/runners/runner.py",
line 193, in apply
return m(transform, input, options)
File
"/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
line 617, in apply_WriteToBigQuery
parse_table_schema_from_json(json.dumps(transform.schema)),
File
"/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery_tools.py",
line 130, in parse_table_schema_from_json
fields = [_parse_schema_field(f) for f in json_schema['fields']]
TypeError: 'NoneType' object has no attribute '__getitem__'{code}
I already proposed a fix for this as part of a larger pr:
https://github.com/apache/beam/pull/8621/commits/41cdfbda5a4e2a56b6d10046ba265ad68c78675d
I was wondering if this also needs to be patched for version 2.12.0?
cc: [~tvalentyn] [~pabloem]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)