Juta Staes created BEAM-7439:
--------------------------------

             Summary: Bigquery Write with schema None: TypeError: 'NoneType' 
object has no attribute '__getitem__'
                 Key: BEAM-7439
                 URL: https://issues.apache.org/jira/browse/BEAM-7439
             Project: Beam
          Issue Type: Bug
          Components: sdk-py-core
            Reporter: Juta Staes


When running a simple write to bigquery on apache-beam==2.12.0

{code:java}
input_data = [
   {'str': 'test')}
 ]
(pipeline | 'create' >> beam.Create(input_data)
   | 'write' >> beam.io.WriteToBigQuery(
   '<project-id>:beam_test.test'))
{code}
 

I get the following error:
{code:java}
WARNING:root:Start running in the cloud
Traceback (most recent call last):
 File "test_pipeline.py", line 193, in <module>
 main()
 File "test_pipeline.py", line 183, in main
 '<project-id>:beam_test.test'))
 File 
"/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pvalue.py",
 line 112, in __or__
 return self.pipeline.apply(ptransform, self)
 File 
"/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
 line 470, in apply
 label or transform.label)
 File 
"/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
 line 480, in apply
 return self.apply(transform, pvalueish)
 File 
"/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
 line 516, in apply
 pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
 File 
"/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/runners/runner.py",
 line 193, in apply
 return m(transform, input, options)
 File 
"/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
 line 617, in apply_WriteToBigQuery
 parse_table_schema_from_json(json.dumps(transform.schema)),
 File 
"/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery_tools.py",
 line 130, in parse_table_schema_from_json
 fields = [_parse_schema_field(f) for f in json_schema['fields']]
TypeError: 'NoneType' object has no attribute '__getitem__'{code}

I already proposed a fix for this as part of a larger pr: 
https://github.com/apache/beam/pull/8621/commits/41cdfbda5a4e2a56b6d10046ba265ad68c78675d

I was wondering if this also needs to be patched for version 2.12.0?

cc: [~tvalentyn] [~pabloem]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to