[
https://issues.apache.org/jira/browse/BEAM-7439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16850288#comment-16850288
]
Valentyn Tymofieiev commented on BEAM-7439:
-------------------------------------------
[~chamikara], yes, there is a gap in test coverage. [~Juta]'s
https://github.com/apache/beam/pull/8170 is adding such test:
test_big_query_write_without_schema
> Bigquery Write with schema None: TypeError: 'NoneType' object has no
> attribute '__getitem__'
> --------------------------------------------------------------------------------------------
>
> Key: BEAM-7439
> URL: https://issues.apache.org/jira/browse/BEAM-7439
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Reporter: Juta Staes
> Assignee: Pablo Estrada
> Priority: Blocker
> Fix For: 2.13.0
>
>
> When running a simple write to bigquery on apache-beam==2.12.0
> {code:java}
> input_data = [
> {'str': 'test'}
> ]
> (pipeline | 'create' >> beam.Create(input_data)
> | 'write' >> beam.io.WriteToBigQuery(
> '<project-id>:beam_test.test'))
> {code}
>
> I get the following error:
> {code:java}
> WARNING:root:Start running in the cloud
> Traceback (most recent call last):
> File "test_pipeline.py", line 193, in <module>
> main()
> File "test_pipeline.py", line 183, in main
> '<project-id>:beam_test.test'))
> File
> "/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pvalue.py",
> line 112, in __or__
> return self.pipeline.apply(ptransform, self)
> File
> "/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
> line 470, in apply
> label or transform.label)
> File
> "/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
> line 480, in apply
> return self.apply(transform, pvalueish)
> File
> "/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
> line 516, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
> File
> "/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/runners/runner.py",
> line 193, in apply
> return m(transform, input, options)
> File
> "/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
> line 617, in apply_WriteToBigQuery
> parse_table_schema_from_json(json.dumps(transform.schema)),
> File
> "/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery_tools.py",
> line 130, in parse_table_schema_from_json
> fields = [_parse_schema_field(f) for f in json_schema['fields']]
> TypeError: 'NoneType' object has no attribute '__getitem__'{code}
> I already proposed a fix for this as part of a larger pr:
> https://github.com/apache/beam/pull/8621/commits/41cdfbda5a4e2a56b6d10046ba265ad68c78675d
> I was wondering if this also needs to be patched for version 2.12.0?
> cc: [~tvalentyn] [~pabloem]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)