[
https://issues.apache.org/jira/browse/BEAM-7439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16849853#comment-16849853
]
Valentyn Tymofieiev commented on BEAM-7439:
-------------------------------------------
Thanks for reporting this, [~Juta].
Hey [~pabloem], I saw you working on some BQ fixes recently. What's your take
on whether we need to cherry-pick a fix for this for 2.13.0 that is being
voted on? It may help to answer these questions:
- Do we know if the error is present in 2.11.0 ?
- Do we know if is this issue is fixed in 2.13.0 RC 1?
- Do we know if it is present in Dataflow runner?
If it is a Direct runner-only issue we could call out it as a limitation in
release notes for 2.12.0, 2.13.0.
cc: [~altay] [~goenka].
> Bigquery Write with schema None: TypeError: 'NoneType' object has no
> attribute '__getitem__'
> --------------------------------------------------------------------------------------------
>
> Key: BEAM-7439
> URL: https://issues.apache.org/jira/browse/BEAM-7439
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Reporter: Juta Staes
> Priority: Major
>
> When running a simple write to bigquery on apache-beam==2.12.0
> {code:java}
> input_data = [
> {'str': 'test'}
> ]
> (pipeline | 'create' >> beam.Create(input_data)
> | 'write' >> beam.io.WriteToBigQuery(
> '<project-id>:beam_test.test'))
> {code}
>
> I get the following error:
> {code:java}
> WARNING:root:Start running in the cloud
> Traceback (most recent call last):
> File "test_pipeline.py", line 193, in <module>
> main()
> File "test_pipeline.py", line 183, in main
> '<project-id>:beam_test.test'))
> File
> "/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pvalue.py",
> line 112, in __or__
> return self.pipeline.apply(ptransform, self)
> File
> "/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
> line 470, in apply
> label or transform.label)
> File
> "/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
> line 480, in apply
> return self.apply(transform, pvalueish)
> File
> "/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
> line 516, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
> File
> "/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/runners/runner.py",
> line 193, in apply
> return m(transform, input, options)
> File
> "/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
> line 617, in apply_WriteToBigQuery
> parse_table_schema_from_json(json.dumps(transform.schema)),
> File
> "/mnt/c/Users/Juta/Documents/02-projects/apache/beam/sdks/venv2/local/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery_tools.py",
> line 130, in parse_table_schema_from_json
> fields = [_parse_schema_field(f) for f in json_schema['fields']]
> TypeError: 'NoneType' object has no attribute '__getitem__'{code}
> I already proposed a fix for this as part of a larger pr:
> https://github.com/apache/beam/pull/8621/commits/41cdfbda5a4e2a56b6d10046ba265ad68c78675d
> I was wondering if this also needs to be patched for version 2.12.0?
> cc: [~tvalentyn] [~pabloem]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)