Abacn commented on issue #23029:
URL: https://github.com/apache/beam/issues/23029#issuecomment-1242044282
Could you please share the error message seen when deploying the pipeline to
Dataflow?
I did some local test and see the following error when cannot connect to
jdbc database:
```
INFO:apache_beam.utils.subprocess_server:WARNING: Configuration class
'org.apache.beam.sdk.extensions.schemaio.expansion.ExternalSchemaIOTransformRegistrar$Configuration'
has no schema registered. Attempting to construct with setter approach.
Traceback (most recent call last):
File "jdbcioTest.py", line 180, in <module>
test_instance.run_read()
File "jdbcioTest.py", line 157, in run_read
p
File
"/Users/yathu/dev/virtualenv/py38beam/lib/python3.8/site-packages/apache_beam/transforms/ptransform.py",
line 1095, in __ror__
return self.transform.__ror__(pvalueish, self.label)
File
"/Users/yathu/dev/virtualenv/py38beam/lib/python3.8/site-packages/apache_beam/transforms/ptransform.py",
line 617, in __ror__
result = p.apply(self, pvalueish, label)
File
"/Users/yathu/dev/virtualenv/py38beam/lib/python3.8/site-packages/apache_beam/pipeline.py",
line 663, in apply
return self.apply(transform, pvalueish)
File
"/Users/yathu/dev/virtualenv/py38beam/lib/python3.8/site-packages/apache_beam/pipeline.py",
line 709, in apply
pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
File
"/Users/yathu/dev/virtualenv/py38beam/lib/python3.8/site-packages/apache_beam/runners/runner.py",
line 185, in apply
return m(transform, input, options)
File
"/Users/yathu/dev/virtualenv/py38beam/lib/python3.8/site-packages/apache_beam/runners/runner.py",
line 215, in apply_PTransform
return transform.expand(input)
File
"/Users/yathu/dev/virtualenv/py38beam/lib/python3.8/site-packages/apache_beam/transforms/external.py",
line 526, in expand
raise RuntimeError(response.error)
RuntimeError: org.apache.beam.sdk.io.jdbc.BeamSchemaInferenceException:
Failed to infer Beam schema
at
org.apache.beam.sdk.io.jdbc.JdbcIO$ReadRows.inferBeamSchema(JdbcIO.java:696)
at org.apache.beam.sdk.io.jdbc.JdbcIO$ReadRows.expand(JdbcIO.java:672)
at org.apache.beam.sdk.io.jdbc.JdbcIO$ReadRows.expand(JdbcIO.java:592)
at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:548)
...
```
If this is also what you see, what happens is that the external transform is
trying to infer schema by connecting to the database at pipeline expansion
time, which happens only in external transform expansion service. Will
investigate whether it is possible or how can avoid it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]