[
https://issues.apache.org/jira/browse/BEAM-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bruce Arctor updated BEAM-7871:
-------------------------------
Comment: was deleted
(was: [~zdenulo], is this due to lack of there being an IO to firebase api
(firestore in native mode)? I do see that there is a datastore connector, so
was wondering if that works with firestore in datastore mode. Haven't tried
yet, but this is a task I was hoping to accomplish (pubsub -> beam/dataflow ->
firestore in native mode, meaning firebase api) – just starting to look into it
which led me here. )
> Streaming from PubSub to Firestore doesn't work on Dataflow
> -----------------------------------------------------------
>
> Key: BEAM-7871
> URL: https://issues.apache.org/jira/browse/BEAM-7871
> Project: Beam
> Issue Type: Bug
> Components: io-py-gcp, runner-dataflow
> Affects Versions: 2.13.0
> Reporter: Zdenko Hrcek
> Priority: Major
>
> I came to the same error as here
> [https://stackoverflow.com/questions/57059944/python-package-errors-while-running-gcp-dataflow]
> but I don't see anywhere reported thus I am creating an issue just in case.
> The pipeline is quite simple, reading from PubSub and writing to Firestore.
> Beam version used is 2.13.0, Python 2.7
> With DirectRunner works ok, but on Dataflow it throws the following message:
>
> {code:java}
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error
> received from SDK harness for instruction -81: Traceback (most recent call
> last):
> File
> "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py",
> line 157, in _execute
> response = task()
> File
> "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py",
> line 190, in <lambda>
> self._execute(lambda: worker.do_instruction(work), work)
> File
> "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py",
> line 312, in do_instruction
> request.instruction_id)
> File
> "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py",
> line 331, in process_bundle
> bundle_processor.process_bundle(instruction_id))
> File
> "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py",
> line 538, in process_bundle
> op.start()
> File "apache_beam/runners/worker/operations.py", line 554, in
> apache_beam.runners.worker.operations.DoOperation.start
> def start(self):
> File "apache_beam/runners/worker/operations.py", line 555, in
> apache_beam.runners.worker.operations.DoOperation.start
> with self.scoped_start_state:
> File "apache_beam/runners/worker/operations.py", line 557, in
> apache_beam.runners.worker.operations.DoOperation.start
> self.dofn_runner.start()
> File "apache_beam/runners/common.py", line 778, in
> apache_beam.runners.common.DoFnRunner.start
> self._invoke_bundle_method(self.do_fn_invoker.invoke_start_bundle)
> File "apache_beam/runners/common.py", line 775, in
> apache_beam.runners.common.DoFnRunner._invoke_bundle_method
> self._reraise_augmented(exn)
> File "apache_beam/runners/common.py", line 800, in
> apache_beam.runners.common.DoFnRunner._reraise_augmented
> raise_with_traceback(new_exn)
> File "apache_beam/runners/common.py", line 773, in
> apache_beam.runners.common.DoFnRunner._invoke_bundle_method
> bundle_method()
> File "apache_beam/runners/common.py", line 359, in
> apache_beam.runners.common.DoFnInvoker.invoke_start_bundle
> def invoke_start_bundle(self):
> File "apache_beam/runners/common.py", line 363, in
> apache_beam.runners.common.DoFnInvoker.invoke_start_bundle
> self.signature.start_bundle_method.method_value())
> File
> "/home/zdenulo/dev/gcp_stuff/df_firestore_stream/df_firestore_stream.py",
> line 39, in start_bundle
> NameError: global name 'firestore' is not defined [while running
> 'generatedPtransform-64']
>
> {code}
> It's interesting that using Beam version 2.12.0 solves the problem on
> Dataflow, it works as expected, not sure what could be the problem.
> Here is a repository with the code which was used
> [https://github.com/zdenulo/dataflow_firestore_stream]
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)