Hi Robert,

-dev <mailto:[email protected]>, as this seems to be really related to improper use. Thanks for the pointer (I somehow missed this in the docs), I tried --save_main_session, but without luck. When adding the flag, the serialization fails with

RecursionError: maximum recursion depth exceeded

My modules do not import one another in cyclic way (if this could cause this problem).

If I try to use the "standard" way through setup.py, I still get errors, even when I try to import the module in the function (DoFn, actually) as described in [1]. It looks like the module is not known, even though it is referenced in setup.py (via py_modules). Is there anything that I'm still doing wrong?

Thanks,

 Jan

[1] https://cloud.google.com/dataflow/docs/resources/faq#how_do_i_handle_nameerrors

https://cloud.google.com/dataflow/docs/resources/faq#programming_with_the_cloud_dataflow_sdk_for_python

https://cloud.google.com/dataflow/docs/resources/faq#how_do_i_handle_nameerrors

https://cloud.google.com/dataflow/docs/resources/faq#how_do_i_handle_nameerrors

https://cloud.google.com/dataflow/docs/resources/faq#how_do_i_handle_nameerrors

https://cloud.google.com/dataflow/docs/resources/faq#how_do_i_handle_nameerrors

https://cloud.google.com/dataflow/docs/resources/faq#how_do_i_handle_nameerrors

On 9/24/21 6:14 PM, Robert Bradshaw wrote:
On Fri, Sep 24, 2021 at 6:33 AM Jan Lukavský <[email protected]> wrote:
+dev

I hit very similar issue even with standard module (math). No matter where I 
put the import statement (even one line preceding the use), the module cannot 
be found and causes

NameError: name 'math' is not defined
This sounds like it was imported in the __main__ module, but
save_main_session was not used.

I therefore think, that the --setup_file works fine, but there is more general 
problem (or misunderstanding from my side) with importing modules. Can this be 
runner-dependent? I use FlinkRunner and submit jobs with 
--flink_submit_uber_jar, could there be the problem?

  Jan

On 9/23/21 3:12 PM, Jan Lukavský wrote:

Oops, sorry, the illustration of the three files is wrong. It was meant to be

src/

  | ---- script.py

  | ---- service_pb2.py

  | ---- service_pb2_grpc.py

The three files are in the same directory.

On 9/23/21 3:08 PM, Jan Lukavský wrote:

Hi,

I'm facing issues importing dependencies of my Python Pipeline. I intend to use 
gRPC to communicate with remote RPC service, hence I have the following project 
structure:

script.py

     |---- service_pb2.py

     |---- service_pb2_grpc.py

I created setup.py with something like

setup(name='...',
   version='1.0',
   description='...',
   py_modules=['service_pb2', 'service_pb2_grpc'])


That seems to work, it packages the dependencies, for example by 'python3 
setup.py sdist'. I pass this file to the Pipeline using --setup_file, but I 
have no luck using the module. Though the script is executed, it fails once I 
try to open a channel using (DoFn.setup):

   def setup(self):
     self.channel = grpc.insecure_channel(self.address)
     self.stub = service_pb2_grpc.RpcServiceStub(self.channel)

with exception ModuleNotFoundError: No module named 'service_pb2_grpc'.

Am I doing something obviously wrong?

  Jan

Reply via email to