Yes, you'll need to bundle up these dependencies in a way that they
can be shipped to the workers. See
https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/

On Thu, Jan 16, 2020 at 2:00 PM Marco Mistroni <[email protected]> wrote:
>
> Hello all
>  i have written an apache beam workflow which i have splitted across two file
> - main_file.py  contains the pipeline
> - utils.py (which contains few functions used in the pipeline)
>
> I have created template  for this using the command below
>
> python -m main_file.py --runner=dataflow --project=myproject 
> --template_location=gs://mybucket/my_template 
> --temp_location=gs://mybucket/temp --staging_location=gs://mybucket/staging
>
> and i have attempted to create a job using this template.
> However, when i kick off the job i am getting exceptions such as
>
>
> Traceback (most recent call last): File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", 
> line 261, in loads return dill.loads(s) File 
> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 317, in loads 
> return load(file, ignore) File 
> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 305, in load obj 
> = pik.load() File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", 
> line 474, in find_class return StockUnpickler.find_class(self, module, name) 
> ImportError: No module named 'utils'
> I am guessing i am missign some steps in packaging the application, or 
> perhaps some extra options to specify dependencies?
> i would not imagine writing a whole workflow in one file, so this looks like 
> a standard usecase ?
>
> kind regards
>
>
>
>

Reply via email to