Yes, you'll need to bundle up these dependencies in a way that they can be shipped to the workers. See https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/
On Thu, Jan 16, 2020 at 2:00 PM Marco Mistroni <[email protected]> wrote: > > Hello all > i have written an apache beam workflow which i have splitted across two file > - main_file.py contains the pipeline > - utils.py (which contains few functions used in the pipeline) > > I have created template for this using the command below > > python -m main_file.py --runner=dataflow --project=myproject > --template_location=gs://mybucket/my_template > --temp_location=gs://mybucket/temp --staging_location=gs://mybucket/staging > > and i have attempted to create a job using this template. > However, when i kick off the job i am getting exceptions such as > > > Traceback (most recent call last): File > "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", > line 261, in loads return dill.loads(s) File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 317, in loads > return load(file, ignore) File > "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 305, in load obj > = pik.load() File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", > line 474, in find_class return StockUnpickler.find_class(self, module, name) > ImportError: No module named 'utils' > I am guessing i am missign some steps in packaging the application, or > perhaps some extra options to specify dependencies? > i would not imagine writing a whole workflow in one file, so this looks like > a standard usecase ? > > kind regards > > > >
