Take a look at https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/
On Tue, Jul 3, 2018 at 2:09 PM OrielResearch Eila Arich-Landkof < [email protected]> wrote: > Hello all, > > > I am using the python code to run my pipeline. similar to the following: > > options = PipelineOptions()google_cloud_options = > options.view_as(GoogleCloudOptions)google_cloud_options.project = > 'my-project-id'google_cloud_options.job_name = > 'myjob'google_cloud_options.staging_location = > 'gs://your-bucket-name-here/staging'google_cloud_options.temp_location = > 'gs://your-bucket-name-here/temp'options.view_as(StandardOptions).runner = > 'DataflowRunner' > > > > I would like to add *pandas-gbq* package installation to my workers. What > would be the recommendation to do so. Can I add it to the > PipelineOptions()? > I remember that there are few options, one of them was with creating a > requirements text file but I can not remember where I saw it and if it is > the simplest way when running the pipeline from datalab. > > Thanks you for any reference! > > -- > Eila > www.orielresearch.org > https://www.meetu <https://www.meetup.com/Deep-Learning-In-Production/> > p.co <https://www.meetup.com/Deep-Learning-In-Production/> > m/Deep-Learning-In-Production/ > <https://www.meetup.com/Deep-Learning-In-Production/> > > >
