Hi Eila,

You can find a list of dependencies installed in Dataflow workers in [1].
Dataflow workers will have a set of dependencies that will satisfy the
requirements from setup.py.

Which bigquery library you are using? There is
a google-cloud-bigquery==0.25.0 dependency, I am not sure where the 0.23.0
is coming from.

Workers do not pick up libraries from the client environment as part of the
job submission. I am not sure how datalab UI integration works however you
have a few options for installing any set of dependencies in the workers.
Using requirements.txt is one of those options.

Ahmet

[1]
https://cloud.google.com/dataflow/docs/concepts/sdk-worker-dependencies#version-250_1

On Thu, Jul 12, 2018 at 8:51 AM, OrielResearch Eila Arich-Landkof <
[email protected]> wrote:

> Hi all,
>
> I am running python pipeline with google.cloud.bigquery library.
> on the local runner, everything runs great
> bigquery.__version__ is 0.28.0
>
> on the dataflow runner, the version is 0.23.0 bigquery.__version__ is
> 0.23.0
> and there are many API changes between these versions.
>
> What will be the best way to change the installed version on the workers?
> I was assuming the the worker has all the master machine libraries
> installed when the execution is done from datalab - is that true?
> I am not generating any requirements.txt, the execution is done through
> the run button on the datalab UI.
>
>
> please help me solve that issue.
> Thanks,
> --
> Eila
> www.orielresearch.org
> https://www.meetu <https://www.meetup.com/Deep-Learning-In-Production/>
> p.co <https://www.meetup.com/Deep-Learning-In-Production/>m/Deep-
> Learning-In-Production/
> <https://www.meetup.com/Deep-Learning-In-Production/>
>
>
>

Reply via email to