[ 
https://issues.apache.org/jira/browse/BEAM-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-1790:
----------------------------------
    Description: 
I am running with {{--requirements_file requirements.txt}}, which contains:
{noformat}
google-cloud-datastore
{noformat}

Unfortunately, when attempting to run this on the cloud dataflow, I get the 
following error trying to build the requirements:
{noformat}
Collecting setuptools (from 
protobuf>=3.0.0->google-cloud-core<0.24dev,>=0.23.1->google-cloud-datastore->-r 
requirements.txt (line 3))
  File was already downloaded 
/var/folders/94/wngs1jw91_n2_jjjrfljtqrc0000gn/T/dataflow-requirements-cache/setuptools-34.3.2.zip
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "setuptools/__init__.py", line 12, in <module>
        import setuptools.version
      File "setuptools/version.py", line 1, in <module>
        import pkg_resources
      File "pkg_resources/__init__.py", line 70, in <module>
        import packaging.version
    ImportError: No module named packaging.version
{noformat}

Looking online https://github.com/pypa/setuptools/issues/937 , it appears this 
is due to "pip asking setuptools to build itself (from source dist), which is 
no longer supported."

I'm not sure what the correct fix is here...since protobuf depends on 
setuptools, and a lot of Google libraries depend on protobuf. Seems there is no 
way to list protobuf/setuptools as being "provided" by the beam runtime (ie 
https://github.com/pypa/pip/issues/3090).

I'm going to try using my own setup.py next and see if I can skirt around the 
issue, but this definitely seems like a bug with beam's requirements packager 
asking for too much?

In the case of GCE, I compile my dependencies into a docker image that extends 
the base GCE images (and lets me use binary installs), not sure something like 
that would work here?

  was:
I am running with {{--requirements_file requirements.txt}}, which contains:
{noformat}
google-cloud-datastore
{noformat}

Unfortunately, when attempting to run this on the cloud dataflow, I get the 
following error trying to build the requirements:
{noformat}
Collecting setuptools (from 
protobuf>=3.0.0->google-cloud-core<0.24dev,>=0.23.1->google-cloud-datastore->-r 
requirements.txt (line 3))
  File was already downloaded 
/var/folders/94/wngs1jw91_n2_jjjrfljtqrc0000gn/T/dataflow-requirements-cache/setuptools-34.3.2.zip
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "setuptools/__init__.py", line 12, in <module>
        import setuptools.version
      File "setuptools/version.py", line 1, in <module>
        import pkg_resources
      File "pkg_resources/__init__.py", line 70, in <module>
        import packaging.version
    ImportError: No module named packaging.version
{noformat}

Looking online https://github.com/pypa/setuptools/issues/937 , it appears this 
is due to "pip asking setuptools to build itself (from source dist), which is 
no longer supported."

I'm not sure what the correct fix is here...since protobuf depends on 
setuptools, and a lot of Google libraries depend on protobuf. Seems there is no 
way to whitelist protobuf/setuptools as being "provided" by the beam runtime 
(ie https://github.com/pypa/pip/issues/3090).

I'm going to try using my own setup.py next and see if I can skirt around the 
issue, but this definitely seems like a bug with beam's requirements packager 
asking for too much?

In the case of GCE, I compile my dependencies into a docker image that extends 
the base GCE images (and lets me use binary installs), not sure something like 
that would work here?


> Failure to build --requirements.txt when it uses google protobuf
> ----------------------------------------------------------------
>
>                 Key: BEAM-1790
>                 URL: https://issues.apache.org/jira/browse/BEAM-1790
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>            Reporter: Mike Lambert
>            Priority: P3
>              Labels: build, requirements
>
> I am running with {{--requirements_file requirements.txt}}, which contains:
> {noformat}
> google-cloud-datastore
> {noformat}
> Unfortunately, when attempting to run this on the cloud dataflow, I get the 
> following error trying to build the requirements:
> {noformat}
> Collecting setuptools (from 
> protobuf>=3.0.0->google-cloud-core<0.24dev,>=0.23.1->google-cloud-datastore->-r
>  requirements.txt (line 3))
>   File was already downloaded 
> /var/folders/94/wngs1jw91_n2_jjjrfljtqrc0000gn/T/dataflow-requirements-cache/setuptools-34.3.2.zip
>     Complete output from command python setup.py egg_info:
>     Traceback (most recent call last):
>       File "<string>", line 1, in <module>
>       File "setuptools/__init__.py", line 12, in <module>
>         import setuptools.version
>       File "setuptools/version.py", line 1, in <module>
>         import pkg_resources
>       File "pkg_resources/__init__.py", line 70, in <module>
>         import packaging.version
>     ImportError: No module named packaging.version
> {noformat}
> Looking online https://github.com/pypa/setuptools/issues/937 , it appears 
> this is due to "pip asking setuptools to build itself (from source dist), 
> which is no longer supported."
> I'm not sure what the correct fix is here...since protobuf depends on 
> setuptools, and a lot of Google libraries depend on protobuf. Seems there is 
> no way to list protobuf/setuptools as being "provided" by the beam runtime 
> (ie https://github.com/pypa/pip/issues/3090).
> I'm going to try using my own setup.py next and see if I can skirt around the 
> issue, but this definitely seems like a bug with beam's requirements packager 
> asking for too much?
> In the case of GCE, I compile my dependencies into a docker image that 
> extends the base GCE images (and lets me use binary installs), not sure 
> something like that would work here?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to