[
https://issues.apache.org/jira/browse/BEAM-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kenneth Knowles updated BEAM-1790:
----------------------------------
Description:
I am running with {{--requirements_file requirements.txt}}, which contains:
{noformat}
google-cloud-datastore
{noformat}
Unfortunately, when attempting to run this on the cloud dataflow, I get the
following error trying to build the requirements:
{noformat}
Collecting setuptools (from
protobuf>=3.0.0->google-cloud-core<0.24dev,>=0.23.1->google-cloud-datastore->-r
requirements.txt (line 3))
File was already downloaded
/var/folders/94/wngs1jw91_n2_jjjrfljtqrc0000gn/T/dataflow-requirements-cache/setuptools-34.3.2.zip
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "setuptools/__init__.py", line 12, in <module>
import setuptools.version
File "setuptools/version.py", line 1, in <module>
import pkg_resources
File "pkg_resources/__init__.py", line 70, in <module>
import packaging.version
ImportError: No module named packaging.version
{noformat}
Looking online https://github.com/pypa/setuptools/issues/937 , it appears this
is due to "pip asking setuptools to build itself (from source dist), which is
no longer supported."
I'm not sure what the correct fix is here...since protobuf depends on
setuptools, and a lot of Google libraries depend on protobuf. Seems there is no
way to list protobuf/setuptools as being "provided" by the beam runtime (ie
https://github.com/pypa/pip/issues/3090).
I'm going to try using my own setup.py next and see if I can skirt around the
issue, but this definitely seems like a bug with beam's requirements packager
asking for too much?
In the case of GCE, I compile my dependencies into a docker image that extends
the base GCE images (and lets me use binary installs), not sure something like
that would work here?
was:
I am running with {{--requirements_file requirements.txt}}, which contains:
{noformat}
google-cloud-datastore
{noformat}
Unfortunately, when attempting to run this on the cloud dataflow, I get the
following error trying to build the requirements:
{noformat}
Collecting setuptools (from
protobuf>=3.0.0->google-cloud-core<0.24dev,>=0.23.1->google-cloud-datastore->-r
requirements.txt (line 3))
File was already downloaded
/var/folders/94/wngs1jw91_n2_jjjrfljtqrc0000gn/T/dataflow-requirements-cache/setuptools-34.3.2.zip
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "setuptools/__init__.py", line 12, in <module>
import setuptools.version
File "setuptools/version.py", line 1, in <module>
import pkg_resources
File "pkg_resources/__init__.py", line 70, in <module>
import packaging.version
ImportError: No module named packaging.version
{noformat}
Looking online https://github.com/pypa/setuptools/issues/937 , it appears this
is due to "pip asking setuptools to build itself (from source dist), which is
no longer supported."
I'm not sure what the correct fix is here...since protobuf depends on
setuptools, and a lot of Google libraries depend on protobuf. Seems there is no
way to whitelist protobuf/setuptools as being "provided" by the beam runtime
(ie https://github.com/pypa/pip/issues/3090).
I'm going to try using my own setup.py next and see if I can skirt around the
issue, but this definitely seems like a bug with beam's requirements packager
asking for too much?
In the case of GCE, I compile my dependencies into a docker image that extends
the base GCE images (and lets me use binary installs), not sure something like
that would work here?
> Failure to build --requirements.txt when it uses google protobuf
> ----------------------------------------------------------------
>
> Key: BEAM-1790
> URL: https://issues.apache.org/jira/browse/BEAM-1790
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Reporter: Mike Lambert
> Priority: P3
> Labels: build, requirements
>
> I am running with {{--requirements_file requirements.txt}}, which contains:
> {noformat}
> google-cloud-datastore
> {noformat}
> Unfortunately, when attempting to run this on the cloud dataflow, I get the
> following error trying to build the requirements:
> {noformat}
> Collecting setuptools (from
> protobuf>=3.0.0->google-cloud-core<0.24dev,>=0.23.1->google-cloud-datastore->-r
> requirements.txt (line 3))
> File was already downloaded
> /var/folders/94/wngs1jw91_n2_jjjrfljtqrc0000gn/T/dataflow-requirements-cache/setuptools-34.3.2.zip
> Complete output from command python setup.py egg_info:
> Traceback (most recent call last):
> File "<string>", line 1, in <module>
> File "setuptools/__init__.py", line 12, in <module>
> import setuptools.version
> File "setuptools/version.py", line 1, in <module>
> import pkg_resources
> File "pkg_resources/__init__.py", line 70, in <module>
> import packaging.version
> ImportError: No module named packaging.version
> {noformat}
> Looking online https://github.com/pypa/setuptools/issues/937 , it appears
> this is due to "pip asking setuptools to build itself (from source dist),
> which is no longer supported."
> I'm not sure what the correct fix is here...since protobuf depends on
> setuptools, and a lot of Google libraries depend on protobuf. Seems there is
> no way to list protobuf/setuptools as being "provided" by the beam runtime
> (ie https://github.com/pypa/pip/issues/3090).
> I'm going to try using my own setup.py next and see if I can skirt around the
> issue, but this definitely seems like a bug with beam's requirements packager
> asking for too much?
> In the case of GCE, I compile my dependencies into a docker image that
> extends the base GCE images (and lets me use binary installs), not sure
> something like that would work here?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)