Javier Domingo Cansino created BEAM-6149:
--------------------------------------------

             Summary: Resolving gcp extra dependencies breaks ssl support when 
contacting google
                 Key: BEAM-6149
                 URL: https://issues.apache.org/jira/browse/BEAM-6149
             Project: Beam
          Issue Type: Bug
          Components: dependencies, runner-dataflow, sdk-py-core
    Affects Versions: 2.8.0, 2.7.0
            Reporter: Javier Domingo Cansino
            Assignee: Tyler Akidau


It looks a bit like an oxymoron to me, but when fully resolving apache-beam 
using gcp extras dependencies, httplib2 is forced to be on a version that 
doesn't allow it to call google, and any pipeline using google services (I 
haven't checked others), fails.

I have done the full back-tracing of the problem, let me try to explain my 
findings.

A quick way to reproduce this, is by using pipenv to install all the 
dependencies. It will make sure to resolve sub-dependencies, {{pipenv install 
apache-beam[gcp]}}, and then run {{python -c 'from google.cloud import 
bigquery;client=bigquery.Client(); list(client.list_projects())'}}. The error 
is the same when running a pipeline, but I kept it simple.

It will throw an error like this one:

{code}
/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/auth/_default.py:66:
 UserWarning: Your application has authenticated using end user credentials 
from Google Cloud SDK. We recommend that most server applications use service 
accounts instead. If your application continues to use end user credentials 
from Cloud SDK, you might receive a "quota exceeded" or "API not enabled" 
error. For more information about service accounts, see 
https://cloud.google.com/docs/authentication/
  warnings.warn(_CLOUD_SDK_CREDENTIALS_WARNING)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File 
"/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/cloud/iterator.py",
 line 218, in _items_iter
    for page in self._page_iter(increment=False):
  File 
"/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/cloud/iterator.py",
 line 247, in _page_iter
    page = self._next_page()
  File 
"/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/cloud/iterator.py",
 line 347, in _next_page
    response = self._get_next_page_response()
  File 
"/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/cloud/iterator.py",
 line 396, in _get_next_page_response
    query_params=params)
  File 
"/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/cloud/_http.py",
 line 299, in api_request
    headers=headers, target_object=_target_object)
  File 
"/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/cloud/_http.py",
 line 193, in _make_request
    return self._do_request(method, url, headers, data, target_object)
  File 
"/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/cloud/_http.py",
 line 223, in _do_request
    body=data)
  File 
"/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google_auth_httplib2.py",
 line 187, in request
    self._request, method, uri, request_headers)
  File 
"/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/auth/credentials.py",
 line 122, in before_request
    self.refresh(request)
  File 
"/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/oauth2/credentials.py",
 line 136, in refresh
    self._client_secret))
  File 
"/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/oauth2/_client.py",
 line 237, in refresh_grant
    response_data = _token_endpoint_request(request, token_uri, body)
  File 
"/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google/oauth2/_client.py",
 line 106, in _token_endpoint_request
    method='POST', url=token_uri, headers=headers, body=body)
  File 
"/home/javier/.local/share/virtualenvs/bqssltest-obub2LuN/lib/python2.7/site-packages/google_auth_httplib2.py",
 line 119, in __call__
    raise exceptions.TransportError(exc)
google.auth.exceptions.TransportError: [SSL: CERTIFICATE_VERIFY_FAILED] 
certificate verify failed (_ssl.c:726)
{code}

The reason why I think this problem hasn't been posted before is because people 
is ignoring pip's output, which clearly states that there are some dependenciy 
issues:

{code:text}
(bqssltest) javier@ffukn897:~/projects/spinoffs/bqssltest$ pip install 
'apache-beam[gcp]==2.7.0'                                                       
                                      ...
google-gax 0.15.16 has requirement future<0.17dev,>=0.16.0, but you'll have 
future 0.17.1 which is incompatible.
gapic-google-cloud-pubsub-v1 0.15.4 has requirement 
oauth2client<4.0dev,>=2.0.0, but you'll have oauth2client 4.1.3 which is 
incompatible.
googledatastore 7.0.1 has requirement httplib2<0.10,>=0.9.1, but you'll have 
httplib2 0.11.3 which is incompatible.
googledatastore 7.0.1 has requirement oauth2client<4.0.0,>=2.0.1, but you'll 
have oauth2client 4.1.3 which is incompatible.
proto-google-cloud-pubsub-v1 0.15.4 has requirement 
oauth2client<4.0dev,>=2.0.0, but you'll have oauth2client 4.1.3 which is 
incompatible.
proto-google-cloud-datastore-v1 0.90.4 has requirement 
oauth2client<4.0dev,>=2.0.0, but you'll have oauth2client 4.1.3 which is 
incompatible.
...
{code}

These warnings are caused by the version pinning in the GCP requirements, in 
specific {{googledatastore==7.0.1}} has a direct requirement of {{httplib2 
[required: >=0.9.1,<0.10, installed: 0.9.2]}}. There is another version pinning 
of httplib2 directly by apache-beam, but doesn't cause the problem because it's 
asking for {{<=0.11.3}}.

I have no idea why googledatastore is pinned on that version, it seems that 
someone is aware of the problem with datastore as {{googledatastore==7.0.2}} is 
released with just that constraint removed.

The only thing missing here is to upgrade this line to use {{7.0.2}}:

https://github.com/apache/beam/blob/master/sdks/python/setup.py#L143

Can anyone do it and release a minor version? From previous experience I know 
it's way faster to merge a PR by a long running collaborator than by someone 
random on the internet.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to