RunOrVeith opened a new issue #1591:
URL: https://github.com/apache/libcloud/issues/1591


   ## Summary
   
   When you create a google driver with OAuth2 credentials, there is code to 
determine which auth type is being used.
   This checks if it can reach the metadata server from Google Compute Engine 
to see if it is some kind of GCE login.
   When using OAuth credentials, this will always fail. If you have retries 
enabled it will non the less retry with exponential back off until the timeout 
is reached and then proceed to say that OAuth is being used. 
   
   ## Detailed Information
   
   The following snippet reproduces the issue on libcloud 3.3.1 (Ubuntu 18.04, 
Python 3.8)
   
   ```python
   import os
   from libcloud.storage.providers import Provider, get_driver
   
   
   def demonstrate_slow_oauth_login(client_id: str, secret: str):
       os.environ["LIBCLOUD_RETRY_FAILED_HTTP_REQUESTS"] = "1"
       driver_type = get_driver(provider=Provider.GOOGLE_STORAGE)
       driver = driver_type(key=client_id, secret=secret)
   ```
   What now happens is this:
   
   1. Up in the call chain of `__init__` when the driver gets created, the 
`BaseDriver.__init__` creates a connection. In this case it is 
`GoogleStorageConnection`.
   2. The `GoogleStorageConnection.__init__` calls this:
   ```python
       @classmethod
       def guess_type(cls, user_id):
           if cls._is_sa(user_id):
               return cls.SA
           elif cls._is_gcs_s3(user_id):
               return cls.GCS_S3
           elif cls._is_gce():  # <-- This slows everything down
               return cls.GCE
           else:
               return cls.IA   # <-- we just want to go here with OAuth
   ```
   3. The check for `_is_gce` works as follows:
   ```python
   def _get_gce_metadata(path=''):
       try:
           url = 'http://metadata/computeMetadata/v1/' + path.lstrip('/')
           headers = {'Metadata-Flavor': 'Google'}
           response = get_response_object(url, headers=headers)
           return response.status, '', response.body
       except Exception as e:
           return -1, str(e), None
   ```
   Will end up making a request with retries (inside `Connection.request`):
   ```python
   retry_request = retry(timeout=self.timeout,
                         retry_delay=self.retry_delay,
                         backoff=self.backoff)
   retry_request(self.connection.request)(method=method,
                                          url=url,
                                          body=data,
                                          headers=headers,
                                          stream=stream)
   ```
   
   4. Calling this when not using gce always results in an error: 
   ```
   HTTPConnectionPool(host='metadata', port=80): Max retries exceeded with url: 
/computeMetadata/v1/ (Caused by 
NewConnectionError('<urllib3.connection.HTTPConnection object at 
0x7fe268057760>: Failed to establish a new connection: [Errno -2] Name or 
service not known'))
   ```
   which then gets retried until the timeout is reached.
   
   5. The problem is actually even worse, since the `__init__` of 
`GoogleStorageDriver` calls this code twice.
   Once as above, and then it creates its own `GoogleStorageJSONConnection`, 
which suffers from the same problem.
   This results in having to wait for expiration of two exponential back-off 
retries, which is by default 30 seconds each.
   
   ### Workaround
   
   The work-around for this is to just turn retrying on after the driver is 
already created (then is takes <1 second), but this is not obvious nor 
intuitive.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to