Hi Michael,

Thanks for this well-researched report.

It looks like this is a bug. Can you start the PR with a test to
demonstrate the issue first as a failing test, then work on the fix.

You're right in that libcloud has to support a lot of scenarios. The
library moved to requests (from an old Python 2 urllib module) a few years
ago, and there are lots of edge cases.

Regards,
Anthony Shaw

On Tue, Sep 1, 2020 at 6:32 AM Michael Spagon <mspa...@infinite.io> wrote:

> Hi,
>
> We are seeing HTTP 400 errors in our OpenStack server logs: ERROR WSGI:
> code 400, Bad request syntax ('0'). They appear when sending a GET
> request.
>
> I traced this bug down and it points to a single-line of code.  Removing
> the line fixes the problem, BUT I am emailing to ask if there are larger
> ramifications of doing so.
>
> Does anyone know the history of this line or why it exists?
>
> *The Problem*
>
> HTTP 400 errors appear in our OpenStack server logs when performing a GET
> request: ERROR WSGI: code 400, Bad request syntax ('0').
> [image: Screen Shot 2020-08-28 at 9.12.32 AM.png]
> *Figure 1.0 Wireshark shows HTTP 400 after GET request.*
>
>
> *The Line of Code*
>
> The code is in libcoud/http.py and relates to how requests is being used
> to prepare requests and potentially affects every HTTP request libcloud
> makes.
>
> A PreparedRequest is being constructed and then its body is being
> overwritten. The reassignment seems redundant to me, as Requests does the
> job of translating it.
>
>
> https://github.com/apache/libcloud/blob/72de440a508fdb4ea546d4e586a452472f1cbd5d/libcloud/http.py#L244
>
> *#libcloud/http.py
> #LibcloudConnection class
> *def prepared_request(self, method, url, body=None,
>                      headers=None, raw=False, stream=False):
>     headers = self._normalize_headers(headers=headers)
>
>     req = requests.Request(method, ''.join([self.host, url]),
>                            data=body, headers=headers)
>
>     prepped = self.session.prepare_request(req)
> *    prepped.body = body  <<<<<< This line.
> *
>     self.response = self.session.send(
>         prepped,
>         stream=stream,
>         verify=self.ca_cert if self.ca_cert is not None else self.verify)
>
>
> *Why I Think It’s the Problem*
>
> When a request is prepared, if its body is an empty string '', the
> preparation step ignores the body and sets the resulting PreparedRequest body
> to None. Subsequently setting the body back to an empty string '' breaks
> requests logic when sending the request. Requests now thinks the request
> should be chunked because body='' breaks this logic:
>
> https://github.com/psf/requests/blob/master/requests/adapters.py#L420
>
> chunked = not (request.body is None or 'Content-Length' in request.headers)
>
> A raw HTTP chunked request is sent with no body. As a result only a 0 is
> transmitted to signify the last chunk. This coincides with when our server
> receives the HTTP 400 error.
>
> https://github.com/psf/requests/blob/master/requests/adapters.py#L469
>
> *#requests/adapters.py
> *for i in request.body:
>     low_conn.send(hex(len(i))[2:].encode('utf-8'))
>     low_conn.send(b'\r\n')
>     low_conn.send(i)
>     low_conn.send(b'\r\n')
> low_conn.send(b'0\r\n\r\n')
>
>
> *Replicating the Behavior*
>
> You can use this snippet to reproduce the same behavior outside the
> context of Libcloud.
>
> import requests
>
> session = requests.Session()
>
> req = requests.Request('GET', 'http://www.google.com', data='')
> *# prepare_request() will ignore the empty string and set prepped.body to 
> None.
> *prepped = session.prepare_request(req)
> *# Overwriting None with '' will break requests logic for sending requests.
> *prepped.body = ''
> *# Chunked request is sent with a zero length body. Only terminating 0 is 
> sent.
> *response = session.send(prepped, stream=False, verify=False)
>
>
> *Conclusion*
>
> Is there a reason for the body being overwritten that I am overlooking? I
> am aware that libcloud's use cases vary greatly and I may be missing
> something. Note that this is not a problem for every provider. Some
> connection classes set the default value for data as None instead of '' ,
> and None does not trigger an issue.
>
> *#libcloud/common/openstack.py OpenStackBaseConnection
> *def request(self, action, params=None, *data=''*, headers=None, ...)
> *#libcloud/common/base.py Connection
> *def request(self, action, params=None, *data=None*, headers=None, ...)
>
> I was able to run the tox test cases and everything was fine.  I am in
> the process of opening a formal issue and/or PR.
>
> ~~~~~~~
> Thank for taking time out of your day.
>
> Cheers,
>
> Michael Spagon
> @ InfiniteIO
>

Reply via email to