> On Nov. 20, 2014, 9:11 p.m., Ben Mahler wrote:
> > 3rdparty/libprocess/include/process/http.hpp, lines 415-416
> > <https://reviews.apache.org/r/28253/diff/1/?file=770208#file770208line415>
> >
> > Here's what we'll have if we added join:
> >
> > parse: takes a __decoded__ query string, construct a key,value map of
> > the parameters
> >
> > join: takes a key,value map of parameters, creates an __encoded__ query
> > string.
> >
> > That's where I find this to be non-intuitive:
> >
> > ```
> > join(parse(query)) == encode(query)
> > ```
> >
> > It seems like only request encoding should deal with constructing an
> > encoded query string, is that not the case? Would like to avoid exposing
> > this if possible, and if not possible, would like to make this more
> > intuitive.
>
> Cody Maloney wrote:
> So that behavior of parse is going to produce incorrect results I
> believe. Suppose one of my parameters in my query string is a small blob of
> JSON (Contains '=' and '&' potentially). If we encode/decode on the outside
> of query string parsing/joining, we can't represent this (The & will look
> like the start of a new parameter).
>
> ```
> std::map<> query = {{"a%20","b&c=d!@#foobar"}}
>
> EXPECT_EQ(query, parse(join(query)));
> ```
>
> Based on my reading of rfc3986 2.1 (snippet inline):
> ```
> A percent-encoding mechanism is used to represent a data octet in a
> component when that octet's corresponding character is outside the
> allowed set or is being used as a delimiter of, or within, the
> component.
> ```
>
> This should work, and is the reason why there is percent encoding. From
> what I can tell grepping throuhg the source code nothing currently does the
> decoding before we parse the query string, so making the decode live inside
> of the parse function should work. Currently callers just end up with percent
> encoded chunks in their query string, which causes badness (And could explain
> an issue which a customer was having with files/browse.json and things not
> showing up)
>
> Ben Mahler wrote:
> Hm.. so I did the grep and as expected the callers parse decoded strings.
>
> The original use of parse is in Decoder:
>
> ```
> static int on_message_complete(http_parser* p)
> {
> DataDecoder* decoder = (DataDecoder*) p->data;
>
> // Parse the query key/values.
> Try<std::string> decoded = http::decode(decoder->query);
> if (decoded.isError()) {
> return 1;
> }
> decoder->request->query = http::query::parse(decoded.get());
> ...
> ```
>
> Note that here we parse a decoded string, as expected. The other use
> appears to be within the http code in the master that parses request
> __bodies__:
>
> ```
> Future<Response> Master::Http::shutdown(const Request& request)
> {
> if (request.method != "POST") {
> return BadRequest("Expecting POST");
> }
>
> // Parse the query string in the request body (since this is a POST)
> // in order to determine the framework ID to shutdown.
> hashmap<string, string> values =
> process::http::query::parse(request.body);
> ```
>
> Bodies are not expected to be percent encoded, IIUC.
>
> Anything I missed...?
>
> Cody Maloney wrote:
> So we do, not sure how I missed that...
>
> So I can fix up the decoding to be inside the query string. Simple extra
> patch, then add some round trip tests. Sound good?
>
> The body one is interesting in that it looks like we're doing it
> incorrectly:
>
> http://stackoverflow.com/questions/4007969/application-x-www-form-urlencoded-or-multipart-form-data/4073451#4073451
>
> We're assuming application/x-www-form-urlencoded at the moment which we
> should decode. None of the requests happen to be from non-libprocess things
> or contain illegal characters so it happens to work.
>
> *random side thoughts*: We aren't a very good HTTP/webserver / parsing
> into request/response objects, processing Content-Type, doing simple clean
> routing. The more I dig into this code the more I think it would be nice if
> there was just an nginx, apache, proxygen / etc. which are good at doing the
> HTTP parts and we just take their parsing + methods, getting us out of the
> game of writing an HTTP server, then just have the bits to bridge / convert
> their request objects to the appropriate libprocess dispatches.
>
> Ben Mahler wrote:
> > So I can fix up the decoding to be inside the query string.
>
> Do you mean decoding to be inside query::parse? Sounds bad because it
> cannot assume the string is always encoded and decoding is not idempotent
> IIUC.
>
> If you _really_ need this, let's just provide the map -> string
> functionality, and leave the encoding responsibility to the Request encoder.
> As much as possible, the Encoder / Decoder code should be the only code
> responsible for encoding and decoding, but maybe I'm missing some context on
> this.
>
> Cody Maloney wrote:
> I'm saying the string is always encoded according to the specification at
> all our callsites. Otherwise it isn't a HTTP query string. Decoding a query
> string which has no encoded parameters in it doesn't make any change to the
> query string.
>
> The decoding has to happen after the query string is parsed apart to
> function correctly (The encoding exists so that otherwise illegal characters
> can be placed there). `parse(decode(query))` will not produce the right set
> of query parameters given query parameters/values which contain '&', '='. The
> encoding and parsing are fundamentally entertwined. It's part of transforming
> the wire format to native datastructures.
>
> I can do this by either making parse() decode as it parses or making a
> function which you call that takes the output from a query::parse() and
> decodes the key/value pairs inside of it. Either way works fine for me. I
> think it is important to note here every current callsite of parse should be
> doing a decode(parse()) if the decode function needs to wrap the parse. Parse
> of a decoded query string is not a functional operation, unless you are
> accepting only a subset of what should be valid query strings.
Great observations! How about:
```
namespace query {
// Decodes the query string into the set of query parameters.
inline hashmap<std::string, std::string> http::query::decode(const std::string&
query);
// Returns an encoded version of the query parameters.
inline std::string encode(const hashmap<std::string, std::string>& query);
}
```
Then we can fix the issue in Decoder via just using http::query::decode. This
patch can then introduce query::encode to be used later in an http::Request
Encoder. http::get is a bit of a mess, would be great to just take a Request
there! (Separate patches please :))
Sound good?
- Ben
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28253/#review62422
-----------------------------------------------------------
On Nov. 20, 2014, 12:50 a.m., Cody Maloney wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28253/
> -----------------------------------------------------------
>
> (Updated Nov. 20, 2014, 12:50 a.m.)
>
>
> Review request for mesos.
>
>
> Bugs: MESOS-2132
> https://issues.apache.org/jira/browse/MESOS-2132
>
>
> Repository: mesos-git
>
>
> Description
> -------
>
> This is needed to turn Request objects into http request which we send across
> the wire. The existing HTTP request sending code expects the query string to
> be already constructed.
>
> Ideally people can either just forward the query string they get to send it
> back across the wire or they can manipulate it using the native C++ datatype
> (A hashmap), and then have it automatically re-encoded.
>
> The parsing logic to go from a string to a map is already exists in this
> header, this just adds the other direction.
>
>
> Diffs
> -----
>
> 3rdparty/libprocess/include/process/http.hpp
> 9cf05acbb724ab9af8010d1788621d37a0e48e86
> 3rdparty/libprocess/src/http.cpp b00f33339366f5c06b6f20e38c5ae0c23b8a9358
> 3rdparty/libprocess/src/tests/http_tests.cpp
> a90e65f77904da0a45e1cc0cc9889ae69354a1a5
>
> Diff: https://reviews.apache.org/r/28253/diff/
>
>
> Testing
> -------
>
> make distcheck ubuntu 14.04
>
>
> Thanks,
>
> Cody Maloney
>
>