> On Nov. 20, 2014, 9:11 p.m., Ben Mahler wrote: > > 3rdparty/libprocess/include/process/http.hpp, lines 415-416 > > <https://reviews.apache.org/r/28253/diff/1/?file=770208#file770208line415> > > > > Here's what we'll have if we added join: > > > > parse: takes a __decoded__ query string, construct a key,value map of > > the parameters > > > > join: takes a key,value map of parameters, creates an __encoded__ query > > string. > > > > That's where I find this to be non-intuitive: > > > > ``` > > join(parse(query)) == encode(query) > > ``` > > > > It seems like only request encoding should deal with constructing an > > encoded query string, is that not the case? Would like to avoid exposing > > this if possible, and if not possible, would like to make this more > > intuitive. > > Cody Maloney wrote: > So that behavior of parse is going to produce incorrect results I > believe. Suppose one of my parameters in my query string is a small blob of > JSON (Contains '=' and '&' potentially). If we encode/decode on the outside > of query string parsing/joining, we can't represent this (The & will look > like the start of a new parameter). > > ``` > std::map<> query = {{"a%20","b&c=d!@#foobar"}} > > EXPECT_EQ(query, parse(join(query))); > ``` > > Based on my reading of rfc3986 2.1 (snippet inline): > ``` > A percent-encoding mechanism is used to represent a data octet in a > component when that octet's corresponding character is outside the > allowed set or is being used as a delimiter of, or within, the > component. > ``` > > This should work, and is the reason why there is percent encoding. From > what I can tell grepping throuhg the source code nothing currently does the > decoding before we parse the query string, so making the decode live inside > of the parse function should work. Currently callers just end up with percent > encoded chunks in their query string, which causes badness (And could explain > an issue which a customer was having with files/browse.json and things not > showing up) > > Ben Mahler wrote: > Hm.. so I did the grep and as expected the callers parse decoded strings. > > The original use of parse is in Decoder: > > ``` > static int on_message_complete(http_parser* p) > { > DataDecoder* decoder = (DataDecoder*) p->data; > > // Parse the query key/values. > Try<std::string> decoded = http::decode(decoder->query); > if (decoded.isError()) { > return 1; > } > decoder->request->query = http::query::parse(decoded.get()); > ... > ``` > > Note that here we parse a decoded string, as expected. The other use > appears to be within the http code in the master that parses request > __bodies__: > > ``` > Future<Response> Master::Http::shutdown(const Request& request) > { > if (request.method != "POST") { > return BadRequest("Expecting POST"); > } > > // Parse the query string in the request body (since this is a POST) > // in order to determine the framework ID to shutdown. > hashmap<string, string> values = > process::http::query::parse(request.body); > ``` > > Bodies are not expected to be percent encoded, IIUC. > > Anything I missed...? > > Cody Maloney wrote: > So we do, not sure how I missed that... > > So I can fix up the decoding to be inside the query string. Simple extra > patch, then add some round trip tests. Sound good? > > The body one is interesting in that it looks like we're doing it > incorrectly: > > http://stackoverflow.com/questions/4007969/application-x-www-form-urlencoded-or-multipart-form-data/4073451#4073451 > > We're assuming application/x-www-form-urlencoded at the moment which we > should decode. None of the requests happen to be from non-libprocess things > or contain illegal characters so it happens to work. > > *random side thoughts*: We aren't a very good HTTP/webserver / parsing > into request/response objects, processing Content-Type, doing simple clean > routing. The more I dig into this code the more I think it would be nice if > there was just an nginx, apache, proxygen / etc. which are good at doing the > HTTP parts and we just take their parsing + methods, getting us out of the > game of writing an HTTP server, then just have the bits to bridge / convert > their request objects to the appropriate libprocess dispatches. > > Ben Mahler wrote: > > So I can fix up the decoding to be inside the query string. > > Do you mean decoding to be inside query::parse? Sounds bad because it > cannot assume the string is always encoded and decoding is not idempotent > IIUC. > > If you _really_ need this, let's just provide the map -> string > functionality, and leave the encoding responsibility to the Request encoder. > As much as possible, the Encoder / Decoder code should be the only code > responsible for encoding and decoding, but maybe I'm missing some context on > this. > > Cody Maloney wrote: > I'm saying the string is always encoded according to the specification at > all our callsites. Otherwise it isn't a HTTP query string. Decoding a query > string which has no encoded parameters in it doesn't make any change to the > query string. > > The decoding has to happen after the query string is parsed apart to > function correctly (The encoding exists so that otherwise illegal characters > can be placed there). `parse(decode(query))` will not produce the right set > of query parameters given query parameters/values which contain '&', '='. The > encoding and parsing are fundamentally entertwined. It's part of transforming > the wire format to native datastructures. > > I can do this by either making parse() decode as it parses or making a > function which you call that takes the output from a query::parse() and > decodes the key/value pairs inside of it. Either way works fine for me. I > think it is important to note here every current callsite of parse should be > doing a decode(parse()) if the decode function needs to wrap the parse. Parse > of a decoded query string is not a functional operation, unless you are > accepting only a subset of what should be valid query strings. > > Ben Mahler wrote: > Great observations! How about: > > ``` > namespace query { > > // Decodes the query string into the set of query parameters. > inline hashmap<std::string, std::string> http::query::decode(const > std::string& query); > > // Returns an encoded version of the query parameters. > inline std::string encode(const hashmap<std::string, std::string>& query); > > } > ``` > > Then we can fix the issue in Decoder via just using http::query::decode. > This patch can then introduce query::encode to be used later in an > http::Request Encoder. http::get is a bit of a mess, would be great to just > take a Request there! (Separate patches please :)) > > Sound good?
Sounds good to me. I'll work on the patches soon. - Cody ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28253/#review62422 ----------------------------------------------------------- On Nov. 20, 2014, 12:50 a.m., Cody Maloney wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/28253/ > ----------------------------------------------------------- > > (Updated Nov. 20, 2014, 12:50 a.m.) > > > Review request for mesos. > > > Bugs: MESOS-2132 > https://issues.apache.org/jira/browse/MESOS-2132 > > > Repository: mesos-git > > > Description > ------- > > This is needed to turn Request objects into http request which we send across > the wire. The existing HTTP request sending code expects the query string to > be already constructed. > > Ideally people can either just forward the query string they get to send it > back across the wire or they can manipulate it using the native C++ datatype > (A hashmap), and then have it automatically re-encoded. > > The parsing logic to go from a string to a map is already exists in this > header, this just adds the other direction. > > > Diffs > ----- > > 3rdparty/libprocess/include/process/http.hpp > 9cf05acbb724ab9af8010d1788621d37a0e48e86 > 3rdparty/libprocess/src/http.cpp b00f33339366f5c06b6f20e38c5ae0c23b8a9358 > 3rdparty/libprocess/src/tests/http_tests.cpp > a90e65f77904da0a45e1cc0cc9889ae69354a1a5 > > Diff: https://reviews.apache.org/r/28253/diff/ > > > Testing > ------- > > make distcheck ubuntu 14.04 > > > Thanks, > > Cody Maloney > >
