Re: [racket-users] Missing request-post-data/raw (from web-server/http)
Thanks, Jay. It is definitely POST, and there is a Content-Length header, so it seems like the problem is indeed #3. I was expecting the raw data to be there even if it had been parsed — I believe the POST data of # "corpus=austen=corpus.CorpusMetadata" was also parsed into bindings (though not from multipart, obviously). So it sounds like what I'll need to do is detect when this situation is happening — I guess that would be when the method is POST, the request-post-data/raw is #f, and there are some bindings — and convert the bindings back into multipart form data to give to http-sendrecv/url. -Philip On Fri, Jun 30, 2017 at 8:20 AM, Jay McCarthywrote: > Hi Philip, > > I don't necessarily know the answer and it's possible that it is an > error. I'll explain what it is doing and maybe that will help us move > forward. > > 1) The request-bindings/raw is just an abstraction over > request-post-data/raw (and the URI) > 2) The request-post-data/raw is always #f for GETs, are you sure they are > POSTs? > 3) POSTs with multipart form data are converted into a > request-bindings and the raw data is not made available, un-parsed. > 4) If there's no Content-Length header, then even if there is data, > then it is not exposed. > > I think that your problem may be (3). It sounds like you expect to see > a copy of the raw data of the request all the time even if it has been > parsed. (The logic of the current behavior is that at the > "application" level there is no POST data, but there is only form > data, but because of "transport" level constraints on the length of > URIs it had to be sent in the data part of the transport layer.) > > Jay > > > On Thu, Jun 29, 2017 at 9:08 PM, Philip McGrath > wrote: > > I'm working on a Racket web application for which I need to proxy certain > > requests to a non-Racket service over HTTP. I've built a very basic > proxy on > > top of http-sendrecv/url that works quite well for the most part. > > > > For POST requests, I pass the request-post-data/raw of the original > request > > as the #:data argument of http-sendrecv/url. > > > > However, I've discovered that certain POST requests (specifically > involving > > file uploads) are not working as expected. On these requests, Chrome > reports > > that it is performing a request with a header > > Content-Type:multipart/form-data; > > boundary=WebKitFormBoundaryAJOgATwBujJhhtbY and a payload as > follows: > > > > --WebKitFormBoundaryAJOgATwBujJhhtbY > > Content-Disposition: form-data; name="tool" > > corpus.CorpusCreator > > --WebKitFormBoundaryAJOgATwBujJhhtbY > > Content-Disposition: form-data; name="palette" > > default > > --WebKitFormBoundaryAJOgATwBujJhhtbY > > Content-Disposition: form-data; name="textarea-1014-inputEl" > > Type in one or more URLs on separate lines or paste in a full text. > > --WebKitFormBoundaryAJOgATwBujJhhtbY > > Content-Disposition: form-data; name="upload"; filename="tmp-file.txt" > > Content-Type: text/plain > > --WebKitFormBoundaryAJOgATwBujJhhtbY-- > > > > > > However, at the Racket level, request-post-data/raw returns #f for these > > requests — but, adding to my confusion, the bindings still show up in > > request-bindings/raw. > > > > Why doesn't this content show up in request-post-data/raw? Is there a > way to > > access the raw, original data for these requests, or do I need to somehow > > reconstruct it from the bindings? > > > > Thanks very much, > > Philip > > > > -- > > You received this message because you are subscribed to the Google Groups > > "Racket Users" group. > > To unsubscribe from this group and stop receiving emails from it, send an > > email to racket-users+unsubscr...@googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. > > > > -- > -=[ Jay McCarthy http://jeapostrophe.github.io]=- > -=[ Associate ProfessorPLT @ CS @ UMass Lowell ]=- > -=[ Moses 1:33: And worlds without number have I created; ]=- > -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [racket-users] Missing request-post-data/raw (from web-server/http)
Hi Philip, I don't necessarily know the answer and it's possible that it is an error. I'll explain what it is doing and maybe that will help us move forward. 1) The request-bindings/raw is just an abstraction over request-post-data/raw (and the URI) 2) The request-post-data/raw is always #f for GETs, are you sure they are POSTs? 3) POSTs with multipart form data are converted into a request-bindings and the raw data is not made available, un-parsed. 4) If there's no Content-Length header, then even if there is data, then it is not exposed. I think that your problem may be (3). It sounds like you expect to see a copy of the raw data of the request all the time even if it has been parsed. (The logic of the current behavior is that at the "application" level there is no POST data, but there is only form data, but because of "transport" level constraints on the length of URIs it had to be sent in the data part of the transport layer.) Jay On Thu, Jun 29, 2017 at 9:08 PM, Philip McGrathwrote: > I'm working on a Racket web application for which I need to proxy certain > requests to a non-Racket service over HTTP. I've built a very basic proxy on > top of http-sendrecv/url that works quite well for the most part. > > For POST requests, I pass the request-post-data/raw of the original request > as the #:data argument of http-sendrecv/url. > > However, I've discovered that certain POST requests (specifically involving > file uploads) are not working as expected. On these requests, Chrome reports > that it is performing a request with a header > Content-Type:multipart/form-data; > boundary=WebKitFormBoundaryAJOgATwBujJhhtbY and a payload as follows: > > --WebKitFormBoundaryAJOgATwBujJhhtbY > Content-Disposition: form-data; name="tool" > corpus.CorpusCreator > --WebKitFormBoundaryAJOgATwBujJhhtbY > Content-Disposition: form-data; name="palette" > default > --WebKitFormBoundaryAJOgATwBujJhhtbY > Content-Disposition: form-data; name="textarea-1014-inputEl" > Type in one or more URLs on separate lines or paste in a full text. > --WebKitFormBoundaryAJOgATwBujJhhtbY > Content-Disposition: form-data; name="upload"; filename="tmp-file.txt" > Content-Type: text/plain > --WebKitFormBoundaryAJOgATwBujJhhtbY-- > > > However, at the Racket level, request-post-data/raw returns #f for these > requests — but, adding to my confusion, the bindings still show up in > request-bindings/raw. > > Why doesn't this content show up in request-post-data/raw? Is there a way to > access the raw, original data for these requests, or do I need to somehow > reconstruct it from the bindings? > > Thanks very much, > Philip > > -- > You received this message because you are subscribed to the Google Groups > "Racket Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to racket-users+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. -- -=[ Jay McCarthy http://jeapostrophe.github.io]=- -=[ Associate ProfessorPLT @ CS @ UMass Lowell ]=- -=[ Moses 1:33: And worlds without number have I created; ]=- -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [racket-users] Missing request-post-data/raw (from web-server/http)
Thanks for your comments. The only legal files to upload in this case are plain text, so I'm not too worried about size. I'm relying on the web-server libraries to deal with any malicious attempts to send overwhelmingly large files (if that's a bad idea, I'd definitely appreciate hearing it!). Other parts of the application are implemented in #lang web-server, including some access control logic surrounding the requests that are proxied to the external service. With other requests, the post-data/raw field of the request struct has been #f only when the method field is #"GET": with POST requests, it has otherwise (and I thought it always would) contained the raw POST data e.g. # "corpus=austen=corpus.CorpusMetadata". I thought the bindings from the bindings/raw-promise field were simply an abstraction over the post-data/raw (and/or query part of the uri field), which is why I'm confused that this POST request has bindings, but has #f for its post-data/raw. -Philip On Thu, Jun 29, 2017 at 9:44 PM, Neil Van Dykewrote: > I don't know the answer to your particular questions with `web-server` > (I've made my own implementations of this in the past), and these comments > might not apply to your particular application, but I'll mention here for > whomever is interested... > > It sounds like you're using this, which might preempt your question: > > post-data/raw : (or/c false/c bytes?) >> > > Does your application permit a large file upload (an uploaded DVD-ROM > ".iso" file, like for a Linux distro install disc 1, is typically a few > gigabytes, and video files can also get huge), and is your program > (including libraries it uses) going to try to allocate gigabytes at a time > just for one HTTP request? > > If the `POST` data is potentially huge, you might want to think about > doing stream reading of it (i.e., not sucking it all into memory before you > do something with it), and sending blocks out your proxy approximately as > soon as they come in (without buffering too much). That can make your > program more robust, lower latency, and maybe even improve overall speed. > > Or, if you want to keep getting a convenient byte string out of the MIME > parser, and you plan to reject huge `POST` data before it > accidentally/intentionally DoS's your server, that will probably happen > either as the HTTP request is being read, or in the MIME multipart parser > (when the request is in MIME multipart, which `POST` isn't always, and if > the HTTP code hands off a pretty raw input port to multipart parsing code, > which it should). This is because you can't assume that HTTP or part > headers will tell you the content size before you read the content -- > sometimes you have to read to find the EOF or the MIME boundary string > kludge. > > I think streaming algorithms are usually the way to go for potentially > huge data. (Well, until you then get into what I'll call "poetic license" > situations, in which you know how to do it in streaming, and you know why > you don't have to stream in this case.) > > -- > You received this message because you are subscribed to the Google Groups > "Racket Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to racket-users+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [racket-users] Missing request-post-data/raw (from web-server/http)
I don't know the answer to your particular questions with `web-server` (I've made my own implementations of this in the past), and these comments might not apply to your particular application, but I'll mention here for whomever is interested... It sounds like you're using this, which might preempt your question: post-data/raw : (or/c false/c bytes?) Does your application permit a large file upload (an uploaded DVD-ROM ".iso" file, like for a Linux distro install disc 1, is typically a few gigabytes, and video files can also get huge), and is your program (including libraries it uses) going to try to allocate gigabytes at a time just for one HTTP request? If the `POST` data is potentially huge, you might want to think about doing stream reading of it (i.e., not sucking it all into memory before you do something with it), and sending blocks out your proxy approximately as soon as they come in (without buffering too much). That can make your program more robust, lower latency, and maybe even improve overall speed. Or, if you want to keep getting a convenient byte string out of the MIME parser, and you plan to reject huge `POST` data before it accidentally/intentionally DoS's your server, that will probably happen either as the HTTP request is being read, or in the MIME multipart parser (when the request is in MIME multipart, which `POST` isn't always, and if the HTTP code hands off a pretty raw input port to multipart parsing code, which it should). This is because you can't assume that HTTP or part headers will tell you the content size before you read the content -- sometimes you have to read to find the EOF or the MIME boundary string kludge. I think streaming algorithms are usually the way to go for potentially huge data. (Well, until you then get into what I'll call "poetic license" situations, in which you know how to do it in streaming, and you know why you don't have to stream in this case.) -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[racket-users] Missing request-post-data/raw (from web-server/http)
I'm working on a Racket web application for which I need to proxy certain requests to a non-Racket service over HTTP. I've built a very basic proxy on top of http-sendrecv/url that works quite well for the most part. For POST requests, I pass the request-post-data/raw of the original request as the #:data argument of http-sendrecv/url. However, I've discovered that certain POST requests (specifically involving file uploads) are not working as expected. On these requests, Chrome reports that it is performing a request with a header Content-Type:multipart/form-data; boundary=WebKitFormBoundaryAJOgATwBujJhhtbY and a payload as follows: --WebKitFormBoundaryAJOgATwBujJhhtbY Content-Disposition: form-data; name="tool" corpus.CorpusCreator --WebKitFormBoundaryAJOgATwBujJhhtbY Content-Disposition: form-data; name="palette" default --WebKitFormBoundaryAJOgATwBujJhhtbY Content-Disposition: form-data; name="textarea-1014-inputEl" Type in one or more URLs on separate lines or paste in a full text. --WebKitFormBoundaryAJOgATwBujJhhtbY Content-Disposition: form-data; name="upload"; filename="tmp-file.txt" Content-Type: text/plain --WebKitFormBoundaryAJOgATwBujJhhtbY-- However, at the Racket level, request-post-data/raw returns #f for these requests — but, adding to my confusion, the bindings still show up in request-bindings/raw. Why doesn't this content show up in request-post-data/raw? Is there a way to access the raw, original data for these requests, or do I need to somehow reconstruct it from the bindings? Thanks very much, Philip -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.