HTTP Proxy URL-encoding woes

Eric Siegerman Mon, 12 Aug 2002 20:44:36 -0700

Ok, so this business of the HTTP proxy URL-encoding stuff is
biting me now.


Why does this happen?  It seems to me that the proxy should pass
along exactly what it got, unless there's a good reason to do
otherwise.  That invites the question:  is there a good reason
for URL-encoding the data?  What purpose is served by modifying
it, in however supposedly-benign a way?

On Fri, Aug 09, 2002 at 10:25:30AM -0400, Mike Stover wrote (over on jmeter-dev):
> It seems that what is needed is a new type of proxy server that generates SOAP 
> Samplers, instead of normal HTTP Samplers.  Not that your communications are 
> using SOAP necessarily, but the form is the same (XML through HTTP).

I wasn't following the quoted thread closely, but this suggestion
seems to be a response to someone else's URL-encoding problems.
If so, it wouldn't help me.  My problem is not with XML-formatted
data, but with bog-standard HTML forms.  My browser encodes a
particular form as "multipart/form-data", but JMeter URL-encodes
the whole request body (all of its MIME parts crunched into one
huge long line) and zaps the encoding to
"application/x-www-form-urlencoded".  Not surprisingly, the app
server hasn't a clue what to do with this.

Of course, a special case could be added for
"multipart/form-data" forms too (perhaps not as visibly as a
separate subclass).  But better than multiplying kludges, it
seems to me, would be to just make the existing proxy
transparent.

Here are the ways in which the proxy incorrectly modifies the
data (perhaps only a partial list; this is based only on the form
submission that's causing me grief):
  - form body munging as described above
  - the response headers are changed to be terminated by LF;
    they're supposed to be CRLF's, and are before the proxy gets
    hold of them [1]

There are other modifications, but they're all correct -- either
necessary, or else unnecessary but innocuous (e.g. reordering
headers).  If a list of these would be useful, I can post one.

I'm willing to take a quick stab at this (but haven't time for
more, so I can't promise anything).  But before I do, I'd like to
know that I'm not going off on the wrong track.  So, is there a
good reason for the current behaviour, or is it a bug /
implementation artifact?

Thanks much.


[1] "[...] a bare CR or LF MUST NOT be substituted for CRLF
    within any of the HTTP control structures (such as header
    fields and multipart boundaries)."
        - RFC 2068, p. 25, emphasis in original


P.S. Sorry if this sounds a bit snarky.  Normally I'd hold onto
it overnight and revise it in the morning.  But deadline pressure
kind of prevents that :-(

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.        [EMAIL PROTECTED]
|  |  /
Anyone who swims with the current will reach the big music steamship;
whoever swims against the current will perhaps reach the source.
        - Paul Schneider-Esleben

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

HTTP Proxy URL-encoding woes

Reply via email to