-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chas,

On 2/8/2010 10:24 PM, c...@munat.com wrote:
>> I'm unaware of any uses of PUT that automatically parse the request body
>> on behalf of the user's code. Instead, the user's code is typically
>> expected to handle the entire request body. Apache Tomcat is not listed
>> on the Wikipedia article for "Representational State Transfer" as a
>> "Java implementation". Instead, there are other tools listed including
>> Apache's CXF. Perhaps this tool may be useful for you moving forward.
> 
> As I've mentioned in previous POSTs, Jetty seems to parse them. It may
> also leave them in the entity-body... I don't know, and I'm too swamped
> with work to find out.
> 
> People seem to be very confused by REST. A server isn't an implementation
> of REST. REST is an architecture.

I am familiar with REST and understand the implications. What you've run
into here is merely conflicting implementations of servlet containers.
It actually has nothing to do with REST except that, in your use case,
the use of PUT has landed you in this position.

>> I'm not interesting in getting into a religious argument about this, but
>> I have a few thoughts about why Tomcat shouldn't do this by default:
>>
>> 1. Doing so consumes the input stream. Yes, Tomcat could buffer this
>>    and still make the data available as an InputStream to the servlet
>>    as well as request parameters, but this requires additional
>>    processing on Tomcat's behalf as well as memory and/or disk space
>>    to buffer that data.
> 
> This seems like an architectural issue specific to Tomcat. I can't speak
> to that as I know nothing about how Tomcat is implemented. If it's too
> difficult or requires a significant performance hit to do both, then it's
> probably not worth it.

It's not that it's an architectural issue with Tomcat, but that extra
code would have to be written to handle all this additional processing
(which is not required by the spec and may actually violate it: see
below). In a product where yelling matches arise over whether proper
synchronization should be applied, adding large amounts of buffering
(possibly to the disk) is certain to be rejected.

> 
>> 2. If Tomcat must consume the input stream, it means that Tomcat must
>>    read everything before the user's code can continue. This
>>    prohibits streamed processing of input data. If the input data is a
>>    multi-megabyte XML document intended for SAX processing, Tomcat's
>>    "interference" in this process is a complete waste of time and may
>>    result in unnecessary OOMEs or disk writes and reads.
>>    I would agree that a client sending XML data as www-form-urlencoded
>>    content type would be pretty stupid, though.
> 
> I agree on all counts.
> 
> 
>> 3. It violates the servlet specification. This may sound like a
>>    cop-out, but there is a good reason that specifications exist:
>>    so that users can experience expected behavior under certain
>>    circumstances. In this case, the specification does not /prohibit/
>>    the parsing of www-form-urlencoded data in a PUT request, but it also
>>    does not say that it /will/ be done. From my perspective, Jetty's
>>    handling of this situation is "surprising".
> 
> I'm not clear on how it violates the servlet spec. I agree that following
> specs is best practice and I'm not at all a fan of folks inventing their
> own ways of doing things

The servlet specification basically says that, except under certain
circumstances (POST + www-form-urlencoded + getParameter call), the
request body should be available for the servlet to read and no
parameters are read.

The whole spec section relevant here is SRV.3.1 "HTTP Protocol
Parameters" and SRV.3.1.1 "When Parameters are Available".

"
SRV.3.1 HTTP Protocol Parameters

Request parameters for the servlet are the strings sent by the client to
a servlet container as part of its request. When the request is an
HttpServletRequest object, and conditions set out in “When Parameters
Are Available” on page 26 are met, the container populates the
parameters from the URI query string and POST-ed data.
"

So, the servlet specification basically says that the sources of
"request parameters" are the query string any "POST-ed data". There is
no mention of other HTTP methods.

That section continues after a description of the API methods for
getting parameter values:

"
Data from the query string and the post body are aggregated into the
request parameter set. Query string data is presented before post body
data. For example, if a request is made with a query string of a=hello
and a post body of a=goodbye&a=world, the resulting parameter set would
be ordered a=(hello, goodbye, world).
"

Again, only the query string and POST are specifically mentioned.

Finally, section 3.1.1 lays down the requirements for when the POST data
will actually be parsed on the servlet's behalf:

"
SRV.3.1.1 When Parameters Are Available

The following are the conditions that must be met before post form data
will be populated to the parameter set:

1. The request is an HTTP or HTTPS request.
2. The HTTP method is POST.
3. The content type is application/x-www-form-urlencoded.
4. The servlet has made an initial call of any of the getParameter
   family of methods on the request object.

If the conditions are not met and the post form data is not included in
the parameter set, the post data must still be available to the servlet
via the request object’s input stream. If the conditions are met, post
form data will no longer be available for reading directly from the
request object’s input stream.
"

So, the spec basically says that the data either will or will not be
read and merged into the "request parameters" available via the
HttpServletRequest object's getPamameter family of methods, not both.

> But if you're only parsing the body as an option, and leaving it
> there (having your cake and eating it to, so to speak), I don't see
> how that violates the spec or the RFC. Yes, stripping out the body =
> bad, bad, bad. But params parsed for PUTs, *in addition to the body*.
> How is that a violation?

This is why I believe it is a violation of the specification: reading
the request for parameters and also leaving the input stream available
for reading is surprising at the least and disastrous at the worst: it
could cause your code to re-parse the request body and assume that the
parameters were actually submitted twice (once via the call to
getParameter, which would give you the server-parsed copy of the
parameter, then a second time after the user-parsed parameters come into
play).

>> One of the sources of confusion at the beginning of this thread was that
>> you were calling this data "params in the header" which muddled these
>> terms together.
> 
> Why do you keep torturing me? OK! OK! No dessert for me tonight. Satisfied?

Yes, actually. We get a lot of posts on the list under the banner of
"frustrated and at my wits' end" and forgive the unnecessary yelling,
attacks, threats, and bold claims that absolutely nothing is wrong with
the poster's application.

Frankly, your bluster outlasted that of most other list members and, I'm
sure, turned a lot of folks off of the thread.

I'm not asking for a grand apology... just for you to relax: we really
are here to help. It's not often that something is found to be "wrong"
with Tomcat, so we like to be observant, methodical, and precise to make
sure we're properly identifying the causes for certain behaviors.

> Thanks a lot for your replies. They've been very educational and have
> caused my paranoia to wane considerably. There are some pretty bright
> people on this list.

There really are: this list is not merely a collection of winers and
Tomcat fanboys. Many if not all of the Tomcat committers lurk on the
list, as well as committers from other ASF projects. We have regulars
who work on production-quality web applications of fairly large
proportion as well as small. One of our regulars works at a company that
maintains its own JVM and offers insight into such implementations
whenever these minutiae arise in threads.

I encourage you to remain on the list after your questions are answered
and complaints are vented: put your mail reader in "threaded" mode and
read any thread that piques your interest. I've learned a lot from
reading and participating in many discussions on this list, and I think
you probably will, too, if you stick around.

So, welcome to the community.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktxfqUACgkQ9CaO5/Lv0PBr/gCfYtTm7PIegTAyT3mysQlI55bw
zaEAn2GFnTcGSyk8bviB0NPXEY4ZfEIb
=tbSv
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to