Jean-frederic Clere wrote:
William A. Rowe, Jr. wrote:

Guys, let me clarify, you are only paying attention to ';' following the
QUERY_STRING delimiter '?', correct?

';' means nothing special before the '?', double check your interpretation of RFC 2616. I can have /foo.bar;bash?v1=a;v2=b (or ...?v1=a&v2=b) and that
semi is part of the foo.bar;bash filename.  Right?

Then what I have just commited is not right...

But in mod_jk the behaviour without the patch is weird.
Try:
JkMount /*.jsp worker1
And url like http://localhost/;jsp-examples/jsp2/;simpletag/;hello.jsp
without the patches.

That may mean the core tomcat parser doesn't parse according to rfc 2616...
or it's simply an issue that ; should be escaped.  See 3.2.3

   Characters other than those in the "reserved" and "unsafe" sets (see
   RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.

which says

2.2. Reserved Characters

   Many URI include components consisting of or delimited by, certain
   special characters.  These characters are called "reserved", since
   their usage within the URI component is limited to their reserved
   purpose.  If the data for a URI component would conflict with the
   reserved purpose, then the conflicting data must be escaped before
   forming the URI.

      reserved    = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
                    "$" | ","

   The "reserved" syntax class above refers to those characters that are
   allowed within a URI, but which may not be allowed within a
   particular component of the generic URI syntax; they are used as
   delimiters of the components described in Section 3.

Now I realize that tomcat gets it's clue on ";" from the same RFC 2396

3.3. Path Component

   The path component contains data, specific to the authority (or the
   scheme if there is no authority component), identifying the resource
   within the scope of that scheme and authority.

      path          = [ abs_path | opaque_part ]

      path_segments = segment *( "/" segment )
      segment       = *pchar *( ";" param )
      param         = *pchar

      pchar         = unreserved | escaped |
                      ":" | "@" | "&" | "=" | "+" | "$" | ","

   The path may consist of a sequence of path segments separated by a
   single slash "/" character.  Within a path segment, the characters
   "/", ";", "=", and "?" are reserved.  Each path segment may include a
   sequence of parameters, indicated by the semicolon ";" character.
   The parameters are not significant to the parsing of relative
   references.

But I was under the belief that RFC 2616 did NOT adopt this structure
for-per-path segment param values.  What we are discussing doesn't inform
tomcat what to do with other abs_path values from other protocols,
only from HTTP.

Now that I reread 2616;

3.2.1 General Syntax

   URIs in HTTP can be represented in absolute form or relative to some
   known base URI [11], depending upon the context of their use. The two
   forms are differentiated by the fact that absolute URIs always begin
   with a scheme name followed by a colon. For definitive information on
   URL syntax and semantics, see "Uniform Resource Identifiers (URI):
   Generic Syntax and Semantics," RFC 2396 [42] (which replaces RFCs
   1738 [4] and RFC 1808 [11]). This specification adopts the
   definitions of "URI-reference", "absoluteURI", "relativeURI", "port",
   "host","abs_path", "rel_path", and "authority" from that
   specification.

I see it ***does*** adopt abs_path, and that includes the definition

      segment       = *pchar *( ";" param )

which means, in short, I believe the scheme parser of httpd is at least
partly flawed :)

http://svn.apache.org/repos/asf/apr/apr-util/trunk/uri/apr_uri.c

Note that the definition of a URI abs_path param informs the resource on
a segment-by-segment basis.  This is quite different than the definition
of an http "query" part (not mentioned in 3.2.1 above)

  http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]

Note especially RFC 2616's section 13.9...

   Unless the origin server explicitly prohibits the caching of their
   responses, the application of GET and HEAD methods to any resources
   SHOULD NOT have side effects that would lead to erroneous behavior if
   these responses are taken from a cache. They MAY still have side
   effects, but a cache is not required to consider such side effects in
   its caching decisions. Caches are always expected to observe an
   origin server's explicit restrictions on caching.

   We note one exception to this rule: since some applications have
   traditionally used GETs and HEADs with query URLs (those containing a
   "?" in the rel_path part) to perform operations with significant side
   effects, caches MUST NOT treat responses to such URIs as fresh unless
   the server provides an explicit expiration time.

If you use segment of *( ";" param ) in your path, ponder a moment; those
parameters to a GET or HEAD requests will be ignored by the proxy in it's
determination of whether to invalidate a stale cache entry.  They *are*
treated as unique, but a subsequent call to /deleteme;user=wrowe will *not*
cause the proxy to refetch the action from the origin server.  A subsequent
first request to GET /deleteme;user=jean-frederic would, of course, be passed
to the origin server, as that path is different from /deleteme;user=wrowe and
is not in the cache.

I'm suspecting alot of GET/HEAD requests from this parameter model are not
observing RFC2616 and it's cache control logic, unless they are explicitly
responding that the 'action' is not cacheable in the response headers:)

So please make sure you've thought this through and that tomcat is doing
precisely as RFC2616 declared, and take note that my original objection does
not precisely play out the way I stated it.

Bill


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to