On 19/02/2013 6:48 a.m., Alex Rousskov wrote:
On 02/16/2013 06:51 PM, Amos Jeffries wrote:

Since we now support HTTP/1.1 storage and revalidation of
Cache-Control:no-cache it is important that we at least detect the cases
where no-cache= and private= contain parameters.
The patch appears to do more than just detection -- Squid starts to
ignore CC:no-cache and, in some cases, CC:private (with or without
parameters, depending on circumstances).


AFAIK these are still rare occurances due to the historic lack of
support. So for now Squid just detects and exempts these responses from
the caching performed. The basic framework for adding future support of
these HTTP/1.1 features is made available
+            } else if (/* p &&*/ !httpHeaderParseQuotedString(p, (ilen-nlen-1), 
&private_)) {
+                debugs(65, 2, "cc: invalid private= specs near '" << item << 
"'");
+                clearPrivate();
+            } else {
It feels wrong to clear and ignore a private CC directive that we know
is there, even though we cannot parse its value. Should we be
conservative and treat this as private without parameters instead of
possibly violating HTTP by ignoring CC:private (e.g., if our parser is
wrong)?

Yes. This is undefined behaviour, but to be on the safe side this defaults again to "CC:private".

Also, if this code stays, please reshape it to handle the successful
parsing case (with side-effect of setting private_) first, for clarity:

   if (!p) {
      private_.clean(); // last CC:private wins
      setMask(true, true);
   } else if (parse(private_)) {
     setMask(true, true); // the above [re]sets private_
   } else {
     clearPrivate(); // ignore this and previous CC:private
   }


Same concerns for no-cache, of course.

They are formed that way to ft in with the surrounding code. In particular the parsing stye of the other 5 options with parameters.


The HTTP semantic meaning of these three cases is very different between themselves but the same syntax<->meaning pattern for both private and no-cache.

CC:private ... means *whole* response is private, MUST NOT cache.

CC:private="foo" ... means the specific listed *headers* are private, response itself MAY be cached.

CC:private=^%%$ ... (or other breakage) means invalid options and SHOULD be ignored.

CC:private, private="foo" ... is treated in HTTP as an explicit hack by the sender - such that caches coping with private="foo" will obey CC:private="foo" and older 1.0 caches will do "CC:private" handling.


This patch is supposed to pull the parameters into a string for later re-use and treat it as CC:private (MUST NOT cache). If the admin uses ignore-private, then we require a revalidate to ensure correct response to later clients, which should hopefully avoid cache poisoning in a lot of cases.


+    void Private(String &v) {setMask(CC_PRIVATE,true); private_.append(v);} // 
uses append for multi-line headers
Would not append() merge two header names together without delimiters if
the directive is split between two header fields? For example, would not
the private_ value become "foobar" in the example below?

   Foo: 1
   Cache-Control: private="foo", private="bar"
   Bar: 2

Fixed.


For the future code to work, the values from multiple directives should
be concatenated using ",". However, it may be better to store them in a
list or map of some kind instead (since that future code will have to
check individual header names for a match anyway). Thus, the best
structure for this may be rather different. This doubles my concern that
we are adding code that should be essentially unused (IMO), that adds
overhead, but that may have to be fixed/rewritten further when it is used.

Yes. SBufList was the idea in future. But we are not there yet. At present we are forced to use String by the current HttpHeader mechanisms.

Also, I think the code will mishandle this case:

   Foo: 1
   Cache-Control: private
   Cache-Control: private="bar"
   Bar: 2

because it will treat the above as the following less restrictive case:

   Foo: 1
   Cache-Control: private="bar"
   Bar: 2

instead of treating it as

   Foo: 1
   Cache-Control: private
   Bar: 2

Does HTTP say that [later] less restrictive CC directives overwrite
[earlier] more restrictive ones?

Yes. These two directives directly contradict each other. See above for the semantics. The resulting behaviour is undefined in the spec, but given that Squid treats both forms as non-cacheable I don't think it is a problem.


+            // CC:no-cache (not only if there are no parameters)
+            const bool CcNoCacheNoParams = (rep->cache_control->hasNoCache() && 
rep->cache_control->noCache().undefined());
The comment says "not only if there are no parameters" but the boolean
condition says "only if there are no parameters".

Dropped the 'not'.

+            // CC:private (if stored)
+            const bool CcPrivate = rep->cache_control->hasPrivate();
Does "if stored" apply to all other CC directives in this context as well?

Yes this is a HIT pathway. Dropped.

Please start local variable names such as CcPrivate with a small letter.

Done.


          // NP: request CC:no-cache only means cache READ is forbidden. STORE 
is permitted.
+        if (rep->cache_control->hasNoCache() && 
rep->cache_control->noCache().defined()) {
+            /* TODO: we are allowed to cache when no-cache= has parameters.
+             * Provided we strip away any of the listed headers unless they 
are revalidated
+             * successfully (ie, must revalidate AND these headers are 
prohibited on stale replies).
+             * That is a bit tricky for squid right now so we avoid caching 
entirely.
+             */
+            debugs(22, 3, HERE << "NO because server reply Cache-Control:no-cache 
has parameters");
+            return 0;
+        }
The new comment and behavior seems inconsistent, especially in light of
the old "NP" comment that is left intact: It seems like we should store
(or not) regardless of the header names presence. Why does presence of
header names make a difference in this case?

The above matches the RFC no-cache definition...

Response CC:no-cache="foo" requires header manipulation Squid cannot do yet. ==> Do not cache.

The if() and TODO is about that missing functionality

 Response CC:no-cache requires validation, Squid does this. ==> Cacheable.

The NP left in the code later on is about that caching.


And yes the whole option is very counter-intuitive.


              // NP: given the must-revalidate exception we should also be able 
to exempt no-cache.
              // HTTPbis WG verdict on this is that it is omitted from the spec 
due to being 'unexpected' by
              // some. The caching+revalidate is not exactly unsafe though with 
Squids interpretation of no-cache
-            // as equivalent to must-revalidate in the reply.
-        } else if (rep->cache_control->noCache() && 
!REFRESH_OVERRIDE(ignore_must_revalidate)) {
+            // (without parameters) as equivalent to must-revalidate in the 
reply.
+        } else if (rep->cache_control->hasNoCache() && 
!rep->cache_control->noCache().defined() && !REFRESH_OVERRIDE(ignore_must_revalidate)) {

Same concern here. Perhaps I am missing the meaning of including those
header names, but I think if we cannot handle them properly, we should
ignore their presence instead of ignoring the presence of the entire
CC:no-cache directive!

You are missing the meaning. Their presence alters the entire no-cache from being scoped to the whole response to be scoped to those headers alone. Handling them properly involves a lot of tricky if-exists post-processing on the old reply and revalidation response - whih at the time Squid has both those sets of details one is locked away behind a const barrier I could not unwind easily. It is far easier to simply treat no-cache="foo" as a no-store option and make this a "Precondition failed" as Co-Advisor says instead of "HTTP violation".

RFC specification has an "unless" clause on both these response directives. This whole patch is about making that "unless" case become non-cacheable again. As Dmitry pointed out it was broken by the earlier no-cache storge update. For the no-cache option we are reverting that behaviour change in the specific case of parameters being present. For the private option we are storing the parameters in a String, but otherwise not changing the current behaviour.


-            debugs(22, 3, HERE << " Authenticated but server reply 
Cache-Control:s-maxage");
+            debugs(22, 3, HERE << "Authenticated but server reply 
Cache-Control:s-maxage");
Please remove this non-change.

Done. Applied separately.

Amos

Reply via email to