Re: mod_cache with Cache-Control no-cache= or private=

2013-03-27 Thread Yann Ylavic
On Mon, Mar 25, 2013 at 11:58 PM, Roy T. Fielding field...@gbiv.com wrote:
 On Mar 13, 2013, at 10:20 AM, Graham Leggett wrote:
 I don't read it that way from the spec, I think it all comes down to the 
 phrase without successful revalidation with the origin server. I read it 
 as without successful revalidation [of the whole request] with the origin 
 server. In other words, the origin server sent the original header, if the 
 origin server doesn't update the header in the 304 response then it means 
 I've had my opportunity to revalidate the entity, current cached value is 
 fine, send it.

 Roy ultimately would be able to answer this?

 It means delete the header fields prior to storing them in the cache
 for later reuse.  If the origin had wanted must-revalidate, it would
 simply use that directive instead.  The successful revalidation bit
 is saying that the cache should forward all of the fields for the response
 to the original request and for any response that is revalidated
 (i.e., forward the new fields received in 304), but not for the
 requests that are entirely handled by the cache.


Thank you for clarification, hence mod_cache is allowed to serve the
cached response (with respect to the other restrictions), no
revalidation is needed (for the CC header fields at least).

The following patch implements this behaviour, with the CC header
fields not being stored while still played with the origin
(validation) response.

Regards,
Yann.

Index: modules/cache/cache_util.c
===
--- modules/cache/cache_util.c  (revision 1461557)
+++ modules/cache/cache_util.c  (working copy)
@@ -542,13 +542,10 @@
 }

 /* These come from the cached entity. */
-if (h-cache_obj-info.control.no_cache
-|| h-cache_obj-info.control.no_cache_header
-|| h-cache_obj-info.control.private_header) {
+if (h-cache_obj-info.control.no_cache) {
 /*
- * The cached entity contained Cache-Control: no-cache, or a
- * no-cache with a header present, or a private with a header
- * present, so treat as stale causing revalidation.
+ * The cached entity contained Cache-Control: no-cache,
+ * so treat as stale causing revalidation.
  */
 return 0;
 }
@@ -915,12 +912,23 @@
 return ap_cache_cacheable_headers(r-pool, r-headers_in, r-server);
 }

-/*
- * Create a new table consisting of those elements from an output
+static int cc_field_doo_doo(void *t, const char *key, const char *val)
+{
+if (val) {
+apr_table_addn(t, key, val);
+return 0;
+}
+return 1;
+}
+
+/* Create a new table consisting of those elements from an output
  * headers table that are allowed to be stored in a cache;
+ * when cc is not NULL, also strip the headers specified by the
+ * Cache-Control private= or no-cache= directives;
  * ensure there is a content type and capture any errors.
  */
-CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_headers_out(request_rec *r)
+CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_headers_out(request_rec *r,
+ const cache_control_t *cc)
 {
 apr_table_t *headers_out;

@@ -944,6 +952,46 @@
r-content_encoding);
 }

+if (cc  (cc-no_cache_header || cc-private_header)) {
+char *token;
+const char *cc_out = apr_table_get(headers_out, Cache-Control);
+while (cc_out  (token = ap_get_list_item(r-pool, cc_out))) {
+apr_size_t len = strlen(token);
+
+/* ap_get_list_item() strips the spurious whitespaces and
+ * lowercases anything (but the quoted-strings) */
+if (len  9  strncmp(token, no-cache=, 9) == 0) {
+token += 9;
+len -= 9;
+}
+else if (len  8  strncmp(token, private=, 8) == 0) {
+token += 8;
+len -= 8;
+}
+else {
+continue;
+}
+
+/* RFC2616 14.9: quoted list of field-names */
+if (len  2  token[0] == ''  token[--len] == '') {
+/* strip the header(s) from the cacheable headers out,
+ * but preserve the ones from the current response by
+ * adding them to the err_headers_out */
+const char *tok, *header;
+(++token)[--len] = '\0';
+tok = token;
+do {
+if ((header = ap_cache_tokstr(r-pool, tok, tok)) 
+!apr_table_do(cc_field_doo_doo, r-err_headers_out,
+  headers_out, header, NULL)) {
+apr_table_unset(r-headers_out, header);
+apr_table_unset(headers_out, header);
+}
+} while (tok);
+}
+}
+}
+
 return headers_out;
 }

@@ -1069,9 +1117,7 @@
  

Re: mod_cache with Cache-Control no-cache= or private=

2013-03-27 Thread Yann Ylavic
I have already created the bugzilla issue #54706 nearly 2 weeks ago,
about mod_cache that may serve cached private= or no-cache= response
headers.

Should I link something discussion from here or the patch to this issue ?

Regards,
Yann.


Re: mod_cache with Cache-Control no-cache= or private=

2013-03-27 Thread Graham Leggett
On 27 Mar 2013, at 6:06 PM, Yann Ylavic ylavic@gmail.com wrote:

 Index: modules/cache/mod_cache.h
 ===
 --- modules/cache/mod_cache.h (revision 1461557)
 +++ modules/cache/mod_cache.h (working copy)
 @@ -152,9 +152,12 @@
 
 /* Create a new table consisting of those elements from an output
  * headers table that are allowed to be stored in a cache;
 + * when cc is not NULL, also strip the headers specified by the
 + * Cache-Control private= or no-cache= directives;
  * ensure there is a content type and capture any errors.
  */
 -CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_headers_out(request_rec *r);
 +CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_headers_out(request_rec *r,
 + const cache_control_t *cc);
 
 /**
  * Parse the Cache-Control and Pragma headers in one go, marking

Been snowed under and haven't had a chance to look at this in detail, but one 
quick thing - we would definitely want to be able to backport this to v2.4 so 
as to get it into people's hands, and to do that, we cannot change the public 
APIs. We would need to find a way to do this without changing the API.

A further thing is the reparsing on the Cache-Control string, I'd like to see 
if I can find a way to avoid this, but need to dig as how to do that.

Regards,
Graham
--



smime.p7s
Description: S/MIME cryptographic signature


Re: mod_cache with Cache-Control no-cache= or private=

2013-03-27 Thread Yann Ylavic
On Wed, Mar 27, 2013 at 5:44 PM, Graham Leggett minf...@sharp.fm wrote:
 Been snowed under and haven't had a chance to look at this in detail, but one 
 quick thing - we would definitely want to be able to backport this to v2.4 so 
 as to get it into people's hands, and to do that, we cannot change the public 
 APIs. We would need to find a way to do this without changing the API.

 A further thing is the reparsing on the Cache-Control string, I'd like to see 
 if I can find a way to avoid this, but need to dig as how to do that.

In the patch below there is no API change and Cache-Control headers
won't be parsed twice, by updating the h-req/resp_hdrs before calling
provider-store_headers() and let it use h-req/resp_headers instead
of recomputing the whole from r-(err_)headers_out.

Regards,
Yann.

Index: modules/cache/cache_util.c
===
--- modules/cache/cache_util.c  (revision 1461557)
+++ modules/cache/cache_util.c  (working copy)
@@ -542,13 +542,10 @@
 }

 /* These come from the cached entity. */
-if (h-cache_obj-info.control.no_cache
-|| h-cache_obj-info.control.no_cache_header
-|| h-cache_obj-info.control.private_header) {
+if (h-cache_obj-info.control.no_cache) {
 /*
- * The cached entity contained Cache-Control: no-cache, or a
- * no-cache with a header present, or a private with a header
- * present, so treat as stale causing revalidation.
+ * The cached entity contained Cache-Control: no-cache,
+ * so treat as stale causing revalidation.
  */
 return 0;
 }
@@ -1069,9 +1066,7 @@
 /* ...then try slowest cases */
 else if (!strncasecmp(token, no-cache, 8)) {
 if (token[8] == '=') {
-if (apr_table_get(headers, token + 9)) {
-cc-no_cache_header = 1;
-}
+cc-no_cache_header = 1;
 }
 else if (!token[8]) {
 cc-no_cache = 1;
@@ -1146,9 +1141,7 @@
 }
 else if (!strncasecmp(token, private, 7)) {
 if (token[7] == '=') {
-if (apr_table_get(headers, token + 8)) {
-cc-private_header = 1;
-}
+cc-private_header = 1;
 }
 else if (!token[7]) {
 cc-private = 1;
Index: modules/cache/mod_cache.c
===
--- modules/cache/mod_cache.c   (revision 1461557)
+++ modules/cache/mod_cache.c   (working copy)
@@ -714,6 +714,65 @@
 }

 /*
+ * Same as ap_cacheable_headers_out(), but also strips the headers
+ * specified by the Cache-Control private= or no-cache= directives.
+ */
+static int cc_field_doo_doo(void *t, const char *key,
+ const char *val)
+{
+if (val) {
+apr_table_addn(t, key, val);
+return 0;
+}
+return 1;
+}
+static apr_table_t *cache_cacheable_headers_cc(request_rec *r,
+ const cache_control_t *cc)
+{
+apr_table_t *headers_out = ap_cache_cacheable_headers_out(r);
+if (cc  (cc-no_cache_header || cc-private_header)) {
+char *token;
+const char *cc_out = apr_table_get(headers_out, Cache-Control);
+while (cc_out  (token = ap_get_list_item(r-pool, cc_out))) {
+apr_size_t len = strlen(token);
+
+/* ap_get_list_item() strips the spurious whitespaces and
+ * lowercases anything (but the quoted-strings) */
+if (len  9  strncmp(token, no-cache=, 9) == 0) {
+token += 9;
+len -= 9;
+}
+else if (len  8  strncmp(token, private=, 8) == 0) {
+token += 8;
+len -= 8;
+}
+else {
+continue;
+}
+
+/* RFC2616 14.9: quoted list of field-names */
+if (len  2  token[0] == ''  token[--len] == '') {
+/* strip the header(s) from the cacheable headers out,
+ * but preserve the ones from the current response by
+ * adding them to the err_headers_out */
+const char *tok, *header;
+(++token)[--len] = '\0';
+tok = token;
+do {
+if ((header = ap_cache_tokstr(r-pool, tok, tok)) 
+!apr_table_do(cc_field_doo_doo, r-err_headers_out,
+  headers_out, header, NULL)) {
+apr_table_unset(r-headers_out, header);
+apr_table_unset(headers_out, header);
+}
+} while (tok);
+}
+}
+}

Re: mod_cache with Cache-Control no-cache= or private=

2013-03-27 Thread Yann Ylavic
In fact this patch is probably better since it does not change
h-resp_hdrs before calling cache_accept_headers() which uses them.

Regars,
Yann.

Index: modules/cache/cache_util.c
===
--- modules/cache/cache_util.c  (revision 1461557)
+++ modules/cache/cache_util.c  (working copy)
@@ -542,13 +542,10 @@
 }

 /* These come from the cached entity. */
-if (h-cache_obj-info.control.no_cache
-|| h-cache_obj-info.control.no_cache_header
-|| h-cache_obj-info.control.private_header) {
+if (h-cache_obj-info.control.no_cache) {
 /*
- * The cached entity contained Cache-Control: no-cache, or a
- * no-cache with a header present, or a private with a header
- * present, so treat as stale causing revalidation.
+ * The cached entity contained Cache-Control: no-cache,
+ * so treat as stale causing revalidation.
  */
 return 0;
 }
@@ -1069,9 +1066,7 @@
 /* ...then try slowest cases */
 else if (!strncasecmp(token, no-cache, 8)) {
 if (token[8] == '=') {
-if (apr_table_get(headers, token + 9)) {
-cc-no_cache_header = 1;
-}
+cc-no_cache_header = 1;
 }
 else if (!token[8]) {
 cc-no_cache = 1;
@@ -1146,9 +1141,7 @@
 }
 else if (!strncasecmp(token, private, 7)) {
 if (token[7] == '=') {
-if (apr_table_get(headers, token + 8)) {
-cc-private_header = 1;
-}
+cc-private_header = 1;
 }
 else if (!token[7]) {
 cc-private = 1;
Index: modules/cache/mod_cache.c
===
--- modules/cache/mod_cache.c   (revision 1461557)
+++ modules/cache/mod_cache.c   (working copy)
@@ -714,6 +714,65 @@
 }

 /*
+ * Same as ap_cacheable_headers_out(), but also strips the headers
+ * specified by the Cache-Control private= or no-cache= directives.
+ */
+static int cc_field_doo_doo(void *t, const char *key,
+ const char *val)
+{
+if (val) {
+apr_table_addn(t, key, val);
+return 0;
+}
+return 1;
+}
+static apr_table_t *cache_cacheable_headers_cc(request_rec *r,
+ const cache_control_t *cc)
+{
+apr_table_t *headers_out = ap_cache_cacheable_headers_out(r);
+if (cc  (cc-no_cache_header || cc-private_header)) {
+char *token;
+const char *cc_out = apr_table_get(headers_out, Cache-Control);
+while (cc_out  (token = ap_get_list_item(r-pool, cc_out))) {
+apr_size_t len = strlen(token);
+
+/* ap_get_list_item() strips the spurious whitespaces and
+ * lowercases anything (but the quoted-strings) */
+if (len  9  strncmp(token, no-cache=, 9) == 0) {
+token += 9;
+len -= 9;
+}
+else if (len  8  strncmp(token, private=, 8) == 0) {
+token += 8;
+len -= 8;
+}
+else {
+continue;
+}
+
+/* RFC2616 14.9: quoted list of field-names */
+if (len  2  token[0] == ''  token[--len] == '') {
+/* strip the header(s) from the cacheable headers out,
+ * but preserve the ones from the current response by
+ * adding them to the err_headers_out */
+const char *tok, *header;
+(++token)[--len] = '\0';
+tok = token;
+do {
+if ((header = ap_cache_tokstr(r-pool, tok, tok)) 
+!apr_table_do(cc_field_doo_doo, r-err_headers_out,
+  headers_out, header, NULL)) {
+apr_table_unset(r-headers_out, header);
+apr_table_unset(headers_out, header);
+}
+} while (tok);
+}
+}
+}
+return headers_out;
+}
+
+/*
  * CACHE_SAVE filter
  * ---
  *
@@ -746,6 +805,7 @@
 apr_time_t exp, date, lastmod, now;
 apr_off_t size = -1;
 cache_info *info = NULL;
+apr_table_t *cc_headers;
 char *reason;
 apr_pool_t *p;
 apr_bucket *e;
@@ -1075,7 +1135,7 @@
  * err_headers_out and we also need to strip any hop-by-hop
  * headers that might have snuck in.
  */
-r-headers_out = ap_cache_cacheable_headers_out(r);
+r-headers_out = cache_cacheable_headers_cc(r, control);

 /* Merge in our cached headers.  However, keep any
updated values. 

Re: mod_cache with Cache-Control no-cache= or private=

2013-03-27 Thread Yann Ylavic
Sorry for my precipitation, the Content-Type is stripped from the
validated (stale) headers with the previous patch, we have to do a
copy like below.

Regards,
Yann.

Index: modules/cache/cache_util.c
===
--- modules/cache/cache_util.c  (revision 1461557)
+++ modules/cache/cache_util.c  (working copy)
@@ -542,13 +542,10 @@
 }

 /* These come from the cached entity. */
-if (h-cache_obj-info.control.no_cache
-|| h-cache_obj-info.control.no_cache_header
-|| h-cache_obj-info.control.private_header) {
+if (h-cache_obj-info.control.no_cache) {
 /*
- * The cached entity contained Cache-Control: no-cache, or a
- * no-cache with a header present, or a private with a header
- * present, so treat as stale causing revalidation.
+ * The cached entity contained Cache-Control: no-cache,
+ * so treat as stale causing revalidation.
  */
 return 0;
 }
@@ -1069,9 +1066,7 @@
 /* ...then try slowest cases */
 else if (!strncasecmp(token, no-cache, 8)) {
 if (token[8] == '=') {
-if (apr_table_get(headers, token + 9)) {
-cc-no_cache_header = 1;
-}
+cc-no_cache_header = 1;
 }
 else if (!token[8]) {
 cc-no_cache = 1;
@@ -1146,9 +1141,7 @@
 }
 else if (!strncasecmp(token, private, 7)) {
 if (token[7] == '=') {
-if (apr_table_get(headers, token + 8)) {
-cc-private_header = 1;
-}
+cc-private_header = 1;
 }
 else if (!token[7]) {
 cc-private = 1;
Index: modules/cache/mod_cache.c
===
--- modules/cache/mod_cache.c   (revision 1461557)
+++ modules/cache/mod_cache.c   (working copy)
@@ -714,6 +714,65 @@
 }

 /*
+ * Same as ap_cacheable_headers_out(), but also strips the headers
+ * specified by the Cache-Control private= or no-cache= directives.
+ */
+static int cc_field_doo_doo(void *t, const char *key,
+ const char *val)
+{
+if (val) {
+apr_table_addn(t, key, val);
+return 0;
+}
+return 1;
+}
+static apr_table_t *cache_cacheable_headers_cc(request_rec *r,
+ const cache_control_t *cc)
+{
+apr_table_t *headers_out = ap_cache_cacheable_headers_out(r);
+if (cc  (cc-no_cache_header || cc-private_header)) {
+char *token;
+const char *cc_out = apr_table_get(headers_out, Cache-Control);
+while (cc_out  (token = ap_get_list_item(r-pool, cc_out))) {
+apr_size_t len = strlen(token);
+
+/* ap_get_list_item() strips the spurious whitespaces and
+ * lowercases anything (but the quoted-strings) */
+if (len  9  strncmp(token, no-cache=, 9) == 0) {
+token += 9;
+len -= 9;
+}
+else if (len  8  strncmp(token, private=, 8) == 0) {
+token += 8;
+len -= 8;
+}
+else {
+continue;
+}
+
+/* RFC2616 14.9: quoted list of field-names */
+if (len  2  token[0] == ''  token[--len] == '') {
+/* strip the header(s) from the cacheable headers out,
+ * but preserve the ones from the current response by
+ * adding them to the err_headers_out */
+const char *tok, *header;
+(++token)[--len] = '\0';
+tok = token;
+do {
+if ((header = ap_cache_tokstr(r-pool, tok, tok)) 
+!apr_table_do(cc_field_doo_doo, r-err_headers_out,
+  headers_out, header, NULL)) {
+apr_table_unset(r-headers_out, header);
+apr_table_unset(headers_out, header);
+}
+} while (tok);
+}
+}
+}
+return headers_out;
+}
+
+/*
  * CACHE_SAVE filter
  * ---
  *
@@ -746,6 +805,7 @@
 apr_time_t exp, date, lastmod, now;
 apr_off_t size = -1;
 cache_info *info = NULL;
+apr_table_t *cc_headers;
 char *reason;
 apr_pool_t *p;
 apr_bucket *e;
@@ -1075,7 +1135,7 @@
  * err_headers_out and we also need to strip any hop-by-hop
  * headers that might have snuck in.
  */
-r-headers_out = ap_cache_cacheable_headers_out(r);
+r-headers_out = cache_cacheable_headers_cc(r, control);

 /* Merge in our cached headers.  However, keep 

Re: mod_cache with Cache-Control no-cache= or private=

2013-03-27 Thread Yann Ylavic
The latest patch is attached to bugzilla #54706.

Regards,
Yann.


Re: mod_cache with Cache-Control no-cache= or private=

2013-03-25 Thread Roy T. Fielding
On Mar 13, 2013, at 10:20 AM, Graham Leggett wrote:

 On 11 Mar 2013, at 12:50 PM, Yann Ylavic ylavic@gmail.com wrote:
 
 The way I read the spec, the specified field-name(s) MUST NOT be sent in 
 the response to a subsequent request without successful revalidation with 
 the origin server. What this means is that if the specified field names 
 are found to be present in the cached response, then the origin server 
 needs to be given the opportunity to update these fields through a 
 conditional request. In the current cache code, we return 0 meaning this 
 is stale, revalidate, and a conditional request is sent to the origin. We 
 hope the origin sends 304 Not Modified, with updated headers 
 corresponding to the fields.
 
 Ok, I see your point, and this is surely the right reading of the rfc,
 but there is necessarily a difference between no-cache and
 no-cache=header(s), particularly with the handling of that (stale)
 header(s).
 
 For what I understand now, if the origin does not send (one of) the
 header(s) in its 304 response, the stale header(s) MUST NOT be
 served.
 
 I don't read it that way from the spec, I think it all comes down to the 
 phrase without successful revalidation with the origin server. I read it as 
 without successful revalidation [of the whole request] with the origin 
 server. In other words, the origin server sent the original header, if the 
 origin server doesn't update the header in the 304 response then it means 
 I've had my opportunity to revalidate the entity, current cached value is 
 fine, send it.
 
 Roy ultimately would be able to answer this?

It means delete the header fields prior to storing them in the cache
for later reuse.  If the origin had wanted must-revalidate, it would
simply use that directive instead.  The successful revalidation bit
is saying that the cache should forward all of the fields for the response
to the original request and for any response that is revalidated
(i.e., forward the new fields received in 304), but not for the
requests that are entirely handled by the cache.

Roy



Re: mod_cache with Cache-Control no-cache= or private=

2013-03-13 Thread Yann Ylavic
Here is the patch that strips the no-cache= or private= specified
headers after the origin server's validation, leaving the only headers
updated by the origin.

Regards,
Yann.

Index: modules/cache/cache_storage.c
===
--- modules/cache/cache_storage.c   (revision 1456050)
+++ modules/cache/cache_storage.c   (working copy)
@@ -156,6 +156,51 @@
 apr_table_unset(h-resp_hdrs, Last-Modified);
 }

+v = apr_table_get(h-resp_hdrs, Cache-Control);
+if (v  (h-cache_obj-info.control.no_cache_header ||
+  h-cache_obj-info.control.private_header)) {
+/*
+ * RFC2616 14.9.1: If the no-cache directive does specify one or more
+ * field-names, then a cache MAY use the response to satisfy a
+ * subsequent request, subject to any other restrictions on caching.
+ * However, the specified field-name(s) MUST NOT be sent in the
+ * response to a subsequent request without successful revalidation
+ * with the origin server.
+ *
+ * Hence we will strip these cached headers (if any) and let the only
+ * ones validated by the origin server.
+ */
+char *token;
+apr_size_t len;
+while ((token = ap_get_list_item(r-pool, v))) {
+/* ap_get_list_item() strips the spurious whitespaces and
+ * lowercases anything (but the quoted-strings) */
+if (strncmp(token, no-cache=, 9) == 0) {
+token += 9;
+}
+else if (strncmp(token, private=, 8) == 0) {
+token += 8;
+}
+else {
+continue;
+}
+
+/* RFC2616 14.9: quoted list of field-names */
+len = strlen(token);
+if (token[0] == ''  token[--len] == '') {
+(++token)[--len] = '\0';
+do {
+const char *name = ap_cache_tokstr(r-pool, token,
+   (const char**)token);
+if (name) {
+/* strip that name header the response */
+apr_table_unset(h-resp_hdrs, name);
+}
+} while (token);
+}
+}
+}
+
 /* The HTTP specification says that it is legal to merge duplicate
  * headers into one.  Some browsers that support Cookies don't like
  * merged headers and prefer that each Set-Cookie header is sent


Re: mod_cache with Cache-Control no-cache= or private=

2013-03-13 Thread Graham Leggett
On 11 Mar 2013, at 12:50 PM, Yann Ylavic ylavic@gmail.com wrote:

 The way I read the spec, the specified field-name(s) MUST NOT be sent in 
 the response to a subsequent request without successful revalidation with 
 the origin server. What this means is that if the specified field names are 
 found to be present in the cached response, then the origin server needs to 
 be given the opportunity to update these fields through a conditional 
 request. In the current cache code, we return 0 meaning this is stale, 
 revalidate, and a conditional request is sent to the origin. We hope the 
 origin sends 304 Not Modified, with updated headers corresponding to the 
 fields.
 
 Ok, I see your point, and this is surely the right reading of the rfc,
 but there is necessarily a difference between no-cache and
 no-cache=header(s), particularly with the handling of that (stale)
 header(s).
 
 For what I understand now, if the origin does not send (one of) the
 header(s) in its 304 response, the stale header(s) MUST NOT be
 served.

I don't read it that way from the spec, I think it all comes down to the phrase 
without successful revalidation with the origin server. I read it as without 
successful revalidation [of the whole request] with the origin server. In 
other words, the origin server sent the original header, if the origin server 
doesn't update the header in the 304 response then it means I've had my 
opportunity to revalidate the entity, current cached value is fine, send it.

Roy ultimately would be able to answer this?

Regards,
Graham
--



smime.p7s
Description: S/MIME cryptographic signature


Re: mod_cache with Cache-Control no-cache= or private=

2013-03-13 Thread Yann Ylavic
On Wed, Mar 13, 2013 at 6:20 PM, Graham Leggett minf...@sharp.fm wrote:
 On 11 Mar 2013, at 12:50 PM, Yann Ylavic ylavic@gmail.com wrote:

 The way I read the spec, the specified field-name(s) MUST NOT be sent in 
 the response to a subsequent request without successful revalidation with 
 the origin server. What this means is that if the specified field names 
 are found to be present in the cached response, then the origin server 
 needs to be given the opportunity to update these fields through a 
 conditional request. In the current cache code, we return 0 meaning this 
 is stale, revalidate, and a conditional request is sent to the origin. We 
 hope the origin sends 304 Not Modified, with updated headers 
 corresponding to the fields.

 Ok, I see your point, and this is surely the right reading of the rfc,
 but there is necessarily a difference between no-cache and
 no-cache=header(s), particularly with the handling of that (stale)
 header(s).

 For what I understand now, if the origin does not send (one of) the
 header(s) in its 304 response, the stale header(s) MUST NOT be
 served.

 I don't read it that way from the spec, I think it all comes down to the 
 phrase without successful revalidation with the origin server. I read it as 
 without successful revalidation [of the whole request] with the origin 
 server. In other words, the origin server sent the original header, if the 
 origin server doesn't update the header in the 304 response then it means 
 I've had my opportunity to revalidate the entity, current cached value is 
 fine, send it.

How would the origin invalidate a Set-Cookie, with an empty one ?

Regards,
Yann.


Re: mod_cache with Cache-Control no-cache= or private=

2013-03-13 Thread Graham Leggett
On 13 Mar 2013, at 7:27 PM, Yann Ylavic ylavic@gmail.com wrote:

 How would the origin invalidate a Set-Cookie, with an empty one ?

I would imagine with a 200 OK.

Roy would be able to confirm.

Regards,
Graham
--



smime.p7s
Description: S/MIME cryptographic signature


Re: mod_cache with Cache-Control no-cache= or private=

2013-03-13 Thread Tom Evans
On Wed, Mar 13, 2013 at 5:27 PM, Yann Ylavic ylavic@gmail.com wrote:

 How would the origin invalidate a Set-Cookie, with an empty one ?

 Regards,
 Yann.

Set it again, with an in the past expiry date.

Cheers

Tom


Re: mod_cache with Cache-Control no-cache= or private=

2013-03-13 Thread Yann Ylavic
On Wed, Mar 13, 2013 at 6:30 PM, Graham Leggett minf...@sharp.fm wrote:
 On 13 Mar 2013, at 7:27 PM, Yann Ylavic ylavic@gmail.com wrote:

 How would the origin invalidate a Set-Cookie, with an empty one ?

 I would imagine with a 200 OK.

 Roy would be able to confirm.

Well, I can't see the difference with the no-cache without a header
specified (the actual code) then...

Regards,
Yann.


Re: mod_cache with Cache-Control no-cache= or private=

2013-03-13 Thread Yann Ylavic
On Wed, Mar 13, 2013 at 6:35 PM, Tom Evans tevans...@googlemail.com wrote:
 On Wed, Mar 13, 2013 at 5:27 PM, Yann Ylavic ylavic@gmail.com wrote:

 How would the origin invalidate a Set-Cookie, with an empty one ?

 Regards,
 Yann.

 Set it again, with an in the past expiry date.

Well, that's not exactly the same thing, the user may have a valid
Cookie (which is not the one cached) the origin wants to keep going
on.
I meant invalidating the cached cookie, not the one of the user.

Cheers,
Yann.


Re: mod_cache with Cache-Control no-cache= or private=

2013-03-13 Thread Tim Bannister
On 13 Mar 2013, at 17:41, Yann Ylavic ylavic@gmail.com wrote:
 On Wed, Mar 13, 2013 at 6:35 PM, Tom Evans tevans...@googlemail.com wrote:
 On Wed, Mar 13, 2013 at 5:27 PM, Yann Ylavic ylavic@gmail.com wrote:
 
 How would the origin invalidate a Set-Cookie, with an empty one ?
 
 Regards,
 Yann.
 
 Set it again, with an in the past expiry date.
 
 Well, that's not exactly the same thing, the user may have a valid Cookie 
 (which is not the one cached) the origin wants to keep going on.
 I meant invalidating the cached cookie, not the one of the user.


Is this the situation you're worried about:

ClientA: GET /foo HTTP/1.1
ClientA: Host: …

ReverseProxy: GET /foo HTTP/1.1
ReverseProxy: Host: …

Origin: HTTP/1.1 200 OK
Origin: Date: …
Origin: Set-Cookie: data=AA
Origin: Cache-Control: private=Set-Cookie

ReverseProxy: HTTP/1.1 200 OK
ReverseProxy: Date: …
ReverseProxy: Set-Cookie: data=AA
ReverseProxy: Cache-Control: private=Set-Cookie



ClientB: GET /foo HTTP/1.1
ClientB: Host: …
ClientB: Cookie: data=BB

ReverseProxy: GET /foo HTTP/1.1
ReverseProxy: Host: …
ReverseProxy: Cookie: data=BBB

Origin: HTTP/1.1 304 Not Modified
Origin: Date: …
Origin: Cache-Control: private=Set-Cookie



This should just work. The final reply from the cacheing reverse proxy should 
look like this:
ReverseProxy: HTTP/1.1 304 Not Modified
ReverseProxy: Date: …

and the Set-Cookie: header from the stored request would not be used (in fact, 
the proxy may have elected not to store it). The origin doesn't have to mention 
that header in the 304 response.


-- 
Tim Bannister – is...@jellybaby.net



Re: mod_cache with Cache-Control no-cache= or private=

2013-03-13 Thread Yann Ylavic
On Wed, Mar 13, 2013 at 9:28 PM, Tim Bannister is...@jellybaby.net wrote:
 Is this the situation you're worried about:

 ClientA: GET /foo HTTP/1.1
 ReverseProxy: GET /foo HTTP/1.1
 Origin: HTTP/1.1 200 OK
 Origin: Set-Cookie: data=AA
 Origin: Cache-Control: private=Set-Cookie
 ReverseProxy: Set-Cookie: data=AA
 ReverseProxy: Cache-Control: private=Set-Cookie

 ClientB: GET /foo HTTP/1.1
 ClientB: Cookie: data=BB
 ReverseProxy: GET /foo HTTP/1.1
 ReverseProxy: Cookie: data=BBB
 Origin: HTTP/1.1 304 Not Modified

Yes, about what happens now, the ReverseProxy (mod_cache) must not
Set-Cookie: data=AA to ClientB.

 This should just work. The final reply from the cacheing reverse proxy should 
 look like this:
 ReverseProxy: HTTP/1.1 304 Not Modified
 ReverseProxy: Date: …

This actually does not work, mod_cache will serve the cached Set-Cookie.
The CacheIgnoreHeaders directive only can prevent this (not controlled
by the origin).

The final reply to ClientB, whose request is not conditional, can also be :
ReverseProxy: HTTP/1.1 200 OK
ReverseProxy: Cache-Control: private=Set-Cookie
ReverseProxy: cached body
That's the main goal I guess (limit backend hits for large things).

 and the Set-Cookie: header from the stored request would not be used (in 
 fact, the proxy may have elected not to store it). The origin doesn't have to 
 mention that header in the 304 response.

In mod_cache the no-store of a particular header is harder to patch
than the no-cache (ie. do not serve without revalidation), but
indeed the former would be more efficient, no need to sanitize each
served response.

For the private=, the rfc does not say more than its BNF...
If private has the same semantic as without the =, the header should
not be stored (the Cache-Control: private response is not stored by
mod_cache).

In all cases, IMHO, no cached Set-Cookie should aver played by a cache
with private/no-cache=Set-Cookie associated with the resource.


Re: mod_cache with Cache-Control no-cache= or private=

2013-03-11 Thread Yann Ylavic
On Sun, Mar 10, 2013 at 1:55 AM, Graham Leggett minf...@sharp.fm wrote:
 On 04 Mar 2013, at 8:22 PM, ylavic dev ylavic@gmail.com wrote:

 For what I understand, mod_cache is allowed to serve its cached entity 
 (though without the specified header(s)).

 I read this through again, this time having slept properly.

Thank you for your enlightened consideration.

 The way I read the spec, the specified field-name(s) MUST NOT be sent in the 
 response to a subsequent request without successful revalidation with the 
 origin server. What this means is that if the specified field names are 
 found to be present in the cached response, then the origin server needs to 
 be given the opportunity to update these fields through a conditional 
 request. In the current cache code, we return 0 meaning this is stale, 
 revalidate, and a conditional request is sent to the origin. We hope the 
 origin sends 304 Not Modified, with updated headers corresponding to the 
 fields.

Ok, I see your point, and this is surely the right reading of the rfc,
but there is necessarily a difference between no-cache and
no-cache=header(s), particularly with the handling of that (stale)
header(s).

For what I understand now, if the origin does not send (one of) the
header(s) in its 304 response, the stale header(s) MUST NOT be
served.
So mod_cache should never send these stale headers to the client, and
either do not cache them at all, or strip them before overlaping the
304's ones.

The actual code does not comply with this requirement since it
overlaps the stale headers with the origin ones, hence the no-cache
headers will still be there if there are not specified in the 304
response.

 If we were to follow this patch, it means that the first time we hit the URL, 
 the client sees the private/no-cache fields, but every cached response after 
 will be treated as fresh with the field missing. This breaks caching.

Indeed this patch is broken, but I can modify it to comply with my
comment above, meaning (at first glance) that the treatment should be
in cache_accept_headers().

Should I propose the new patch or my understanding is definitively broken ?

Regards,
Yann.


Re: mod_cache with Cache-Control no-cache= or private=

2013-03-09 Thread Graham Leggett
On 04 Mar 2013, at 8:22 PM, ylavic dev ylavic@gmail.com wrote:

 I've been working on a patch for mod_cache to deal (fully) with the response 
 header Cache-Control and the no-cache=header or private=header directives.
 This feature is mainly used with the Set-Cookie header, and allows the 
 origin server to control the caching of that particular header.
 
 Although the code is already there to detect their usage with a header, 
 mod_cache still handle these directives as if no header was specified.
 That is, stale entity causing revalidation [by the origin server].
 
 The RFC-2616 (14.9.1 What is Cacheable) says this about the no-cache=header 
 directive :
   If the no-cache directive does specify one or more field-names,
   then a cache MAY use the response to satisfy a subsequent request,
   subject to any other restrictions on caching. However, the
   specified field-name(s) MUST NOT be sent in the response to a
   subsequent request without successful revalidation with the origin
   server. This allows an origin server to prevent the re-use of
   certain header fields in a response, while still allowing caching
   of the rest of the response.
 For what I understand, mod_cache is allowed to serve its cached entity 
 (though without the specified header(s)).

I read this through again, this time having slept properly.

The way I read the spec, the specified field-name(s) MUST NOT be sent in the 
response to a subsequent request without successful revalidation with the 
origin server. What this means is that if the specified field names are found 
to be present in the cached response, then the origin server needs to be given 
the opportunity to update these fields through a conditional request. In the 
current cache code, we return 0 meaning this is stale, revalidate, and a 
conditional request is sent to the origin. We hope the origin sends 304 Not 
Modified, with updated headers corresponding to the fields.

If we were to follow this patch, it means that the first time we hit the URL, 
the client sees the private/no-cache fields, but every cached response after 
will be treated as fresh with the field missing. This breaks caching.

What you're trying to achieve needs to be handled by your origin server, which 
should support conditional requests, and then send updated Set-Cookie headers 
along with the 304 Not Modified responses. This way the body stays cached, but 
your Set-Cookie is updated on every hit.

Regards,
Graham
--



smime.p7s
Description: S/MIME cryptographic signature


Re: mod_cache with Cache-Control no-cache= or private=

2013-03-06 Thread Yann Ylavic
Hi,

On Mon, Mar 4, 2013 at 7:22 PM, ylavic dev ylavic@gmail.com wrote:
 I've been working on a patch for mod_cache to deal (fully) with the
 response header Cache-Control and the no-cache=header or private=header
 directives.

I realize that, maybe, the patch should have been included in the
message, rather than in an attachment, for it to be read quickly.
So let me reply to myself with the patch below (which is not a big deal)...

Or maybe is there a reason not to include that functionality in
mod_cache, with most of the code being already there ?
I could not find any relative discussion in the list nor anywhere
(about mod_cache, but to say it is not implemented).

Regards,
Yann.

Index: modules/cache/cache_util.c
===
--- modules/cache/cache_util.c  (revision 1451191)
+++ modules/cache/cache_util.c  (working copy)
@@ -27,7 +27,7 @@

 extern module AP_MODULE_DECLARE_DATA cache_module;

-#define CACHE_SEPARATOR ,   
+#define CACHE_SEPARATOR , \t

 /* Determine if url matches the hostname, scheme and port and path
  * in filter. All but the path comparisons are case-insensitive.
@@ -542,17 +542,84 @@
 }

 /* These come from the cached entity. */
-if (h-cache_obj-info.control.no_cache
-|| h-cache_obj-info.control.no_cache_header
-|| h-cache_obj-info.control.private_header) {
+if (h-cache_obj-info.control.no_cache) {
 /*
- * The cached entity contained Cache-Control: no-cache, or a
- * no-cache with a header present, or a private with a header
- * present, so treat as stale causing revalidation.
+ * The cached entity contained Cache-Control: no-cache, so
+ * treat as stale causing revalidation.
  */
 return 0;
 }
+if (h-cache_obj-info.control.no_cache_header
+|| h-cache_obj-info.control.private_header) {
+/*
+ * RFC2616 14.9.1: The cached entity contained
+ * Cache-Control: no-cache=, or Cache-Control: private=, with
+ * a header present, hence we are allowed to serve this entity,
+ * but without the specified headers, so let's strip them now,
+ * and fall through the other restrictions.
+ *
+ * Here we assume mixed Cache-Control: no-cache and no-cache=
+ * have been caught above and treated as stale causing revalidation,
+ * leaving here the only no-cache= and/or private= with a header.
+ */
+char *token;
+const char *header = apr_table_get(h-resp_hdrs, Cache-Control);
+while (header  (token = ap_get_list_item(r-pool, header))) {
+/* ap_get_list_item() strips the spurious whitespaces and
+ * lowercases anything (but the quoted-strings) */
+if (strncmp(token, no-cache=, 9) == 0) {
+token += 9;
+}
+else if (strncmp(token, private=, 8) == 0) {
+token += 8;
+}
+else {
+continue;
+}

+if (*token == '') {
+/* RFC2616 2.2: quoted-string
+ * found no ap_*() function to unquote those strings,
+ * so the job is done here... */
+char *pos, *start, *end;
+pos = start = end = token + 1;
+while (*pos  *pos != '') {
+if (*pos == '\\') {
+/* RFC2616 2.2: quoted-pair */
+if (end == pos) {
+/* duplicate to preserve the original token
+ * should the quoted-string be invalid */
+start = apr_pstrdup(r-pool, start);
+pos = end = start + (pos - token) - 1;
+}
+/* skip the quote */
+pos++;
+}
+if (end != pos) {
+*end = *pos;
+}
+end++;
+pos++;
+}
+if (*pos == ''  !*(pos + 1)) {
+/* valid quoted-string */
+token = start;
+*end = '\0';
+}
+else {
+/* invalid quoted-string, continue? fall through?
+ * like ap_get_mime_headers_core() we do not check
+ * headers' names validity, and just fall through,
+ * is there a tiny chance to unset such a header? */
+/*continue;*/
+}
+}
+
+/* strip that header from the response */
+apr_table_unset(h-resp_hdrs, token);
+}
+}
+
 if ((agestr = apr_table_get(h-resp_hdrs, Age))) {
 age_c = apr_atoi64(agestr);
 }


Re: mod_cache with Cache-Control no-cache= or private=

2013-03-06 Thread Graham Leggett
On 06 Mar 2013, at 12:04 PM, Yann Ylavic ylavic@gmail.com wrote:

 I've been working on a patch for mod_cache to deal (fully) with the
 response header Cache-Control and the no-cache=header or private=header
 directives.
 
 I realize that, maybe, the patch should have been included in the
 message, rather than in an attachment, for it to be read quickly.
 So let me reply to myself with the patch below (which is not a big deal)...
 
 Or maybe is there a reason not to include that functionality in
 mod_cache, with most of the code being already there ?
 I could not find any relative discussion in the list nor anywhere
 (about mod_cache, but to say it is not implemented).

I reviewed the patch and it looks sane, but my schedule has been insane, and I 
need some sleep before I commit this.

Should have time over the weekend if not before. Complying with all of the RFC 
is the goal of mod_cache, thank you for contributing this.

Regards,
Graham
--



smime.p7s
Description: S/MIME cryptographic signature


mod_cache with Cache-Control no-cache= or private=

2013-03-04 Thread ylavic dev
Hi,

I've been working on a patch for mod_cache to deal (fully) with the
response header Cache-Control and the no-cache=header or private=header
directives.
This feature is mainly used with the Set-Cookie header, and allows the
origin server to control the caching of that particular header.

Although the code is already there to detect their usage with a header,
mod_cache still handle these directives as if no header was specified.
That is, stale entity causing revalidation [by the origin server].

The RFC-2616 (14.9.1 What is Cacheable) says this about the
no-cache=header directive :

  If the no-cache directive does specify one or more field-names,
  then a cache MAY use the response to satisfy a subsequent request,
  subject to any other restrictions on caching. However, the
  specified field-name(s) MUST NOT be sent in the response to a
  subsequent request without successful revalidation with the origin
  server. This allows an origin server to prevent the re-use of
  certain header fields in a response, while still allowing caching
  of the rest of the response.

For what I understand, mod_cache is allowed to serve its cached entity
(though without the specified header(s)).

mod_cache already provides the configuration directive CacheIgnoreHeaders
to prevent the specified headers from being cached, but this gives no
control to the origin server.

The attached patch works with trunk and 2.4 (I've got one for 2.2 too,
which is the version I use, but I doubt such a feature would be merged into
it...).

Note that the patch also fixes what seems to be an unfortunate replacement
of the tab character (^V) with spaces in the CACHE_SEPARATOR macro
definition.

Thanks for your feedbacks,
regards,
Yann.


cache_util-trunk.patch
Description: Binary data