Re: mod_cache with Cache-Control no-cache= or private=
On Mon, Mar 25, 2013 at 11:58 PM, Roy T. Fielding field...@gbiv.com wrote: On Mar 13, 2013, at 10:20 AM, Graham Leggett wrote: I don't read it that way from the spec, I think it all comes down to the phrase without successful revalidation with the origin server. I read it as without successful revalidation [of the whole request] with the origin server. In other words, the origin server sent the original header, if the origin server doesn't update the header in the 304 response then it means I've had my opportunity to revalidate the entity, current cached value is fine, send it. Roy ultimately would be able to answer this? It means delete the header fields prior to storing them in the cache for later reuse. If the origin had wanted must-revalidate, it would simply use that directive instead. The successful revalidation bit is saying that the cache should forward all of the fields for the response to the original request and for any response that is revalidated (i.e., forward the new fields received in 304), but not for the requests that are entirely handled by the cache. Thank you for clarification, hence mod_cache is allowed to serve the cached response (with respect to the other restrictions), no revalidation is needed (for the CC header fields at least). The following patch implements this behaviour, with the CC header fields not being stored while still played with the origin (validation) response. Regards, Yann. Index: modules/cache/cache_util.c === --- modules/cache/cache_util.c (revision 1461557) +++ modules/cache/cache_util.c (working copy) @@ -542,13 +542,10 @@ } /* These come from the cached entity. */ -if (h-cache_obj-info.control.no_cache -|| h-cache_obj-info.control.no_cache_header -|| h-cache_obj-info.control.private_header) { +if (h-cache_obj-info.control.no_cache) { /* - * The cached entity contained Cache-Control: no-cache, or a - * no-cache with a header present, or a private with a header - * present, so treat as stale causing revalidation. + * The cached entity contained Cache-Control: no-cache, + * so treat as stale causing revalidation. */ return 0; } @@ -915,12 +912,23 @@ return ap_cache_cacheable_headers(r-pool, r-headers_in, r-server); } -/* - * Create a new table consisting of those elements from an output +static int cc_field_doo_doo(void *t, const char *key, const char *val) +{ +if (val) { +apr_table_addn(t, key, val); +return 0; +} +return 1; +} + +/* Create a new table consisting of those elements from an output * headers table that are allowed to be stored in a cache; + * when cc is not NULL, also strip the headers specified by the + * Cache-Control private= or no-cache= directives; * ensure there is a content type and capture any errors. */ -CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_headers_out(request_rec *r) +CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_headers_out(request_rec *r, + const cache_control_t *cc) { apr_table_t *headers_out; @@ -944,6 +952,46 @@ r-content_encoding); } +if (cc (cc-no_cache_header || cc-private_header)) { +char *token; +const char *cc_out = apr_table_get(headers_out, Cache-Control); +while (cc_out (token = ap_get_list_item(r-pool, cc_out))) { +apr_size_t len = strlen(token); + +/* ap_get_list_item() strips the spurious whitespaces and + * lowercases anything (but the quoted-strings) */ +if (len 9 strncmp(token, no-cache=, 9) == 0) { +token += 9; +len -= 9; +} +else if (len 8 strncmp(token, private=, 8) == 0) { +token += 8; +len -= 8; +} +else { +continue; +} + +/* RFC2616 14.9: quoted list of field-names */ +if (len 2 token[0] == '' token[--len] == '') { +/* strip the header(s) from the cacheable headers out, + * but preserve the ones from the current response by + * adding them to the err_headers_out */ +const char *tok, *header; +(++token)[--len] = '\0'; +tok = token; +do { +if ((header = ap_cache_tokstr(r-pool, tok, tok)) +!apr_table_do(cc_field_doo_doo, r-err_headers_out, + headers_out, header, NULL)) { +apr_table_unset(r-headers_out, header); +apr_table_unset(headers_out, header); +} +} while (tok); +} +} +} + return headers_out; } @@ -1069,9 +1117,7 @@
Re: mod_cache with Cache-Control no-cache= or private=
I have already created the bugzilla issue #54706 nearly 2 weeks ago, about mod_cache that may serve cached private= or no-cache= response headers. Should I link something discussion from here or the patch to this issue ? Regards, Yann.
Re: mod_cache with Cache-Control no-cache= or private=
On 27 Mar 2013, at 6:06 PM, Yann Ylavic ylavic@gmail.com wrote: Index: modules/cache/mod_cache.h === --- modules/cache/mod_cache.h (revision 1461557) +++ modules/cache/mod_cache.h (working copy) @@ -152,9 +152,12 @@ /* Create a new table consisting of those elements from an output * headers table that are allowed to be stored in a cache; + * when cc is not NULL, also strip the headers specified by the + * Cache-Control private= or no-cache= directives; * ensure there is a content type and capture any errors. */ -CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_headers_out(request_rec *r); +CACHE_DECLARE(apr_table_t *)ap_cache_cacheable_headers_out(request_rec *r, + const cache_control_t *cc); /** * Parse the Cache-Control and Pragma headers in one go, marking Been snowed under and haven't had a chance to look at this in detail, but one quick thing - we would definitely want to be able to backport this to v2.4 so as to get it into people's hands, and to do that, we cannot change the public APIs. We would need to find a way to do this without changing the API. A further thing is the reparsing on the Cache-Control string, I'd like to see if I can find a way to avoid this, but need to dig as how to do that. Regards, Graham -- smime.p7s Description: S/MIME cryptographic signature
Re: mod_cache with Cache-Control no-cache= or private=
On Wed, Mar 27, 2013 at 5:44 PM, Graham Leggett minf...@sharp.fm wrote: Been snowed under and haven't had a chance to look at this in detail, but one quick thing - we would definitely want to be able to backport this to v2.4 so as to get it into people's hands, and to do that, we cannot change the public APIs. We would need to find a way to do this without changing the API. A further thing is the reparsing on the Cache-Control string, I'd like to see if I can find a way to avoid this, but need to dig as how to do that. In the patch below there is no API change and Cache-Control headers won't be parsed twice, by updating the h-req/resp_hdrs before calling provider-store_headers() and let it use h-req/resp_headers instead of recomputing the whole from r-(err_)headers_out. Regards, Yann. Index: modules/cache/cache_util.c === --- modules/cache/cache_util.c (revision 1461557) +++ modules/cache/cache_util.c (working copy) @@ -542,13 +542,10 @@ } /* These come from the cached entity. */ -if (h-cache_obj-info.control.no_cache -|| h-cache_obj-info.control.no_cache_header -|| h-cache_obj-info.control.private_header) { +if (h-cache_obj-info.control.no_cache) { /* - * The cached entity contained Cache-Control: no-cache, or a - * no-cache with a header present, or a private with a header - * present, so treat as stale causing revalidation. + * The cached entity contained Cache-Control: no-cache, + * so treat as stale causing revalidation. */ return 0; } @@ -1069,9 +1066,7 @@ /* ...then try slowest cases */ else if (!strncasecmp(token, no-cache, 8)) { if (token[8] == '=') { -if (apr_table_get(headers, token + 9)) { -cc-no_cache_header = 1; -} +cc-no_cache_header = 1; } else if (!token[8]) { cc-no_cache = 1; @@ -1146,9 +1141,7 @@ } else if (!strncasecmp(token, private, 7)) { if (token[7] == '=') { -if (apr_table_get(headers, token + 8)) { -cc-private_header = 1; -} +cc-private_header = 1; } else if (!token[7]) { cc-private = 1; Index: modules/cache/mod_cache.c === --- modules/cache/mod_cache.c (revision 1461557) +++ modules/cache/mod_cache.c (working copy) @@ -714,6 +714,65 @@ } /* + * Same as ap_cacheable_headers_out(), but also strips the headers + * specified by the Cache-Control private= or no-cache= directives. + */ +static int cc_field_doo_doo(void *t, const char *key, + const char *val) +{ +if (val) { +apr_table_addn(t, key, val); +return 0; +} +return 1; +} +static apr_table_t *cache_cacheable_headers_cc(request_rec *r, + const cache_control_t *cc) +{ +apr_table_t *headers_out = ap_cache_cacheable_headers_out(r); +if (cc (cc-no_cache_header || cc-private_header)) { +char *token; +const char *cc_out = apr_table_get(headers_out, Cache-Control); +while (cc_out (token = ap_get_list_item(r-pool, cc_out))) { +apr_size_t len = strlen(token); + +/* ap_get_list_item() strips the spurious whitespaces and + * lowercases anything (but the quoted-strings) */ +if (len 9 strncmp(token, no-cache=, 9) == 0) { +token += 9; +len -= 9; +} +else if (len 8 strncmp(token, private=, 8) == 0) { +token += 8; +len -= 8; +} +else { +continue; +} + +/* RFC2616 14.9: quoted list of field-names */ +if (len 2 token[0] == '' token[--len] == '') { +/* strip the header(s) from the cacheable headers out, + * but preserve the ones from the current response by + * adding them to the err_headers_out */ +const char *tok, *header; +(++token)[--len] = '\0'; +tok = token; +do { +if ((header = ap_cache_tokstr(r-pool, tok, tok)) +!apr_table_do(cc_field_doo_doo, r-err_headers_out, + headers_out, header, NULL)) { +apr_table_unset(r-headers_out, header); +apr_table_unset(headers_out, header); +} +} while (tok); +} +} +}
Re: mod_cache with Cache-Control no-cache= or private=
In fact this patch is probably better since it does not change h-resp_hdrs before calling cache_accept_headers() which uses them. Regars, Yann. Index: modules/cache/cache_util.c === --- modules/cache/cache_util.c (revision 1461557) +++ modules/cache/cache_util.c (working copy) @@ -542,13 +542,10 @@ } /* These come from the cached entity. */ -if (h-cache_obj-info.control.no_cache -|| h-cache_obj-info.control.no_cache_header -|| h-cache_obj-info.control.private_header) { +if (h-cache_obj-info.control.no_cache) { /* - * The cached entity contained Cache-Control: no-cache, or a - * no-cache with a header present, or a private with a header - * present, so treat as stale causing revalidation. + * The cached entity contained Cache-Control: no-cache, + * so treat as stale causing revalidation. */ return 0; } @@ -1069,9 +1066,7 @@ /* ...then try slowest cases */ else if (!strncasecmp(token, no-cache, 8)) { if (token[8] == '=') { -if (apr_table_get(headers, token + 9)) { -cc-no_cache_header = 1; -} +cc-no_cache_header = 1; } else if (!token[8]) { cc-no_cache = 1; @@ -1146,9 +1141,7 @@ } else if (!strncasecmp(token, private, 7)) { if (token[7] == '=') { -if (apr_table_get(headers, token + 8)) { -cc-private_header = 1; -} +cc-private_header = 1; } else if (!token[7]) { cc-private = 1; Index: modules/cache/mod_cache.c === --- modules/cache/mod_cache.c (revision 1461557) +++ modules/cache/mod_cache.c (working copy) @@ -714,6 +714,65 @@ } /* + * Same as ap_cacheable_headers_out(), but also strips the headers + * specified by the Cache-Control private= or no-cache= directives. + */ +static int cc_field_doo_doo(void *t, const char *key, + const char *val) +{ +if (val) { +apr_table_addn(t, key, val); +return 0; +} +return 1; +} +static apr_table_t *cache_cacheable_headers_cc(request_rec *r, + const cache_control_t *cc) +{ +apr_table_t *headers_out = ap_cache_cacheable_headers_out(r); +if (cc (cc-no_cache_header || cc-private_header)) { +char *token; +const char *cc_out = apr_table_get(headers_out, Cache-Control); +while (cc_out (token = ap_get_list_item(r-pool, cc_out))) { +apr_size_t len = strlen(token); + +/* ap_get_list_item() strips the spurious whitespaces and + * lowercases anything (but the quoted-strings) */ +if (len 9 strncmp(token, no-cache=, 9) == 0) { +token += 9; +len -= 9; +} +else if (len 8 strncmp(token, private=, 8) == 0) { +token += 8; +len -= 8; +} +else { +continue; +} + +/* RFC2616 14.9: quoted list of field-names */ +if (len 2 token[0] == '' token[--len] == '') { +/* strip the header(s) from the cacheable headers out, + * but preserve the ones from the current response by + * adding them to the err_headers_out */ +const char *tok, *header; +(++token)[--len] = '\0'; +tok = token; +do { +if ((header = ap_cache_tokstr(r-pool, tok, tok)) +!apr_table_do(cc_field_doo_doo, r-err_headers_out, + headers_out, header, NULL)) { +apr_table_unset(r-headers_out, header); +apr_table_unset(headers_out, header); +} +} while (tok); +} +} +} +return headers_out; +} + +/* * CACHE_SAVE filter * --- * @@ -746,6 +805,7 @@ apr_time_t exp, date, lastmod, now; apr_off_t size = -1; cache_info *info = NULL; +apr_table_t *cc_headers; char *reason; apr_pool_t *p; apr_bucket *e; @@ -1075,7 +1135,7 @@ * err_headers_out and we also need to strip any hop-by-hop * headers that might have snuck in. */ -r-headers_out = ap_cache_cacheable_headers_out(r); +r-headers_out = cache_cacheable_headers_cc(r, control); /* Merge in our cached headers. However, keep any updated values.
Re: mod_cache with Cache-Control no-cache= or private=
Sorry for my precipitation, the Content-Type is stripped from the validated (stale) headers with the previous patch, we have to do a copy like below. Regards, Yann. Index: modules/cache/cache_util.c === --- modules/cache/cache_util.c (revision 1461557) +++ modules/cache/cache_util.c (working copy) @@ -542,13 +542,10 @@ } /* These come from the cached entity. */ -if (h-cache_obj-info.control.no_cache -|| h-cache_obj-info.control.no_cache_header -|| h-cache_obj-info.control.private_header) { +if (h-cache_obj-info.control.no_cache) { /* - * The cached entity contained Cache-Control: no-cache, or a - * no-cache with a header present, or a private with a header - * present, so treat as stale causing revalidation. + * The cached entity contained Cache-Control: no-cache, + * so treat as stale causing revalidation. */ return 0; } @@ -1069,9 +1066,7 @@ /* ...then try slowest cases */ else if (!strncasecmp(token, no-cache, 8)) { if (token[8] == '=') { -if (apr_table_get(headers, token + 9)) { -cc-no_cache_header = 1; -} +cc-no_cache_header = 1; } else if (!token[8]) { cc-no_cache = 1; @@ -1146,9 +1141,7 @@ } else if (!strncasecmp(token, private, 7)) { if (token[7] == '=') { -if (apr_table_get(headers, token + 8)) { -cc-private_header = 1; -} +cc-private_header = 1; } else if (!token[7]) { cc-private = 1; Index: modules/cache/mod_cache.c === --- modules/cache/mod_cache.c (revision 1461557) +++ modules/cache/mod_cache.c (working copy) @@ -714,6 +714,65 @@ } /* + * Same as ap_cacheable_headers_out(), but also strips the headers + * specified by the Cache-Control private= or no-cache= directives. + */ +static int cc_field_doo_doo(void *t, const char *key, + const char *val) +{ +if (val) { +apr_table_addn(t, key, val); +return 0; +} +return 1; +} +static apr_table_t *cache_cacheable_headers_cc(request_rec *r, + const cache_control_t *cc) +{ +apr_table_t *headers_out = ap_cache_cacheable_headers_out(r); +if (cc (cc-no_cache_header || cc-private_header)) { +char *token; +const char *cc_out = apr_table_get(headers_out, Cache-Control); +while (cc_out (token = ap_get_list_item(r-pool, cc_out))) { +apr_size_t len = strlen(token); + +/* ap_get_list_item() strips the spurious whitespaces and + * lowercases anything (but the quoted-strings) */ +if (len 9 strncmp(token, no-cache=, 9) == 0) { +token += 9; +len -= 9; +} +else if (len 8 strncmp(token, private=, 8) == 0) { +token += 8; +len -= 8; +} +else { +continue; +} + +/* RFC2616 14.9: quoted list of field-names */ +if (len 2 token[0] == '' token[--len] == '') { +/* strip the header(s) from the cacheable headers out, + * but preserve the ones from the current response by + * adding them to the err_headers_out */ +const char *tok, *header; +(++token)[--len] = '\0'; +tok = token; +do { +if ((header = ap_cache_tokstr(r-pool, tok, tok)) +!apr_table_do(cc_field_doo_doo, r-err_headers_out, + headers_out, header, NULL)) { +apr_table_unset(r-headers_out, header); +apr_table_unset(headers_out, header); +} +} while (tok); +} +} +} +return headers_out; +} + +/* * CACHE_SAVE filter * --- * @@ -746,6 +805,7 @@ apr_time_t exp, date, lastmod, now; apr_off_t size = -1; cache_info *info = NULL; +apr_table_t *cc_headers; char *reason; apr_pool_t *p; apr_bucket *e; @@ -1075,7 +1135,7 @@ * err_headers_out and we also need to strip any hop-by-hop * headers that might have snuck in. */ -r-headers_out = ap_cache_cacheable_headers_out(r); +r-headers_out = cache_cacheable_headers_cc(r, control); /* Merge in our cached headers. However, keep
Re: mod_cache with Cache-Control no-cache= or private=
The latest patch is attached to bugzilla #54706. Regards, Yann.
Re: mod_cache with Cache-Control no-cache= or private=
On Mar 13, 2013, at 10:20 AM, Graham Leggett wrote: On 11 Mar 2013, at 12:50 PM, Yann Ylavic ylavic@gmail.com wrote: The way I read the spec, the specified field-name(s) MUST NOT be sent in the response to a subsequent request without successful revalidation with the origin server. What this means is that if the specified field names are found to be present in the cached response, then the origin server needs to be given the opportunity to update these fields through a conditional request. In the current cache code, we return 0 meaning this is stale, revalidate, and a conditional request is sent to the origin. We hope the origin sends 304 Not Modified, with updated headers corresponding to the fields. Ok, I see your point, and this is surely the right reading of the rfc, but there is necessarily a difference between no-cache and no-cache=header(s), particularly with the handling of that (stale) header(s). For what I understand now, if the origin does not send (one of) the header(s) in its 304 response, the stale header(s) MUST NOT be served. I don't read it that way from the spec, I think it all comes down to the phrase without successful revalidation with the origin server. I read it as without successful revalidation [of the whole request] with the origin server. In other words, the origin server sent the original header, if the origin server doesn't update the header in the 304 response then it means I've had my opportunity to revalidate the entity, current cached value is fine, send it. Roy ultimately would be able to answer this? It means delete the header fields prior to storing them in the cache for later reuse. If the origin had wanted must-revalidate, it would simply use that directive instead. The successful revalidation bit is saying that the cache should forward all of the fields for the response to the original request and for any response that is revalidated (i.e., forward the new fields received in 304), but not for the requests that are entirely handled by the cache. Roy
Re: mod_cache with Cache-Control no-cache= or private=
Here is the patch that strips the no-cache= or private= specified headers after the origin server's validation, leaving the only headers updated by the origin. Regards, Yann. Index: modules/cache/cache_storage.c === --- modules/cache/cache_storage.c (revision 1456050) +++ modules/cache/cache_storage.c (working copy) @@ -156,6 +156,51 @@ apr_table_unset(h-resp_hdrs, Last-Modified); } +v = apr_table_get(h-resp_hdrs, Cache-Control); +if (v (h-cache_obj-info.control.no_cache_header || + h-cache_obj-info.control.private_header)) { +/* + * RFC2616 14.9.1: If the no-cache directive does specify one or more + * field-names, then a cache MAY use the response to satisfy a + * subsequent request, subject to any other restrictions on caching. + * However, the specified field-name(s) MUST NOT be sent in the + * response to a subsequent request without successful revalidation + * with the origin server. + * + * Hence we will strip these cached headers (if any) and let the only + * ones validated by the origin server. + */ +char *token; +apr_size_t len; +while ((token = ap_get_list_item(r-pool, v))) { +/* ap_get_list_item() strips the spurious whitespaces and + * lowercases anything (but the quoted-strings) */ +if (strncmp(token, no-cache=, 9) == 0) { +token += 9; +} +else if (strncmp(token, private=, 8) == 0) { +token += 8; +} +else { +continue; +} + +/* RFC2616 14.9: quoted list of field-names */ +len = strlen(token); +if (token[0] == '' token[--len] == '') { +(++token)[--len] = '\0'; +do { +const char *name = ap_cache_tokstr(r-pool, token, + (const char**)token); +if (name) { +/* strip that name header the response */ +apr_table_unset(h-resp_hdrs, name); +} +} while (token); +} +} +} + /* The HTTP specification says that it is legal to merge duplicate * headers into one. Some browsers that support Cookies don't like * merged headers and prefer that each Set-Cookie header is sent
Re: mod_cache with Cache-Control no-cache= or private=
On 11 Mar 2013, at 12:50 PM, Yann Ylavic ylavic@gmail.com wrote: The way I read the spec, the specified field-name(s) MUST NOT be sent in the response to a subsequent request without successful revalidation with the origin server. What this means is that if the specified field names are found to be present in the cached response, then the origin server needs to be given the opportunity to update these fields through a conditional request. In the current cache code, we return 0 meaning this is stale, revalidate, and a conditional request is sent to the origin. We hope the origin sends 304 Not Modified, with updated headers corresponding to the fields. Ok, I see your point, and this is surely the right reading of the rfc, but there is necessarily a difference between no-cache and no-cache=header(s), particularly with the handling of that (stale) header(s). For what I understand now, if the origin does not send (one of) the header(s) in its 304 response, the stale header(s) MUST NOT be served. I don't read it that way from the spec, I think it all comes down to the phrase without successful revalidation with the origin server. I read it as without successful revalidation [of the whole request] with the origin server. In other words, the origin server sent the original header, if the origin server doesn't update the header in the 304 response then it means I've had my opportunity to revalidate the entity, current cached value is fine, send it. Roy ultimately would be able to answer this? Regards, Graham -- smime.p7s Description: S/MIME cryptographic signature
Re: mod_cache with Cache-Control no-cache= or private=
On Wed, Mar 13, 2013 at 6:20 PM, Graham Leggett minf...@sharp.fm wrote: On 11 Mar 2013, at 12:50 PM, Yann Ylavic ylavic@gmail.com wrote: The way I read the spec, the specified field-name(s) MUST NOT be sent in the response to a subsequent request without successful revalidation with the origin server. What this means is that if the specified field names are found to be present in the cached response, then the origin server needs to be given the opportunity to update these fields through a conditional request. In the current cache code, we return 0 meaning this is stale, revalidate, and a conditional request is sent to the origin. We hope the origin sends 304 Not Modified, with updated headers corresponding to the fields. Ok, I see your point, and this is surely the right reading of the rfc, but there is necessarily a difference between no-cache and no-cache=header(s), particularly with the handling of that (stale) header(s). For what I understand now, if the origin does not send (one of) the header(s) in its 304 response, the stale header(s) MUST NOT be served. I don't read it that way from the spec, I think it all comes down to the phrase without successful revalidation with the origin server. I read it as without successful revalidation [of the whole request] with the origin server. In other words, the origin server sent the original header, if the origin server doesn't update the header in the 304 response then it means I've had my opportunity to revalidate the entity, current cached value is fine, send it. How would the origin invalidate a Set-Cookie, with an empty one ? Regards, Yann.
Re: mod_cache with Cache-Control no-cache= or private=
On 13 Mar 2013, at 7:27 PM, Yann Ylavic ylavic@gmail.com wrote: How would the origin invalidate a Set-Cookie, with an empty one ? I would imagine with a 200 OK. Roy would be able to confirm. Regards, Graham -- smime.p7s Description: S/MIME cryptographic signature
Re: mod_cache with Cache-Control no-cache= or private=
On Wed, Mar 13, 2013 at 5:27 PM, Yann Ylavic ylavic@gmail.com wrote: How would the origin invalidate a Set-Cookie, with an empty one ? Regards, Yann. Set it again, with an in the past expiry date. Cheers Tom
Re: mod_cache with Cache-Control no-cache= or private=
On Wed, Mar 13, 2013 at 6:30 PM, Graham Leggett minf...@sharp.fm wrote: On 13 Mar 2013, at 7:27 PM, Yann Ylavic ylavic@gmail.com wrote: How would the origin invalidate a Set-Cookie, with an empty one ? I would imagine with a 200 OK. Roy would be able to confirm. Well, I can't see the difference with the no-cache without a header specified (the actual code) then... Regards, Yann.
Re: mod_cache with Cache-Control no-cache= or private=
On Wed, Mar 13, 2013 at 6:35 PM, Tom Evans tevans...@googlemail.com wrote: On Wed, Mar 13, 2013 at 5:27 PM, Yann Ylavic ylavic@gmail.com wrote: How would the origin invalidate a Set-Cookie, with an empty one ? Regards, Yann. Set it again, with an in the past expiry date. Well, that's not exactly the same thing, the user may have a valid Cookie (which is not the one cached) the origin wants to keep going on. I meant invalidating the cached cookie, not the one of the user. Cheers, Yann.
Re: mod_cache with Cache-Control no-cache= or private=
On 13 Mar 2013, at 17:41, Yann Ylavic ylavic@gmail.com wrote: On Wed, Mar 13, 2013 at 6:35 PM, Tom Evans tevans...@googlemail.com wrote: On Wed, Mar 13, 2013 at 5:27 PM, Yann Ylavic ylavic@gmail.com wrote: How would the origin invalidate a Set-Cookie, with an empty one ? Regards, Yann. Set it again, with an in the past expiry date. Well, that's not exactly the same thing, the user may have a valid Cookie (which is not the one cached) the origin wants to keep going on. I meant invalidating the cached cookie, not the one of the user. Is this the situation you're worried about: ClientA: GET /foo HTTP/1.1 ClientA: Host: … ReverseProxy: GET /foo HTTP/1.1 ReverseProxy: Host: … Origin: HTTP/1.1 200 OK Origin: Date: … Origin: Set-Cookie: data=AA Origin: Cache-Control: private=Set-Cookie ReverseProxy: HTTP/1.1 200 OK ReverseProxy: Date: … ReverseProxy: Set-Cookie: data=AA ReverseProxy: Cache-Control: private=Set-Cookie ClientB: GET /foo HTTP/1.1 ClientB: Host: … ClientB: Cookie: data=BB ReverseProxy: GET /foo HTTP/1.1 ReverseProxy: Host: … ReverseProxy: Cookie: data=BBB Origin: HTTP/1.1 304 Not Modified Origin: Date: … Origin: Cache-Control: private=Set-Cookie This should just work. The final reply from the cacheing reverse proxy should look like this: ReverseProxy: HTTP/1.1 304 Not Modified ReverseProxy: Date: … and the Set-Cookie: header from the stored request would not be used (in fact, the proxy may have elected not to store it). The origin doesn't have to mention that header in the 304 response. -- Tim Bannister – is...@jellybaby.net
Re: mod_cache with Cache-Control no-cache= or private=
On Wed, Mar 13, 2013 at 9:28 PM, Tim Bannister is...@jellybaby.net wrote: Is this the situation you're worried about: ClientA: GET /foo HTTP/1.1 ReverseProxy: GET /foo HTTP/1.1 Origin: HTTP/1.1 200 OK Origin: Set-Cookie: data=AA Origin: Cache-Control: private=Set-Cookie ReverseProxy: Set-Cookie: data=AA ReverseProxy: Cache-Control: private=Set-Cookie ClientB: GET /foo HTTP/1.1 ClientB: Cookie: data=BB ReverseProxy: GET /foo HTTP/1.1 ReverseProxy: Cookie: data=BBB Origin: HTTP/1.1 304 Not Modified Yes, about what happens now, the ReverseProxy (mod_cache) must not Set-Cookie: data=AA to ClientB. This should just work. The final reply from the cacheing reverse proxy should look like this: ReverseProxy: HTTP/1.1 304 Not Modified ReverseProxy: Date: … This actually does not work, mod_cache will serve the cached Set-Cookie. The CacheIgnoreHeaders directive only can prevent this (not controlled by the origin). The final reply to ClientB, whose request is not conditional, can also be : ReverseProxy: HTTP/1.1 200 OK ReverseProxy: Cache-Control: private=Set-Cookie ReverseProxy: cached body That's the main goal I guess (limit backend hits for large things). and the Set-Cookie: header from the stored request would not be used (in fact, the proxy may have elected not to store it). The origin doesn't have to mention that header in the 304 response. In mod_cache the no-store of a particular header is harder to patch than the no-cache (ie. do not serve without revalidation), but indeed the former would be more efficient, no need to sanitize each served response. For the private=, the rfc does not say more than its BNF... If private has the same semantic as without the =, the header should not be stored (the Cache-Control: private response is not stored by mod_cache). In all cases, IMHO, no cached Set-Cookie should aver played by a cache with private/no-cache=Set-Cookie associated with the resource.
Re: mod_cache with Cache-Control no-cache= or private=
On Sun, Mar 10, 2013 at 1:55 AM, Graham Leggett minf...@sharp.fm wrote: On 04 Mar 2013, at 8:22 PM, ylavic dev ylavic@gmail.com wrote: For what I understand, mod_cache is allowed to serve its cached entity (though without the specified header(s)). I read this through again, this time having slept properly. Thank you for your enlightened consideration. The way I read the spec, the specified field-name(s) MUST NOT be sent in the response to a subsequent request without successful revalidation with the origin server. What this means is that if the specified field names are found to be present in the cached response, then the origin server needs to be given the opportunity to update these fields through a conditional request. In the current cache code, we return 0 meaning this is stale, revalidate, and a conditional request is sent to the origin. We hope the origin sends 304 Not Modified, with updated headers corresponding to the fields. Ok, I see your point, and this is surely the right reading of the rfc, but there is necessarily a difference between no-cache and no-cache=header(s), particularly with the handling of that (stale) header(s). For what I understand now, if the origin does not send (one of) the header(s) in its 304 response, the stale header(s) MUST NOT be served. So mod_cache should never send these stale headers to the client, and either do not cache them at all, or strip them before overlaping the 304's ones. The actual code does not comply with this requirement since it overlaps the stale headers with the origin ones, hence the no-cache headers will still be there if there are not specified in the 304 response. If we were to follow this patch, it means that the first time we hit the URL, the client sees the private/no-cache fields, but every cached response after will be treated as fresh with the field missing. This breaks caching. Indeed this patch is broken, but I can modify it to comply with my comment above, meaning (at first glance) that the treatment should be in cache_accept_headers(). Should I propose the new patch or my understanding is definitively broken ? Regards, Yann.
Re: mod_cache with Cache-Control no-cache= or private=
On 04 Mar 2013, at 8:22 PM, ylavic dev ylavic@gmail.com wrote: I've been working on a patch for mod_cache to deal (fully) with the response header Cache-Control and the no-cache=header or private=header directives. This feature is mainly used with the Set-Cookie header, and allows the origin server to control the caching of that particular header. Although the code is already there to detect their usage with a header, mod_cache still handle these directives as if no header was specified. That is, stale entity causing revalidation [by the origin server]. The RFC-2616 (14.9.1 What is Cacheable) says this about the no-cache=header directive : If the no-cache directive does specify one or more field-names, then a cache MAY use the response to satisfy a subsequent request, subject to any other restrictions on caching. However, the specified field-name(s) MUST NOT be sent in the response to a subsequent request without successful revalidation with the origin server. This allows an origin server to prevent the re-use of certain header fields in a response, while still allowing caching of the rest of the response. For what I understand, mod_cache is allowed to serve its cached entity (though without the specified header(s)). I read this through again, this time having slept properly. The way I read the spec, the specified field-name(s) MUST NOT be sent in the response to a subsequent request without successful revalidation with the origin server. What this means is that if the specified field names are found to be present in the cached response, then the origin server needs to be given the opportunity to update these fields through a conditional request. In the current cache code, we return 0 meaning this is stale, revalidate, and a conditional request is sent to the origin. We hope the origin sends 304 Not Modified, with updated headers corresponding to the fields. If we were to follow this patch, it means that the first time we hit the URL, the client sees the private/no-cache fields, but every cached response after will be treated as fresh with the field missing. This breaks caching. What you're trying to achieve needs to be handled by your origin server, which should support conditional requests, and then send updated Set-Cookie headers along with the 304 Not Modified responses. This way the body stays cached, but your Set-Cookie is updated on every hit. Regards, Graham -- smime.p7s Description: S/MIME cryptographic signature
Re: mod_cache with Cache-Control no-cache= or private=
Hi, On Mon, Mar 4, 2013 at 7:22 PM, ylavic dev ylavic@gmail.com wrote: I've been working on a patch for mod_cache to deal (fully) with the response header Cache-Control and the no-cache=header or private=header directives. I realize that, maybe, the patch should have been included in the message, rather than in an attachment, for it to be read quickly. So let me reply to myself with the patch below (which is not a big deal)... Or maybe is there a reason not to include that functionality in mod_cache, with most of the code being already there ? I could not find any relative discussion in the list nor anywhere (about mod_cache, but to say it is not implemented). Regards, Yann. Index: modules/cache/cache_util.c === --- modules/cache/cache_util.c (revision 1451191) +++ modules/cache/cache_util.c (working copy) @@ -27,7 +27,7 @@ extern module AP_MODULE_DECLARE_DATA cache_module; -#define CACHE_SEPARATOR , +#define CACHE_SEPARATOR , \t /* Determine if url matches the hostname, scheme and port and path * in filter. All but the path comparisons are case-insensitive. @@ -542,17 +542,84 @@ } /* These come from the cached entity. */ -if (h-cache_obj-info.control.no_cache -|| h-cache_obj-info.control.no_cache_header -|| h-cache_obj-info.control.private_header) { +if (h-cache_obj-info.control.no_cache) { /* - * The cached entity contained Cache-Control: no-cache, or a - * no-cache with a header present, or a private with a header - * present, so treat as stale causing revalidation. + * The cached entity contained Cache-Control: no-cache, so + * treat as stale causing revalidation. */ return 0; } +if (h-cache_obj-info.control.no_cache_header +|| h-cache_obj-info.control.private_header) { +/* + * RFC2616 14.9.1: The cached entity contained + * Cache-Control: no-cache=, or Cache-Control: private=, with + * a header present, hence we are allowed to serve this entity, + * but without the specified headers, so let's strip them now, + * and fall through the other restrictions. + * + * Here we assume mixed Cache-Control: no-cache and no-cache= + * have been caught above and treated as stale causing revalidation, + * leaving here the only no-cache= and/or private= with a header. + */ +char *token; +const char *header = apr_table_get(h-resp_hdrs, Cache-Control); +while (header (token = ap_get_list_item(r-pool, header))) { +/* ap_get_list_item() strips the spurious whitespaces and + * lowercases anything (but the quoted-strings) */ +if (strncmp(token, no-cache=, 9) == 0) { +token += 9; +} +else if (strncmp(token, private=, 8) == 0) { +token += 8; +} +else { +continue; +} +if (*token == '') { +/* RFC2616 2.2: quoted-string + * found no ap_*() function to unquote those strings, + * so the job is done here... */ +char *pos, *start, *end; +pos = start = end = token + 1; +while (*pos *pos != '') { +if (*pos == '\\') { +/* RFC2616 2.2: quoted-pair */ +if (end == pos) { +/* duplicate to preserve the original token + * should the quoted-string be invalid */ +start = apr_pstrdup(r-pool, start); +pos = end = start + (pos - token) - 1; +} +/* skip the quote */ +pos++; +} +if (end != pos) { +*end = *pos; +} +end++; +pos++; +} +if (*pos == '' !*(pos + 1)) { +/* valid quoted-string */ +token = start; +*end = '\0'; +} +else { +/* invalid quoted-string, continue? fall through? + * like ap_get_mime_headers_core() we do not check + * headers' names validity, and just fall through, + * is there a tiny chance to unset such a header? */ +/*continue;*/ +} +} + +/* strip that header from the response */ +apr_table_unset(h-resp_hdrs, token); +} +} + if ((agestr = apr_table_get(h-resp_hdrs, Age))) { age_c = apr_atoi64(agestr); }
Re: mod_cache with Cache-Control no-cache= or private=
On 06 Mar 2013, at 12:04 PM, Yann Ylavic ylavic@gmail.com wrote: I've been working on a patch for mod_cache to deal (fully) with the response header Cache-Control and the no-cache=header or private=header directives. I realize that, maybe, the patch should have been included in the message, rather than in an attachment, for it to be read quickly. So let me reply to myself with the patch below (which is not a big deal)... Or maybe is there a reason not to include that functionality in mod_cache, with most of the code being already there ? I could not find any relative discussion in the list nor anywhere (about mod_cache, but to say it is not implemented). I reviewed the patch and it looks sane, but my schedule has been insane, and I need some sleep before I commit this. Should have time over the weekend if not before. Complying with all of the RFC is the goal of mod_cache, thank you for contributing this. Regards, Graham -- smime.p7s Description: S/MIME cryptographic signature
mod_cache with Cache-Control no-cache= or private=
Hi, I've been working on a patch for mod_cache to deal (fully) with the response header Cache-Control and the no-cache=header or private=header directives. This feature is mainly used with the Set-Cookie header, and allows the origin server to control the caching of that particular header. Although the code is already there to detect their usage with a header, mod_cache still handle these directives as if no header was specified. That is, stale entity causing revalidation [by the origin server]. The RFC-2616 (14.9.1 What is Cacheable) says this about the no-cache=header directive : If the no-cache directive does specify one or more field-names, then a cache MAY use the response to satisfy a subsequent request, subject to any other restrictions on caching. However, the specified field-name(s) MUST NOT be sent in the response to a subsequent request without successful revalidation with the origin server. This allows an origin server to prevent the re-use of certain header fields in a response, while still allowing caching of the rest of the response. For what I understand, mod_cache is allowed to serve its cached entity (though without the specified header(s)). mod_cache already provides the configuration directive CacheIgnoreHeaders to prevent the specified headers from being cached, but this gives no control to the origin server. The attached patch works with trunk and 2.4 (I've got one for 2.2 too, which is the version I use, but I doubt such a feature would be merged into it...). Note that the patch also fixes what seems to be an unfortunate replacement of the tab character (^V) with spaces in the CACHE_SEPARATOR macro definition. Thanks for your feedbacks, regards, Yann. cache_util-trunk.patch Description: Binary data