Leif Hedstrom created TS-2245:
---------------------------------

             Summary: Fix the semantics and behavior of e.g. 
proxy.config.http.cache.ignore_accept_encoding_mismatch
                 Key: TS-2245
                 URL: https://issues.apache.org/jira/browse/TS-2245
             Project: Traffic Server
          Issue Type: Bug
          Components: HTTP
            Reporter: Leif Hedstrom


These four configurations options where added to fix a real problem (content 
duplications in cache): 

{code}
  {RECT_CONFIG, "proxy.config.http.cache.ignore_accept_mismatch", RECD_INT, 
"0", RECU_DYNAMIC, RR_NULL, RECC_INT, "[0-1]", RECA_NULL}
  ,
  {RECT_CONFIG, "proxy.config.http.cache.ignore_accept_language_mismatch", 
RECD_INT, "0", RECU_DYNAMIC, RR_NULL, RECC_INT, "[0-1]", RECA_NULL}
  ,
  {RECT_CONFIG, "proxy.config.http.cache.ignore_accept_encoding_mismatch", 
RECD_INT, "0", RECU_DYNAMIC, RR_NULL, RECC_INT, "[0-1]", RECA_NULL}
  ,
  {RECT_CONFIG, "proxy.config.http.cache.ignore_accept_charset_mismatch", 
RECD_INT, "0", RECU_DYNAMIC, RR_NULL, RECC_INT, "[0-1]", RECA_NULL}
  ,
{code}

However, as implemented, they are pretty much useless, and if enabled, have 
high risk of giving wrong content. To make things worse, they are global 
configurations, since they are not passable from the HTTPSM into the cache.

I've examine the code thoroughly, and I actually think these configurations had 
the right intentions, but just implemented it wrong. What they really ought to 
have been is e.g. proxy.config.http.cache.relax_accept_encoding_match .


What *should* happen (IMO) is that these four configs (ideally we'd rename them 
or make new ones) would check if there is no Vary: header in the cached entry. 
IF there is no Vary: header, *AND* one of these settings it set, we skip that 
matching that happens on the cache client header and the incoming client header 
entirely (give the match a score of 1.0). These configs should ideally also be 
per-remap overridable, but that requires code changes like TS-1919.

A real use case scenario is this: Assume a content is always served by origin 
without Content-Encoding, or Vary: header. This would be typical for e.g. a PNG 
(image).

Upon cache miss, if the first request comes with Accept-Encoding: gzip, 
everything is fine, and we serve this cached item to all clients thereafter. 
However, if the first request comes with no Accept-Encoding: header whatsoever, 
that response can not satisfy a response from a request with AE: gzip, so we 
get *at least* two copies of the same object in cache.

I'm curious to get some input on this, and let me know if the explanations 
makes no sense. :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to