On Thu, 26 Feb 2004, Jon Kay wrote: > So far, there is one difficult and thought-provoking question about > this design: object storage format. Should it be stored encoded once > an encoding is done (using Vary headers to figure out circumstances, > as in the spec)? Should it stay decoded like today, and always be > reencoded during transfer? Or should both formats be available once > created?
My vote is to have both available once created, with proper Vary and ETag type headers, plus some internal information to connect the two together to allow proper cache refreshes etc. Note that Vary is not entirely suitable here and you want to use something which more closely represents your decision logics on what encoding to use. While the object is read, have it cached like usual. Then when you recode the object with another encoding stream the resulting object back to the cache as another object. > Now, storing both encoded and decoded does present the challenge of > linking for synchronization purposes all encodings of a particular > object. Indeed. > Joe says you guys had already done something like that in > implemting Vary functionality. Not exacly. Different negotiations results in different entities having their own life. Neither Vary or ETag have any such connections between the different entities of the same URI. The closest thing they have is invalidation of the variant index if a change in what variance is based on is detected. > Are there ideas / code along these lines that I can glom onto? When implementing caching of content recodings (and also cached transfer-encodings if considered) the criterias is different from normal caching and everything must be synched with the original object. I would propose to solve this by storing sufficient amount of information in the recoded object to be able to verify that it matches the original object on cache hits. You dot not need to automatically purge or redo encodings when the original changes, but you must make sure so is done at latest before giving it out as a cache hit. Please note that messing with Content-Encoding in a proxy is somewhat outside the HTTP specifications and some care is needed to do it correctly or your risk causing major headache for the HTTP/1.1 content negotiation and equality criterias and even risk causing object corruption at the clients. By applying Content-Encoding you technically create a new entity (which is supposed to only be done by origin servers). This new entity must use a different ETag to differentiate from the original to not mess up equality conditions at the clients, and any cache revalidations etc must be based on the original not your recoded version to make sure there is no confusion between the recoding proxy and the origin server as to the cached objects freshness. Regards Henrik
