On 01/25/2012 06:53 PM, Amos Jeffries wrote: >> We created workers as an internal performance optimization that has >> nothing to do with HTTP. It is our responsibility to make sure that >> optimization stays internal. If caches are not synchronized, the >> optimization may negatively affect external HTTP agents. > > I see you arguing that IPC messages about purges is a requirement we > imposed on ourselves. I agree, and focus on IPC so that admin who > disable ICP/HTCP/PURGE are not causing problems.
I am not talking about any specific technology to enforce cache synchronization, just the assertion that either the internal caches are synchronized or they are violating HTTP. > I see no evidence that sharing an IP is any more (or less) of a > violation than each worker having a unique IP and same FQDN. We haven't > gone around claiming that sibling relationships or popular CDN > hierarchies are all violating HTTP, though they hit sync problems too. If they have sync problems, they may violate HTTP. I am just doing my best trying to stay focused on the [local] cache architecture topic; I do not want to get into discussion about distributed hierarchies. >> Again, if HTTP has no text defining when two cooperating caches must be >> in sync, then it would be difficult to decide which interpretation of >> the HTTP spirit is "correct". > > The new wording for HTTPbis part 6 draft -18 section 2.5 about > PUT/POST/DELETE/unknown explicitly clarifies the spirit with: > " > Note that this does not guarantee that all appropriate responses are > invalidated. For example, the request that caused the change at the > origin server might not have gone through the cache where a response > is stored. > " IMO, this just warns the implementer that the network complications (policy routing, load balancing, cache hierarchies, etc) may cause HTTP violations. It does not permit those violations any more than a warning of a possible DDoS attack makes that attack benign. It just says "this MUST/SHOULD cannot guarantee anything because it applies to the caches that received the request and not to the other caches that did not; your next request may go through those other caches". > section 2.2 on what responses can be served uses the wording > " > Also, note that unsafe requests might invalidate already stored > responses; see Section 2.5. > " > *might* invalidate. I think this just means that while an unsafe request MUST invalidate the corresponding stored response, there may be no such corresponding responses stored. > Two giant loopholes to walk through. Invalidation MUST is a best-effort > benefit for a hierarchy, not a guarantee of removal. Clearly, we interpret the same specs differently. You see loopholes negating explicit MUSTs. I see a non-normative explanation why somebody cannot rely on the request path staying the same and a non-normative reference to a MUST. > With Squid SMP mode design being an entire hierarchy inside one box we > have to adjust our viewpoint to that of hierarchy compliance. The > workers are as compliant as ever individually. We have raised awareness > of the hierarchy level interaction problems and need to fix it above and > beyond the specs. They in word and spirit focus on requirements of > individual cache instances, not distributed or hierarchy. > > What we do by fixing the problem is improve the > friendliness/predictability and usefulness of Squid responses. Not the > compliance level. I understand your point of view, but I have not seen anything in HTTP that supports a definition of compliance for a hierarchy of caches (with a single entry point) that differs from compliance of a single cache. I have not read the entire HTTPbis yet so it is possible that I will find it. However, since we seem to agree that worker caches must be in sync, we can build the implementation on that consensus! It may be a good idea to polish HTTPbis if it indeed allows both yours and mine points of view to coexist even though they contradict each other. Thank you, Alex.
