Re: [squid-dev] [RFC] client header mangling
On 06/07/2016 05:09 AM, Amos Jeffries wrote: > I've been looking at ways to resolve the long Vary discussion going on > in squid-users with a patch that we can accept into mainline. What they > (joe and Yuri) have at present works, but only with extra > request_header_replace config preventing integrity problems. Disclaimer: I have not read the Vary discussion on squid-users. I suspect your RFC is generic enough to ignore that discussion as far as the RFC is concerned. > One way to make useful progress would be to finally add the recurring > request for request_header_access/replace to work on client messages in > a pre-cache doCallouts hook rather than only a post-cache hook. I assume that by "finally add" you meant something like "finally implement". > I am imagining this being done on the adapted request headers after > ICAP, eCAP and URL-rewrite have all done their things. Sounds good. > And using the > same request_header_* directive ACL lists as for outbound traffic. I am not sure exactly what you mean, but which headers/transactions to mangle should be up to the Squid admin. The pre- and post-cache mangling will often differ. The pre- and post-cache mangling API will be very similar, of course, but we should not restrict the admin to a single set of rules that is always applied on both sides of the cache. One elegant way to implement this would be to add a vectoring-point ACL that will match "pre-cache" and "post-cache" vectoring points (at least). That way, you do not need to add new directives but admins can mangle headers differently on each side. I suspect these ACLs would be useful in other contexts as well. As for "eCAP versus squid.conf" mangling, I suggest the following rules of thumb: 0. Message body mangling belongs to eCAP/ICAP. 1. If header mangling decisions require information contained in the message body, such mangling belongs to eCAP/ICAP. 2. Header field mangling that cannot be expressed using "add field", "delete field", or "a regex substitution of the field value" operation belongs to eCAP/ICAP. 3. All other mangling actions can be supported directly in squid.conf, at any vectoring point. Thank you, Alex. ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [RFC] client header mangling
On 10/06/2016 9:49 a.m., Eliezer Croitoru wrote: > I am trying to understand so bear with me couple seconds. > I have seen that there are pages\servers which doesn't state about the > User-Agent in the Vary response while still taking it into account. > > The caching side of the picture is storing an object which will never be > served. > The HIT ratio is a whole other story of the picture. > > Since I am not inside the code but I do try to understand, currently what > happens? Currently what happens is Squid uses the real client headers or ICAP/eCAP adapted headers when looking up the cached objects variant. The admin might be fixing the actual response using request_header_access/replace to craft what the server sees. But that does not help Squid with the variants. Altering the headers prior to cache lookup is needed for that, which today means ICAP/eCAP are needed. I'm proposing making he header alteratiosn affect input as well as output. > How many lookups are done for\per a request? 1 or 2 if it is a Vary response. > Do we run an object lookup after the response headers was received from the > server? No. > Can we predict a Vary object based on the request only?(I assume that it will > be an estimated and not absolute certainty if at all) No. > > Also let say we have a 1k page ahead, would we want it to be fetched > from disk\ram store rather then from the origin server after we told > it we want the object? This is not relevant. Size of the objects and where each would be placed is not relevant to the problem. The issue is that there are a huge, possibly infinite number of such objects wasting filenum spaces/slots in the cache. There are only 2^25 object slots per cache location, so these huge sets of variants can really cramp the storage even if they are only 1 bytes in size. > > I am almost sure that lowering the disk and ram stored objects should > be a goal by itself if we cannot "dig" them up from ram or disk later > for any use. Yes, but not relevant to the current decision. The question here is whether we should allow pre-cache header mangling by admin as a way to reduce number of objects count for Vary responses. The alternative is requiring them to use ICAP or eCAP to do it. Possibly asking someone to write an eCAP module. > > A request_header_replace can work only for "generic" ones such as > without a language preference such as "br" added to some requests by > browser add-ons. No. It can and will work for any header. Just requires the admin to know what ones to modify and when. Which is still somewhat hard for unusual headers. Which is part of why I RFC'd it rather than going ahead and proposing a patch. > > Now a step further, I can write a tiny ICAP service that will > "handle" common Vary headers from FireFox and other browsers to test > how it affects caches in general. I'd rather eCAP for this than ICAP. But if you think you can do it and want to try then we can work on the details of what the code needs to do. Amos ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [RFC] client header mangling
I am trying to understand so bear with me couple seconds. I have seen that there are pages\servers which doesn't state about the User-Agent in the Vary response while still taking it into account. The caching side of the picture is storing an object which will never be served. The HIT ratio is a whole other story of the picture. Since I am not inside the code but I do try to understand, currently what happens? How many lookups are done for\per a request? Do we run an object lookup after the response headers was received from the server? Can we predict a Vary object based on the request only?(I assume that it will be an estimated and not absolute certainty if at all) Also let say we have a 1k page ahead, would we want it to be fetched from disk\ram store rather then from the origin server after we told it we want the object? I am almost sure that lowering the disk and ram stored objects should be a goal by itself if we cannot "dig" them up from ram or disk later for any use. A request_header_replace can work only for "generic" ones such as without a language preference such as "br" added to some requests by browser add-ons. Now a step further, I can write a tiny ICAP service that will "handle" common Vary headers from FireFox and other browsers to test how it affects caches in general. Eliezer Eliezer Croitoru Linux System Administrator Mobile: +972-5-28704261 Email: elie...@ngtech.co.il -Original Message- From: squid-dev [mailto:squid-dev-boun...@lists.squid-cache.org] On Behalf Of Amos Jeffries Sent: Tuesday, June 7, 2016 2:10 PM To: Squid Developers Subject: [squid-dev] [RFC] client header mangling I've been looking at ways to resolve the long Vary discussion going on in squid-users with a patch that we can accept into mainline. What they (joe and Yuri) have at present works, but only with extra request_header_replace config preventing integrity problems. One way to make useful progress would be to finally add the recurring request for request_header_access/replace to work on client messages in a pre-cache doCallouts hook rather than only a post-cache hook. I am imagining this being done on the adapted request headers after ICAP, eCAP and URL-rewrite have all done their things. And using the same request_header_* directive ACL lists as for outbound traffic. Any alternative ideas or objections? Amos ___ squid-dev mailing list squid-dev@lists.squid-cache.org <mailto:squid-dev@lists.squid-cache.org> http://lists.squid-cache.org/listinfo/squid-dev ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
[squid-dev] [RFC] client header mangling
I've been looking at ways to resolve the long Vary discussion going on in squid-users with a patch that we can accept into mainline. What they (joe and Yuri) have at present works, but only with extra request_header_replace config preventing integrity problems. One way to make useful progress would be to finally add the recurring request for request_header_access/replace to work on client messages in a pre-cache doCallouts hook rather than only a post-cache hook. I am imagining this being done on the adapted request headers after ICAP, eCAP and URL-rewrite have all done their things. And using the same request_header_* directive ACL lists as for outbound traffic. Any alternative ideas or objections? Amos ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev