Re: [squid-users] Is it possible to mark tcp_outgoing_mark (server side) with SAME MARK as incoming packet (client side)?
Hi So documentation is right but placement of the statement is possibly wrong. Its not highlighted right infront. i.e qos_flows applies only for packets from server to client(squid) NOT from client to server. Is it possible to do reverse too? Or atleast have an acl where I can check incoming MARK on packet? So then I can make use of tcp_outgoing_mark. I just noticed that there was same discussion done in list previously as well (in 2013), here is the link: http://www.squid-cache.org/mail-archive/squid-users/201303/0421.html Yes, I'm still really interested to implement this. I got as far as doing some investigation a few weeks back. It seems *most* of the groundwork is there. I think there is space to store the incoming client connection mark, there are facilities to set the outgoing upstream mark (to an acl value). What is needed is: - code to connect the two, ie set a default outgoing mark - some thought on handling connection pipelining and re-use. At present squid maintains a pool of connections say to an upstream proxy, these now need to be selected not just because they are idle, but also because they have the correct connection mark set. This looks do-able, but slightly more tricky Ed W
Re: [squid-users] Sponsor etag/vary support for Squid 3.3
On 02/04/2013 23:38, Alex Rousskov wrote: I hope this will also be useful to others than just me! Yes, I believe most of the ETag improvements you want will be generally useful, including in environments where ETags come from origin servers (rather than being added by Squid in violation of HTTP rules). Thanks Alex. Yes, it seems a small number of frameworks are finally using etags correctly, so there is some possibility of real usage in the wild! I still think that the earlier comprehensive design with a new/dedicated checksum headers would be an overall better solution for your specific problem, but it would take a lot more time to implement while you should be able to get quite a bit by simply [ab]using ETags. And since the changes you need are mostly generally useful, I do not see any big problems with this simplified approach, at least as the first step. Sure - I don't disagree, but this will be a great first step to prove whether it can work or not. Also, at least this bit will be standards compliant, so we can work on bending the rules outside of that change later on. I think this will be really good groundwork (and benefit squid) Cheers Ed W
Re: [squid-users] Sponsor etag/vary support for Squid 3.3
Hi Alex Sorry, I didn't notice your reply! I'm picking up an old thread from some time back. I remain interested in getting support for etag into squid (and related revalidate support). My main requirement is that I have two proxies on either side of a bandwidth limited link (with high cost). I want the situation that when a client GETs some object, A client GETs some object currently in the cache and with ETag, but that cached object is either stale or being forcefully reloaded by the client, right? Yes. Or some second client requests the same object so we need to do a freshness check, or client clears their cache, or upstream doesn't correctly implement IF-MODIFIED-SINCE, etc, etc I'm not trying to decrease the incidence of squid asking the upstream server if the object is fresh (which could also trigger non idempotent changes), however, I will try to reduce the amount of bandwidth used over the proxy-to-proxy middle link (which crosses an expensive sat connection) by ensuring that etags are set on important resources (eg creating one where it doesn't exist, using some hash of the content body) What I have probably failed to consider properly is a change in headers between two otherwise identical responses (ie same bodies), but I guess that will become clear later. Also I think VARY support will either drop out or be required. I have a use in mind which would become dependent on browser version (eg serving webp graphics to chrome) we can convert this to an IF-NONE-MATCH and trust the etag confirms that the object is unchanged. Note, I am aware of the limitations of trusting etags. In my setup I will have control over the proxy on the high speed side of the connection and we can use various methods on that side to ensure that the etags are sane. The main goal is to minimise bandwidth across the intermediate (expensive) link. Previously we discussed all kinds of complex ideas including implementing trailers, and custom headers with hash values. On reflection I think everything required can be done using only etag revalidation (and some tweaking of etags, but squid needs know nothing about that...) Yes, reload-into-If-None-Match and stale-into-If-None-Match features sound simple. The latter may even be supported already (will check). If something outside of Squid provides reliable-enough ETags to all cachable responses, then the complexities discussed earlier go away. Please confirm whether my understanding of your updated requirements is correct. I believe so. So, the situation is a downstream client talking to two squid proxies in a chain, through to the eventual upstream web server. Between the two squid proxies is an expensive internet link (charged by the byte) and so we desire to minimise bytes across the link. Essentially an upstream adaption proxy will used on the fast (ie cheap) side of the connection. This will examine all responses before they are handed to fast side squid and in this proxy we will beat the etag into shape, eg adding an SHA hash if none exists, etc. Obviously I have to accept all breakage which occurs if I change the upstream's etag - however, I think we have this covered. My goal is that if an object has the same response body, and it's already in squid cache on the slow side of the link, then we freshen the resource by going back to the origin server via our pair of squid servers, however, we avoid the transfer of the body back across the expensive link (between the two squid proxies) if the etag still matches I hope this will also be useful to others than just me! Thanks Ed W
Re: [squid-users] Re: squid qos_flows - copying mark from client side to upstream request?
On 02/04/2013 21:14, Andrew Beverley wrote: (I will always have another proxy as my upstream). If so then actually I need to reset the mark for each request? I *think* you could just set the mark on the upstream connection for each request. I think this is the correct answer (so that it handles persistent connections). I confess to not yet having continued to explore the code, but I guess I need recommendations on a good place to insert this - basically I need the point where a new request hits the network in the direction of the upstream A leg up would be welcome, I will continue to explore the code in the meantime. Thanks Ed W
Re: [squid-users] Re: squid qos_flows - copying mark from client side to upstream request?
Hi So for example I mark clients that have passed a captive portal test with some mark, I need that mark copying up to requests coming from squid so that I know they effectively come from a validated client As Amos says, this is probably the wrong way to do it. If you want to see an example of how I did it, then check out this page: http://andybev.com/index.php/PortalShaper I use iptables to drop (or redirect) all packets that are received from clients that have not passed the captive portal. Technically I don't just track pass/fail... Users have a choice of gateways to use the internet via (each will have a cost). Their choice of gateway is marked on packets from their machine, we then route through the appropriate gateway based on the connection mark (hence why I need it passed upstream through squid) Also we mark each connection with a unique per user mark so that iptables can account for the traffic they consume and bill them. Technically this could be done inside squid, but all other traffic is accounted in iptables and there is some hairy calculations needed to bill differently for different gateways, so I don't want to reproduce this in multiple locations Hence I think I need to implement the reverse of the current code? Now, as for implementation, I don't have the code in front of me, but I think I noticed there is a single code path to open a new upstream connection? At present this applies a packet mark based on tcp_outgoing_mark. Is the client connection information available at this point, so that I could mark the connection at this point based on the client connection mark? However, I think squid uses persistent connections to upstream? (I will always have another proxy as my upstream). If so then actually I need to reset the mark for each request? Where would be the correct location to put the marking code in this case, ie I guess where the packet is sent to the upstream socket? (I guess I need to be careful about pipelining also?) Thanks for your thoughts Ed W
[squid-users] squid qos_flows - copying mark from client side to upstream request?
Hi Andy, Sorry to bug you, but I finally got round to trying the qos_flows feature and I think my understanding is completely back to front? What I need is to copy the packet/connection mark from the client request, and apply it to the upstream request. So for example I mark clients that have passed a captive portal test with some mark, I need that mark copying up to requests coming from squid so that I know they effectively come from a validated client Near as I can tell the current qos_flows applies this all backwards, ie it assumes that the upstream has some mark on it, and copies this back to the client response connection? How tricky would it be to offer this option in both directions? Does anyone else have a use for this kind of feature? Thanks Ed W
[squid-users] Sponsor etag/vary support for Squid 3.3
Hi, Alex, I'm picking up an old thread from some time back. I remain interested in getting support for etag into squid (and related revalidate support). My main requirement is that I have two proxies on either side of a bandwidth limited link (with high cost). I want the situation that when a client GETs some object, we can convert this to an IF-NONE-MATCH and trust the etag confirms that the object is unchanged. Note, I am aware of the limitations of trusting etags. In my setup I will have control over the proxy on the high speed side of the connection and we can use various methods on that side to ensure that the etags are sane. The main goal is to minimise bandwidth across the intermediate (expensive) link. Previously we discussed all kinds of complex ideas including implementing trailers, and custom headers with hash values. On reflection I think everything required can be done using only etag revalidation (and some tweaking of etags, but squid needs know nothing about that...) If anyone else is interested in such support then please shout. Alex, would you mind picking this up with me again with a view to sponsoring development? Thanks Ed W
Re: [squid-users] Certificate server validation
Hi Alex Can squid handle a slightly simpler case where we want to restrict CONNECT access to servers which meet/fail to match a certain SSL cname? eg I want to block facebook access, but without sslbump, so I allow SSL proxying, but deny connections to servers with an SSL cname *.facebook.com? If your blocking decision is based on information coming from the HTTP CONNECT request alone, then you can block CONNECT requests using regular http_access rules. For example, you can block CONNECT requests for a given origin server name or a given IP address. The only caveat I am aware of is that most browsers will not display Squid's error page in this case because browsers cannot separate secure server response context from insecure proxy CONNECT response context (and there have been attacks based on the lack of context separation before browsers stopped displaying CONNECT responses). If your blocking decision is based on information coming from the SSL server certificate itself, then you have to bump the transaction for Squid to see that certificate. Without bumping, Squid only sees CONNECT headers and raw opaque-to-Squid TCP payload bytes. For example, the SslServerCertValidator feature that Amos recommended (thanks Amos!) requires bumping the transaction. SSL server certificate is not available at HTTP CONNECT inspection time. You can hack a helper script that will connect to the server using the CONNECT address, receive an SSL certificate, terminate the connection, and tell Squid what the certificate was, but doing so is bad for servers, so I would expect that some of them will eventually complain to your ISP, block your source IPs, or even feed your helper with bogus info. It will also not work with servers using SNI. FWIW, we are working on a Peek and Splice feature that allows decisions based on server SSL certificate without bumping the connection, but we still have to solve a few difficult problems before I can be certain that it is doable at all: http://www.mail-archive.com/squid-dev@squid-cache.org/msg19574.html Finally, I think it is technically possible to peek at the certificate with no intention of bumping the connection (but with the intention of possibly terminating it). I am not aware of anybody working on this, but the Peek and Splice feature mentioned above will provide this functionality as a side effect. However, please note that without bumping, you still will not be able to serve an error page to the blocked user because the browser will expect to communicate with the [secure] origin server, not Squid (the browser already sent its SSL Hello request). Apologies for the slow reply - this is REALLY interesting At the moment I'm playing with ndpi (part of ntop), which does some simple parsing of the raw tcp packets to try and dig out the certificate cname. I'm fairly sure a determined attacker could get past this, but this is probably acceptable for it's main requirement My situation is that I need to restrict access to certain classes of connection. Mostly we have a situation where the bandwidth costs are expensive the user themselves is volunteering to be restricted so that they don't spend on connections they don't value. So for example they will wish to restrict antivirus scanner updates, windows updates, Skype registrations, IOS push, etc. So users request a specific firewall profile to be applied, but that increasingly everything looks like http/https these days... So we desire to standardise on one method to police, log and restrict http protocols and ideally that would be squid. So yes, Peek with restrict (and a rather abrupt disconnect) would be superb for our purposes. Will wait and watch! Cheers Ed w
Re: [squid-users] Certificate server validation
On 20/01/2013 01:24, Amos Jeffries wrote: On 19/01/2013 3:37 a.m., vincent viard wrote: Hello, I ask you about the feasibility of achieving an validation server certificates used during session establishment SSL/TLS in HTTPS at the level of SQUID proxy ? The idea is not to break the SSL session with a man-in-the-middle (ex. SSLBump), but to authenticate (and to authorize) the target with a white or black list of CAs. In other words, realize with Squid, the first validation of the SSL handshake logically made by the client browser on the certificate of server. In advance, thank you and good day. Vince Please see http://wiki.squid-cache.org/Features/SslServerCertValidator This feature is merged and will be in 3.4 series when it is released. To use it now you need to build the 3.HEAD Squid sources. Can squid handle a slightly simpler case where we want to restrict CONNECT access to servers which meet/fail to match a certain SSL cname? eg I want to block facebook access, but without sslbump, so I allow SSL proxying, but deny connections to servers with an SSL cname *.facebook.com? Thanks Ed W
Re: [squid-users] Re: squid with pdns, bandwidth control issue
On 02/07/2012 14:12, Muhammad Yousuf Khan wrote: after limiting my bandwidth using delaypool, things seems OK. BTW i thought that Squid would be doing it automatically. All the research on TCP up until very recently has been how to *maximise* the amount of data flowing down a given pipe. Only recently has there been a lot more thought into single streams NOT trying to maximise the flow to match the pipe size... No systems right now really do anything other than try to download as fast as possible (arguably latest bittorrent clients with this fancy new reinvention of a tcp alike protocol are the only exception I can think of). If you DONT want to max out a given pipe then you need to configure this in some way Note you may have other problems which you might want to tune as well as delay pools... Ed W
Re: [squid-users] Re: squid with pdns, bandwidth control issue
On 29/06/2012 14:12, Muhammad Yousuf Khan wrote: i have made some test and here is some detail and results ok i am using two machine 1, Gateway IPcop (linux) 2. Debian lenny (squid) i am using download manager to download a 50MB file. IPCOP --- when i do it VIA IPCOP my download burst rate up to 270 KB not ping delay and other can also brows easily. Squid on Lenny VIA SQuid (proxy mode) my download reach 365 which is full throughput and faster then IPCOP but ping delay reach 4000 which is considered almost near to death. and no other users can brown and getting time out message on there browser. i think this shows that issue is with squid box and i don't know weather i have to tweak the squid or TCP buffer or anything Run a download using wget from both boxes and observe the download speeds and effect on ping. This might help you figure out if it's an operating system configuration setting The effect is clear though - one of your machines is managing to max out the entire inbound connection (which is exactly what TCP is supposed to try and do). The other machine is only partially using the connection (I know that feels more desirable, but it's likely an accident and it's not how tcp tries to behave) So your problem seems to be reduced to figuring out why one machine is performing optimally and hence hogging the whole internet connection. Reduce the problem to the basics and debug from there. Just remember that tcp is supposed to learn how to hog the entire connection, allocating traffic more evenly is a tricky problem and you might want to use the various features in squid delay pools and linux traffic control to control this..? Good luck Ed W
Re: [squid-users] Outlook 2010 crashing on gzip-encoded proxied internet calendars
On 29/06/2012 14:14, Pim Zandbergen wrote: Could it be squid is feeding Outlook a gzip encoded cached calendar, which was previously received by Thunderbird? Would that be a squid bug? Aha, yes, isn't there only partial support for vary in squid right now? You might want to dump the vary headers for the response with the different Accept-Encoding headers and compare? Ed W
[squid-users] Conditional cache_peer based on transparent/non transparent connection?
Hi, I have an upstream cache_peer which requires authentication. My local network squid accepts explicit proxy clients, and does transparent redirection on everyone else. Clearly I can't use the cache_peer for the transparently proxied clients, however, how could I use it only for explicit proxy clients? (ie those with a proxy set in their browser). The cache_peer line uses login=PASSTHRU so we never get involved with the authentication on the local squid Thoughts on how I could achieve this please (without two squid instances)? Squid 3.2.0.16 Thanks Ed W
Re: [squid-users] Roadmap Squid 3.2
Is Squid-3.2.0.15 the most stable release to be using for deployment on the bleeding edge, or is 3.2.0.12 still the safest bet? In the past you have given some guidance as builds have moved into new functionality vs bug squashing phases? Are you imminently about to release 3.2.016? Does someone have some big picture comments on rock store - benefits, any known issues? Cheers Ed W
Re: [squid-users] How many proxies to run?
On 13/01/2012 13:29, Eliezer Croitoru wrote: On 12/01/2012 19:58, Gerson Barreiros wrote: I have an unique server doing this job. My scenario is most the same as mentioned above. I just want to know if i can make this server a Virtual Machine, that will use shared hard disk / memory / cpu with another VMs. web proxy on a vm is not the best choice in a mass load environment. you can use on a vm but in most cases it will mean lower performances. i have a place with 40Mbps atm line that is using 2 squid servers on a vm and it works fine. another one is an ISP with 4 machines with a total of 800Mbps output to the clients. statistics is one of the musts before getting a conclusion. Eliezer I quite like Linux-vservers for my virtualisation solution. It's basically a kind of fancy chroot with kernel enhancements to make the separation almost as complete as full virtualisation. Since it IS basically just a chroot, you are still running on bare metal and there is no virtualisation overhead as such. For my requirements it works very nicely and I don't have any needs that mean I need a full virtualisation solution (KVM, etc). The main reasons a container solution such as linux-vservers isn't suitable is when you need full separation from host OS, eg kernel version is important, or where hardware virtualisation is useful (although there are ways to kind of virtualise the network card with vservers), also where you need features of an enterprise virtualisation solution, so as live migration. On the flip side, the performance is very high with a container solution and my machines boot in 1-2 seconds, so it's really very easy for me to manage without a full virtualisation solution Good luck Ed W
Re: [squid-users] Facebook page very slow to respond
On 20/10/2011 06:11, Wilson Hernandez wrote: To tell you the truth I don't know whats the deal: bandwithd or squid but, is really getting in my nerve loosing users left and right every week I need to come up with a solution before my whole network goes down the drain You need to get a reproducible situation and work from there. Find a tame user with the problem and get network timings, etc. Trace it on the server, setup direct access vs proxy access for them and compare performance, etc. You can use things like Chromes developer mode, or Firefox Firebug/Tamperdata to see network traffic and timings - that alone would probably help you a lot (pages can seem sluggish if certain assets such as javascript or css are slow to load and block page rendering - often these assets can be advertising things, or other things which might be handled differently in your proxying situation) Good luck Ed W
Re: [squid-users] How to filter response in squid-3.1.x?
On 20/10/2011 09:11, Amos Jeffries wrote: On 20/10/11 20:11, Kaiwang Chen wrote: So Squid without the adapter will cache one copy of responses in only one encoding. Yes. Will Vary:Accept-Encoding request header enable multiply copies? No. It tells Squid there are multiple variants with the same URL, and to check the Accept-Encoding header against the one stored already when deciding if it is a HIT. Hi, can I just double check your response above. Whilst Squid support might not be working (correctly) for Vary caching, my understanding is that Vary:Accept-Encoding is correct here to effectively cache both gzip'ed and non gzip'ed versions of the content? That said, if so, then this probably a good example of where post-cache response adaption (or code in squid) is useful? The problem seems to be that: ecap runs pre-cache, the ecap adaption gzips the data, the gzip'ed version is cached and now all cache hits return the gzip version regardless of Accept-Encoding? I think it's a non problem in that *I think* all modern browsers handle almost all assets gzip encoded whether they asked for it or not? In the past IE (6 and prior) was the odd one out with some bugs, and I have seen certain ISP proxies somehow mangle the response encoding header causing gziped content to be treated as plain text by the browser (not sure what happens, only have non technical customer reports). I *think* it's probably safe these days to blindly gzip everything in sight? Also when I last looked at the code, I think the ecap module a) only gzips a very small number of content types (ideally it should be expanded), and b) from memory I don't think it respected Accept-Encoding and compressed everything regardless? These are probably incorrect assertions, didn't recheck the code Thanks Ed W P.S. If someone wanted to investigate adding gzip to Squid directly then they would be looking for a hook into the code somewhere that handled all the client bound response bodies, with access to the original request to check Accept-Encoding headers. Where might someone look to add such code?
Re: [squid-users] handing off usernames to parent proxies
On 20/10/2011 18:11, E.S. Rosenberg wrote: On the whole I just need the backend to know the username, or what 'browsing plan' the session is using, sometimes plans are also determined based on src IP (ie. certain stations aren't allowed to browse no matter who's logged in, or are supposed to only have access to a whitelist even when staff are using them), so I think a 'NAT'-like method is most likely what i need. Just to highlight a feature that not everyone yet knows about, but in the 3.2 series there is support for conntrack marking both to copy the original connection mark to the output and also to mark connections based on various squid criteria. Conntrack marks don't affect anything outside of the network stack they are running on (ie next hop knows nothing), but they can be used to help integrate a firewall to achieve various clever effects. I'm not sure that they help you that much, so this was more to add an idea on the off chance it helps... At a pinch you can use your firewall to change IP address or TOS marks to communicate conntrack marks outside of the box, but it's a bit crude... The other thing is that I believe you can use the auth helpers to set the upstream auth username to be somewhat different to the logged in user? So I *believe* you can achieve the effect that you can do some database lookup on users in group X to get a group name X and pass that X upstream as the auth user. The point is that you don't need to use IP as your upstream signaling criteria, you can use the auth user, but pre-grouped to the service class names that you need. As an extension to this basic idea I believe you can use the auth helpers to derive these usernames from other criteria such as client IP address, etc. Does this help? Good luck Ed W
Re: [squid-users] Add top information to all webpages (like godaddy AD)
So you could be smarter and instead inject some javascript which checks if you are in a frameset and if not creates one. This of course has some subtleties with ajax... Ed W On 11/10/2011 12:28, Hasanen AL-Bana wrote: I believe yes ! but it will cause lots of troubles with pages like facebook gmail you can redirect all requests to a url_rewriter script. squid will pass the requested url to the script , then the script must generate a page with 2 iFrames , first iFrame will hold the Ad, the second iFrame goes bellow the first one and will contain the original requested page. but think of the problems you will face because squid will add that to each request which will break all the page ,hence the script must be smart enough to process only root pages like index.php index.html On Tue, Oct 11, 2011 at 2:20 PM, Jorge Bastos mysql.jo...@decimal.pt wrote: Howdy, I'd like to do something that I don't know if that's possible somehow. I have squid configured as transparent, and I'd like to add on everypage that the user visits, information on the top of the pages, like an AD. Is this possible? For example Godaddy has this on the free hosting they provide. Thanks in advanced, Jorge Bastos,
Re: [squid-users] Facebook page very slow to respond
On 08/10/2011 20:25, Wilson Hernandez wrote: Thanks for replying. Well, our cache.log looks ok. No real problems there but, will be monitoring it closely to check if there is something unusual. As for the DNS, we have local DNS server inside our LAN that is used by 95% of the machines. This server uses our provider's servers as well as google's: forwarders { 8.8.8.8; 196.3.81.5; 196.3.81.132; }; Our users are just driving me crazy with calls regarding facebook: is slow, doesn't work, and a lot other complaints... Occasionally you will find that Google DNS servers get poisoned and take you to a non local facebook page. I guess run dig against specific servers and be sure you are ending up on a server which doesn't have some massive ping to it? I spent a while debugging a similar problem where the BBC home page got suddenly slow on me because I was being redirected to some german akamai site rather than the UK one... This is likely to make a difference between snappy and sluggish though, not dead... Good luck Ed W
Re: [squid-users] Tuning for very expensive bandwidth links
Hi So the remote (client) side proxy would need an eCAP plugin that would modify the initial request to include an ETag. This would require some ability to interrogate what we have in cache and generate/request the ETag associated with what we have already - do you have a pointer to any API/code that I would need to look at to do this? I'm unsure sorry. Alex at The Measurement Factory has better info on specific details of what the eCAP API can do. If I wanted to hack on Squid 3.2... Do you have a 60 second overview on the code points to examine with a view to basically: a) create an etag and insert the relevant header on any response content (although, perhaps done only in the case that an etag is not provided by upstream server) b) add an etag header to requests (without one) - ie we are looking at the case that client 2 requests content we have cached, but client 2 doesn't know that, only local squid does. Just looking for a quick heads up on where to start investigating? IIRC we have Dimitry with The Measurement Factory assisting with HTTP compliance fixes. I'm sure sponsorship towards a specific fix will be welcomed. How do I get in contact with Dimitry? The one public eCAP adapter we have bee notified about happens to be for doing gzip. http://code.google.com/p/squid-ecap-gzip/ Hmm.. I did already look this over a bit - very nice and simple API, shame there aren't a huge bunch of ecap plugins sprung up? The limitation seems to be that the API is really around mangling requests/responses, but there isn't obviously a way to interrogate squid and ask it questions about what it's caching? Even if there were then you also have a race condition that you might say to upstream that we have content X in cache, but by the time the response comes back that content might have been removed..? Seems that at least parts of this might need to be done internally to squid? Just to be clear, the point is that few web servers generate useful etags, and under the condition that bandwidth is the limiting constraint (plus a hierarchy of proxies), then it might be useful to generate (and later test) etags based on some consistent hash algorithm? Thanks Ed W
Re: [squid-users] Tuning for very expensive bandwidth links
On 30/03/2011 19:17, Marcus Kool wrote: If your users do not mind, you can block ads and user tracking sites of which many produce 1x1 gifs. Most ads and tracking codes are not cacheable and may consume a lot. This all depends on which sites your users visit of course. Thanks - got that covered to some extent. So far I was looking at these two lists for a simple domain blocking system to catch adverts and tracking http://www.mvps.org/winhelp2002/hosts.htm http://hosts-file.net/ Any other suggestions/comments? Additionally we will offer the option to do image recompression and upstream gzip of content (actually probably we will use our own compressing tunnel across the slow link) Anyone know of any already written ecap/icap servers I might want to investigate? Cheers Ed W
Re: [squid-users] Tuning for very expensive bandwidth links
Hi My thought was to investigate having the internet side proxy add etag headers to all content based on some quality hash function. Then have the (expensive) remote side proxy rewrite the request headers to always use If-None-Match? The idea is that the bandwidth is cheap on internet connected side, so it can refresh it's cache of the whole page, generate a new hash, but still return a not modified response if the end result is the same string of bytes. How much of that can I implement in Squid 3.x today..? 3.1.10+ will validate If-None-Match and ETag, but will not add them to requests itself. Thanks - can you expand on what it means to validate in this case? I think you mean that if the content is cached with a given eTag then requests for that content will be returned from cache if the request has an appropriate If-None-Match - is this the case? Note, I realise this could lead to some side effects where the action of visiting the web page itself causes some other side effect, however, I think this is a manageable problem for this requirement? Thanks for any pointers to ideas or other products that might help? ICAP or eCAP would be the way to go here for quick results. Making a plugin to do the ETag generation and alterations before sending off. Understood. So the remote (client) side proxy would need an eCAP plugin that would modify the initial request to include an ETag. This would require some ability to interrogate what we have in cache and generate/request the ETag associated with what we have already - do you have a pointer to any API/code that I would need to look at to do this? Then on the internet side proxy we would do whatever we need to retrieve the content, say fetch the asset. Then our eCap on that side would generate a consistent ETag using our favourite hash function? The part I'm unsure how to implement would be examining what's in squid's cache in order to generate an ETag based on what we have got (ie for remote side)? You could also look at cutting bodies off 304 replies at the Internet side to avoid the bandwidth expensive TCP_REFRESH_UNMODIFIED responses. Hmm, yes that would be very sensible. Apart from via eCAP are there other ways I might do that? NP: if you want to go ahead and alter Squid code adding If-None-Match on outbound requests is an open bug. As is proper ETag variant caching support. I don't know if I have the time/ability to hack on squid code? Is there someone who might be interested on working on this for an affordable fee? Thanks for the very helpful feedback. Note if there are any existing ecap/icap modules I should look at then please educate me? (I'm currently using Ziproxy and looking at moving the interesting bits to a Squid ecap module. I have also used Rabbit proxy which is somewhat similar) Thanks for your comments Ed W
[squid-users] Tuning for very expensive bandwidth links
Hi, Just investigating some tuning for squid for use with satellite links (which are relatively slow + bandwidth can be charged at $10-100/MB) I'm pondering having a dual proxy configuration with a proxy at both ends of the satellite link. A desired goal would be to force serving from local cache anything which hasn't actually changed (byte for byte) on the internet side. My thought was to investigate having the internet side proxy add etag headers to all content based on some quality hash function. Then have the (expensive) remote side proxy rewrite the request headers to always use If-None-Match? The idea is that the bandwidth is cheap on internet connected side, so it can refresh it's cache of the whole page, generate a new hash, but still return a not modified response if the end result is the same string of bytes. How much of that can I implement in Squid 3.x today..? Note, I realise this could lead to some side effects where the action of visiting the web page itself causes some other side effect, however, I think this is a manageable problem for this requirement? Thanks for any pointers to ideas or other products that might help? Ed W
[squid-users] Support for detecting if-modified using SHA digest or similar?
Hi, I am plotting a hierarchical cache with a proxy at the client end of a slow expensive satellite internet connection, and another on the fast cheap internet side (goal is to optimise traffic passing through the slow link). I would specifically like to address the issue that many (smaller, dynamic) sites do not properly support if-modified type headers and always send the same content each time.. I think the only way this can be solved is if the client end cache notices it has a cached version of a resource, adds it's own if-modified-sha header stating which content it's got, the upstream proxy then may need to fetch the object again, but if the upstream finds the content actually is the same then it commutes the response to a 304 (Something like a dynamic proxy generated e-tag really) Someone may tell me this is already in an RFC? If so great. If not, could someone advise how difficult this feature might be to add to Squid 3.1? Bonus marks if it doesn't break streaming resources... Any other ways to achieve the same effect? Thanks Ed W
[squid-users] Modifying content passing through proxy?
For various reasons I am interested in writing a proxy for users on very low bandwidth connections. What ability do I have to modify the content passing through squid? I'm interested in emulating the ability to resize pictures and generally mangle the HTML that you find in a proxy like Rabbit: http://www.khelekore.org/rabbit/ Grateful for any thoughts Ed W
Re: [squid-users] Modifying content passing through proxy?
I've been looking at similar options, primarily to speed up web browsing on small screen devices connected via GPRS. I'm looking at normal sized clients: IE/firefox on a laptop/desktop. Again over satellite or GPRS. What ability do I have to modify the content passing through squid? I'm interested in emulating the ability to resize pictures and generally mangle the HTML that you find in a proxy like Rabbit: http://www.khelekore.org/rabbit/ Have you considered using an instance of the transcoding proxy as a parent proxy for Squid? Do you mean the IBM software or to you mean transcoding proxy as a generic term for Rabbit? Rabbit doesn't quite do what I want, in particular it doesn't easily let the user change back and forward to higher quality versions. I want more control over this and am quite prepared to write something. I also have other requirements and will probably need a proxy client on the laptop end and also something else at the server end because I implement a compressing tunnel using an advanced compression algorithm. So I really just wondered how much of this functionality I could push into Squid and how much needs to be on the external proxy... If anyone knows of any other good opensource proxy applications (commercial or otherwise) that I could use for this process then please let me know Ed W