Re: [squid-users] Re: squidblacklist.org
Ricardo, Did you ask a question and answered it yourself?! Anyway, the ideas of the SquidBlacklists are good I used it initially. Quite good and developing. Yes some URLs had issues as well as contradictions when reloading squid reading the related ACLs. However, I stopped testing them when it all went commercial in a blink of an eye without warning like when they (idea developers) asked us to contribute (and test) initially. # Edmonds On Sat, Aug 31, 2013 at 8:12 PM, Ricardo Klein klein@gmail.com wrote: they're proxy list covers a good amount of proxyes? They have any address range of UltraSurf? -- Att... Ricardo Felipe Klein klein@gmail.com On Sat, Aug 31, 2013 at 9:36 AM, Ahmad ahmed.za...@netstream.ps wrote: hi , i use squidblacklist , it is very strong acl and is updated at least every week , regards - Mr.Ahmad -- View this message in context: http://squid-web-proxy-cache.1019090.n4.nabble.com/squidblacklist-org-tp4661852p4661865.html Sent from the Squid - Users mailing list archive at Nabble.com.
Re: [squid-users] Well, this is what I concluded about using squid!
Hi Firan, besides what Amos already said, I'd like to add a comment to this last statement of yours: I will give squid a little more time but I think I will give up soon and advise the party I installed squid for to go for another, commercial, cache proxy. The status of Squid as a Free/Open Source Software project doesn't mean it can't be commercially supported. In fact, on the Squid website itself there is a pretty extensive list of companies providing commercial support for squid (http://www.squid-cache.org/Support/services.html) . I also expect that most commercial Linux distributions (e.g. Red Hat Enterprise Linux, SuSE, Ubuntu..) will support Squid on their respective distribution as part of their commercial support packages. If that's still not enough, by now in most countries there is a well-developed market of local integrators who - for a price - will support FOSS; in many cases these are companies built by experienced members of the FOSS community. Finally, you can try contacting the Squid developers; some in the team are offering consulting services (try getting direct access to engineering from any commercial product). From some of your statements it seems that you want the benefits of a gratis product, but with the support infrastructure of a commercial venture. You want a powerful product in an enterprise- or telco-class environment, but which can be set up by an unexperienced admin. I'm sorry to say this, but both sets of requirements are contradictory. Setting up a support infrastructure has costs, unfortunately, which can't be sustained by a gratis project (besides the great community and developer support Squid is already offering); likewise, a powerful product in a demanding environment requires an experienced admin. This said, thanks for your feedback. As Amos already said, we try to do our best and to improve Squid and make it even more powerful, flexible and easy to use than it is now. I hope that the informations he and I provided you will help you organize a business case that can support your needs, with Squid or with any other product on the market. -- /kinkie
Re: [squid-users] Well, this is what I concluded about using squid!
Hey Golden Shadow, I am always happy to assist if I can.. squid was used on systems with much load then just 25k requests per minute..( I have seen that) The real question is about this: what hadrware should be used to serve 25k requests per minute? then we add the questions what linux version can work with 25K requests per minute? The answer to that is strictly depends on the admin config It includes choosing the right hardware and the right hardware setup the right software setup etc... If you are wishing to make it work I would say that ubuntu is a very nice OS and Also CentOS and SUSE. when you setup this kind of a system you take an experienced system engineer and make sure what and how.. As someone asked me before the first step it to take only the hardware and the OS for a spin.. squid on top of ram linux and hardware no logs at all.. Just to see how the basic hardware is working for the setup.. Take In account that there is no need to cache all of the INTERNET since this is not the purpose of squid. If there some developer that thinks he can download the internet into a tiny 2u or 4u box or even a 20u box full of boxes he is wrong! If you want to understand caching please just ask... On 09/02/2013 06:09 AM, Amos Jeffries wrote: As alway the choice is yours. Thank you for at least mentioning your problems though. All too many peope just say it dont work and walk away. (StoreID jinx) Eliezer Amos
[squid-users] external_acl doesn't work after upgrade debian 6 squid 2.7 to debian 7 squid 3.1.20
Hi, i upgraded a Debian 6.0 with Squid 2.7 to Debian 7.1 with Squid 3.1.20. We have an external ACL included like this: external_acl_type phpauthscript protocol=2.5 children=100 ttl=0 negative_ttl=0 %SRC /etc/squid3/authscript The script updates an MySQL database with the hits of a user, where the user get looked up by the clientip. This worked fine on Debian 5 and Debian 6 with squid 2.7. But on Debian 7 this stopps working, as the authscript dies as it not gets the IP-Adress. What can i do to get the script working again? best regards thomas
Re: [squid-users] Eliezer packages
Last time I built squid with LDAP_group I have just installed openldap-devel and openldap in the CentOS build machine. If you need I can check where I left my list of dependencies that I have installed to build (dont know if I still have it) -- Att... Ricardo Felipe Klein klein@gmail.com On Sat, Aug 31, 2013 at 8:09 PM, Eliezer Croitoru elie...@ngtech.co.il wrote: Hey Ricardo, I will be glad to try to add it in the next build but I am not promising anything yet. How do you build yout squid today and on what OS ? CentOS? Do you know the dependencies for the helper? If you would share more info I will be able to handle it better. Eliezer On 08/30/2013 11:19 PM, Ricardo Klein wrote: Hey Eliezer, can you add LDAP_group to your external-acl-helpers? So I will use your packages to test new squid versions instead build my own... any chances? -- Att... Ricardo Felipe Klein klein@gmail.com
[squid-users] tproxy and url-rewrite
I have a squid with tproxy and url-rewrite Some url-rewtites goes to localhost OK rewrite-url=http://127.0.0.1/; The problem is that squid does the request using the original client IP (as tproxy has to) and localhost can't answer. there's a way to force a tcp_outgoing address (or disable tproxy) for localhost urls?
Re: [squid-users] Eliezer packages
I will look at the list of files that should be used by the spec file. I am not doing a completed and bullet proof RPM but since it runs so many systems I assume the tests until now provide enough data on the stability of the RPM. Eliezer On 09/02/2013 09:06 PM, Ricardo Klein wrote: Last time I built squid with LDAP_group I have just installed openldap-devel and openldap in the CentOS build machine. If you need I can check where I left my list of dependencies that I have installed to build (dont know if I still have it) -- Att... Ricardo Felipe Klein klein@gmail.com On Sat, Aug 31, 2013 at 8:09 PM, Eliezer Croitoru elie...@ngtech.co.il wrote: Hey Ricardo, I will be glad to try to add it in the next build but I am not promising anything yet. How do you build yout squid today and on what OS ? CentOS? Do you know the dependencies for the helper? If you would share more info I will be able to handle it better. Eliezer On 08/30/2013 11:19 PM, Ricardo Klein wrote: Hey Eliezer, can you add LDAP_group to your external-acl-helpers? So I will use your packages to test new squid versions instead build my own... any chances? -- Att... Ricardo Felipe Klein klein@gmail.com
Re: [squid-users] Eliezer packages
let me know when you build something and I will test it ;-) -- Att... Ricardo Felipe Klein klein@gmail.com On Mon, Sep 2, 2013 at 5:11 PM, Eliezer Croitoru elie...@ngtech.co.il wrote: I will look at the list of files that should be used by the spec file. I am not doing a completed and bullet proof RPM but since it runs so many systems I assume the tests until now provide enough data on the stability of the RPM. Eliezer On 09/02/2013 09:06 PM, Ricardo Klein wrote: Last time I built squid with LDAP_group I have just installed openldap-devel and openldap in the CentOS build machine. If you need I can check where I left my list of dependencies that I have installed to build (dont know if I still have it) -- Att... Ricardo Felipe Klein klein@gmail.com On Sat, Aug 31, 2013 at 8:09 PM, Eliezer Croitoru elie...@ngtech.co.il wrote: Hey Ricardo, I will be glad to try to add it in the next build but I am not promising anything yet. How do you build yout squid today and on what OS ? CentOS? Do you know the dependencies for the helper? If you would share more info I will be able to handle it better. Eliezer On 08/30/2013 11:19 PM, Ricardo Klein wrote: Hey Eliezer, can you add LDAP_group to your external-acl-helpers? So I will use your packages to test new squid versions instead build my own... any chances? -- Att... Ricardo Felipe Klein klein@gmail.com
Re: [squid-users] tproxy and url-rewrite
On 09/02/2013 11:00 PM, Alfredo Rezinovsky wrote: I have a squid with tproxy and url-rewrite Some url-rewtites goes to localhost OK rewrite-url=http://127.0.0.1/; The problem is that squid does the request using the original client IP (as tproxy has to) and localhost can't answer. there's a way to force a tcp_outgoing address (or disable tproxy) for localhost urls? Hey, There is no problem in that. when you do a rewrite like this it should not work!! you should never do a TPROXY interception and then connect to a localhost address. nobody can route localhost 127.0.0.0/8 address to elsewhere then the localmachine since this is how computers work. If you would describe what you want to achieve using url-rewrite we can suggest you on a way to make it work. There is a workaround to use a cache_peer with a no-tproxy flag on it. Eliezer
Re: [squid-users] Number of Objects Inside aufs Cache Directory
Hi Amos, Thanks for your reply. I've read the few hundreds thing in Squid: The Definitive Guide by Duane Wessels and I think his recommendation is related to performance of squid more than to file system constraints. I quoted the following from the book: Some people think that Squid performs better, or worse, depending on the particular values for L1 and L2. It seems to make sense, intuitively, that small directories can be searched faster than large ones. Thus, L1 and L2 should probably be large enough so that each L2 directory has no more than a few hundred files. Best regards, Firas From: Amos Jeffries squ...@treenet.co.nz To: squid-users@squid-cache.org Sent: Monday, September 2, 2013 3:33 AM Subject: Re: [squid-users] Number of Objects Inside aufs Cache Directory On 2/09/2013 8:40 a.m., Golden Shadow wrote: Hello there! I've read that the number of first level and second level aufs subdirectories should be selected so that the number of objects inside each second level subdirectory should be no more than few hundreds. Not sure where that came from. few hundreds sounds like advice for working with FAT-16 formatted disks - or worse. Most of the modern filesystems can handle several thousands easily. The real reason for these parameters is that some filesystems start producing errors. For example; Squid tores 2^24 objects in its cache_dir but - FAT16 fails with more than 2^14 files in one directory - and IIRC ext2 and/or ext3 starts giving me trouble around 2^16 files in one directory. I'm not sure about other OS, I've not hit their limits myself. In one of the subdirectories of the following cache_dir on my 3.3.8 squid, there is more than 15000 objects! In other subdirectories, there is ZERO objects, is this normal?! Yes. It depends entirely on how many objects are in the cache. The earlier fileno entries fill up first, so the directories where those fileo map to will show lots of files while later ones do not. NOTE: when changing these L1/L2 values the entire cache_dir fileno-file mapping gets screwed up. So you need to erase the contents and use an empty location/directory to build the new structure shape inside. cache_dir aufs /mnt/cachedrive2/small 25 64 256 min-size=32000 max-size=20 If you want to increase that spread you can change the 64 to 128. It should halve the number of files in the fullest directory. With 250GB storing 32KB-200KB sized objects you are looking at a total of between 1,310,720 and 8,192,000 objects in that particular cache. Amos
Re: [squid-users] Number of Objects Inside aufs Cache Directory
Hey there, Since squid holds an internal DB there is no lookup that should be done on the file system level such as ls |grep file_name in order to find the file and to fetch all the details from the indoe. the L1 and L2 is to not reach the FS limit of files per inode.. let say with a L1 directory you have a limit of 65k files for ext FS. when using a L1 L2files you reach a higher amount of files at the upper limit... like instead of having a 1X65k limit you would have 128X256X65k which is about: 2,129,920,000 The FS by itself wont be the limit but the cpu ram etc.. I have seen a comparison between xfs and ext4 and it seems to me like xfs shows that there is a limit to what you can expect from a FS to do.. Also GlusterFS showed a very smart way of handling couple things with hash based distribution of files over couple nodes. using a 15k sas drives you can see that there is a limit to what speed the HDD can do but still when you have enough ram and CPU you can let the OS handle both the request and the next coming read\write that are scheduled for the disk. In any case there should be a limit to how many IO one cpu and HDD can handle at the same time.. when a raid system is being used and it has more ram then the local system I can understand the usage of such a raid device but as I said before.. Squid implementation should be taken one step at a time and a LB based on routing should be used to also distribute the load between couple systems. now lets take the system up from layer 1 to layer 7.. layer 1 would be the copper and the limit is either 1Gbps or 10Gbps in a case of optical.. I would assume you do have a core router that needs to know about the load of each instance periodically(30 secs). A juniper router can take it on the 600-800 Mhz while doing routing only. A linux server can take with a 2-3 Ghz a bit more then just what this juniper can... if designed right.. A small keepalived and some load balancer magic on any of the enterprise class OS would do the trick in a basic routing mode.. once the layer 2-3 is up you can work on the layer 4 and up which is the level of squid. from the load balancer to the proxies create a small internal network and make sure that the traffic will be marked when incoming and outgoing so that the LB will send the egress traffic to the edge and not the clients.. now try to think it over and add the tproxy to the proxies step by step. so the system can be 1-2 LB that can take the full network load and each proxy that can take over only 1-2 GB loaded the balance of the network.. This is nice to have 1 server that has 64 or 128 cores but the OS needs to know and use all the resources in a way that the application will handle only the 4-7 level and leave all the rest to the kernel. For now it's a dream and still TIER-1 TIER-2 and other providers use squid and they are happy so it's not the software that blame for something that doesn't work as expected. Regards, Eliezer On 09/03/2013 12:45 AM, Golden Shadow wrote: Hi Amos, Thanks for your reply. I've read the few hundreds thing in Squid: The Definitive Guide by Duane Wessels and I think his recommendation is related to performance of squid more than to file system constraints. I quoted the following from the book: Some people think that Squid performs better, or worse, depending on the particular values for L1 and L2. It seems to make sense, intuitively, that small directories can be searched faster than large ones. Thus, L1 and L2 should probably be large enough so that each L2 directory has no more than a few hundred files. Best regards, Firas From: Amos Jeffries squ...@treenet.co.nz To: squid-users@squid-cache.org Sent: Monday, September 2, 2013 3:33 AM Subject: Re: [squid-users] Number of Objects Inside aufs Cache Directory On 2/09/2013 8:40 a.m., Golden Shadow wrote: Hello there! I've read that the number of first level and second level aufs subdirectories should be selected so that the number of objects inside each second level subdirectory should be no more than few hundreds. Not sure where that came from. few hundreds sounds like advice for working with FAT-16 formatted disks - or worse. Most of the modern filesystems can handle several thousands easily. The real reason for these parameters is that some filesystems start producing errors. For example; Squid tores 2^24 objects in its cache_dir but - FAT16 fails with more than 2^14 files in one directory - and IIRC ext2 and/or ext3 starts giving me trouble around 2^16 files in one directory. I'm not sure about other OS, I've not hit their limits myself. In one of the subdirectories of the following cache_dir on my 3.3.8 squid, there is more than 15000 objects! In other subdirectories, there is ZERO objects, is this normal?! Yes. It depends entirely on how many objects are in the cache. The earlier fileno entries fill up first, so the
Re: [squid-users] Well, this is what I concluded about using squid!
Hi Amos, Kinkie and Eliezer @Kinkie: It's very interesting to me to know about the companies that provide commercial support for squid on:(http://www.squid-cache.org/Support/services.html). Thanks for your reply and for providing this link. @Eliezer: Thanks for your reply. I'm already using Centos but yeah, I think I need to consider faster disk drives, perhaps SSDs. @Amos: Thanks a lot for your detailed explanation, you made several things clearer to me. By configure options, I meant the options that should be used with the configure script that is run before make. Is there any good documentation about their meanings and what options to use with what values based on a certain environment and/or hardware? Take for example --enable-async-io=, should I use it? What is the best value to use? I don't think there is good documentation about those configure options. As for: http://wiki.squid-cache.org/Features/Tproxy4 Although it takes care of most things a beginner would need to implement a TPROXY, but it looks summarized and in my opinion does not remove the confusion caused by the many old and obsolete articles about how to implement a TPROXY. When I first started, I was puzzled whether I should compile the kernel from source to use the TPROXY patches or just use the kernel that comes with my Centos 6.4 kernel. Moreover, the following section really confused me: Use DIVERT to prevent existing connections going through TPROXY twice: iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT Based on my understanding of this iptables rule, I think it is intended for reply packets coming from web servers to squid (with the spoofed IP address), right?! Saying that its purpose is to prevent existing connections going through TPROXY twice really confused me and I still can't understand what this means! I know there is a lot of documentation online about tuning file descriptors and similar things, and yeah I think it's not the job of squid documentation to talk about how to do those things. What I meant is that it would perhaps be much easier for newbies to be notified that they may need to tune file descriptors, that they may get SYN floods, that they may have page faults, and all other things that are waiting for them down the way, all in a single document. This will also reduce the number of duplicate questions on this list, I guess. Thanks once again and I really appreciate your support. Best regards, Firas Hi Firan, Sorry to hear you have had such a bad experience with Squid. Some comments inline ... On 2/09/2013 10:13 a.m., Golden Shadow wrote: I've tried squid previously only in a lab environment. Over the last 3 months though, I had the chance to try squid in a real environment, in which the TPROXY squid I installed receives around 25000 http requests per minute. Unfortunately, I've concluded that if someone was to install squid in a real environment, there would be no specific guide that he can follow to avoid all the problems that are waiting for him on the way. A guide about what is the best hardware to choose, ... but there is no best. Squid is designed to work on a *very* wide range of hardware (and virtual machine) systems in an even more varied set of environments, many of which have not even been designed or invented yet (thanks to POSIX). If you want advice you have only to outline what environment you are trying to work with and throw it out for the rest of the community to respond to. If anyone out there has past experience you may hear back, but be aware that even so any response is only _past_ experience and usually a personal opinion. What works best for a coffee shop is not anywhere near what works best for a Tier-1 gateway between countries, which is different again from a mobile network gateway. Also, the number of experienced admin capable of dealing with large scale networks on a debugging level seem to be getting rarer. (If this whole cloud fad actually meets its marketing goal we will probably end up with only one world-wide admin who has nobody else around to ask advice from. Suck to be learning that job.). the recommended configure options (there is no good documentation for most of them), ... except the configuration manual (http://www.squid-cache.org/Doc/config/). If there are issues with any option please point it out. We are constantly improving it. The authoritative reference guide of that is the rather long squid.conf.documented which should have been installed with your proxy. how to implement a TPROXY using squid, ... http://wiki.squid-cache.org/Features/Tproxy4#Squid_Configuration. That one sub-section of the page (one single 'tproxy' flag even) really is the whole Squid part required to get TPROXY running ona basic proxy or cache. Everything else listed on that page and elsewhere is system-specific details, policy choices, and troubleshooting help. We cannot decide for you what
Re: [squid-users] Number of Objects Inside aufs Cache Directory
Hi Eliezer, Thanks a lot for your detailed message. It's good to know that L1 and L2 are only there to avoid reaching the File system limit of files per inode. I think I need to redesign the whole thing from the bottom up as you outlined. Balancing the load between two squid nodes that efficiently use the hardware resources is better than using one squid node that does not efficiently use the available resources. I think using WCCP to redirect traffic and to balance the load is a good choice and it works perfectly for me so far. It also offers good reliability in case one node gets down. Thanks once again for your support. Best regards, Firas - Original Message - From: Eliezer Croitoru elie...@ngtech.co.il To: squid-users@squid-cache.org Cc: Sent: Tuesday, September 3, 2013 1:29 AM Subject: Re: [squid-users] Number of Objects Inside aufs Cache Directory Hey there, Since squid holds an internal DB there is no lookup that should be done on the file system level such as ls |grep file_name in order to find the file and to fetch all the details from the indoe. the L1 and L2 is to not reach the FS limit of files per inode.. let say with a L1 directory you have a limit of 65k files for ext FS. when using a L1 L2files you reach a higher amount of files at the upper limit... like instead of having a 1X65k limit you would have 128X256X65k which is about: 2,129,920,000 The FS by itself wont be the limit but the cpu ram etc.. I have seen a comparison between xfs and ext4 and it seems to me like xfs shows that there is a limit to what you can expect from a FS to do.. Also GlusterFS showed a very smart way of handling couple things with hash based distribution of files over couple nodes. using a 15k sas drives you can see that there is a limit to what speed the HDD can do but still when you have enough ram and CPU you can let the OS handle both the request and the next coming read\write that are scheduled for the disk. In any case there should be a limit to how many IO one cpu and HDD can handle at the same time.. when a raid system is being used and it has more ram then the local system I can understand the usage of such a raid device but as I said before.. Squid implementation should be taken one step at a time and a LB based on routing should be used to also distribute the load between couple systems. now lets take the system up from layer 1 to layer 7.. layer 1 would be the copper and the limit is either 1Gbps or 10Gbps in a case of optical.. I would assume you do have a core router that needs to know about the load of each instance periodically(30 secs). A juniper router can take it on the 600-800 Mhz while doing routing only. A linux server can take with a 2-3 Ghz a bit more then just what this juniper can... if designed right.. A small keepalived and some load balancer magic on any of the enterprise class OS would do the trick in a basic routing mode.. once the layer 2-3 is up you can work on the layer 4 and up which is the level of squid. from the load balancer to the proxies create a small internal network and make sure that the traffic will be marked when incoming and outgoing so that the LB will send the egress traffic to the edge and not the clients.. now try to think it over and add the tproxy to the proxies step by step. so the system can be 1-2 LB that can take the full network load and each proxy that can take over only 1-2 GB loaded the balance of the network.. This is nice to have 1 server that has 64 or 128 cores but the OS needs to know and use all the resources in a way that the application will handle only the 4-7 level and leave all the rest to the kernel. For now it's a dream and still TIER-1 TIER-2 and other providers use squid and they are happy so it's not the software that blame for something that doesn't work as expected. Regards, Eliezer On 09/03/2013 12:45 AM, Golden Shadow wrote: Hi Amos, Thanks for your reply. I've read the few hundreds thing in Squid: The Definitive Guide by Duane Wessels and I think his recommendation is related to performance of squid more than to file system constraints. I quoted the following from the book: Some people think that Squid performs better, or worse, depending on the particular values for L1 and L2. It seems to make sense, intuitively, that small directories can be searched faster than large ones. Thus, L1 and L2 should probably be large enough so that each L2 directory has no more than a few hundred files. Best regards, Firas From: Amos Jeffries squ...@treenet.co.nz To: squid-users@squid-cache.org Sent: Monday, September 2, 2013 3:33 AM Subject: Re: [squid-users] Number of Objects Inside aufs Cache Directory On 2/09/2013 8:40 a.m., Golden Shadow wrote: Hello there! I've read that the number of first level and second level aufs subdirectories should be selected so that the number of objects inside each second level subdirectory should be
Re: [squid-users] Number of Objects Inside aufs Cache Directory
On 09/03/2013 02:24 AM, Golden Shadow wrote: I think I need to redesign the whole thing from the bottom up as you outlined. Balancing the load between two squid nodes that efficiently use the hardware resources is better than using one squid node that does not efficiently use the available resources. I think using WCCP to redirect traffic and to balance the load is a good choice and it works perfectly for me so far. It also offers good reliability in case one node gets down. I wanted to write a tool that will allow a squid cluster to run smoothly in the past by doing periodically health checks. it's a basic crontab task that flags 1-4 flags per proxy in the cluster in either a DB or a small FS file that will might hold a record of fame about the stability of the connection etc.. if someone can sketch a way that this kind of a helper can work I will be glad to write some code in ruby to make it work. Eliezerr