Re: [squid-users] Squid 3.4 very high cpu - strace.
Another experiment is to try purging and rebuilding the ssl_crtd helper cache. Hi Amos, We do the above on every squid restart anyway (via a wrapper script). Your config file has some nits (may not be relevant to the problem though): * Try switching the order of manager localhost so localhost is tested first. Manager has become a regex ACL. * hierarchy_stoplist can be removed completely. It is serving no purpose in your config. Yeah, I know! This config has pretty much just been tweaked from an original one that's about 11 years old. I'm still really keen to figure out why we can't really proceed to 3.4 and then hopefully get it fixed, Management have already asked to get in a reseller to look at Bluecoat/Barracuda/Websense etc so I'll try my best to get a good number of users on each config change I do to diagnose the problem. Thanks for your time. Cheers, Alex
Re: [squid-users] Squid 3.4 very high cpu - strace.
On 06/20/2014 03:32 PM, Alex Crow wrote: On 19/06/14 18:57, Eliezer Croitoru wrote: OK and what about the squid.conf? I have not seen one until now and it seems to me kind of important... I do know about systems that do not get 100% cpu and it's weird that we have couple guys having the issue while others do not. Thanks, Eliezer FYI, config attached. The same config works without CPU spikes in 3.3. Alex OK after reading the config file it seems like there are couple things that we\you should be aware of when looking at the issue: 1. External helpers code was changed from 3.3 to 3.4 (one way) 2. you are using delay_pools. 3. you are using ntlm authentication. In the past there was suspect which said that the new helpers related code might cause an issue like that but yet to be verified. (this needs testing and idea on how to show and proof that this is either a real suspect or a bogus one) About ntlm auth.. There is sure some overhead related to using ntlm and cpu usage due to couple layers one on top of the other and it was proofed that there is a difference between using ntlm and not using ntlm at all. It dosn't proof what in ntlm is causing the issue and I am not sure it will be fixed due to the basic fact that ntlm maintenance stopped at 200X 3 or 6 and which I am not sure about the accurate date yet. The only options I see is doing two things: Remove the ntlm and group external helpers related acls for a testing period to verify that only when these works\runs the high cpu usage is there and while the delay_pools are still intact the system runs fine. This will narrow down the issues from 3 to 2 ideal suspects. There is also another suspect which is over-usage of squid ACLs to block or allow domains\regex\etc but it can be verified that these are not an issue by removing the external_acl and ntlm helpers and test how squid behave. ** Another tiny detail would be: what bandwidth is this server pushing? How many MBps or Mbps(MBps = mbps/8)? I know that it can be painful to run these tests but if you have the option to verify the issue it will narrow the issue down pretty fast. Also I am almost sure that this thread should be summarized into either a bug report or first a thread in squid-dev list so you would get better help and directions from the developers. Thanks, Eliezer
Re: [squid-users] Squid 3.4 very high cpu - strace.
FYI, config attached. The same config works without CPU spikes in 3.3. Alex Can you try with delay_access 1 allow !CONNECT (for each rule)
Re: [squid-users] Squid 3.4 very high cpu - strace.
Can you try with delay_access 1 allow !CONNECT (for each rule) I forgot http://bugs.squid-cache.org/show_bug.cgi?id=2907
Re: [squid-users] Squid 3.4 very high cpu - strace.
On 20/06/14 14:28, Eliezer Croitoru wrote: OK after reading the config file it seems like there are couple things that we\you should be aware of when looking at the issue: 1. External helpers code was changed from 3.3 to 3.4 (one way) 2. you are using delay_pools. 3. you are using ntlm authentication. In the past there was suspect which said that the new helpers related code might cause an issue like that but yet to be verified. (this needs testing and idea on how to show and proof that this is either a real suspect or a bogus one) About ntlm auth.. There is sure some overhead related to using ntlm and cpu usage due to couple layers one on top of the other and it was proofed that there is a difference between using ntlm and not using ntlm at all. It dosn't proof what in ntlm is causing the issue and I am not sure it will be fixed due to the basic fact that ntlm maintenance stopped at 200X 3 or 6 and which I am not sure about the accurate date yet. The only options I see is doing two things: Remove the ntlm and group external helpers related acls for a testing period to verify that only when these works\runs the high cpu usage is there and while the delay_pools are still intact the system runs fine. This will narrow down the issues from 3 to 2 ideal suspects. There is also another suspect which is over-usage of squid ACLs to block or allow domains\regex\etc but it can be verified that these are not an issue by removing the external_acl and ntlm helpers and test how squid behave. ** Another tiny detail would be: what bandwidth is this server pushing? How many MBps or Mbps(MBps = mbps/8)? I know that it can be painful to run these tests but if you have the option to verify the issue it will narrow the issue down pretty fast. Also I am almost sure that this thread should be summarized into either a bug report or first a thread in squid-dev list so you would get better help and directions from the developers. Thanks, Eliezer Hi, The first thing I'm going to try is disabling delay pools for CONNECT, then after that for all requests. As disabling NTLM will leave us more open than I'd like that would be the following step. Cheers Alex
Re: [squid-users] Squid 3.4 very high cpu - strace.
On 21/06/2014 12:32 a.m., Alex Crow wrote: On 19/06/14 18:57, Eliezer Croitoru wrote: OK and what about the squid.conf? I have not seen one until now and it seems to me kind of important... I do know about systems that do not get 100% cpu and it's weird that we have couple guys having the issue while others do not. Thanks, Eliezer FYI, config attached. The same config works without CPU spikes in 3.3. Another experiment is to try purging and rebuilding the ssl_crtd helper cache. Your config file has some nits (may not be relevant to the problem though): * Try switching the order of manager localhost so localhost is tested first. Manager has become a regex ACL. * hierarchy_stoplist can be removed completely. It is serving no purpose in your config. Amos
Re: [squid-users] Squid 3.4 very high cpu - strace.
On 21/05/14 08:30, Amos Jeffries wrote: On 21/05/2014 8:11 a.m., Alex Crow wrote: Wrong on my part again. Changing the memory_replacement_policy still got to 100% cpu after Shift-reload in Thunderbird a few times - even disabling cache_mem entirely did not eliminate it. 3.3 never gets about about 67% load no matter how many time the page is reloaded. Thunderbird, are these troubles all coming from HTML emails? Does using AUFS instead of diskd cache types help? there are a lot of calls in that trace polling the diskd helpers. Amos Hi Amos, aufs is no better - in fact it seems to build up CPU much faster than diskd on just a couple of page reloads. Alex
Re: [squid-users] Squid 3.4 very high cpu - strace.
OK and what about the squid.conf? I have not seen one until now and it seems to me kind of important... I do know about systems that do not get 100% cpu and it's weird that we have couple guys having the issue while others do not. Thanks, Eliezer On 05/20/2014 09:54 PM, Alex Crow wrote: Hi Amos, all, I have set up a test box with latest 3.4.5 nightly. I get 95-100% cpu even with one client accessing the cache. I've attached a compressed strace of the child process in case anything is evident from that. Please tell me what else I might need to do to help resolve this issue. I'm hoping this will help get to the bottom of why a number of people are having this issue on 3.4.x. Any help much appreciated as always. Alex
Re: [squid-users] Squid 3.4 very high cpu - strace.
On 21/05/2014 8:11 a.m., Alex Crow wrote: Wrong on my part again. Changing the memory_replacement_policy still got to 100% cpu after Shift-reload in Thunderbird a few times - even disabling cache_mem entirely did not eliminate it. 3.3 never gets about about 67% load no matter how many time the page is reloaded. Thunderbird, are these troubles all coming from HTML emails? Does using AUFS instead of diskd cache types help? there are a lot of calls in that trace polling the diskd helpers. Amos
Re: [squid-users] Squid 3.4 very high cpu - strace.
Thunderbird, are these troubles all coming from HTML emails? I meant Firefox, sorry - I was writing the email in Thunderbird so typed that in instead. Not quite 40 yet but already losing it! Does using AUFS instead of diskd cache types help? there are a lot of calls in that trace polling the diskd helpers. I've not tried it but I'll have a go. Cheers Alex
Re: [squid-users] Squid 3.4 very high cpu - strace.
I think I've just found something. I had this set: memory_replacement_policy heap GDSF replacing this with: memory_replacement_policy lru got rid of the high CPU in 3.4 (works ok in 3,3). I will try heap LRU. Cheers Alex On 20/05/14 19:54, Alex Crow wrote: Hi Amos, all, I have set up a test box with latest 3.4.5 nightly. I get 95-100% cpu even with one client accessing the cache. I've attached a compressed strace of the child process in case anything is evident from that. Please tell me what else I might need to do to help resolve this issue. I'm hoping this will help get to the bottom of why a number of people are having this issue on 3.4.x. Any help much appreciated as always. Alex
Re: [squid-users] Squid 3.4 very high cpu - strace.
Wrong on my part again. Changing the memory_replacement_policy still got to 100% cpu after Shift-reload in Thunderbird a few times - even disabling cache_mem entirely did not eliminate it. 3.3 never gets about about 67% load no matter how many time the page is reloaded. So again hope the the trace shows something up. Cheers Alex On 20/05/14 20:04, Alex Crow wrote: I think I've just found something. I had this set: memory_replacement_policy heap GDSF replacing this with: memory_replacement_policy lru got rid of the high CPU in 3.4 (works ok in 3,3). I will try heap LRU. Cheers Alex On 20/05/14 19:54, Alex Crow wrote: Hi Amos, all, I have set up a test box with latest 3.4.5 nightly. I get 95-100% cpu even with one client accessing the cache. I've attached a compressed strace of the child process in case anything is evident from that. Please tell me what else I might need to do to help resolve this issue. I'm hoping this will help get to the bottom of why a number of people are having this issue on 3.4.x. Any help much appreciated as always. Alex