Opps forgot to send the reply about how to make a redirecto use less cpu and make it faster to the squid users list so here it is ---------- Forwarded message ---------- Date: Tue, 26 Jan 1999 04:32:52 +1100 (EST) From: Jeffrey Borg <[EMAIL PROTECTED]> To: Olivier GOSSELET <[EMAIL PROTECTED]> Subject: Re: SNMP : Info on cacheCpuUsage And Redirector Program Considerations Hi There. On Mon, 25 Jan 1999, Olivier GOSSELET wrote: > I am currently monitoring a Squid/2.1.PATCH2 cache with MRTG 2.5.4c. In the > MIB description, the cacheCpuUsage is defined as the " Amount of cpu > seconds consumed". The first question is : Does it also include the > amount of cpu seconds consumed by the eventual redirector processes ? Not sure it may not. > If not, is there a easy way to monitor it with mrtg ? This is important for > me because few month ago, i posted a mail in this mailing list to ask > the best way to filter out a big list of web-sites (more or less 10.000 > until now but it's increasing every month). Some people reply that the > list was to big to handle with squid access-list. They were saying : use > a redirector program. We decided to use Squirm for the two announced > reasons :Very, very fast, Virtually no memory usage ! Write a script for mrtg which takes the output of ps grepping for squirm and squid and add up the times > But when you look at this > > USER PID %CPU %MEM SIZE RSS TTY STAT START TIME COMMAND > squid 81 0.0 26.3 69124 67848 ? S Jan 20 2:05 (squid) -sY > squid 82 0.0 0.1 852 384 ? S Jan 20 0:00 (dnsserver) > squid 87 1.3 6.5 17392 16808 ? R Jan 20 99:10 (squirm) > squid 88 0.5 6.5 17392 16872 ? R Jan 20 44:27 (squirm) > squid 89 0.3 6.5 17392 16872 ? R Jan 20 28:38 (squirm) > squid 90 0.2 6.5 17392 16872 ? R Jan 20 19:54 (squirm) > squid 91 0.1 6.5 17392 16872 ? R Jan 20 9:33 (squirm) > > first you can see that squirm takes a large amount of memory and it's also a > big cpu time consumer when you compare it to Squid ! It will always run that high there are more or less 10,000 operations for every unsuccessful match of your list THIS is gonna waste cpu time? A) for squirm have you put abort statenments at the start of the patterns file? B) have you made use of squirms accelerator statenments? > Does someone know an alternate solution to squirm ? C) look into alternatives to squirm Ideas that come to mind are run a sql server (mysql would work fine here) this can always be run off a different box to the squid and then the redirector will just be an interface to mysql. the main problem here is that having 10,000 urls it's gonna waste cpu cycles on unsuccessful matches? There are options for speeding this up to in any redirector Make a good list too. try looking through squids logs and compiling a "good" list of websites. Put these first into any checking of sites in a redirector. This will effectively save it from checking thru a full 10,000+ lines not to find a match. Also a cache of recently accessed good sites (cut the filename off) so it adds onto a very quick abort list stored in a 3rd database So it would work like this client requests http://www.mywebsite.com/someuser/filename.html this file maybe has 10 other images associated with it in the same directory, currently these are checked thru 10,000 lists lets just cut the redirector's work by a factor of 10 Database 1 = Bad urls Database 2 = Good urls (what is known to be of use) Database 3 = Tempoary cache (it will only last say 10 minutes depending on how many urls build up in it) The redirector gets the url above First check it against Database 3 (if it matches just let it thru) Then check the Database 2 with it (if it matches here just let it thru) Then check Database 1 with it if there was no match in Database 1 then cut the filename off and put http://www.mywebsite.com/someuser/ into Database 3 for the client will be requesting 10 other objects real soon now and it might as well no process that much Now all of this will involve writing your own redirector in whatever language you prefer (perl would work fine as the database is doing all the work). The database will chew ram + cpu BUT it's wont take 5x the ram of those redirector processes + the temp url cache database will save your cpu and speed things right along. The good list is optional and depends on the load that still exists after the temp cache is in use. > Each week we have 25 new client (we are connecting 600 schools to Internet), > until now, there is no performance problem but i am really worried for > the future. I don't want the multiplication of the machine just due to > cpu limitation, it's not cheap to duplicate the hardaware and it's > harder to maintain ! You are gonna need something which will reduce the load on the box that's a lot of clients there. Jeff
