Re: [squid-users] what are the Pros and cons filtering urls using squid.conf?

Marcus Kool Tue, 11 Jun 2013 03:51:10 -0700


On 06/11/2013 04:13 AM, Jose-Marcio Martins wrote:


Thanks for the info,

On 06/11/2013 05:26 AM, Amos Jeffries wrote:

On 11/06/2013 9:03 a.m., Jose-Marcio Martins wrote:

When Squid reloads it pauses *everything* it is doing while the reload is 
happening.
* 100% of resources get dedicated to the reload ensuring fastest possible 
recovery then everything
resumes exactly as before.


OK ! But, IMHO, one doesn't need to reload squid if a helper just needs to 
reload its database. It's a design error on huge servers.


There is a big misunderstanding:
in the old days when the only URL filter was squidguard, Squid had the be 
reloaded in order for squidguard to reloads its database.
And when Squid reloads, *everything* pauses.
_But things have changed since then_:
- ICAP-based URL filters can reload a URL database without Squid reloading
- ufdbGuard, which is a URL redirector just like squidGuard, can also reload a 
URL database without Squid reloading.

The above implies that ICAP-based filters and ufdbGuard are a good alternatives 
for squidguard or filtering by ACLs.

When a squid helper reloads it pauses *just* transations which are depending on 
it, other
transactions remain processing.
* Some small % of resources get dedicated to the reload.
* each helper instance of that type must do its own reload, multiplying the 
work performed during
reload by M times.


OK. So the goal is to MINIMIZE the time taken to reload/reopen databases, 
WITHOUT reloading all squid.

This kind of problem arrives on many other kind of online filtering software. 
E.g. mail filters (milters, ...).

If the looooong time is the one needed to convert, e.g., a text file into a .db 
file, you can do things like the following :

1.    move bl.db bl.db.old
2.      makedatabase bl.db
3.    tell all helpers to reopen database (e.g. using some signal).

On (1.) renaming the file doesn't change file descriptors, so from 1. to 3. the 
helper will still use old database. On 3. all helpers will just close the old 
database and open the new one. The needed
time to do this is minimal (surely much less than a milisecond). Just lock 
database access during reopening database.

If you use in memory databases, you can think the same way, except that you're 
using memory pointers instead of file descriptors/database handlers.

Sure, if you have a very big number of helpers, this may be a problem for 
memory databases. But in this case, maybe you shall think about why you need so 
much helpers. Maybe there are some
optimisation to be done on the programming side, or use some fast disk based 
database, or shared memory.

Another situation can arrive when you have different kind of helpers : one kind 
doing url filtering and another one doing content filtering (e.g., virus, ...). 
So, if when each one need to reload all
squid... it's crazy...

It seems that ICAP allows all this in a cleaner way.

What about multithreading. Which solution can be used on multithreaded helpers ?


ufdbGuard loads the URL database in memory and is multithreaded.
It can do 50,000 URL verifications per second, which is far more than the 
number of requests per second that Squid can handle.
ufdbGuard works very well with a large number of helpers (e.g. 64), since the 
helpers are lightweight and there is one one physical copy of the URL database 
in memory.

Best regards,
Marcus

Thanks for the pointer on hotconf, I'll take a look.

Regards,

José-Marcio


When ICAP reloads it has the option of signalling Squid no more transactions 
and completing the
existing ones first, or spawning a new service instance with the new config and 
then swapping over
seamlessly.
* the resources of some other server are usually being applied to the problem - 
leaving Squid to run
happily
* Squid can failover to other instances of that ICAP service for handling new 
transactions.

No matter how you slice it, Squid will eventually need reconfiguring for 
something and we come back
to Squid needing to accept new configuration without pausing at all.
There is the "HotConf" project (http://wiki.squid-cache.org/Features/HotConf) 
which 3.x releases are
being prepared for through the code cleanup we are doing in the background on 
each successive
release. There is also CacheMgr.JS project Kinkie and I have underway to polish 
up the manager API,
which will eventually result in some configuration options being configurable 
via the web API.

Amos

Re: [squid-users] what are the Pros and cons filtering urls using squid.conf?

Reply via email to