Re: Varnish, long lived cache and purge on change

2009-08-19 Thread Rob S
phk and other deep Varnish developers,

Do you think it'd ever be viable to have a sort of process that goes 
through the tail of the purge queue and applies the purges then deletes 
them from the queue?  If so, how much work would it be to implement?  
There are a fair number of us who would really appreciate something like 
this, and I'm sure would make a contribution if someone was to implement 
something.

Thanks,


Rob

Karl Pietri wrote:
 Hey Ken, =)
 Yeah this is what i was afraid of.  I think we have a work around 
 by normalizing the hash key to a few select things we want to support 
 and on change setting the ttl of those objects to 0.  This would avoid 
 using the url.purge.  All of our urls in this case are pretty, and not 
 images.

 Thanks for the great info and sorry about the 4th thread on the 
 subject, i did not search thoroughly enough in the archives. 

 -Karl

 On Tue, Aug 18, 2009 at 4:34 PM, Ken Brownfield kb+varn...@slide.com 
 mailto:kb%2bvarn...@slide.com wrote:

 Hey Karl. :-)

 The implementation of purge in Varnish is really a queue of
 refcounted ban objects.  Every image hit is compared to the ban
 list to see if the object in cache should be reloaded from a backend.

 If you have purge_dups off, /every/ request to Varnish will regex
 against every single ban in the list.  If you have purge_dups on,
 it will at least not compare against duplicate bans.

 However, a ban that has been created will stay around until
 /every/ object that was in the cache at the time of that ban has
 been re-requested, dupe or no.  If you have lots of content,
 especially content that may not be accessed very often, the ban
 list can become enormous.  Even with purge_dups, duplicate ban
 entries remain in memory.  And the bans are only freed from RAM
 when their refcount hits 0 /AND/ they're at the very tail end of
 the ban queue.

 Because of the implementation, there's no clear way around this
 AFAICT.

 You can get a list of bans with the purge.list management
 command, but if it's more than ~2400 long you'll need to use
 netcat to get the list.  Also, purged dups will NOT show up in
 this list, even though they're sitting on RAM.  I have a trivial
 patch that will make dups show up in purge.list if you'd like to
 get an idea of how many bans you have.

 The implementation is actually really clever, IMHO, especially
 with regard to how it avoids locks, and there's really no other
 scalable way to implement a regex purge that I've been able to
 dream up.

 The only memory-reducing option within the existing implementation
 is to actually delete/free duplicate bans from the list, and to
 delete/free bans when an object hit causes the associated ban's
 refcount to hit 0.  However, this requires all access to the ban
 list to be locked, which is likely a significant performance hit.
  I've written this patch, and it works, but I haven't put
 significant load on it.

 I'm not sure if Varnish supports non-regex/non-wildcard purges?
  This would at least not have to go through the ban system,  but
 obviously it doesn't work for arbitrary path purges.

 We version our static content, which avoids cache thrash and this
 purge side-effect.  This is very easy if you have a central
 URL-generation system in code (templates, ajax, etc), but probably
 more problematic in situations where the URL needs to be pretty.

 Ken

 On Aug 18, 2009, at 4:06 PM, Karl Pietri wrote:

 Hello everyone,
 Recently we decided that our primary page that everyone views
 doesn't really change all that often.  In fact it changes very
 rarely except for the stats counters (views, downloads, etc).  So
 we decided that we wanted to store everything in varnish for a
 super long time (and tell the client its not cacheable or
 cacheable for a very short amount of time), flush the page from
 varnish when it truly changes and have a very fast ajax call to
 update the stats.  This worked great for about 2 days.   Then we
 ran out of ram and varnish started causing a ton of swap activity
 and it increased the response times of everything on the site to
 unusable.

 After poking about i seem to have found the culprit.  When you
 use url.purge it seems to keep a record of that and check every
 object as it is fetched to see if it was purged or not.  To test
 this i set a script to purge a lot of stuff and got the same
 problem to happen.


 from varnishstat -1

  n_purge236369  .   N total active purges
 n_purge_add236388 2.31 N new purges added
 n_purge_retire 19 0.00 N old purges deleted
 n_purge_obj_test  165145216.12 N objects tested
 n_purge_re_test5052057513 49316.27 N regexps tested 

Re: Thread pools

2009-08-19 Thread Poul-Henning Kamp
In message 294d5daa0908171755y44f5c132o587f3c818849...@mail.gmail.com, Mark M
oseley writes:
I've seen various things in the wiki and threads on this list talking
about thread pools. In general, the advice is typically conservative,
i.e. don't use more than the default 2 thread pools unless you have
to. I've also seen the occasional comment suggesting one run as many
thread pools as there are cores/cpus.

I think the point here is don't make 1000 or even 100 pools.

One pool per core should be all you need to practically eliminate
thread contention, but to truly realize this, we would have to pin
pools on cores and other nasty and often backfiring optimizations.

Having a few too many pools probably does not hurt too much, but
may increase the thread create/kill ratio a bit.

Also, the wiki mentions that it's mainly appropriate when you run into
locks tying things up. Is that mainly a case of high LRU turnover or
are there other scenarios where locking is an issue? What are the
symptoms of locking becoming an issue with the current configuration
and what fields in varnishstat should I be looking at?

POSIX unfortunately does not offer any standard tools for analysing
lock behaviour, so we have had to make some pretty crude ones
ourselves.

The main sign of lock issues is that the number of context switches
increase drastically, you OS can probably give you a view of that
number.

If you want to go deeper, we have a flag in diag_bitmaps that enables
shmlogging of lock contentions (or even all lock operations), together
with varnishtop suitably filtered, that gives a good idea which locks
we have trouble with.

but I worry about wasting any CPU% on those 8 core 1950s [...]

Varnish is all about wasting CPU%, normally we barely touch the
CPUs, man systems running with 80-90% idle CPUs.

Have you played a bit with varnihshist ?   I suspect that may be
the most sensitive, if crude, indicator of overall performance
we can offer.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish, long lived cache and purge on change

2009-08-19 Thread Poul-Henning Kamp
In message 4a8bb076.50...@gmail.com, Rob S writes:
phk and other deep Varnish developers,

Do you think it'd ever be viable to have a sort of process that goes 
through the tail of the purge queue and applies the purges then deletes 
them from the queue?  If so, how much work would it be to implement?  

Right now we do not know which objects hold onto a ban, only the
number of objects that do.

Do implement it, we would need to put a linked list in each ban
and wire the referencing objects onto it.

My only worry is that it adds a linked list to the objcore structure
taking it from 88 to 104 bytes.

I seem to recall that the locking is benign.

Probably the more interesting question is how aggressive you want it to
be: if it is too militant, it will cause a lot of needless disk activity.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish, long lived cache and purge on change

2009-08-19 Thread Rob S
Poul-Henning Kamp wrote:
 My only worry is that it adds a linked list to the objcore structure
 taking it from 88 to 104 bytes.
   
I realise this could be undesirable, but at the moment varnish is 
proving quite difficult to use in sites that frequently purge, with 
different users adding their own workarounds (versioning URLs, 
restarting Varnish, tweaking the hash key etc).  Everything is a trade 
off, but I think it's desirable to increase the memory footprint per 
object so as to not bring down the server with massive memory growth.

 Probably the more interesting question is how aggressive you want it to
 be: if it is too militant, it will cause a lot of needless disk activity
I feel that some sort of hysteresis on the size of the purge list would 
make most sense, perhaps starting to process if the list exceeds more 
than X bytes, and stop when the list is  Y bytes.

Having thought a little more about this, I realise I don't know whether 
graced requests respect bans.  If they don't, then processing the ban 
list will change Varnish's behaviour. 

Rob
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish, long lived cache and purge on change

2009-08-19 Thread Poul-Henning Kamp
In message 2837.1250697...@critter.freebsd.dk, Poul-Henning Kamp writes:
In message 4a8bb076.50...@gmail.com, Rob S writes:

Just to follow up to myself after trying to hack up a solution in -trunk:

I seem to recall that the locking is benign.

Make that Mostly benign :-)

Probably the more interesting question is how aggressive you want it to
be: if it is too militant, it will cause a lot of needless disk activity.

There was actually a far more interesting question, or rather issue:

The lurker thread does not have a HTTP request.

That means that we can not evaluate a ban test like req.url ~ foo:
we simply don't have a req.url to compare with.

So provided you only have obj.* tests in your bans, it is possible,
for req.* tests it is a no go...

The obvious workaround is evident, store the req.* fields you need in
obscure obj.* headers (possibly stripping them in vcl_deliver).


With that caveat, give r4206 a shot if you dare...


-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc