Re: cache empties itself?

2008-04-04 Thread Sascha Ottolski
Am Freitag 04 April 2008 01:32:28 schrieb DHF:
 Sascha Ottolski wrote:
  however, my main problem is currently that the varnish childs keep
  restarting, and that this empties the cache, which effectively
  renders the whole setup useless for me :-( if the cache has filled
  up, it works great, if it restarts empty, obviously it doesn't.
 
  is there anything I can do to prevent such restarts?

 Varnish doesn't just restart on its own.  Check to make sure you
 aren't sending a kill signal if you are running logrotate through a
 cronjob. I'm not sure if a HUP will empty the cache or not.

 --Dave

I definetely did nothing like this, I've observed restarts out of the 
blue. I'm no giving the trunk a try, hopefully there's an improvement 
to that matter.

what I did once in a while is to vcl.load, vcl.use. will this force a 
restart of the child, thus flushing the cache?


Thanks again,

Sascha
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: cache empties itself?

2008-04-04 Thread Sascha Ottolski
Am Freitag 04 April 2008 04:37:44 schrieb Ricardo Newbery:
           sub vcl_fetch {
               if (obj.ttl  120s) {
                   set obj.ttl = 120s;
               }
           }

 Or you can invent your own header... let's call it  X-Varnish-1day

           sub vcl_fetch {
               if (obj.http.X-Varnish-1day) {
                   set obj.ttl = 86400s;
               }
           }

so it seems like I'm on the right track, thanks for clarifying. now, is 
the ttl a information local to varnish, or will it set headers also (if 
I look into the headers of my varnishs' responses, it doesn't appear 
so)?

what really confuses me: the man pages state a little different 
semantics for default_ttl. in man varnishd:

 -t ttl  Specifies a hard minimum time to live for cached
 documents.  This is a shortcut for specifying the
 default_ttl run-time parameter.

 default_ttl
   The default time-to-live assigned to objects if neither
   the backend nor the configuration assign one.  Note
   that changes to this parameter are not applied retroac‐
   tively.

   The default is 120 seconds.


hard minimum sounds to me as if it would overwrite any setting the 
backend has given. however, in man vcl it's explained, that default_ttl 
does only affect documents without backend given TTL:

 The following snippet demonstrates how to force a minimum TTL
 for all documents.  Note that this is not the same as setting
 the default_ttl run-time parameter, as that only affects doc‐
 ument for which the backend did not specify a TTL.

 sub vcl_fetch {
 if (obj.ttl  120s) {
 set obj.ttl = 120s;
 }
 }


the examples have a unit (s) appended, as in the example of the man 
page, that suggests that I could also append things like m, h, d (for 
minutes, hours, days)?

BTW, in the trunk version, the examples for a backend definition have 
still the old syntax.

 backend www {
 set backend.host = www.example.com;
 set backend.port = 80;
 }


instead

 backend www {
 .host = www.example.com;
 .port = 80;
 }


Thanks a lot,

Sascha
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


make varnish still respond if backend dead

2008-04-04 Thread Sascha Ottolski
Hi,

sorry if this is FAQ: what can I do to make varnish respond to request 
if it's backend is dead. should return cache hits, of course, and 
a proxy error or something for a miss.

and how can I prevent varnish to cache 404 for objects it couldn't 
fetch due to a dead backend? at least I think that is what happened, as 
varnish reported 404 for URLs that definetely exist; the dead backend 
seems to be the only logical explanation why varnish could think it's 
not.

oh, and is there a way to put the local hostname in a header? I have two 
proxies, load balanced by LVS, so using server.ip reports the same IP 
on both nodes.


Thanks, Sascha
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: cache empties itself?

2008-04-04 Thread Sascha Ottolski
Am Freitag 04 April 2008 10:11:52 schrieb Stig Sandbeck Mathisen:
 On Fri, 4 Apr 2008 09:01:57 +0200, Sascha Ottolski [EMAIL PROTECTED] 
said:
  I definetely did nothing like this, I've observed restarts out of
  the blue. I'm no giving the trunk a try, hopefully there's an
  improvement to that matter.

 If the varnish caching process dies for some reason, the parent
 varnish process will start a new one to keep the service running.
 This new one will not re-use the cache of the previous.

 With all that said, the varnish caching process should not die in
 this way, that is undesirable behaviour.

 If you'd like to help debugging this issue, take a look at
 http://varnish.projects.linpro.no/wiki/DebuggingVarnish

I already started my proxies with the latest trunk and corecumps 
enabled, and cross my fingers. so far it's running for about 11 
hours...

BTW, if I have 32 GB of RAM, and 517 GB of cache file, how large will 
the core dump be?


 Note that if you run a released version, your issue may have beeen
 fixed already in a later release, the related branch, or in trunk,

  what I did once in a while is to vcl.load, vcl.use. will this force
  a restart of the child, thus flushing the cache?

 No.  The reason Varnish has vcl.load and vcl.use is to make sure you
 don't have to restart anything, thus losing your cached data.

excellent. the numbers that are shown next to the config in vcl.list, is 
it the number of connections that (still) use it?


Thanks again,

Sascha
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


unable to compile nagios module from trunk

2008-04-04 Thread Sascha Ottolski
after checking out and running autogen.sh, configure stops with this 
error:

./configure: line 19308: syntax error near unexpected token 
`VARNISHAPI,'
./configure: line 19308: `PKG_CHECK_MODULES(VARNISHAPI, varnishapi)'

Cheers, Sascha
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: cache empties itself?

2008-04-04 Thread Michael S. Fischer
On Fri, Apr 4, 2008 at 3:20 AM, Sascha Ottolski [EMAIL PROTECTED] wrote:
  you are right, _if_ the working set is small. in my case, we're talking
  20+ mio. small images (5-50 KB each), 400+ GB in total size, and it's
  growing every day. access is very random, but there still is a good
  amount of hot objects. and to be ready for a larger set it cannot
  reside on the webserver, but lives on a central storage. access
  performance to the (network) storage is relatively slow, and our
  experiences with mod_cache from apache were bad, that's why I started
  testing varnish.

Ah, I see.

The problem is that you're basically trying to compensate for a
congenital defect in your design: the network storage (I assume NFS)
backend.  NFS read requests are not cacheable by the kernel because
another client may have altered the file since the last read took
place.

If your working set is as large as you say it is, eventually you will
end up with a low cache hit ratio on your Varnish server(s) and you'll
be back to square one again.

The way to fix this problem in the long term is to split your file
library into shards and put them on local storage.

Didn't we discuss this a couple of weeks ago?

Best regards,

--Michael
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: cache empties itself?

2008-04-04 Thread Sascha Ottolski
Am Freitag 04 April 2008 18:11:23 schrieb Michael S. Fischer:
 On Fri, Apr 4, 2008 at 3:20 AM, Sascha Ottolski [EMAIL PROTECTED] 
wrote:
   you are right, _if_ the working set is small. in my case, we're
  talking 20+ mio. small images (5-50 KB each), 400+ GB in total
  size, and it's growing every day. access is very random, but there
  still is a good amount of hot objects. and to be ready for a
  larger set it cannot reside on the webserver, but lives on a
  central storage. access performance to the (network) storage is
  relatively slow, and our experiences with mod_cache from apache
  were bad, that's why I started testing varnish.

 Ah, I see.

 The problem is that you're basically trying to compensate for a
 congenital defect in your design: the network storage (I assume NFS)
 backend.  NFS read requests are not cacheable by the kernel because
 another client may have altered the file since the last read took
 place.

 If your working set is as large as you say it is, eventually you will
 end up with a low cache hit ratio on your Varnish server(s) and
 you'll be back to square one again.

 The way to fix this problem in the long term is to split your file
 library into shards and put them on local storage.

 Didn't we discuss this a couple of weeks ago?

exactly :-) what can I see, I did analyze the logfiles, and learned that 
despite the fact that a lot of the access are truly random, there is 
still a good amount of the request concentrated to a smaller set of the 
images. of course, the set is changing over time, but thats what a 
cache can handle perfectly.

and my experiences seem to prove my theory: if varnish keeps running 
like it is now for about 18 hours *knock on wood*, the cache hit rate 
is close to 80 %! and that takes so much pressure from the backend that 
the overall performance is just awesome.

putting the files on local storage just doesn't scales well. I'm more 
thinking about splitting the proxies like discussed on the list before: 
a loadbalancer could distribute the URLs in a way that each cache holds 
it's own share of the objects.


Cheers, Sascha

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: cache empties itself?

2008-04-04 Thread Ricardo Newbery

On Apr 4, 2008, at 2:50 AM, Michael S. Fischer wrote:

 On Thu, Apr 3, 2008 at 8:59 PM, Ricardo Newbery [EMAIL PROTECTED] 
  wrote:

 Well, first of all you're setting up a false dichotomy.  Not  
 everything
 fits neatly into your apparent definitions of dynamic versus  
 static.  Your
 definitions appear to exclude the use case where you have cacheable  
 content
 that is subject to change at unpredictable intervals but which is  
 otherwise
 fairly static for some length of time.

 In my experience, you almost never need a caching proxy for this
 purpose.  Most modern web servers are perfectly capable of serving
 static content at wire speed.  Moreover, if your origin servers have a
 reasonable amount of RAM and the working set size is relatively small,
 the static objects are already likely to be in the buffer cache.  In a
 scenario such as this, having caching proxies upstream for these sorts
 of objects can actually be *worse* in terms of performance -- consider
 the wasted time processing a cache miss for content that's already
 cached downstream.


Again, static content isn't only the stuff that is served from  
filesystems in the classic static web server scenario.  There are  
plenty of dynamic applications that process content from database --  
applying skins and compositing multiple elements into a single page  
while filtering every element or otherwise applying special processing  
based on a user's access privileges.  An example of this is a dynamic  
content management system like Plone or Drupal.  In many cases, these  
dynamic responses are fairly static for some period of time but  
there is still a definite performance hit, especially under load.

Ric


___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: cache empties itself?

2008-04-04 Thread DHF
Sascha Ottolski wrote:
 Am Freitag 04 April 2008 18:11:23 schrieb Michael S. Fischer:
   
 Ah, I see.

 The problem is that you're basically trying to compensate for a
 congenital defect in your design: the network storage (I assume NFS)
 backend.  NFS read requests are not cacheable by the kernel because
 another client may have altered the file since the last read took
 place.

 If your working set is as large as you say it is, eventually you will
 end up with a low cache hit ratio on your Varnish server(s) and
 you'll be back to square one again.

 The way to fix this problem in the long term is to split your file
 library into shards and put them on local storage.

 Didn't we discuss this a couple of weeks ago?
 

 exactly :-) what can I see, I did analyze the logfiles, and learned that 
 despite the fact that a lot of the access are truly random, there is 
 still a good amount of the request concentrated to a smaller set of the 
 images. of course, the set is changing over time, but thats what a 
 cache can handle perfectly.

 and my experiences seem to prove my theory: if varnish keeps running 
 like it is now for about 18 hours *knock on wood*, the cache hit rate 
 is close to 80 %! and that takes so much pressure from the backend that 
 the overall performance is just awesome.

 putting the files on local storage just doesn't scales well. I'm more 
 thinking about splitting the proxies like discussed on the list before: 
 a loadbalancer could distribute the URLs in a way that each cache holds 
 it's own share of the objects.
   
By putting intermediate caches between the file storage and the client,
you are essentially just spreading the storage locally between cache
boxes, so if this method doesn't scale then you are still in need of a
design change, and frankly so am I :)
What you need to model is the popularity curve for your content, if your
images do not fit with an 80/20 rule of popularity, ie. 20% of your
images soak up less than 80% or requests, then you will spend more time
thrashing the caches than serving the content, and Michael is right, you
would be better served to dedicate web servers with local storage and
shard your images across them.  If 80% of your content is rarely viewed,
then using the same amount of hardware defined as caching accelerators,
you will see an increase in throughput due to more hardware serving a
smaller number of images.  It all depends on your content and users
viewing habits.

--Dave

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: cache empties itself?

2008-04-04 Thread Michael S. Fischer
On Fri, Apr 4, 2008 at 11:05 AM, Ricardo Newbery [EMAIL PROTECTED] wrote:

  Again, static content isn't only the stuff that is served from
 filesystems in the classic static web server scenario.  There are plenty of
 dynamic applications that process content from database -- applying skins
 and compositing multiple elements into a single page while filtering every
 element or otherwise applying special processing based on a user's access
 privileges.  An example of this is a dynamic content management system like
 Plone or Drupal.  In many cases, these dynamic responses are fairly
 static for some period of time but there is still a definite performance
 hit, especially under load.

If that's truly the case, then your CMS should be caching the output locally.

--Michael
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: cache empties itself?

2008-04-04 Thread Ricardo Newbery

On Apr 4, 2008, at 2:04 PM, Michael S. Fischer wrote:

 On Fri, Apr 4, 2008 at 11:05 AM, Ricardo Newbery [EMAIL PROTECTED] 
  wrote:

 Again, static content isn't only the stuff that is served from
 filesystems in the classic static web server scenario.  There are  
 plenty of
 dynamic applications that process content from database --  
 applying skins
 and compositing multiple elements into a single page while  
 filtering every
 element or otherwise applying special processing based on a user's  
 access
 privileges.  An example of this is a dynamic content management  
 system like
 Plone or Drupal.  In many cases, these dynamic responses are fairly
 static for some period of time but there is still a definite  
 performance
 hit, especially under load.

 If that's truly the case, then your CMS should be caching the output  
 locally.


Should be?  Why?  If you can provide this capability via a separate  
process like Varnish, then why should your CMS do this instead?  Am  
I missing some moral dimension to this issue?  ;-)

In any case, both of these examples, Plone and Drupal, can indeed  
cache the output locally but that is still not as fast as placing a  
dedicated cache server in front.  It's almost always faster to have a  
dedicated single-purpose process do something instead of cranking up  
the hefty machinery for requests that can be adequately served by the  
lighter process.

Ric


___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc