RE: Strategies for splitting load across varnish instances? Andavoiding single-point-of-failure?

2010-01-18 Thread BUSTARRET, Jean-francois

-Message d'origine-
 It's probably simplest to paraphrase the code:
 
   Calculate hash over full complement of backends.
   Is the selected backend sick
   Calculate hash over subset of healthy backends

Let's get back to consistent hashing and it's use...

Correct me if I am wrong, but doesn't this mean that adding a new varnish 
instance implies a full rehash ? 

This can be a problem for scalability. Memcached clients typically solve this 
by using consistent hashing (a key stays on the same node, even in case of a 
node failure or node addition/removal).

Jean-François
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Strategies for splitting load across varnish instances? And avoiding single-point-of-failure?

2010-01-18 Thread Tollef Fog Heen
]] Ken Brownfield 

| 3) Hash/bucket URLs to cache pairs.
| 
| Same as 2), but for every hash bucket you would send those hits to two
| machines (think RAID-10).  This provides redundancy from the effects
| of 2a), and gives essentially infinite scalability for the price of
| doubling your miss rate once (two machines per bucket caching the same
| data).  The caveat from 2b) still applies.

I've pondered having a semi-stable hash algorithm which would hash to
one host, say, 90% of the time and another 10% of the time.  This would
allow you much more flexible scalability here as you would not need
twice the number of servers, only the number you need to have
redundant.  And you could tune the extra cache miss rate versus how much
redundancy you need.

I don't know of any products having this out of the box.  I am fairly
sure you could do it on F5s using iRules, and I would not be surprised
if HAProxy or nginx can either do it or be taught how to do this.

-- 
Tollef Fog Heen 
Redpill Linpro -- Changing the game!
t: +47 21 54 41 73
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Strategies for splitting load across varnish instances? Andavoiding single-point-of-failure?

2010-01-18 Thread Poul-Henning Kamp
In message 53c652a09719c54da24741d0157cb26904c5f...@tfprdexs1.tf1.groupetf1.fr

 It's probably simplest to paraphrase the code:
 
  Calculate hash over full complement of backends.
  Is the selected backend sick
  Calculate hash over subset of healthy backends

Let's get back to consistent hashing and it's use...

Correct me if I am wrong, but doesn't this mean that adding a new 
varnish instance implies a full rehash ? 

Yes, that is pretty much guaranteed to be the cost with any
stateless hashing.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Handling of cache-control

2010-01-18 Thread Tollef Fog Heen

Hi all,

we are considering changing the defaults on how the cache-control header
is handled in Varnish.  Currently, we only look at s-maxage and maxage
to decide if and how long an object should be cached.  (We also look at
expires, but that's not relevant here.)

My suggestion is to also look at Cache-control: no-cache, possibly also
private and no-store and obey those.  You would still be able to
override this in vcl by setting obj.cacheable to true and the ttl to
some value.

The reason I think we should at least consider changing this is the
principle of least surprise: we support max-age and s-maxage, so we
should also support the other common values of the cache-control header
field.

Feedback very welcome,

-- 
Tollef Fog Heen 
Redpill Linpro -- Changing the game!
t: +47 21 54 41 73
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Handling of cache-control

2010-01-18 Thread Laurence Rowe
2010/1/18 Tollef Fog Heen tfh...@redpill-linpro.com:

 Hi all,

 we are considering changing the defaults on how the cache-control header
 is handled in Varnish.  Currently, we only look at s-maxage and maxage
 to decide if and how long an object should be cached.  (We also look at
 expires, but that's not relevant here.)

 My suggestion is to also look at Cache-control: no-cache, possibly also
 private and no-store and obey those.  You would still be able to
 override this in vcl by setting obj.cacheable to true and the ttl to
 some value.

 The reason I think we should at least consider changing this is the
 principle of least surprise: we support max-age and s-maxage, so we
 should also support the other common values of the cache-control header
 field.

 Feedback very welcome,

Given the proposed move away from having the default vcl, it would
seem logical to move cache-control header logic into VCL. The less
magic that happens behind the scenes, the easier it is to understand
what a particular configuration will do.

Laurence
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Handling of cache-control

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 5:20 AM, Tollef Fog Heen wrote:
 we are considering changing the defaults on how the cache-control header
 is handled in Varnish.  Currently, we only look at s-maxage and maxage
 to decide if and how long an object should be cached.  (We also look at
 expires, but that's not relevant here.)
 
 My suggestion is to also look at Cache-control: no-cache, possibly also
 private and no-store and obey those.

Why wasn't it doing it all along?  

--Michael
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Strategies for splitting load across varnish instances? And avoiding single-point-of-failure?

2010-01-18 Thread Ken Brownfield
On Jan 16, 2010, at 7:32 AM, Michael Fischer wrote:

 On Sat, Jan 16, 2010 at 1:54 AM, Bendik Heltne bhel...@gmail.com wrote:
 
 Our Varnish servers have ~ 120.000 - 150.000 objects cached in ~ 4GB
 memory and the backends have a much easier life than before Varnish.
 We are about to upgrade RAM on the Varnish boxes, and eventually we
 can switch to disk cache if needed. 
 
 If you receive more than 100 requests/sec per Varnish instance and you use a 
 disk cache, you will die.  

I was surprised by this, what appears to be grossly irresponsible guidance, 
given how large the installed base is that does thousands per second quite 
happily.

Perhaps there's missing background for this statement?  Do you mean swap 
instead of Varnish file/mmap?  Disk could just as easily mean SSD these days.  
Even years ago on Squid and crappy EIDE drives you could manage 1-2,000 
requests per second.
-- 
Ken

___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Release schedule for saint mode.

2010-01-18 Thread pablort
Hey there,

Anybody knows what's the plan to release saint mode ? :D

Thanks a lot,
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 12:58 PM, pub crawler wrote:

 This is an inquiry for the Varnish community.
 
 Wondering how many folks are using Varnish purely for binary storage
 and caching (graphic files, archives, audio files, video files, etc.)?
 
 Interested specifically in large Varnish installations with either
 high number of files or where files are large in size.
 
 Can anyone out there using Varnish for such care to say they are?

I guess it depends on your precise configuration.

Most kernels cache recently-accessed files in RAM, and so common web servers 
such as Apache can already serve up static objects very quickly if they are 
located in the buffer cache.  (Varnish's apparent speed is largely based on the 
same phenomenon.)  If the data is already cached in the origin server's buffer 
caches, then interposing an additional caching layer may actually be somewhat 
harmful because it will add some additional latency.

If you've evenly distributed your objects among a number of origin servers, 
assuming they do nothing but serve up these static objects, and the origin 
servers have a sum total of RAM larger than your caching servers, then you 
might be better off just serving directly from the origin servers.

On the other hand, there are some use cases, such as edge-caching, where 
interposing a caching layer can be quite helpful even if the origin servers are 
fast, because making the object available closer to the requestor can conserve 
network latency.  (In fact, overcommit may be OK in this situation if the I/O 
queue depth is reasonably shallow if you can guarantee that any additional I/O 
overhead is less than network latency incurred by having to go to the origin 
server.)

--Michael


___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Strategies for splitting load across varnish instances? And avoiding single-point-of-failure?

2010-01-18 Thread Poul-Henning Kamp
In message a8edc1fb-e3e2-4be7-887a-92b0d1da9...@dynamine.net, Michael S. Fis
cher writes:

What VM can overcome page-thrashing incurred by constantly referencing a
working set that is significantly larger than RAM?

No VM can overcome the task at hand, but some work a lot better than
others.

Varnish has a significant responsibility, not yet fully met, to tell
the VM system as much about what is going on as possible.

Poul-Henning

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Strategies for splitting load across varnish instances? And avoiding single-point-of-failure?

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 1:52 PM, Poul-Henning Kamp wrote:

 In message a8edc1fb-e3e2-4be7-887a-92b0d1da9...@dynamine.net, Michael S. 
 Fis
 cher writes:
 
 What VM can overcome page-thrashing incurred by constantly referencing a
 working set that is significantly larger than RAM?
 
 No VM can overcome the task at hand, but some work a lot better than
 others.
 
 Varnish has a significant responsibility, not yet fully met, to tell
 the VM system as much about what is going on as possible.

Can you describe in more detail your comparative analysis and plans?  

Thanks,

--Michael
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread pub crawler
 Most kernels cache recently-accessed files in RAM, and so common web servers 
 such as Apache can ?already serve up static objects very quickly if they are 
 located in the buffer cache.  (Varnish's apparent speed is largely based on 
 the same phenomenon.)  If the data is already cached in the origin server's 
 buffer caches, then interposing an additional caching layer may actually be 
 somewhat harmful because it will add some additional latency.

So far Varnish is performing very well for us as a web server of these
cached objects.   The connection time for an item out of Varnish is
noticeably faster than with web servers we have used - even where the
items have been cached.  We are mostly using 3rd party tools like
webpagetest.org to look at the item times.

 If you've evenly distributed your objects among a number of origin servers, 
 assuming they do nothing but serve up these static objects, and the origin 
 servers have a sum total of RAM larger than your caching servers, then you 
 might be better off just serving directly from the origin servers.

Varnish is good as a slice in a few different place in a cluster and a
few more when running distributed geographic clusters.   Aside from
Nginx or something highly optimized I am fairly certain Varnish
provides faster serving of cached objects as an out of the box default
experience.  I'll eventually find some time to test it in our
environment against web servers we use.

 On the other hand, there are some use cases, such as edge-caching, where 
 interposing a caching layer can be quite helpful even if the origin servers 
 are fast, because making the object available closer to the

Edge caching and distributed cache front ends are exactly what's
needed.  It's a poor mans CDN but can be very effective if done well.

The question I posed is to see if this type of use (binary almost
purely) is being done and scaling well at large scale (50GB and
beyond).  Binary data usually poses more overhead as the data is
larger - less stored elements in RAM, often it can't be compressed
further, more FIFO type of purging due to this, etc.

-Paul
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 2:16 PM, pub crawler wrote:

 Most kernels cache recently-accessed files in RAM, and so common web servers 
 such as Apache can ?already serve up static objects very quickly if they 
 are located in the buffer cache.  (Varnish's apparent speed is largely 
 based on the same phenomenon.)  If the data is already cached in the origin 
 server's buffer caches, then interposing an additional caching layer may 
 actually be somewhat harmful because it will add some additional latency.
 
 So far Varnish is performing very well for us as a web server of these
 cached objects.   The connection time for an item out of Varnish is
 noticeably faster than with web servers we have used - even where the
 items have been cached.  We are mostly using 3rd party tools like
 webpagetest.org to look at the item times.
 
 Varnish is good as a slice in a few different place in a cluster and a
 few more when running distributed geographic clusters.   Aside from
 Nginx or something highly optimized I am fairly certain Varnish
 provides faster serving of cached objects as an out of the box default
 experience.  I'll eventually find some time to test it in our
 environment against web servers we use.

I have a hard time believing that any difference in the total response time of 
a cached static object between Varnish and a general-purpose webserver will be 
statistically significant, especially considering typical Internet network 
latency.  If there's any difference it should be well under a millisecond.

--Michael___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 3:08 PM, Ken Brownfield wrote:

 I have a hard time believing that any difference in the total response time 
 of a cached static object between Varnish and a general-purpose webserver 
 will be statistically significant, especially considering typical Internet 
 network latency.  If there's any difference it should be well under a 
 millisecond.
 
 I would suggest that you get some real-world experience, or at least do some 
 research in this area.  Like your earlier assertion, this is patently untrue 
 as a general conclusion.

 Differences in latency of serving static content can vary widely based on the 
 web server in use, easily tens of milliseconds or more.  There are dozens of 
 web servers out there, some written in interpreted languages, many 
 custom-written for a specific application, many with add-ons and modules and 
 other hijinx that can effect the latency of serving static content.

That's why you don't use those webservers as origin servers for that purpose.  
But you don't use Varnish for it either.  It's not an origin server anyway.

 In the real world, sites run their applications through web servers, and this 
 fact does (and should) guide the decision on the base web server to use, not 
 static file serving.

I meant webservers that more than 50%+ of the world uses, which do not include 
those.  I was assuming, perhaps incorrectly, that the implementor would have at 
least the wisdom/laziness to use a popular general-purpose webserver such as 
Apache for the purpose of serving static objects from the filesystem.   And 
that's not even really a stretch as it's the default for most servers.

  (Though nginx may have an on-disk cache?  And don't get me started on Apache 
 caching. :-)

Doctor, heal thyself before you call me inexperienced.  Using application-level 
caching for serving objects from the filesystem rarely works, which is the main 
point of Varnish.  Just because *you* can't get good performance out of Apache 
doesn't mean it's not worth using.

--Michael
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread pub crawler
 Differences in latency of serving static content can vary widely based on
 the web server in use, easily tens of milliseconds or more.  There are
 dozens of web servers out there, some written in interpreted languages, many
 custom-written for a specific application, many with add-ons and modules and

Most webservers as shipped are simply not very speedy.   Nginx,
Cherokee, Lighty are three exceptions :)
Latency is all over the place in web server software.  Caching is a
black art still no matter where you are talking about having one or
lacking one :)  Ten milliseconds is easily wasted in a web server,
connection pooling, negotiating the transfer, etc.  Most sites have so
many latency issues and such a lack of performance.  Most folks seem
to just ignore it though and think all is well with low performance.

That's why Varnish and the folks here are so awesome.   A band of data
crushers, bandwidth abusers and RAM junkies with lower latency in
mind.  Latency is an ugly multiplier - it gets multiplied by every
request, multiple requests per user, multiplied by all the use in a
period of time.   If your page has 60 elements to be served and you
add a mere 5ms to each element that's 300ms of latency just on serving
static items.  There are other scenarios too like dealing with people
on slow connections (if your audience has lots of these).

  If you're serving pure static content with no need for application logic,
 then yes, there is little benefit to choosing a two-tier infrastructure when
 a one-tier out-of-the-box nginx/lighttpd/thttpd will do just fine.  But, if
 your content does not fit in memory, you're back to reverse-Squid or
 Varnish.  (Though nginx may have an on-disk cache?  And don't get me started
 on Apache caching. :-)

Static sites will still be aided in scaling fronting them with Varnish
or similar cache front end if they are big enough.  A small for
instance might be offloading images or items that require longer
connection timeouts to Varnish - reducing the disk IO perhaps and
being able to cut your open connections on your web server.  You could
do the same obviously by dissecting your site into multiple servers
and dividing the load ~ lose some of the functionality that is
appealing in Varnish and the ability to dynamically adjust traffic,
load, direction, etc. within Varnish.  Unsure if anything similar
exists in Nginx - but then you are turning a web server into something
else and likely some performance reduction.

Mind you,  most people here *I think* are dealing with big scaling -
busy sites, respectable and sometimes awe inspiring amounts of data.
Then there are those slow as can be app servers the might have to work
around too.  So the scale of latency issues is a huge cost center for
most folks.

Plenty of papers have been wrote about latency and the user
experience.  The slower the load the less people interact and in
commerce terms spend with the site.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread Poul-Henning Kamp
In message 4c3149fb1001181416r7cd1c1c2n923a438d6a0df...@mail.gmail.com, pub c
rawler writes:

So far Varnish is performing very well for us as a web server of these
cached objects.   The connection time for an item out of Varnish is
noticeably faster than with web servers we have used - even where the
items have been cached.  We are mostly using 3rd party tools like
webpagetest.org to look at the item times.

The average workload of a cache hit, last I looked, was 7 system
calls, with typical service times, from request received from kernel
until response ready to be written to kernel, of 10-20 microseconds.

Compared to the amount of work real webservers do for the same task,
that is essentially nothing.

I don't know if that is THE best performance, but I know of a lot
of software doing a lot worse.

Try running varnishhist if you have not already :-)

Poul-Henning

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 3:37 PM, pub crawler wrote:

 Differences in latency of serving static content can vary widely based on
 the web server in use, easily tens of milliseconds or more.  There are
 dozens of web servers out there, some written in interpreted languages, many
 custom-written for a specific application, many with add-ons and modules and
 
 Most webservers as shipped are simply not very speedy.   Nginx,
 Cherokee, Lighty are three exceptions :)
 Latency is all over the place in web server software.  Caching is a
 black art still no matter where you are talking about having one or
 lacking one :)  Ten milliseconds is easily wasted in a web server,
 connection pooling, negotiating the transfer, etc.  Most sites have so
 many latency issues and such a lack of performance.  

Let me clear, in case I have not been clear enough already:

I am not talking about the edge cases of those low-concurrency, high-latency, 
scripted-language webservers that are becoming tied to web application 
frameworks like Rails and Django and that are the best fit for front-end 
caching because they are slow at serving dynamic content.  

But we are not discussing serving dynamic content in this thread anyway.  We 
are talking about binary files, aren't we?  Yes?  Blobs on disk?  Unless 
everyone is living on a different plane then me, then I think that's what we're 
talking about.

For those you should be using a general purpose webserver.  There's no reason 
you can't run both side by side.  And I stand by my original statement about 
their performance relative to Varnish.

--Michael
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread pub crawler
 The average workload of a cache hit, last I looked, was 7 system
 calls, with typical service times, from request received from kernel
 until response ready to be written to kernel, of 10-20 microseconds.

Well that explains some of the performance difference in Varnish (in
our experience) versus web servers.

7 calls isn't much and you said MICROSECONDS.   :)

 I don't know if that is THE best performance, but I know of a lot
 of software doing a lot worse.

I haven't done the reproduceable testing to share with everyone yet.
But using 3rd party remotely hosted analysis services we know for
certain our page elements are starting faster and the average object
load time has gone down significantly.   We are using one or more of
the fast webservers and still are - just behind Varnish now :)
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread Ken Brownfield
On Jan 18, 2010, at 3:16 PM, Michael S. Fischer wrote:
 On Jan 18, 2010, at 3:08 PM, Ken Brownfield wrote:
 
 In the real world, sites run their applications through web servers, and 
 this fact does (and should) guide the decision on the base web server to 
 use, not static file serving.
 
 I meant webservers that more than 50%+ of the world uses, which do not 
 include those.

Depends on whether you mean 50% of companies, or 50% of web property traffic.  
The latter?  Definitely.

  I was assuming, perhaps incorrectly, that the implementor would have at 
 least the wisdom/laziness to use a popular general-purpose webserver such as 
 Apache for the purpose of serving static objects from the filesystem.   And 
 that's not even really a stretch as it's the default for most servers.

This is true, though default Apache configurations vary the gamut from clean to 
bloated (1ms variation I would say)

 
 (Though nginx may have an on-disk cache?  And don't get me started on Apache 
 caching. :-)
 
 Doctor, heal thyself before you call me inexperienced.  Using 
 application-level caching for serving objects from the filesystem rarely 
 works, which is the main point of Varnish.  Just because *you* can't get good 
 performance out of Apache doesn't mean it's not worth using.

I'm not sure what your definition of application-level is, here.  Much of 
Apache's functionality could be considered an application.  But if you mean an 
embedded app running inside Apache, then that distinction has almost no 
bearing on whether file serving works or not -- an app can serve files just 
as fast as Apache, assuming C/C++.

Adding unnecessary software overhead will add latency to requests to the 
filesystem, and obviously should be avoided.  However, a cache in front of a 
general web server will 1) cause an object miss to have additional latency 
(though small) and 2) guarantee object hits will be as low as possible.  A 
cache in front of a dedicated static file server is unnecessary, but worst-case 
would introduce additional latency only for cache misses.

I'm not sure what your comment on Apache is about, since I never said Apache 
isn't worth using.  I've been using it in production for 11+ years now.  Does 
it perform well for static files in the absence of any other function?  Yes.  
Would I choose it for anything other than an application server?  No.  There 
are much better solutions out there, and the proof is in the numbers.
-- 
Ken

 --Michael


___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 3:54 PM, Ken Brownfield wrote:

 Adding unnecessary software overhead will add latency to requests to the 
 filesystem, and obviously should be avoided.  However, a cache in front of a 
 general web server will 1) cause an object miss to have additional latency 
 (though small) and 2) guarantee object hits will be as low as possible.  A 
 cache in front of a dedicated static file server is unnecessary, but 
 worst-case would introduce additional latency only for cache misses.

Agreed.  This is what I was trying to communicate all along.  It was my 
understanding that this was what the thread was about.

  Does [Apache] perform well for static files in the absence of any other 
 function?  Yes.  Would I choose it for anything other than an application 
 server?  No.  There are much better solutions out there, and the proof is in 
 the numbers.


Not sure what you mean here... at my company it's used for everything but 
proxying (because Apache's process model is contraindicated at high 
concurrencies if you want to support Keep-Alive connections).  And we serve a 
lot of traffic at very low latencies.   

--Michael
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread Poul-Henning Kamp
In message 02d0ec1a-d0b0-40ee-b278-b57714e54...@dynamine.net, Michael S. Fis
cher writes:

But we are not discussing serving dynamic content in this thread
anyway.  We are talking about binary files, aren't we?  Yes?  Blobs
on disk?  Unless everyone is living on a different plane then me,
then I think that's what we're talking about.

For those you should be using a general purpose webserver.  There's
no reason you can't run both side by side.  And I stand by my
original statement about their performance relative to Varnish.

Why would you use a general purpose webserver, if Varnish can
deliver 80 or 90% of your content much faster and much cheaper ?

It sounds to me like you have not done your homework with respect
to Varnish.

For your information, here is the approximate sequence of systemcalls
Varnish performs for a cache hit:

read(get the HTTP request)
timestamp
timestamp
timestamp
timestamp
writev  (write the response)

With some frequency, depending on your system and OS, you will also
see a few mutex operations.

The difference between the first and the last timestamp is typically
on the order of 10-20 microseconds.  The middle to timestamps
are mostly for my pleasure and could be optimized out, if they
made any difference.

This is why people who run synthetic benchmarks do insane amounts
of req/s on varnish boxes, for values of insane  100.000.

I suggest you look at how many systems calls and how long time your
general purpose webserver spends doing the same job.

Once you have done that, I can recommend you read the various
architects notes I've written, and maybe browse through

http://phk.freebsd.dk/pubs/varnish_perf.pdf

Where you decide to deposit your conventional wisdom afterwards
is for you to decide, but it is unlikely to be applicable.

Poul-Henning

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread Poul-Henning Kamp
In message 364f5e3e-0d1e-4c95-b101-b7a00c276...@slide.com, Ken Brownfield wri
tes:

A cache hit under Varnish will be comparable in latency to a
dedicated static server hit, regardless of the backend.

Only provided the dedicated static server is written to work in
a modern SMP/VM system, which few, if any, of them are.

Poul-Henning

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread Poul-Henning Kamp
In message 87f6439f-76fe-416c-b750-5a53a9712...@dynamine.net, Michael S. Fis
cher writes:

I'm merely contending that the small amount of added =
latency for a cache hit, where neither server is operating at full =
capacity, is not enough to significantly affect the user experience.

Which translated to plain english becomes:

If you don't need varnish, you don't need varnish.

I'm not sure how much useful information that statement contains :-)


-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Handling of cache-control

2010-01-18 Thread Poul-Henning Kamp
In message de028c9e-4618-4ebc-8477-6e308753c...@dynamine.net, Michael S. Fis
cher writes:
On Jan 18, 2010, at 5:20 AM, Tollef Fog Heen wrote:

 My suggestion is to also look at Cache-control: no-cache, possibly also
 private and no-store and obey those.

Why wasn't it doing it all along?  

Because we wanted to give the backend a chance to tell Varnish one
thing with respect to caching, and the client another.

I'm not saying we hit the right decision, and welcome any consistent,
easily explainable policy you guys can agree on.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 4:15 PM, Ken Brownfield wrote:

 Ironically and IMHO, one of the barriers to Varnish scalability is its thread 
 model, though this problem strikes in the thousands of connections.

Agreed.  In an early thread on varnish-misc in February 2008 I concluded that 
reducing thread_pool_max to well below the default value (to 16 threads/CPU) 
was instrumental in attaining maximum performance on high-hit-ratio workloads.  
 (This was with Varnish 1.1; things may have changed since then but the theory 
remains.)

Funny how there's always a tradeoff:

Overcommit -  page-thrashing death
Undercommit - context-switch death

:)

--Michael
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 4:35 PM, Poul-Henning Kamp wrote:

 In message 97f066dd-4044-46a7-b3e1-34ce928e8...@slide.com, Ken Brownfield 
 wri
 tes:
 
 Ironically and IMHO, one of the barriers to Varnish scalability
 is its thread model, though this problem strikes in the thousands
 of connections.
 
 It's only a matter of work to pool slow clients in Varnish into
 eventdriven writer clusters, but so far I have not seen a
 credible argument for doing it.
 
 A thread is pretty cheap to have around if it doesn't do anything,
 and the varnish threads typically do not do anything during the
 delivery-phase:  They are stuck in the kernel in a writev(2) 
 or sendfile(2) system call.

Does Varnish already try to utilize CPU caches efficiently by employing some 
sort of LIFO thread reuse policy or by pinning thread pools to specific CPUs?  
If not, there might be some opportunity for optimization there.

--Michael
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc


Re: Varnish use for purely binary files

2010-01-18 Thread pub crawler
Wanted in inject another discussion heady item into this thread and
see if the idea is confirmed in other folks current architecture.
Sorry in advance for being verbose.

Often web servers (my experience) are smaller servers, less RAM and
fewer CPUs than the app servers and databases.  A typical webserver
might be a 2GB or 4GB machine with a dual CPU.  But, the disk storage
on any given webserver will far exceed the RAM in the machine. This
means disk IO even when attempting to cache as much as possible in a
webserver due to the limited RAM.

In this normal web server size model, simply plugging a bigger RAM
Varnish in upstream means less disk IO, faster web servers, less
memory consumption managing threads, etc.  This is well proven basic
Varnish adopter model.

Here's a concept that is not specific to the type of data being stored
in Varnish:

With some additional hashing in the mix, you could limit your large
Varnish cache server to the very high repetitively accessed items and
use the hash to go to the backend webservers where ideally you hit a
smaller Varnish instance with the item cached on the 2-4GB webserver
downriver and have it talk to the webserver directly on localhost if
it didn't have the data.

Anyone doing anything remotely like this?  Lots of big RAM
installations for Varnish.  I like the Google or mini Google model of
many smaller machines distributing the load.  Seem feasible?  2-4GB
machines are very affordable compared to the 16GB and above machines.
Certainly more collective horsepower with the individual smaller
servers - perhaps a better performance-per-watt also (another one of
my interests).

Thanks again everyone.  I enjoy hearing about all the creative ways
folks are using Varnish in their very different environments.  The
more scenarios for Varnish, the more adoption and ideally the more
resources and expertise that become available for future development.

There is some sort of pruning of the cache that is beyond me at the
moment to keep Varnish from being overpopulated with non used items
and similarly from wasting RAM on the webservers for such.

Simple concept and probably very typical.  Oh yeah, plus it scales
horizontally on lower cost dummy server nodes.
___
varnish-misc mailing list
varnish-misc@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-misc