Re: varnish crashes

2010-01-24 Thread Michael S. Fischer
On Jan 24, 2010, at 10:40 AM, Angelo Höngens wrote:
> According to top, the CPU usage for the varnishd process is 0.0% at 400
> req/sec. The load over the past 15 minutes is 0.45, probably mostly
> because of haproxy running on the same machine. So I don't think load is
> a problem.. My problem is that varnish sometimes just crashes or stops
> responding.
> My hit cache ratio is not that high, around 80%, and the backend servers
> can be slow at times (quite complex .net web apps). But I've changed
> some settings, and I am waiting for the next time varnish starts to stop
> responding.. I'm beginning to think it's something that grows over time,
> after restarting the varnish process things tend to run smooth for a
> while. I'll just keep monitoring it.

The other most common reason why the varnish supervisor can start killing off 
children is when they are blocked waiting on a page-in, which is usually due to 
VM overcommit (i.e., storage file size significantly exceeds RAM and you have a 
very large hot working set).   You can usually see that when "iostat -x" output 
shows the I/O busy % is close to 100, meaning the disk is saturated.   You can 
also see that in vmstat (look at the pi/po columns if you're using a file, or 
si/so if you're using malloc).


varnish-misc mailing list

Re: varnish crashes

2010-01-24 Thread Michael S. Fischer
On Jan 24, 2010, at 7:23 AM, Angelo Höngens wrote:
>> What is thread_pool_max set to?  Have you tried lowering it?   We have
>> found that on systems with very high cache-hit ratios, 16 threads per
>> CPU is the sweet spot to avoid context-switch saturation.
> [ang...@nmt-nlb-03 ~]$ varnishadm -T localhost:81| grep
> thread_pool
> thread_pool_add_delay  20 [milliseconds]
> thread_pool_add_threshold  2 [requests]
> thread_pool_fail_delay 200 [milliseconds]
> thread_pool_max500 [threads]
> thread_pool_min5 [threads]
> thread_pool_purge_delay1000 [milliseconds]
> thread_pool_timeout300 [seconds]
> thread_pools   2 [pools]
> Thread_pool_max is set to 500 threads.. But I just increased it to 4000
> (as per, as 'top'
> shows me it's using around 480~490 threads now..
> You suggest lowering it, what would be the effect of that? I would think
> it would run out of threads or something? Well, we'll see what happens
> with the increased threads..

Increasing concurrency is unlikely to solve the problem, although setting the 
number of thread pools to the number of CPUs is probably a good idea.

Assuming a high hit ratio and high CPU utilization (you haven't posted either), 
lowering concurrency (i.e. reducing thread_pool_max) can help reduce CPU 
contention incurred by context switching.  

If maximum concurrency is reached, incoming connections will be deferred to the 
TCP listen(2) backlog (the overflowed_requests counter in varnishstat increases 
when this happens).   When the request reaches the head of the queue, it will 
then be picked up by a processing thread.  The net effect is some additional 
latency, but probably not as much as you're experiencing if your CPU is swamped 
with context switches.

There are a few cases where increasing thread_pool_max can help, in particular, 
where you have a high cache-miss ratio and you have slow origin servers.  But 
if CPU is already high, it will only make the problem worse.

BTW, on FreeBSD you can view the current length of the listen(2) backlog via 
"netstat -aL"  By default, varnishd's listen(2) backlog is 512; as long as you 
don't see the length hit that value you should be ok.


varnish-misc mailing list

Re: Varnish use for purely binary files

2010-01-19 Thread Michael S. Fischer
On Jan 19, 2010, at 12:46 AM, Poul-Henning Kamp wrote:

> In message , "Michael S. 
> Fis
> cher" writes:
>> Does Varnish already try to utilize CPU caches efficiently by employing =
>> some sort of LIFO thread reuse policy or by pinning thread pools to =
>> specific CPUs?  If not, there might be some opportunity for optimization =
>> there.
> You should really read the varnish_perf.pdf slides I linked to yesteday...

They appear to only briefly mention the LIFO issue (in one bullet point toward 
the end), and do not discuss the CPU affinity issue.

varnish-misc mailing list

Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 4:35 PM, Poul-Henning Kamp wrote:

> In message <>, Ken Brownfield 
> wri
> tes:
>> Ironically and IMHO, one of the barriers to Varnish scalability
>> is its thread model, though this problem strikes in the thousands
>> of connections.
> It's only a matter of work to pool slow clients in Varnish into
> eventdriven writer clusters, but so far I have not seen a
> credible argument for doing it.
> A thread is pretty cheap to have around if it doesn't do anything,
> and the varnish threads typically do not do anything during the
> delivery-phase:  They are stuck in the kernel in a writev(2) 
> or sendfile(2) system call.

Does Varnish already try to utilize CPU caches efficiently by employing some 
sort of LIFO thread reuse policy or by pinning thread pools to specific CPUs?  
If not, there might be some opportunity for optimization there.

varnish-misc mailing list

Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 4:15 PM, Ken Brownfield wrote:

> Ironically and IMHO, one of the barriers to Varnish scalability is its thread 
> model, though this problem strikes in the thousands of connections.

Agreed.  In an early thread on varnish-misc in February 2008 I concluded that 
reducing thread_pool_max to well below the default value (to 16 threads/CPU) 
was instrumental in attaining maximum performance on high-hit-ratio workloads.  
 (This was with Varnish 1.1; things may have changed since then but the theory 

Funny how there's always a tradeoff:

Overcommit ->  page-thrashing death
Undercommit -> context-switch death


varnish-misc mailing list

Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 4:06 PM, Poul-Henning Kamp wrote:

> In message <>, "Michael S. 
> Fis
> cher" writes:
>> But we are not discussing serving dynamic content in this thread
>> anyway.  We are talking about binary files, aren't we?  Yes?  Blobs
>> on disk?  Unless everyone is living on a different plane then me,
>> then I think that's what we're talking about.
>> For those you should be using a general purpose webserver.  There's
>> no reason you can't run both side by side.  And I stand by my
>> original statement about their performance relative to Varnish.
> Why would you use a general purpose webserver, if Varnish can
> deliver 80 or 90% of your content much faster and much cheaper ?

There's no question that Varnish is faster and that it can handle more peak 
requests per second than a general-purpose webserver at a near-100% cache hit 
rate.  I'm merely contending that the small amount of added latency for a cache 
hit, where neither server is operating at full capacity, is not enough to 
significantly affect the user experience.

There are many competing factors that need to go into the planning process 
other than pure peak capacity, among them the cache hit ratio, the cost of a 
cache miss, and where your money is better spent: installing RAM in cache 
servers or in origin servers.


varnish-misc mailing list

Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 3:54 PM, Ken Brownfield wrote:

> Adding unnecessary software overhead will add latency to requests to the 
> filesystem, and obviously should be avoided.  However, a cache in front of a 
> general web server will 1) cause an object miss to have additional latency 
> (though small) and 2) guarantee object hits will be as low as possible.  A 
> cache in front of a dedicated static file server is unnecessary, but 
> worst-case would introduce additional latency only for cache misses.

Agreed.  This is what I was trying to communicate all along.  It was my 
understanding that this was what the thread was about.

>  Does [Apache] perform "well" for static files in the absence of any other 
> function?  Yes.  Would I choose it for anything other than an application 
> server?  No.  There are much better solutions out there, and the proof is in 
> the numbers.

Not sure what you mean here... at my company it's used for everything but 
proxying (because Apache's process model is contraindicated at high 
concurrencies if you want to support Keep-Alive connections).  And we serve a 
lot of traffic at very low latencies.   

varnish-misc mailing list

Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 3:47 PM, Poul-Henning Kamp wrote:

> In message , "Michael S. 
> Fis
> cher" writes:
>> That's why you don't use those webservers as origin servers for
>> that purpose.  But you don't use Varnish for it either.  It's not
>> an origin server anyway.
> Actually, for protocol purposes, Varnish is an origin server.
> If you read RFC2616 very carefully, you can find the one place where
> they failed to evict server-side caches from the text, when they
> realized that a cache under the control of the webmaster, is
> indistinguisable from a webserver, for protocol purposes.

I meant it for practical purposes, Poul-Henning.  But I'm sure you knew that. :)

varnish-misc mailing list

Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 3:37 PM, pub crawler wrote:

>> Differences in latency of serving static content can vary widely based on
>> the web server in use, easily tens of milliseconds or more.  There are
>> dozens of web servers out there, some written in interpreted languages, many
>> custom-written for a specific application, many with add-ons and modules and
> Most webservers as shipped are simply not very speedy.   Nginx,
> Cherokee, Lighty are three exceptions :)
> Latency is all over the place in web server software.  Caching is a
> black art still no matter where you are talking about having one or
> lacking one :)  Ten milliseconds is easily wasted in a web server,
> connection pooling, negotiating the transfer, etc.  Most sites have so
> many latency issues and such a lack of performance.  

Let me clear, in case I have not been clear enough already:

I am not talking about the edge cases of those low-concurrency, high-latency, 
scripted-language webservers that are becoming tied to web application 
frameworks like Rails and Django and that are the best fit for front-end 
caching because they are slow at serving dynamic content.  

But we are not discussing serving dynamic content in this thread anyway.  We 
are talking about binary files, aren't we?  Yes?  Blobs on disk?  Unless 
everyone is living on a different plane then me, then I think that's what we're 
talking about.

For those you should be using a general purpose webserver.  There's no reason 
you can't run both side by side.  And I stand by my original statement about 
their performance relative to Varnish.

varnish-misc mailing list

Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 3:08 PM, Ken Brownfield wrote:

>> I have a hard time believing that any difference in the total response time 
>> of a cached static object between Varnish and a general-purpose webserver 
>> will be statistically significant, especially considering typical Internet 
>> network latency.  If there's any difference it should be well under a 
>> millisecond.
> I would suggest that you get some real-world experience, or at least do some 
> research in this area.  Like your earlier assertion, this is patently untrue 
> as a general conclusion.

> Differences in latency of serving static content can vary widely based on the 
> web server in use, easily tens of milliseconds or more.  There are dozens of 
> web servers out there, some written in interpreted languages, many 
> custom-written for a specific application, many with add-ons and modules and 
> other hijinx that can effect the latency of serving static content.

That's why you don't use those webservers as origin servers for that purpose.  
But you don't use Varnish for it either.  It's not an origin server anyway.

> In the real world, sites run their applications through web servers, and this 
> fact does (and should) guide the decision on the base web server to use, not 
> static file serving.

I meant webservers that more than 50%+ of the world uses, which do not include 
those.  I was assuming, perhaps incorrectly, that the implementor would have at 
least the wisdom/laziness to use a popular general-purpose webserver such as 
Apache for the purpose of serving static objects from the filesystem.   And 
that's not even really a stretch as it's the default for most servers.

>  (Though nginx may have an on-disk cache?  And don't get me started on Apache 
> caching. :-)

Doctor, heal thyself before you call me inexperienced.  Using application-level 
caching for serving objects from the filesystem rarely works, which is the main 
point of Varnish.  Just because *you* can't get good performance out of Apache 
doesn't mean it's not worth using.

varnish-misc mailing list

Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 2:16 PM, pub crawler wrote:

>> Most kernels cache recently-accessed files in RAM, and so common web servers 
>> such as Apache can ?>already serve up static objects very quickly if they 
>> are located in the buffer cache.  (Varnish's apparent >speed is largely 
>> based on the same phenomenon.)  If the data is already cached in the origin 
>> server's buffer >caches, then interposing an additional caching layer may 
>> actually be somewhat harmful because it will add >some additional latency.
> So far Varnish is performing very well for us as a web server of these
> cached objects.   The connection time for an item out of Varnish is
> noticeably faster than with web servers we have used - even where the
> items have been cached.  We are mostly using 3rd party tools like
> to look at the item times.
> Varnish is good as a slice in a few different place in a cluster and a
> few more when running distributed geographic clusters.   Aside from
> Nginx or something highly optimized I am fairly certain Varnish
> provides faster serving of cached objects as an out of the box default
> experience.  I'll eventually find some time to test it in our
> environment against web servers we use.

I have a hard time believing that any difference in the total response time of 
a cached static object between Varnish and a general-purpose webserver will be 
statistically significant, especially considering typical Internet network 
latency.  If there's any difference it should be well under a millisecond.

varnish-misc mailing list

Re: Strategies for splitting load across varnish instances? And avoiding single-point-of-failure?

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 1:52 PM, Poul-Henning Kamp wrote:

> In message , "Michael S. 
> Fis
> cher" writes:
>> What VM can overcome page-thrashing incurred by constantly referencing a
>> working set that is significantly larger than RAM?
> No VM can "overcome" the task at hand, but some work a lot better than
> others.
> Varnish has a significant responsibility, not yet fully met, to tell
> the VM system as much about what is going on as possible.

Can you describe in more detail your comparative analysis and plans?  


varnish-misc mailing list

Re: Varnish use for purely binary files

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 12:58 PM, pub crawler wrote:

> This is an inquiry for the Varnish community.
> Wondering how many folks are using Varnish purely for binary storage
> and caching (graphic files, archives, audio files, video files, etc.)?
> Interested specifically in large Varnish installations with either
> high number of files or where files are large in size.
> Can anyone out there using Varnish for such care to say they are?

I guess it depends on your precise configuration.

Most kernels cache recently-accessed files in RAM, and so common web servers 
such as Apache can already serve up static objects very quickly if they are 
located in the buffer cache.  (Varnish's apparent speed is largely based on the 
same phenomenon.)  If the data is already cached in the origin server's buffer 
caches, then interposing an additional caching layer may actually be somewhat 
harmful because it will add some additional latency.

If you've evenly distributed your objects among a number of origin servers, 
assuming they do nothing but serve up these static objects, and the origin 
servers have a sum total of RAM larger than your caching servers, then you 
might be better off just serving directly from the origin servers.

On the other hand, there are some use cases, such as edge-caching, where 
interposing a caching layer can be quite helpful even if the origin servers are 
fast, because making the object available closer to the requestor can conserve 
network latency.  (In fact, overcommit may be OK in this situation if the I/O 
queue depth is reasonably shallow if you can guarantee that any additional I/O 
overhead is less than network latency incurred by having to go to the origin 


varnish-misc mailing list

Re: Strategies for splitting load across varnish instances? And avoiding single-point-of-failure?

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 1:05 PM, Poul-Henning Kamp wrote:

> In message <>, "Michael S. 
> Fis
> cher" writes:
>> I should have been more clear.  If you overcommit and use disk you  
>> will die.  Even SSD is a problem as the write latencies are high.
> That is still very much dependent on the quality of the VM subsystem
> in your OS kernel.

What VM can overcome page-thrashing incurred by constantly referencing a 
working set that is significantly larger than RAM?

varnish-misc mailing list

Re: Strategies for splitting load across varnish instances? And avoiding single-point-of-failure?

2010-01-18 Thread Michael S. Fischer

On Jan 18, 2010, at 12:31 PM, Ken Brownfield  wrote:

On Jan 16, 2010, at 7:32 AM, Michael Fischer wrote:

On Sat, Jan 16, 2010 at 1:54 AM, Bendik Heltne   

Our Varnish servers have ~ 120.000 - 150.000 objects cached in ~ 4GB
memory and the backends have a much easier life than before Varnish.
We are about to upgrade RAM on the Varnish boxes, and eventually we
can switch to disk cache if needed.

If you receive more than 100 requests/sec per Varnish instance and  
you use a disk cache, you will die.

I was surprised by this, what appears to be grossly irresponsible  
guidance, given how large the installed base is that does thousands  
per second quite happily.

Perhaps there's missing background for this statement?  Do you mean  
swap instead of Varnish file/mmap?  Disk could just as easily mean  
SSD these days.  Even years ago on Squid and crappy EIDE drives you  
could manage 1-2,000 requests per second

I should have been more clear.  If you overcommit and use disk you  
will die.  Even SSD is a problem as the write latencies are high.


varnish-misc mailing list

Re: Handling of cache-control

2010-01-18 Thread Michael S. Fischer
On Jan 18, 2010, at 5:20 AM, Tollef Fog Heen wrote:
> we are considering changing the defaults on how the cache-control header
> is handled in Varnish.  Currently, we only look at s-maxage and maxage
> to decide if and how long an object should be cached.  (We also look at
> expires, but that's not relevant here.)
> My suggestion is to also look at Cache-control: no-cache, possibly also
> private and no-store and obey those.

Why wasn't it doing it all along?  

varnish-misc mailing list

Re: Varnish logging and data merging

2010-01-10 Thread Michael S. Fischer
Varnish does keep a log if you ask it to.

On Jan 10, 2010, at 10:37 PM, pub crawler   

> Alright, up and running with Varnish successfully. Quite happy with
> Varnish.  Our app servers no longer are failing / overwhelmed.
> Here's our new problem...
> We have a lot of logging going on in our applications. Logs pages, IP
> info, time date, URL parameters, etc.  Since many pages are being
> served out of Varnish cache,  they don't get logged by our
> application.
> How is anyone else out there working around this sort of problem with
> an existing application?  Considering a 1x1 graphic file inclusion
> into our pages to facilitate logging and ensuring Varnish doesn't
> cache it.
> Share your ideas.
> -Paul
> ___
> varnish-misc mailing list
varnish-misc mailing list

Re: AW: Varnish poisoned cache avoidance

2010-01-10 Thread Michael S. Fischer
It has been my experience that anti-DoS is usually easiest to implement at the 
origin server level, where the request handlers are typically more flexible and 
easiest to program.  Even forking servers like Apache can issue 4xx responses 
lightning fast, without many resources being consumed.


On Jan 10, 2010, at 3:07 AM, pub crawler wrote:

> The antiDoS features would be a good enhancement to Varnish.  I
> realize it's a very complex and resource intensive thing to approach.
> There are likely many other ways some of these functions could be used
> in other ways for other solutions.
> In our instance we are not experiencing a true denial of service
> attack, but rather are being overwhelmed on our app servers by an high
> sustained per second request rate.  The antiDoS features would
> certainly help us in this situation.
> Is the "delay 100ms" something that could be made available in Varnish
> near term?
> I could use this delay in conjunction with IP range of known bots
> causing problems or by defining User Agent with this.
> On Sun, Jan 10, 2010 at 5:12 AM, Poul-Henning Kamp  
> wrote:
>> In message <01cf01ca91db$8c29b790$a47d26...@de>, "Mike Schiessl" writes:
>>> How can varnishd help me prevent DDOS / DOS attacks ?
>> Firstly, by being damn fast.
>> Originally we had some plans for specific antiDoS measures, something
>> like:
>>sub vcl_recv {
>>if (client.bandwidth > 100 mbit/s) {
>>delay 100ms;
>>if (client.missratio > 20%) {
>> et cetera...
>> There are some issues and fine details to doing it, amongst other things
>> that we need to have a data structure for the client which survives
>> the individual session long enough for it to make any difference
>> in the above context.
>> The trouble of course is that a DDoS cannot be identified by IP#,
>> prompting ideas long the lines of
>>sub vcl_recv {
>>if (backend.hitrate < 70%) {
>>/* do something... */
>> etc.
>> But before we get anywere, somebody needs to figure out what we
>> can do.
>> Basically any countermeasure has two equally troublesome components:
>> 1. detection.  Knowing that you need to do something.
>> 2. mitigation.  What are we going to do ?
>> Poul-Henning
>> --
>> Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
>> | TCP/IP since RFC 956
>> FreeBSD committer   | BSD since 4.3-tahoe
>> Never attribute to malice what can adequately be explained by incompetence.
>> ___
>> varnish-misc mailing list
> ___
> varnish-misc mailing list

varnish-misc mailing list

Re: Slow connections

2009-12-28 Thread Michael S. Fischer
That kind of VM overcommit (400GB on an 8GB box) is hazardous for performance 
anyway.  I strongly advise configuring Varnish cache sizes at slightly under 
the actual RAM size of the box.  If your working set size is larger, you need 
more boxes or more RAM anyway, as paging I/O will significantly impede your 

Another thing that would be useful to know are the TCP stats from the haproxy, 
e.g., netstat -s -t (Linux).  Unfortunately you can't determine retransmits on 
a per-destination basis using netstat alone, but there might be open source 
tools out there that can help. 

Finally, sometimes I've seen these issues where the network interface or the 
switch were having speed/duplex autonegotiation issues or bad network cables; 
manually configuring the switch (e.g. to 1Gb full duplex) sometimes solved the 


On Dec 28, 2009, at 12:55 PM, Joe Williams wrote:

> Thanks Cosimo, I'll have to give that a shot. I use 400GB because I have 
> a dedicated EC2 machine for varnish and figured I would use as much 
> space as I have for it, looks like I'm actually only using around 8GB.
> -Joe
> On 12/28/09 12:45 PM, Cosimo Streppone wrote:
>> On 28 Dec 2009 19:59:02, Joe Williams  wrote:
>>> Any other suggestions? Another analysis of the logs shows that varnish
>>> vs other backends (CouchDB) I see an order of magnitude higher
>>> percentage of 3 second connection times with varnish.
>>> Here's my varnish command options
>>>  and my sysctl changes
>>> the sysctl on haproxy is
>>> identical.
>> Hi Joe,
>> I didn't follow all the discussion, but a quick look made me
>> think about my case.
>> I practically did the same steps, with the same results
>> as yours. Kernel tuning, somaxconn, listen_depth, etc...
>> In my case I was experiencing dropped connections, no
>> or very delayed syn-ack packets from varnish server, random varnish
>> restarts (crashes?), and sudden system load spikes,
>> even as high as 200/300.
>> The solution to this problem was really simple.
>> Switch from the "file" allocation to "malloc".
>> In the config file I had:
>>  -s file,/var/lib/varnish/varnish.cache,20G
>> and I changed it to:
>>  -s malloc,20G
>> I see you have 400G, so this solution won't probably work
>> for you?, but my suggestion would be to at least try it
>> with as much RAM as you have, to see if the issue disappears.
>> Then you can setup some swap partition maybe.
>> Check this mail,
> -- 
> Name: Joseph A. Williams
> Email:
> Blog:
> ___
> varnish-misc mailing list

varnish-misc mailing list

Re: Compressed and uncompressed cached object handling

2009-11-17 Thread Michael S. Fischer
Are you returning a "Vary: Accept-Encoding" in your origin server's  
response headers?


On Nov 17, 2009, at 4:01 PM, Daniel Rodriguez wrote:

> Hi guys,
> I'm having a problem with a varnish implementation that we are testing
> to replace an ugly appliance. We were almost ready to place our server
> in a more real  environment (some of our production sites), but I
> found out that there is something not working properly with the
> compression handling in my varnishd (varnish-2.0.5 - Debian)
> Varnish its returning the first object cached no matter if i ask for a
> clear object (no Accept-encoding specified) or a gzip/deflate object.
> If the object cached is a gzip object it will return that no matter if
> I later ask for a clear one later.
> According to what I have seen in the documentation varnish should keep
> both object versions (compressed and no-compressed) in the cache and
> deliver the one that its asked by the client.
> Step 1
> I ask for a non-compressed object (no Accept-encoding specified). This
> works "great"
> GET -H "TE:" -sed "";
> 200 OK
> Cache-Control: max-age=20
> Connection: close
> Date: Mon, 16 Nov 2009 16:56:06 GMT
> Via: 1.1 varnish
> Age: 0
> Server: Apache
> Content-Length: 11013
> Content-Type: text/html; charset=iso-8859-15
> Last-Modified: Mon, 16 Nov 2009 16:56:06 GMT
> Client-Date: Mon, 16 Nov 2009 16:56:06 GMT
> Client-Peer:
> Client-Response-Num: 1
> X-Varnish: 1655545411
> The request goes like this in the log:
>   12 SessionOpen  c 57909 :80
> 12 ReqStart c 57909 1655545411
> 12 RxRequestc GET
> 12 RxURLc /test/prueba.php
> 12 RxProtocol   c HTTP/1.1
> 12 RxHeader c Connection: TE, close
> 12 RxHeader c Host:
> 12 RxHeader c TE:
> 12 RxHeader c User-Agent: lwp-request/5.827 libwww-perl/5.831
> 12 VCL_call c recv
> 12 VCL_return   c lookup
> 12 VCL_call c hash
> 12 VCL_return   c hash
> 12 VCL_call c miss
> 12 VCL_return   c fetch
> 14 BackendOpen  b default 33484 80
> 12 Backend  c 14 default default
> 14 TxRequestb GET
> 14 TxURLb /test/prueba.php
> 14 TxProtocol   b HTTP/1.1
> 14 TxHeader b Host:
> 14 TxHeader b User-Agent: lwp-request/5.827 libwww-perl/5.831
> 14 TxHeader b X-Varnish: 1655545411
> 14 TxHeader b X-Forwarded-For:
>  0 CLI  - Rd ping
>  0 CLI  - Wr 0 200 PONG 1258390564 1.0
> 14 RxProtocol   b HTTP/1.1
> 14 RxStatus b 200
> 14 RxResponse   b OK
> 14 RxHeader b Date: Mon, 16 Nov 2009 16:56:01 GMT
> 14 RxHeader b Server: Apache
> 14 RxHeader b Cache-control: max-age=20
> 14 RxHeader b Last-Modified: Mon, 16 Nov 2009 16:56:06 GMT
> 14 RxHeader b Connection: close
> 14 RxHeader b Transfer-Encoding: chunked
> 14 RxHeader b Content-Type: text/html; charset=iso-8859-15
> 12 ObjProtocol  c HTTP/1.1
> 12 ObjStatusc 200
> 12 ObjResponse  c OK
> 12 ObjHeaderc Date: Mon, 16 Nov 2009 16:56:01 GMT
> 12 ObjHeaderc Server: Apache
> 12 ObjHeaderc Cache-control: max-age=20
> 12 ObjHeaderc Last-Modified: Mon, 16 Nov 2009 16:56:06 GMT
> 12 ObjHeaderc Content-Type: text/html; charset=iso-8859-15
> 14 BackendClose b default
> 12 TTL  c 1655545411 RFC 20 1258390566 0 0 20 0
> 12 VCL_call c fetch
> 12 VCL_return   c deliver
> 12 Length   c 11013
> 12 VCL_call c deliver
> 12 VCL_return   c deliver
> 12 TxProtocol   c HTTP/1.1
> 12 TxStatus c 200
> 12 TxResponse   c OK
> 12 TxHeader c Server: Apache
> 12 TxHeader c Cache-control: max-age=20
> 12 TxHeader c Last-Modified: Mon, 16 Nov 2009 16:56:06 GMT
> 12 TxHeader c Content-Type: text/html; charset=iso-8859-15
> 12 TxHeader c Content-Length: 11013
> 12 TxHeader c Date: Mon, 16 Nov 2009 16:56:06 GMT
> 12 TxHeader c X-Varnish: 1655545411
> 12 TxHeader c Age: 0
> 12 TxHeader c Via: 1.1 varnish
> 12 TxHeader c Connection: close
> 12 ReqEnd   c 1655545411 1258390561.316438675
> 1258390566.327898026 0.000134945 5.010995150 0.000464201
> 12 SessionClose c Connection: close
> 12 StatSess c 57909 5 1 1 0 0 1 282 11013
> Step 2
> Then the next request will go with a (Accept-encoding: gzip), and
> returns me a clear object :(
> GET -H "Accept-encoding: gzip" -H "TE:" -sed " 
> "
> 200 OK
> Cache-Control: max-age=20
> Connection: close
> Date: Mon, 16 Nov 2009 16:56:09 GMT
> Via: 1.1 varnish
> Age: 3
> Server: Apache
> Content-Length: 11013
> Content-Type: text/html; charset=iso-8859-15
> Last-Modified: Mon, 16 Nov 2009 16:56:06 GMT
> Client-Date: Mon, 16 Nov 2009 16:56:

Re: varnish 2.0.4 questions - no IMS, no persistence cache - please help

2009-11-10 Thread Michael S. Fischer
amd64 refers to the architecture (AKA x86_64), not the particular CPU  
vendor.   (As a matter of fact, I was unaware of this limitation;  
AFAIK it does not exist in FreeBSD.)

In any event, mmap()ing 340GB even on a 64GB box is a recipe for  
disaster; you will probably suffer death by paging if your working set  
is larger than RAM.   If it's smaller than RAM, then, well, there's no  
harm in making it just under the total RAM size.


On Nov 11, 2009, at 1:04 AM, GaneshKumar Natarajan wrote:

> Thanks.
> I checked /proc/cpuinfo and it shows intel processor.
> So even with Intel, we see this limitation of 340 GB. This is a
> serious limitation to me, since in Squid, we were using 1.5 TB of
> storage and i thought i could mmap and use all the space for Varnish.
> Any workarounds or working kernel version in linux, please let me  
> know.
> mylinux version: RH4
> 2.6.9-89.ELsmp #1 SMP Mon Apr 20 10:33:05 EDT 2009 x86_64 x86_64
> x86_64 GNU/Linux
> ulimit -a:
> core file size  (blocks, -c) 0
> data seg size   (kbytes, -d) unlimited
> file size   (blocks, -f) unlimited
> pending signals (-i) 1024
> max locked memory   (kbytes, -l) 32
> max memory size (kbytes, -m) unlimited
> open files  (-n) 65535
> pipe size(512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> stack size  (kbytes, -s) 10240
> cpu time   (seconds, -t) unlimited
> max user processes  (-u) 278528
> virtual memory  (kbytes, -v) unlimited
> file locks  (-x) unlimited
> cat /proc/cpufinfo
> processor   : 0
> vendor_id   : GenuineIntel
> cpu family  : 6
> model   : 23
> model name  : Intel(R) Xeon(R) CPU   L5240  @ 3.00GHz
> stepping: 6
> cpu MHz : 2992.505
> cache size  : 6144 KB
> physical id : 0
> siblings: 2
> core id : 0
> cpu cores   : 2
> fpu : yes
> fpu_exception   : yes
> cpuid level : 10
> wp  : yes
> flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
> nx lm pni monitor ds_cpl est tm2 cx16 xtpr lahf_lm
> bogomips: 5989.00
> clflush size: 64
> cache_alignment : 64
> address sizes   : 38 bits physical, 48 bits virtual
> power management:
> processor   : 1
> vendor_id   : GenuineIntel
> cpu family  : 6
> model   : 23
> model name  : Intel(R) Xeon(R) CPU   L5240  @ 3.00GHz
> stepping: 6
> cpu MHz : 2992.505
> cache size  : 6144 KB
> physical id : 3
> siblings: 2
> core id : 6
> cpu cores   : 2
> fpu : yes
> fpu_exception   : yes
> cpuid level : 10
> wp  : yes
> flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
> nx lm pni monitor ds_cpl est tm2 cx16 xtpr lahf_lm
> bogomips: 5985.03
> clflush size: 64
> cache_alignment : 64
> address sizes   : 38 bits physical, 48 bits virtual
> power management:
> processor   : 2
> vendor_id   : GenuineIntel
> cpu family  : 6
> model   : 23
> model name  : Intel(R) Xeon(R) CPU   L5240  @ 3.00GHz
> stepping: 6
> cpu MHz : 2992.505
> cache size  : 6144 KB
> physical id : 0
> siblings: 2
> core id : 1
> cpu cores   : 2
> fpu : yes
> fpu_exception   : yes
> cpuid level : 10
> wp  : yes
> flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
> nx lm pni monitor ds_cpl est tm2 cx16 xtpr lahf_lm
> bogomips: 5984.96
> clflush size: 64
> cache_alignment : 64
> address sizes   : 38 bits physical, 48 bits virtual
> power management:
> processor   : 3
> vendor_id   : GenuineIntel
> cpu family  : 6
> model   : 23
> model name  : Intel(R) Xeon(R) CPU   L5240  @ 3.00GHz
> stepping: 6
> cpu MHz : 2992.505
> cache size  : 6144 KB
> physical id : 3
> siblings: 2
> core id : 7
> cpu cores   : 2
> fpu : yes
> fpu_exception   : yes
> cpuid level : 10
> wp  : yes
> flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
> nx lm pni monitor ds_cpl est tm2 cx16 xtpr lahf_lm
> bogomips: 5985.04
> clflush size: 64
> cache_alignment : 64
> address sizes   : 38 bits physical, 48 bits virtual
> power management:
> Ganesh
> On Tue, Nov 10, 2009 at 1:48 AM, cripy  wrote:
>> GaneshKumar Natarajan writes:
>> Tue, 20 Oct 2009 12:35:00 -0700
>> 3. mmap storage : max i can configure is 340 GB.
>> I was able to use only 340 GB of cache

Re: Yahoo! Traffic Server

2009-11-02 Thread Michael S. Fischer
If you'd like to examine the source, you can find it at:

(I'm a Yahoo! employee, though I'm not here to represent them in any  


On Nov 2, 2009, at 4:26 PM, Ask Bjørn Hansen wrote:

> I thought this might be of interest:
>  - ask
> ___
> varnish-misc mailing list

varnish-misc mailing list

Re: Dropped connections with tcp_tw_recycle=1

2009-09-20 Thread Michael S. Fischer
On Sep 20, 2009, at 6:20 AM, Nils Goroll wrote:

>> tcp_tw_recycle is incompatible with NAT on the server side
> ... because it will enforce the verification of TCP time stamps.  
> Unless all
> clients behind a NAT (actually PAD/masquerading) device use  
> identical timestamps
> (within a certain range), most of them will send invalid TCP  
> timestamps so SYNs
> will get dropped.

Since you seem pretty knowledgeable on the subject, can you please  
explain the difference between tcp_tw_reuse and tcp_tw_recycle?


varnish-misc mailing list

Re: Memory spreading, then stop responding

2009-07-28 Thread Michael S. Fischer
On Jul 28, 2009, at 3:09 PM, Rob S wrote:

> Michael S. Fischer wrote:
>> On Jul 28, 2009, at 2:35 PM, Rob S wrote:
>>> Thanks Darryl.  However, I don't think this solution will work in  
>>> our
>>> usage.  We're running a blog.  Administrators get un-cached access,
>>> straight through varnish.  Then, when they publish, we issue a purge
>>> across the entire site.  We need to do this as there's various  
>>> bits of
>>> navigation that'd need to be updated.  I can't see that we can do  
>>> this
>>> if we set obj.ttl.
>>> Has anyone any recommendations as to how best to deal with purges  
>>> like this?
>> If you're issuing a PURGE across the entire site, why not simply  
>> restart Varnish with an empty cache?
>> --Michael
> Because Varnish is also working for other hosts which don't need  
> purging at the same time...

My company gets around this madness by versioning its URLs.  It works  
pretty well.

varnish-misc mailing list

Re: Memory spreading, then stop responding

2009-07-28 Thread Michael S. Fischer
On Jul 28, 2009, at 2:35 PM, Rob S wrote:
> Thanks Darryl.  However, I don't think this solution will work in our
> usage.  We're running a blog.  Administrators get un-cached access,
> straight through varnish.  Then, when they publish, we issue a purge
> across the entire site.  We need to do this as there's various bits of
> navigation that'd need to be updated.  I can't see that we can do this
> if we set obj.ttl.
> Has anyone any recommendations as to how best to deal with purges  
> like this?

If you're issuing a PURGE across the entire site, why not simply  
restart Varnish with an empty cache?


varnish-misc mailing list

Re: 100% Transparent Reverse Proxy

2009-07-25 Thread Michael S. Fischer
What's the purpose of these requirements?  Just curious.


On Jul 25, 2009, at 9:10 PM, Ryan Chan wrote:

> Hello,
> I have serveral web sites running on Apache/PHP, I want to install a  
> Transparent Reverse Proxy (e.g. squid, varnish) to cache the static  
> stuff. (By looking at expire or LM resposne header)
> However, one of my requirements is that neither client (browser) or  
> server (Apache/PHP) is aware of existences of that proxy.
> E.g.
> Client will not see header such as via, age etc.
> Server will not see header such as X-Forwarded-For
> I want to ask: Is it possible to do the above stuffs using varnish?
> Thanks.
> ___
> varnish-misc mailing list

varnish-misc mailing list

Re: Time for a Varnish user meeting ?

2009-06-15 Thread Michael S. Fischer

I think you mean 1 week :)


On Jun 15, 2009, at 11:02 AM, Jauder Ho wrote:

Well, Velocity is in 2 weeks in San Jose if anyone wants to meet.  
It's short notice but probably an appropriate conference.


On Mon, Jun 15, 2009 at 3:07 AM, Poul-Henning Kamp  

Isn't it time we start to plan a Varnish User Meeting ?

We can either do it as a stand alone thing, a one day event somewhere
convenient (Oslo ?  Copenhagen ?  London ?) or we can try to piggyback
onto some related conference and hold our meeting before/after the

Anybody willing to try to organize something ?


Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20 | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by  

varnish-misc mailing list

varnish-misc mailing list

varnish-misc mailing list

Re: Poor #requests/second performance

2009-06-01 Thread Michael S. Fischer
Ok, so your average latency is 16ms.  At a concurrency of 10, at most,  
you can obtain 625r/s.

(1 request/connection / 0.016s = 62.5 request/s/connection * 10  
connections = 625 request/s)

Try increasing your benchmark concurrency.


On Jun 1, 2009, at 11:10 PM, Andreas Jung wrote:

> On 02.06.09 08:04, Poul-Henning Kamp wrote:
>> In message <>, Andreas Jung writes:
 Examining varnishstat to see what happens.

>>> At what in particular. Looking at varnishstat does not give me a  
>>> clue
>>> about a possible problem.
>> Dropped requests. A small number is OK, a continuos growth is not.
>> ("overflowed" requests are OK).
>> Threads not created should be zero.
> Output of varnishstat:
> 0+00:40:24
> diaweb04
> Hitrate ratio:   10   83   83
> Hitrate avg: 1.   0.9912   0.9912
>   50050 0.0020.65 Client connections accepted
>   50049 0.0020.65 Client requests received
>   50015 0.0020.63 Cache hits
>  35 0.00 0.01 Cache misses
>  35 0.00 0.01 Backend connections success
>  28 0.00 0.01 Backend connections reuses
>  35 0.00 0.01 Backend connections recycles
>   1  ..   N struct srcaddr
>   0  ..   N active struct srcaddr
>  93  ..   N struct sess_mem
>   0  ..   N struct sess
>  17  ..   N struct object
>  17  ..   N struct objecthead
>  28  ..   N struct smf
>   3  ..   N small free smf
>   3  ..   N large free smf
>   7  ..   N struct vbe_conn
>   6  ..   N struct bereq
>  12  ..   N worker threads
>  12 0.00 0.00 N worker threads created
>  62 0.00 0.03 N overflowed work requests
>   2  ..   N backends
>  30  ..   N expired objects
>  55  ..   N LRU moved objects
>   50048 0.0020.65 Objects sent with write
>   50050 0.0020.65 Total Sessions
>   50050 0.0020.65 Total Requests
>  35 0.00 0.01 Total fetch
>13213046 0.00  5450.93 Total header bytes
>   423243959 0.00174605.59 Total body bytes
>   50050 0.0020.65 Session herd
> and the result of 'ab':
> aj...@blackmoon:~> cat out.txt
> Server Software:Unknown
> Server
> Server Port:80
> Document Path:  /logo.jpg
> Document Length:8448 bytes
> Concurrency Level:  10
> Time taken for tests:   81.439 seconds
> Complete requests:  5
> Failed requests:0
> Write errors:   0
> Total transferred:  436644839 bytes
> HTML transferred:   42240 bytes
> Requests per second:613.95 [#/sec] (mean)
> Time per request:   16.288 [ms] (mean)
> Time per request:   1.629 [ms] (mean, across all concurrent  
> requests)
> Transfer rate:  5235.93 [Kbytes/sec] received
> Connection Times (ms)
>  min  mean[+/-sd] median   max
> Connect:02  35.5  12999
> Processing: 2   14  31.3 123009
> Waiting:0   11  19.4 102895
> Total:  3   16  47.3 143015
> Percentage of the requests served within a certain time (ms)
>  50% 14
>  66% 16
>  75% 18
>  80% 19
>  90% 23
>  95% 26
>  98% 31
>  99% 36
> 100%   3015 (longest request)
> Andreas
> ___
> varnish-misc mailing list

varnish-misc mailing list

Re: Varnish restarts when all memory is allocated

2009-05-29 Thread Michael S. Fischer
I think the lesson of these cases is pretty clear:  make your  
cacheable working set fits into the proxy server's available memory --  
or, if you want to exceed your available memory, make sure your hit  
ratio is sufficiently high that the cache server rarely resorts to  
paging in the data.  Otherwise, your cache server will suffer I/O  
starvation due to excessive paging.

This is a general working principle of cache configuration, and not  
specific to Varnish.  After all, the whole point of a cache is to  
serve cacheable objects quickly.  Disk is not fast, and Varnish  
doesn't make slow disks faster.

This goal will become much easier if you can put a "layer 7 switch" in  
front of a pool of Varnish servers and route HTTP requests to them  
based on some attribute of the request (e.g., the URI and/or Cookie:  
header, thereby ensuring efficient use of your cache pool's memory.

Best regards,


On May 29, 2009, at 3:42 AM, Marco Walraven wrote:

> On Tue, May 26, 2009 at 11:29:08PM +0200, Marco Walraven wrote:
>> Hi,
>> We are testing a Varnish Cache in our production environment with a  
>> 500Gb storage file and
>> 32Gb of RAM. Varnish performance is excellent when all of the 32Gb  
>> is not allocated yet.
>> The rates I am seeing here are around 40-60Mbit/s, with roughly  
>> 2.2M objects in cache and
>> hitting a ratio of ~0.65, even then Varnish can handle it easily.  
>> However it is still
>> warming up since we have a lot of objects that need to be cached.
>> The problem I am facing is that as soon as RAM is exhausted Varnish  
>> restarts itself.
> In the meantime I have been doing some tests tweaking the VM system  
> under Linux, especially
> vm.min_free_kbytes, leaving some memory for pdflush and kswapd. The  
> results are slightly better,
> but still varnishd starts to hog the CPU's and restarts.  
> Alternatively we disabled swap, running
> without it. But also tested with a swap file of 16Gb on a different  
> disk. Again slightly better
> results but still the same effect in the end.
> We also ran Varnish without the file storage type having just 8Gb  
> assigned with malloc, this ran
> longer than the other tests we did. Varnishd did not crash but got  
> extremely high CPU usages 700%
> and recoverd from that after a minute or 2.
> The linux system run with the following sysctl config applied:
> Linux varnish001 2.6.18-6-amd64 #1 SMP Tue May 5 08:01:28 UTC 2009  
> x86_64 GNU/Linux
> /etc/systctl.conf
> net.ipv4.ip_local_port_range = 1024 65536
> net.core.rmem_max=16777216
> net.core.wmem_max=16777216
> net.ipv4.tcp_rmem=4096 87380 16777216
> net.ipv4.tcp_wmem=4096 65536 16777216
> net.ipv4.tcp_fin_timeout = 3
> net.ipv4.tcp_tw_recycle = 1
> net.core.netdev_max_backlog = 3
> net.ipv4.tcp_no_metrics_save=1
> net.core.somaxconn = 262144
> net.ipv4.tcp_syncookies = 0
> net.ipv4.tcp_max_orphans = 262144
> net.ipv4.tcp_max_syn_backlog = 262144
> net.ipv4.tcp_synack_retries = 2
> net.ipv4.tcp_syn_retries = 2
> vm.swappiness = 0
> vm.min_free_kbytes = 4194304
> vm.dirty_background_ratio = 25
> vm.dirty_expire_centisecs = 1000
> vm.dirty_writeback_centisecs = 100
> So, yesterday I installed FreeBSD 7.2 STABLE with the lastest CVSup  
> on the second Varnish box and
> ran Varnish with the exact same config as on the Linux box. Exact  
> same setup, 16 Gb swap file, same
> arguments for varnishd, same vcl, same amount of traffic,  
> connections etc etc. I did apply
> perfmance tuning as described on the wiki. Both systems ran Ok until  
> the moment there was little
> RAM left. Linux showed the exact same behaviour as before, high CPU  
> load, varnishd with high amounts
> of CPU usages and in the end it varnishd restarted with an ampty  
> cache. FreeBSD kept on going as I
> expected it to work; however with a higher load but still serving  
> images at 60Mbit/s. I did see that
> it sometimes needed to recover. Meaning accepting no connections for  
> a few moments and then starting
> to go on again, but enough to notice.
> Both systems run varnishd as followed, I changed the amount of  
> buckets to 450011 as opposed to the previous tests I ran. Same for  
> the lru_interval which was 60 and maybe too low.
> /usr/sbin/varnishd -P /var/run/ -a :80 -f /etc/varnish/ 
> default.vcl -T -t 3600 -w 400,4000,60 -s file,500G -p  
> obj_workspace 8192 -p sess_workspace 262144 -p lru_interval 600 -h  
> classic,450011 -p sess_timeout 2 -p listen_depth 8192 -p  
> log_hashstring off -p shm_workspace 32768 -p ping_interval 10 -p  
> srcaddr_ttl 0 -p esi_syntax 1
> Below some output of Linux when it started to hog the CPU and output  
> of the FreeBSD system 15 minutes
> later when it was still going.
> So is this kind of setup actually possible ? And if so how to get it  
> running smoothly ? So far FreeBSD
> comes pretty close but not yet there.
> Thanks for the help,
> Marco
> Linux:
> Hitrate ratio:   10  

Re: Theoretical connections/second limit using Varnish

2009-04-29 Thread Michael S. Fischer
On Apr 29, 2009, at 9:30 AM, Nick Loman wrote:

> Michael S. Fischer wrote:
>> On Apr 29, 2009, at 9:22 AM, Poul-Henning Kamp wrote:
>>> In message <>, Nick Loman writes:
>>>> Has Varnish got a solution to this problem which does not involve
>>>> time-wait recycling? One thing I've thought of is perhaps  
>>>> is used or could be used when Varnish makes connections to the  
>>>> backend?
>>> Varnish tries as hard as reasonable to reuse backend connections,
>>> so you should be able to get multiple requests per backend  
>>> connection.
>>> If this is not the case for you, you should find out why backend  
>>> connections
>>> are not reused.
> Hi Poul-Henning, Michael,
> I've configured Apache with KeepAlive off, so I expect the TCP  
> connection to be closed after each request and Varnish won't be able  
> to use it.
> I've done that for a specific reason relating to backend PHP  
> processes.

I don't dispute your reasoning; my employer does this as well.   
KeepAlive with Apache/PHP can be a recipe for resource starvation on  
your origin servers.

> I typically have thousands of connections in TIME_WAIT mode as a  
> result, which is expected, but I wonder what the solution could be  
> if I ever hit more connections than local ports available.

I think SO_REUSEADDR is the answer - I'm somewhat surprised that  
Varnish doesn't set it by default for the backend connections.

varnish-misc mailing list

Re: Theoretical connections/second limit using Varnish

2009-04-29 Thread Michael S. Fischer
On Apr 29, 2009, at 9:22 AM, Poul-Henning Kamp wrote:

> In message <>, Nick Loman writes:
>> Has Varnish got a solution to this problem which does not involve
>> time-wait recycling? One thing I've thought of is perhaps  
>> is used or could be used when Varnish makes connections to the  
>> backend?
> Varnish tries as hard as reasonable to reuse backend connections,
> so you should be able to get multiple requests per backend connection.
> If this is not the case for you, you should find out why backend  
> connections
> are not reused.

The OP said he turned backend Keep-Alive off.  That's his problem.

varnish-misc mailing list

Re: Varnish 2.0.3 consuming excessive memory

2009-04-07 Thread Michael S. Fischer
Not that I have an answer, but I'd be curious to see the differences  
in 'pmap -x ' output for the different children.


On Apr 7, 2009, at 6:27 PM, Darryl Dixon - Winterhouse Consulting wrote:

>> Hi All,
>> I have an odd problem that I have only noticed happening since  
>> moving from
>> 1.1.2 to 2.0.3 - excessive memory consumption of the varnish child
>> process. For example, I have a varnish instance with a 256MB cache
>> allocated, that is currently consuming 4.9GB of resident memory  
>> (6.5GB
>> virtual). The instance has only been running for 4 days and has  
>> only got
>> 25MB of objects in the cache.
>> This is clearly excessive and is causing us some serious problems  
>> in terms
>> of memory pressure on the machine. Our VCL is largely unchanged  
>> from our
>> 1.1.2 setup to the 2.0.3 except for the obvious vcl.syntax changes,  
>> and
>> the introduction of request restarts under certain limited  
>> scenarios. Can
>> anyone shed some light?
> One further footnote to this. I have a second varnish instance  
> running on
> the same machine which talks to different backend servers (still  
> primarily
> the same sort of content though), with the VCL only fractionally  
> different
> - it does not seem to suffer from the runaway memory consumption of  
> the
> first instance. The only difference in the VCL between the two is  
> that in
> the one with runaway memory this is present in vcl_recv():
> +if (req.http.Pragma ~ ".*no-cache.*" || req.http.Cache-Control ~
> ".*no-cache.*") {
> +purge_url(regsub(req.url, "[?].*$", ".*$"));
> +}
> +
> Is there possibly something in the regsub engine being triggered  
> that is
> very expensive and would cause it to consume and hold on to large  
> amounts
> of memory?
> regards,
> Darryl Dixon
> Winterhouse Consulting Ltd
> ___
> varnish-misc mailing list

varnish-misc mailing list

Re: Default behaviour with regards to Cache-Control

2009-02-12 Thread Michael S. Fischer
On Feb 12, 2009, at 3:34 AM, Poul-Henning Kamp wrote:
> Well, if people in general think our defaults should be that way, we
> can change them, our defaults are whatever the consensus can agree on.

I'm with the OP.  Regardless of the finer details of the RFC, if I'm a  
web developer and I set the "Cache-Control:" header to "private" or  
"no-cache," I would expect it not to be cached by any midstream proxy,  
regardless of who controls it.  This would be especially true if I  
worked for a larger organization, where some folks in another country,  
despite receiving a paycheck signed by the same person as me, may  
control a proxy layer I'm not even aware of.


varnish-misc mailing list

Re: Cookie Expiring Date

2009-02-03 Thread Michael S. Fischer
On Feb 3, 2009, at 6:25 AM, Tollef Fog Heen wrote:

> If it has expired, the client just won't send it, so just check
> req.http.cookie for the relevant cookie and you'll be fine.

I strongly advise against this, as it could subject you to replay  

That said, the client does not include an expiration date with the  
Cookie: header in an HTTP request.  You'll have to check the validity  
of the header on the backend, or modify Varnish to do it for you.

varnish-misc mailing list

Re: [+] Re: Breaking Varnish

2009-01-28 Thread Michael S. Fischer
On Jan 28, 2009, at 10:04 AM, Niall O'Higgins wrote:
>> This is a typical indication of raw overload, what levels of traffic
>> are you hitting it with ?
> This kind of thing:
> Transaction rate:3776.65 trans/sec
> Throughput: 1.68 MB/sec
> Concurrency:   28.28

That doesn't seem that high.  What OS/# CPUs are you using?

varnish-misc mailing list

Re: [varnish] renaming varnish concepts...

2009-01-28 Thread Michael S. Fischer
On Jan 28, 2009, at 4:30 AM, Poul-Henning Kamp wrote:
> Your question is -exactly- why I want the rename:  purge sounds like
> something happens to the object right now, and that is not possible
> from the CLI context.

How about 'qpurge' ?

varnish-misc mailing list

Re: 2.1 plans

2009-01-09 Thread Michael S. Fischer
On Jan 9, 2009, at 1:59 AM, Tollef Fog Heen wrote:

> | What about CARP-like cache routing (i.e., where multiple cache  
> servers
> | themselves are hash buckets)?  This would go a LONG way towards
> | scalability.
> second item
> sounds like what you want?

Yup, sounds like what I proposed about a year ago :-)

varnish-misc mailing list

Re: 2.1 plans

2009-01-08 Thread Michael S. Fischer
+1.  This is a very good idea for optimizing RAM utilization.


On Jan 8, 2009, at 11:25 AM, Jeff Anderson wrote:

> Thanks for the response.
> I think inline page compression would be great too.  Store gzipped
> objects in the persistent cache and unzip if uncompressed objects are
> requested.
> On Jan 8, 2009, at 10:54 AM, Per Buer wrote:
>> Jeff Anderson wrote:
>>> I'd like to see individual object request statistics and a method to
>>> prefetch objects from the backend that are most frequently  
>>> requested.
>>> Perhaps also a way to prioritize objects into cache tiers based on
>>> frequency of requests.  So, for example, highly requested objects  
>>> are
>>> maintained in RAM and less frequently requested objects are cached  
>>> to
>>> disk.
>> Your operating system already does this today with Varnish. Squid
>> tries
>> to maintain a two tier cache hierarchy without success.
>>> If persistent storage is on its way maybe a method to assign
>>> priority to large disk cache volumes versus memory regions.
>> Noted.
>>> It might  be nice to have a distributed and/or tiered cache model
>>> where a single
>>> master has a very large cache and potentially very long grace  
>>> ability
>>> where objects can exist even if stale.  That master in turn could
>>> host
>>> frontend caches that communicate  efficiently to the master cache  
>>> and
>>> also have a tiered internal object priority.
>> I believe most of this can be achieved today. Stale objects will
>> hopefully reach the 2.0 series before the 2.1 revolutions - at least
>> as
>> a patch, I hope.
>>> Thanks,
>>> --Jeff
>>> On Jan 8, 2009, at 2:29 AM, Tollef Fog Heen wrote:

 a short while before Christmas, I wrote up a small document  
 what I would like to get into 2.1 and when I'd like milestones to
 happen.  This is a suggestion, I'm open to ideas and comments on
 feature set as well as if my guesstimates for dates is completely

 Varnish 2.1 release plan

 The theme for Varnish 2.1 is "scalability", particularly trying to
 address the needs of sites like which has a lot of objects
 where priming the cache takes a long time, leading to long periods
 higher load on the backend servers.

 The main feature is persistent storage, see 
 for design notes. Another important scalability feature is a new
 lockless hash algorithm which scales much better than the current
 one.  Poul-Henning already has an implementation of this in the
 but it's still fresh.

 Minor features which would be nice to get in are:

 * Web UI, showing pretty graphs as well as allowing easy
 of a cluster of Varnish machines.

 * Expiry randomisation.  This reduces the "lemmings" effect where
 end up with a many objects with almost the same TTL (typically on
 startup) which then expire at the same time.  The feature will  
 you to set the TTL to plus/minus X %.

 * Dynamic, user-defined counters that can be read and written from

 * Forced purges, where a thread walks the list of purged objects  
 removes them.

 The schedule

 - 2009-01-15: New hash algorithm working
 - 2009-02-15: Web UI
 - 2009-03-15: Persistent storage
 - 2009-04-01: Feature complete
 - 2009-05-20: Release candidate
 - 2009-05-01: No release critical bugs left
 - 2009-05-10: Release

 Tollef Fog Heen
 Redpill Linpro -- Changing the game!
 t: +47 21 54 41 73
 varnish-misc mailing list
>>> --Jeff
>>> ___
>>> varnish-misc mailing list
>> -- 
>> Per Buer - Leder Infrastruktur og Drift - Redpill Linpro
>> Telefon: 21 54 41 21 - Mobil: 958 39 117
>> |
>> ___
>> varnish-misc mailing list
> --Jeff
> ___
> varnish-misc mailing list

varnish-misc mailing list

Re: 2.1 plans

2009-01-08 Thread Michael S. Fischer
What about CARP-like cache routing (i.e., where multiple cache servers  
themselves are hash buckets)?  This would go a LONG way towards  


On Jan 8, 2009, at 2:29 AM, Tollef Fog Heen wrote:

> Hi,
> a short while before Christmas, I wrote up a small document pointing  
> to
> what I would like to get into 2.1 and when I'd like milestones to
> happen.  This is a suggestion, I'm open to ideas and comments on both
> feature set as well as if my guesstimates for dates is completely off:
> Varnish 2.1 release plan
> The theme for Varnish 2.1 is "scalability", particularly trying to
> address the needs of sites like which has a lot of objects and
> where priming the cache takes a long time, leading to long periods of
> higher load on the backend servers.
> The main feature is persistent storage, see
> for design notes. Another important scalability feature is a new
> lockless hash algorithm which scales much better than the current
> one.  Poul-Henning already has an implementation of this in the tree,
> but it's still fresh.
> Minor features which would be nice to get in are:
> * Web UI, showing pretty graphs as well as allowing easy configuration
>  of a cluster of Varnish machines.
> * Expiry randomisation.  This reduces the "lemmings" effect where you
>  end up with a many objects with almost the same TTL (typically on
>  startup) which then expire at the same time.  The feature will allow
>  you to set the TTL to plus/minus X %.
> * Dynamic, user-defined counters that can be read and written from VCL
> * Forced purges, where a thread walks the list of purged objects and
>  removes them.
> The schedule
> Alphas:
> - 2009-01-15: New hash algorithm working
> - 2009-02-15: Web UI
> - 2009-03-15: Persistent storage
> Beta:
> - 2009-04-01: Feature complete
> Release
> - 2009-05-20: Release candidate
> - 2009-05-01: No release critical bugs left
> - 2009-05-10: Release
> -- 
> Tollef Fog Heen
> Redpill Linpro -- Changing the game!
> t: +47 21 54 41 73
> ___
> varnish-misc mailing list

varnish-misc mailing list

Re: question about configure warning

2009-01-06 Thread Michael S. Fischer
On Jan 6, 2009, at 7:42 AM, Marcus Smith wrote:
> "The build system will automatically detect the availability of  
> epoll()
> and build the corresponding cache_acceptor. It will also automatically
> detect the availability of sendfile(), though its use is discouraged
> (and disabled by default) due to a variety of non-Linux-specific  
> issues."
> (

What are these "non-Linux-specific issues" to which the document refers?

varnish-misc mailing list

Re: Varnish performance tests

2008-12-08 Thread Michael S. Fischer
On Dec 8, 2008, at 9:03 AM, Per Buer wrote:

> Rebert Luc wrote:
>> Hello,
>> In our studies we have a project which consists in testing the
>> performance of Varnish in order to make a comparative with and  
>> without
>> the proxy cache.
>> Does anyone know which utilities to employ ? (knowing that the aim  
>> is to
>> justify our tests)
> You could have a look at varnishreplay to "replay" earlier recorded
> varnishlogs. If you also want to synthesize traffic you could also  
> check
> out siege and curl-loader.
> If you have a high hitrate in your cache you'll most likely end up
> benchmarking the tools rather then varnish. I think I measured siege  
> to
> be ~3x slower then varnish.

I used httperf with great success.  However, it's only single- 
threaded, so you may want to put a wrapper around it to test request  
concurrencies > 1.

varnish-misc mailing list

Re: Overflowed work requests

2008-11-23 Thread Michael S. Fischer
How many CPUs (including all cores) are in your systems?


On Nov 20, 2008, at 12:06 PM, Michael wrote:

> Hi,
> PF> What does "overflowed work requests" in varnishstat signify ? If  
> this
> PF> number is large is it a bad sign ?
> I have similar problem. "overflowed work requests" and "dropped work
> requests" is too large.
> varnish-2.0.2 from ports
>> varnishstat -1
> uptime385  .   Child uptime
> client_conn115120   299.01 Client connections accepted
> client_req 113731   295.41 Client requests received
> cache_hit   39565   102.77 Cache hits
> cache_hitpass833821.66 Cache hits for pass
> cache_miss  65744   170.76 Cache misses
> backend_conn74104   192.48 Backend connections success
> backend_unhealthy0 0.00 Backend connections not  
> attempted
> backend_busy0 0.00 Backend connections too  
> many
> backend_fail0 0.00 Backend connections  
> failures
> backend_reuse   73414   190.69 Backend connections reuses
> backend_recycle 73469   190.83 Backend connections  
> recycles
> backend_unused  0 0.00 Backend connections unused
> n_srcaddr3207  .   N struct srcaddr
> n_srcaddr_act 456  .   N active struct srcaddr
> n_sess_mem   1910  .   N struct sess_mem
> n_sess   1780  .   N struct sess
> n_object63603  .   N struct object
> n_objecthead63603  .   N struct objecthead
> n_smf  126931  .   N struct smf
> n_smf_frag  1  .   N small free smf
> n_smf_large  18446744073709551614  .   N large free smf
> n_vbe_conn239  .   N struct vbe_conn
> n_bereq   391  .   N struct bereq
> n_wrk 496  .   N worker threads
> n_wrk_create  496 1.29 N worker threads created
> n_wrk_failed0 0.00 N worker threads not  
> created
> n_wrk_max   47907   124.43 N worker threads limited
> n_wrk_queue   455 1.18 N queued work requests
> n_wrk_overflow 111098   288.57 N overflowed work requests
> n_wrk_drop  47232   122.68 N dropped work requests
> n_backend   1  .   N backends
> n_expired1960  .   N expired objects
> n_lru_nuked 0  .   N LRU nuked objects
> n_lru_saved 0  .   N LRU saved objects
> n_lru_moved 32435  .   N LRU moved objects
> n_deathrow  0  .   N objects on deathrow
> losthdr22 0.06 HTTP header overflows
> n_objsendfile   0 0.00 Objects sent with sendfile
> n_objwrite  85336   221.65 Objects sent with write
> n_objoverflow   0 0.00 Objects overflowing  
> workspace
> s_sess  77004   200.01 Total Sessions
> s_req  113233   294.11 Total Requests
> s_pipe  0 0.00 Total pipe
> s_pass   863822.44 Total pass
> s_fetch 73696   191.42 Total fetch
> s_hdrbytes   33793720 87775.90 Total header bytes
> s_bodybytes3821523829   9926035.92 Total body bytes
> sess_closed  691517.96 Session Closed
> sess_pipeline3056 7.94 Session Pipeline
> sess_readahead330 0.86 Session Read Ahead
> sess_linger 0 0.00 Session Linger
> sess_herd  104807   272.23 Session herd
> shm_records   7238597 18801.55 SHM records
> shm_writes 606387  1575.03 SHM writes
> shm_flushes44 0.11 SHM flushes due to overflow
> shm_cont 2188 5.68 SHM MTX contention
> shm_cycles  3 0.01 SHM cycles through buffer
> sm_nreq148189   384.91 allocator requests
> sm_nobj126908  .   outstanding allocations
> sm_balloc  4091076608  .   bytes allocated
> sm_bfree   5572595712  .   bytes free
> sma_nreq0 0.00 SMA allocator requests
> sma_nobj0  .   SMA outstanding allocations
> sma_nbytes  0  .   SMA outstanding bytes
> sma_balloc  0  .   SMA bytes allocated
> sma_bfree   0  .   SMA bytes free
> sms_nreq1 0.00 SMS allocator requests
> sms_nobj0  .   SMS outstanding allocations
> sms_nbytes  0  .   SMS outst

Re: varnish2.0.2 on Suse 10.3

2008-11-20 Thread Michael S. Fischer
Where did you get your Varnish package?  Or did you build it from source?

Is there a working C compiler environment on both systems?


On Thu, Nov 20, 2008 at 7:46 AM, Paras Fadte <[EMAIL PROTECTED]> wrote:
> I installed the same version on openSUSE 10.1 (X86-64) and it runs
> fine .What could be the issue?
> On Thu, Nov 20, 2008 at 4:12 PM, Michael S. Fischer
> <[EMAIL PROTECTED]> wrote:
>> Smells like an architecture mismatch.  Any chance you're running a
>> 32-bit Varnish build?
>> --Michael
>> On Thu, Nov 20, 2008 at 1:34 AM, Paras Fadte <[EMAIL PROTECTED]> wrote:
>>> Hi,
>>> I have installed varnish 2.0.2 on openSUSE 10.3 (X86-64) , but it
>>> doesn't seem to start and I get "VCL compilation failed" message. What
>>> could be the issue ?
>>> Thanks in advance.
>>> -Paras
>>> ___
>>> varnish-misc mailing list
varnish-misc mailing list

Re: varnish2.0.2 on Suse 10.3

2008-11-20 Thread Michael S. Fischer
Smells like an architecture mismatch.  Any chance you're running a
32-bit Varnish build?


On Thu, Nov 20, 2008 at 1:34 AM, Paras Fadte <[EMAIL PROTECTED]> wrote:
> Hi,
> I have installed varnish 2.0.2 on openSUSE 10.3 (X86-64) , but it
> doesn't seem to start and I get "VCL compilation failed" message. What
> could be the issue ?
> Thanks in advance.
> -Paras
> ___
> varnish-misc mailing list
varnish-misc mailing list

Re: varnishncsa and liblogging

2008-10-07 Thread Michael S. Fischer
I assume this is for logging daemon metadata/error conditions and not
actual traffic?

If this is for request/response logging, consider implementing a
bridge daemon that reads from the SHM like varnishlog or varnishncsa
does, and which then sends the output via liblogging.  This will
provide the fastest possible performance without bogging down Varnish,
and makes it such that it can be updated or modified entirely
independent of varnish itself.


On Tue, Oct 7, 2008 at 2:46 PM, Skye Poier Nott <[EMAIL PROTECTED]> wrote:
> Hello again,
> I'm currently working on adding liblogging (reliable syslog over BEEP)
> support to varnishncsa.
> Is this something that the project would be interested in adding to
> trunk when it's done?  I presume it would need to be wrapped in a
> configure --with-liblogging option (default off)
> Thanks,
> Skye
> ___
> varnish-misc mailing list
varnish-misc mailing list

Re: want to allow or deny an URL based on a timestamp

2008-08-12 Thread Michael S. Fischer
Nearly every modern webserver has optimized file transfers using
sendfile(2).  You're not going to get any better performance by shifting the
burden of this task to your caching proxies.


On Tue, Aug 12, 2008 at 12:53 AM, Sascha Ottolski <[EMAIL PROTECTED]> wrote:

> Hi all,
> I'm certain that it's possible, but am not sure how to do it: I want to
> let my application create "encrypted" URLs, that are valid only for a
> specific amount of time. I guess with the help of embedded C and
> manually constructed hash keys this should be doable. Now I'm wondering
> if may be someone already has done something like this, or as other
> ideas to achieve this?
> My idea is basically inspired by a lighttpd module:
> The workflow would be something like
> - "decrpyt" incoming URL
> - rewrite URL, extract timestamp
> - if not in range, send 404 (or what seems appropriate)
> - if timestamp is ok, set hash key
> - deliver object from cache or pull from backend
> Thanks for any pointer,
> Sascha
> ___
> varnish-misc mailing list
varnish-misc mailing list

Re: varnish will just restart itself as the virtual memory goes to 3G

2008-06-20 Thread Michael S. Fischer
This sounds an awful lot like "no PAE kernel" -- i.e., 32 bits and a really
old OS.


On Fri, Jun 20, 2008 at 2:42 AM, kuku li <[EMAIL PROTECTED]> wrote:

> Hello,
> we have been running varnish for a while but noticed that varnish will just
> restart itself as the virtual memory goes to 3G(from the linux top command)
> and the cache hit rate consequently drop to almost 0%.  Is it a known bug or
> we just missed something important?
> Thanks.
> ___
> varnish-misc mailing list
varnish-misc mailing list

Re: Question about threads

2008-06-19 Thread Michael S. Fischer
On Thu, Jun 19, 2008 at 5:37 AM, Rafael Umann <[EMAIL PROTECTED]>

> > What is your request:connection ratio?
> Unfortunately now i dont have servers doing 2 hits/second, and
> thats why i dont have stats for you.

Actually, it's right there in your varnishstat output:

   36189916   476.66   250.71 Client connections accepted
  404804204  5494.13  2804.30 Client requests received

Your request:connection ratio is > 10:1!  This is a very good situation to
be in.  varnish doesn't have to spawn nearly as many threads as it would if
the ratio were lower, as is common at many other sites.

> I have 6 servers runing this
> service now, each one doing 5500 hits/sec with 10% CPU usage, and this
> infrastructure scales to 2 hits/sec each server.

It's probably inaccurate to assume that things will scale linearly :)

Best regards,

varnish-misc mailing list

Re: Question about threads

2008-06-18 Thread Michael S. Fischer
On Wed, Jun 18, 2008 at 4:51 AM, Rafael Umann <[EMAIL PROTECTED]>

> If it is a 32bits system, probably the problem is that your stack size
> is 10Mb. So 238 * 10mb = ~2gb
> I decreased my stack size to 512Kb. Using 1gb storage files i can now
> open almost 1900 threads using all the 2gb that 32bits can alloc. So, my
> Varnish makes 2 hits/second serving clients!

What is your request:connection ratio?

Best regards,

varnish-misc mailing list

Re: Question about threads

2008-06-17 Thread Michael S. Fischer
Raising the number of threads will not significantly improve Varnish
concurrency in most cases.  I did a test a few months ago using 4 CPUs on
RHEL 4.6 with very high request concurrency and a very low
request-per-connection ratio (i.e., 1:1, no keepalives) and found that the
magic number is about 16 threads/CPU.  You should raise overflow_max to a
very high value (1% worked just fine for us).

Under optimal operating conditions you should not see the "threads not
created" value increasing like this.

Best regards,


On Tue, Jun 17, 2008 at 3:37 AM, <[EMAIL PROTECTED]> wrote:

> I recently made a loadtest against through varnish.
> First I received a very high response time and found out that varnish was
> maxing the maximum nr of threads.
> I updated thread_min = 5 and thread_max = 300 and recevied much better
> resp. times.
> Then I increased the nr of concurrent users and made another loadtest. The
> strange thing here was that I received high resp. times but the threads
> stopped at 238.
> The "N worker threads not created" increased rapidly.
> I increased the threads again and changed listen_depth to 2048.
> Here is all the numbers:
> 238 0.00 0.22 N worker threads created
>1318 4.98 1.21 N worker threads not created
>0 Debug- "Create worker thread failed 12 Cannot allocate memory"
>0 Debug- "Create worker thread failed 12 Cannot allocate memory"
>0 Debug- "Create worker thread failed 12 Cannot allocate memory"
>0 Debug- "Create worker thread failed 12 Cannot allocate memory"
>0 Debug- "Create worker thread failed 12 Cannot allocate memory"
>0 Debug- "Create worker thread failed 12 Cannot allocate memory"
>0 Debug- "Create worker thread failed 12 Cannot allocate memory"
>0 Debug- "Create worker thread failed 12 Cannot allocate memory"
> default_ttl120 [seconds]
> thread_pools   2 [pools] 
> thread_pool_max400 [threads] 
> thread_pool_min10 [threads] 
> thread_pool_timeout120 [seconds]
> thread_pool_purge_delay1000 [milliseconds]
> thread_pool_add_threshold  2 [requests]
> thread_pool_add_delay  10 [milliseconds]
> thread_pool_fail_delay 200 [milliseconds]
> overflow_max   100 [%]
> rush_exponent  3 [requests per request]
> sess_workspace 8192 [bytes]
> obj_workspace  8192 [bytes]
> sess_timeout   5 [seconds]
> pipe_timeout   60 [seconds]
> send_timeout   600 [seconds]
> auto_restart   on [bool]
> fetch_chunksize128 [kilobytes]
> vcl_trace  off [bool]
> listen_address
> listen_depth   2048 [connections] 
> srcaddr_hash   1049 [buckets]
> srcaddr_ttl30 [seconds]
> backend_http11 off [bool]
> client_http11  off [bool]
> cli_timeout5 [seconds]
> ping_interval  3 [seconds]
> lru_interval   2 [seconds]
> cc_command exec cc -fpic -shared -Wl,-x -o %o %s
> max_restarts   4 [restarts]
> max_esi_includes   5 [restarts]
> cache_vbe_connsoff [bool]
> connect_timeout400 [ms]
> cli_buffer 8192 [bytes]
> diag_bitmap0x0 [bitmap]
> Why do I get "Create worker thread failed 12 Cannot allocate memory" when I
> had 1900MB free RAM and 65GB free Disk on the server? Any ideas?
> If "N worker threads not created" is increasing, is that a bad sign?
> Thanks
> Duja
> ___
> varnish-misc mailing list
varnish-misc mailing list

Re: Blew away .../var/varnish/HOSTNAME/ -- broke varnishstat, how to recover?

2008-06-02 Thread Michael S. Fischer
On Mon, Jun 2, 2008 at 7:57 AM, Chris Shenton <[EMAIL PROTECTED]>

> We have to fill out pounds of paperwork in order to take any outage on
> a public server, no matter how small.  Is there a way to restart
> Varnish without any downtime -- to continue accepting but holding
> connections until restarted, rather like Apache's "apachectl graceful"
> does?  Other ideas?

Can you avoid the problem by putting your Varnish servers behind a load
balancer?  That way, you can preemptively disable the server from taking
traffic on the LB side prior to restarting Varnish, thereby eliminating any
perceivable customer effect.

Also, be careful about using "apachectl graceful" (especially combined with
log rotation), as connections that are currently idle but which may never
receive traffic again will not be terminated.  I consider it too leaky to

Best regards,

varnish-misc mailing list

Re: Multiple varnish instances per server?

2008-06-01 Thread Michael S. Fischer
Why are you using Varnish to serve primarily images?  Modern webservers
serve static files very efficiently off the filesystem.
Best regards,


On Sun, Jun 1, 2008 at 8:58 AM, Barry Abrahamson <[EMAIL PROTECTED]>

> Hi,
> Is anyone running multiple varnish instances per server (one per disk
> or similar?)
> We are currently running a single varnish instance per server using
> the file backend.  Machines are Dual Opteron 2218, 4GB RAM, and 2
> 250GB SATA drives.  We have the cache file on a software RAID 0
> array.  Our cache size is set to 300GB, but once we get to 100GB or
> so, IO starts to get very spiky, causing loads to spike into the 100
> range.  Our expires are rather long (1-2 weeks).  My initial thoughts
> were that this was caused by cache file fragmentation, but we are
> seeing similar issues when using the malloc backend.  We were thinking
> that running 2 instances per server with smaller cache files (one per
> physical disk), may improve our IO problems.  Is there any performance
> benefit/detriment to running multiple varnish instances per server?
> Is there a performance hit for having a large cache?
> Request rates aren't that high (50-150/sec), but the cached files are
> all images, some of which can be rather big (3MB).
> Also, is anyone else seeing similar issues under similar workloads?
> --
> Barry Abrahamson | Systems Wrangler | Automattic
> Blog:
> ___
> varnish-misc mailing list
varnish-misc mailing list

Re: varnish and logging

2008-04-20 Thread Michael S. Fischer
On Sun, Apr 20, 2008 at 10:25 AM, Timothy Ball <[EMAIL PROTECTED]> wrote:

> Does anyone have a script that takes varnishlog output and munges it into
> something that looks combinedlog-ish? Queries to google-tube have not been
> useful.

varnishncsa(1) comes in the box.

varnish-misc mailing list

Re: Unprivileged user?

2008-04-16 Thread Michael S. Fischer
On Tue, Apr 15, 2008 at 11:53 PM, Poul-Henning Kamp <[EMAIL PROTECTED]> wrote:
> In message <[EMAIL PROTECTED]>, "Mich
> ael S. Fischer" writes:
>  >>  Varnish for instance assumes that the administrator is not a total
>  >>  madman, who would do something as patently stupid as you prospose
>  >>  above, under the general assumption that if he were, varnish would
>  >>  be the least of his troubles.
>  >
>  >I'm not saying that they would; I'm just saying that you can't count
>  >on user 'nobody' having the precise role that a security-conscious
>  >sysadmin would want.
>  Which is why there is a -u argument, for people who muck up the
>  configuration that has been standard on all decent UNIX'es for
>  the last 15 years.

Thus answering OP's question.  QED.  :-)

varnish-misc mailing list

Re: Unprivileged user?

2008-04-15 Thread Michael S. Fischer
On Tue, Apr 15, 2008 at 1:16 AM, Poul-Henning Kamp <[EMAIL PROTECTED]> wrote:
>  >Well-engineered software doesn't make potentially false assumptions
>  >about the environment in which it runs.
>  And they don't.
>  Varnish for instance assumes that the administrator is not a total
>  madman, who would do something as patently stupid as you prospose
>  above, under the general assumption that if he were, varnish would
>  be the least of his troubles.

I'm not saying that they would; I'm just saying that you can't count
on user 'nobody' having the precise role that a security-conscious
sysadmin would want.  Perhaps the sysadmin might create a 'varnishd'
user instead that also has limited access, and, hence, the -u option
is quite useful.  Assuming that the nonprivileged user is named
'nobody' could well be false.  I was simply providing the most extreme
example to demonstrate a point.

Best regards,

varnish-misc mailing list

Re: Unprivileged user?

2008-04-15 Thread Michael S. Fischer
On Tue, Apr 15, 2008 at 12:25 AM, Ricardo Newbery
>  Assuming that "nobody" is an available user on your system, then is
>  the "-u user" option for varnishd superfluous?

Who's to say that "nobody" is an unprivileged user?


nobody:*:0:0:alias for root:...

Well-engineered software doesn't make potentially false assumptions
about the environment in which it runs.

varnish-misc mailing list

Re: Two New HTTP Caching Extensions

2008-04-09 Thread Michael S. Fischer
On Tue, Apr 8, 2008 at 4:34 PM, Ricardo Newbery <[EMAIL PROTECTED]> wrote:
> > I should add a qualifier to my vote, that stale-while-revalidate
> > generally is used to mask suboptimal backend performance and so I
> > discourage it in favor of fixing the backend.
>  Of course the main premise of a reverse-proxy cache is to mask suboptimal
> backend performance.  :-)

Except, in this case, you are presumably already relieving your
backend of a significant burden with your cache.  if your backend is
*still* unable to process requests to fulfill a request from the
caching layer within a reasonable time, you're in serious trouble

varnish-misc mailing list

Re: Two New HTTP Caching Extensions

2008-04-08 Thread Michael S. Fischer
On Tue, Apr 8, 2008 at 4:25 PM, Michael S. Fischer <[EMAIL PROTECTED]> wrote:
> On Tue, Apr 8, 2008 at 4:18 PM, Ricardo Newbery <[EMAIL PROTECTED]> wrote:
>  >  +1 on stale-while-revalidate.  I found this one to be real handy.
>  Another +1

I should add a qualifier to my vote, that stale-while-revalidate
generally is used to mask suboptimal backend performance and so I
discourage it in favor of fixing the backend.

varnish-misc mailing list

Re: Two New HTTP Caching Extensions

2008-04-08 Thread Michael S. Fischer
On Tue, Apr 8, 2008 at 4:18 PM, Ricardo Newbery <[EMAIL PROTECTED]> wrote:
>  +1 on stale-while-revalidate.  I found this one to be real handy.

Another +1

varnish-misc mailing list

Re: cache empties itself?

2008-04-07 Thread Michael S. Fischer
On Fri, Apr 4, 2008 at 3:31 PM, Ricardo Newbery <[EMAIL PROTECTED]> wrote:

> > > Again, "static" content isn't only the stuff that is served from
> > > filesystems in the classic static web server scenario.  There are plenty
> of
> > > "dynamic" applications that process content from database -- applying
> skins
> > > and compositing multiple elements into a single page while filtering
> every
> > > element or otherwise applying special processing based on a user's
> access
> > > privileges.  An example of this is a dynamic content management system
> like
> > > Plone or Drupal.  In many cases, these "dynamic" responses are fairly
> > > "static" for some period of time but there is still a definite
> performance
> > > hit, especially under load

>  In any case, both of these examples, Plone and Drupal, can indeed cache the
> output "locally" but that is still not as fast as placing a dedicated cache
> server in front.  It's almost always faster to have a dedicated
> single-purpose process do something instead of cranking up the hefty
> machinery for requests that can be adequately served by the lighter process.

Sure, but this is also the sort of content that can be cached back
upstream using ordinary HTTP headers.

Still waiting for that compelling case that requires independent cache

varnish-misc mailing list

Re: recommendation for swap space?

2008-04-07 Thread Michael S. Fischer
On Mon, Apr 7, 2008 at 2:14 PM, Simon Lyall <[EMAIL PROTECTED]> wrote:
> On Mon, 7 Apr 2008, Michael S. Fischer wrote:
>  > That said, it wouldn't make sense to entirely deallocate your swap
>  > space, since the kernel may decide to page or swap out processes other
>  > than Varnish.
>  and what is wrong with that?  Surely your RAM is better being used by the
>  main applications on the server ( Varnish ) rather than "sitting around
>  and waiting" copies of sshd, cron and getty?

Huh?  Nothing I said contradicts that.

varnish-misc mailing list

Re: recommendation for swap space?

2008-04-07 Thread Michael S. Fischer
On Mon, Apr 7, 2008 at 9:00 AM, Dag-Erling Smørgrav <[EMAIL PROTECTED]> wrote:
> Sascha Ottolski <[EMAIL PROTECTED]> writes:
>  > now that my varnish processes start to reach the RAM size, I'm wondering
>  > what a dimension of swap would be wise? I currently have about 30 GB
>  > swap space for 32 GB RAM, but am wondering if it could even make sense
>  > to have no swap at all? My cache file is 517 GB in size.
>  Varnish does not use swap.

That said, it wouldn't make sense to entirely deallocate your swap
space, since the kernel may decide to page or swap out processes other
than Varnish.

varnish-misc mailing list

Re: cache empties itself?

2008-04-04 Thread Michael S. Fischer
On Fri, Apr 4, 2008 at 11:05 AM, Ricardo Newbery <[EMAIL PROTECTED]> wrote:

>  Again, "static" content isn't only the stuff that is served from
> filesystems in the classic static web server scenario.  There are plenty of
> "dynamic" applications that process content from database -- applying skins
> and compositing multiple elements into a single page while filtering every
> element or otherwise applying special processing based on a user's access
> privileges.  An example of this is a dynamic content management system like
> Plone or Drupal.  In many cases, these "dynamic" responses are fairly
> "static" for some period of time but there is still a definite performance
> hit, especially under load.

If that's truly the case, then your CMS should be caching the output locally.

varnish-misc mailing list

Re: cache empties itself?

2008-04-04 Thread Michael S. Fischer
On Fri, Apr 4, 2008 at 3:20 AM, Sascha Ottolski <[EMAIL PROTECTED]> wrote:
>  you are right, _if_ the working set is small. in my case, we're talking
>  20+ mio. small images (5-50 KB each), 400+ GB in total size, and it's
>  growing every day. access is very random, but there still is a good
>  amount of "hot" objects. and to be ready for a larger set it cannot
>  reside on the webserver, but lives on a central storage. access
>  performance to the (network) storage is relatively slow, and our
>  experiences with mod_cache from apache were bad, that's why I started
>  testing varnish.

Ah, I see.

The problem is that you're basically trying to compensate for a
congenital defect in your design: the network storage (I assume NFS)
backend.  NFS read requests are not cacheable by the kernel because
another client may have altered the file since the last read took

If your working set is as large as you say it is, eventually you will
end up with a low cache hit ratio on your Varnish server(s) and you'll
be back to square one again.

The way to fix this problem in the long term is to split your file
library into shards and put them on local storage.

Didn't we discuss this a couple of weeks ago?

Best regards,

varnish-misc mailing list

Re: cache empties itself?

2008-04-04 Thread Michael S. Fischer
On Thu, Apr 3, 2008 at 8:59 PM, Ricardo Newbery <[EMAIL PROTECTED]> wrote:

>  Well, first of all you're setting up a false dichotomy.  Not everything
> fits neatly into your apparent definitions of dynamic versus static.  Your
> definitions appear to exclude the use case where you have cacheable content
> that is subject to change at unpredictable intervals but which is otherwise
> fairly "static" for some length of time.

In my experience, you almost never need a caching proxy for this
purpose.  Most modern web servers are perfectly capable of serving
static content at wire speed.  Moreover, if your origin servers have a
reasonable amount of RAM and the working set size is relatively small,
the static objects are already likely to be in the buffer cache.  In a
scenario such as this, having caching proxies upstream for these sorts
of objects can actually be *worse* in terms of performance -- consider
the wasted time processing a cache miss for content that's already
cached downstream.

Best regards,

varnish-misc mailing list

Re: cache empties itself?

2008-04-03 Thread Michael S. Fischer
On Thu, Apr 3, 2008 at 7:37 PM, Ricardo Newbery <[EMAIL PROTECTED]> wrote:

>  URL versioning is usually not appropriate for html
> pages or other primary resources that are intended to be reached directly by
> the end user and whose URLs must not change.

Back to square one.  Are these latter resources dynamic, or are they static?

- If they are dynamic, neither your own proxies nor upstream proxies
should be caching the content.
- If they are static, then they should be cacheable for the same
amount of time all the way upstream (modulo protected URLs).

I've haven't yet seen a defensible need for varying cache lifetimes,
depending on the proximity of the proxy to the origin server, as this
request seems to be.  Of course, I'm open to being convinced otherwise

varnish-misc mailing list

Re: cache empties itself?

2008-04-03 Thread Michael S. Fischer
On Thu, Apr 3, 2008 at 11:53 AM, Ricardo Newbery <[EMAIL PROTECTED]> wrote:
 >  On Apr 3, 2008, at 11:04 AM, Michael S. Fischer wrote:
> > On Thu, Apr 3, 2008 at 10:58 AM, Sascha Ottolski <[EMAIL PROTECTED]> wrote:
> >
> > > and I don't wan't upstream caches or browsers to cache that long, only
> > > varnish, so setting headers doesn't seem to fit.
> > >
> >
> > Why not?  Just curious.   If it's truly cachable content, it seems as
> > though it would make sense (both for your performance and your
> > bandwidth outlays) to let browsers cache.
>  Can't speak for the OP but a common use case is where you want an
> aggressive cache but still need to retain the ability to purge the cache
> when content changes.  As far as I know, there are only two ways to do this
> without contaminating downstream caches with potentially stale content...
> via special treatment in the varnish config (which is what the OP is trying
> to do) or using a special header that only your varnish instance will
> recognize (like Surrogate-Control, which as far as I know Varnish does not
> support out-of-the-box but Squid3 does).

Seems to me that this is rather brittle and error-prone.

- If a particular resource is truly dynamic, then it should not be
cachable at all.
- If a particular resource can be considered static (i.e. cachable),
yet updateable, then it is *far* safer to version your URLs, as you
have zero control over intermediate proxies.

varnish-misc mailing list

Re: cache empties itself?

2008-04-03 Thread Michael S. Fischer
On Thu, Apr 3, 2008 at 10:58 AM, Sascha Ottolski <[EMAIL PROTECTED]> wrote:
>  and I don't wan't upstream caches or browsers to cache that long, only
>  varnish, so setting headers doesn't seem to fit.

Why not?  Just curious.   If it's truly cachable content, it seems as
though it would make sense (both for your performance and your
bandwidth outlays) to let browsers cache.

varnish-misc mailing list

Re: cache empties itself?

2008-04-03 Thread Michael S. Fischer
On Thu, Apr 3, 2008 at 10:26 AM, Sascha Ottolski <[EMAIL PROTECTED]> wrote:
>  All this with 1.1.2. It's vital to my setup to cache as many objects as
>  possible, for a long time, and that they really stay in the cache. Is
>  there anything I could do to prevent the cache being emptied? May be
>  I've been bitten by a bug and should give the trunk a shot?

Just set the Expires: headers on the origin (backend) server responses
to now + 10 years or something.

varnish-misc mailing list

Re: Miscellaneous questions

2008-04-01 Thread Michael S. Fischer
On Mon, Mar 31, 2008 at 11:08 AM, Sascha Ottolski <[EMAIL PROTECTED]> wrote:

> probably not exactly the same, but may be someone finds it useful: If
> just started to dive a bit into HAProxy ( the
> development version has the ability to calculate the loadbalancing
> based on the hash of the URI to decide which backend should receive a
> request. I guess this could be a nice companion to put in front of
> several reverse proxies to increase the hit rate of each one.

One major shortcoming of HAProxy is that it does not support HTTP Keep-Alive
connections.  This can be an issue if your origin servers are far away from
your proxies.

varnish-misc mailing list

Re: production ready devel snapshot?

2008-03-31 Thread Michael S. Fischer
On Mon, Mar 31, 2008 at 10:34 PM, Stig Sandbeck Mathisen <[EMAIL PROTECTED]>

> On Mon, 31 Mar 2008 20:10:06 +0200, Sascha Ottolski <[EMAIL PROTECTED]>
> said:
> > is there anything like a snapshot release that is worth giving it a
> > try, especially if my configuration will hopefully stay simple for a
> > while?
> You could try using trunk.  It seems fairly stable.

If it's so stable, why not cut a release?  The nice thing about releases is
that they're easy to revert to when analyzing bug reports.

varnish-misc mailing list

Re: Directors user sessions

2008-03-28 Thread Michael S. Fischer
On Fri, Mar 28, 2008 at 4:58 AM, Florian Engelhardt <[EMAIL PROTECTED]>

> You could store the sessions on a separate server, for instance on a
> memcache or in a database

Good idea.  (Though if you use memcached, you'd probably want to
periodically copy the backing store to a file to survive system failure.)

> or mount the the filesystem where the
> session is stored via nfs on every backend server.

Bad idea.  NFS file locking is unreliable at best.

varnish-misc mailing list

Re: Varnish vs. X-JSON header

2008-03-27 Thread Michael S. Fischer
The Transfer-Encoding: header is missing from the Varnish response as well.


On Thu, Mar 27, 2008 at 7:55 AM, Florian Engelhardt <[EMAIL PROTECTED]>

> Hello,
> i've got a problem with the X-JSON HTTP-Header not beeing delivered by
> varnish in pipe and pass mode.
> My application runs on PHP with lighttpd, when querying the lighty
> direct (via port :81), the header is present in the request. PHP Script
> is as follows:
>  header('X-JSON: foobar');
> echo 'foobar';
> ?>
> Requesting with curl shows the following:
> $ curl -D -
> HTTP/1.1 200 OK
> Expires: Fri, 28 Mar 2008 14:49:29 GMT
> Cache-Control: max-age=86400
> Content-type: text/html
> Server: lighttpd
> Content-Length: 6
> Date: Thu, 27 Mar 2008 14:49:29 GMT
> Age: 0
> Via: 1.1 varnish
> Connection: keep-alive
> foobar
> Requesting on port 81 (where lighty listens on):
> $ curl -
> HTTP/1.1 200 OK
> Transfer-Encoding: chunked
> Expires: Fri, 28 Mar 2008 14:51:45 GMT
> Cache-Control: max-age=86400
> X-JSON: foobar
> Content-type: text/html
> Date: Thu, 27 Mar 2008 14:51:45 GMT
> Server: lighttpd
> foobar
> Why is this X-JSON header missing when requested via varnish?
> Kind Regards
> Flo
> PS: my vcl file:
> backend default {
>  .host = "";
>  .port = "81";
> }
> sub vcl_recv {
>  if (req.url ~ "^/media\.php.*" || req.url == "/status/") {
>  }
>  if (req.url ~ "^/ADMIN.*") {
>  }
>  if (req.url == "/test.php") {
>  }
> }
> # test.php entry just for testing purpose
> ___
> varnish-misc mailing list
varnish-misc mailing list

Re: what if a header I'm testing is missing?

2008-03-21 Thread Michael S. Fischer
On Fri, Mar 21, 2008 at 3:36 AM, Ricardo Newbery <[EMAIL PROTECTED]> wrote:

>  and I'm wondering if the first part of this is unnecessary.  For
>  example, what happens if I have this...
>  if (req.http.Cookie ~ "(__ac=|_ZopeId=)") {
>  pass;
>  }
>  but no Cookie header is present in the request.  Is Varnish flexible
>  enough to realize that the test fails without throwing an error?

Why don't you try it and report your findings back to us?

varnish-misc mailing list

Re: Miscellaneous questions

2008-03-17 Thread Michael S. Fischer
On Mon, Mar 17, 2008 at 3:32 PM, DHF <[EMAIL PROTECTED]> wrote:
>  This is called CARP/"Cache Array Routing Protocol" in squid land.
>  Here's a link to some info on it:
>  It works quite well for reducing the number of globally duplicated
>  objects in an multilayer accelerator setup, as you can add additional
>  machines in the interstitial space between the frontline caches and the
>  origin as a cheap and easy way to increase the overall ram available to
>  hot objects without having to use some front end load balancer like
>  perlbal, big ip or whatever to direct the individual clients to specific
>  frontlines to accomplish the same thing ( though you usually still have
>  a load balancer for fault tolerance ).  Though in squid there are some
>  bugs with their implementation ...

Thanks for the reminder.  I'll file RFEs for both the static and CARP
implementations.  I presume the static configuration will be done
first (if at all), as it's probably significantly easier to implement.

varnish-misc mailing list

Re: Miscellaneous questions

2008-03-17 Thread Michael S. Fischer
On Mon, Mar 17, 2008 at 8:57 AM, Poul-Henning Kamp <[EMAIL PROTECTED]> wrote:

>  >No, we were talking about how long an idle backend connection is kept
>  >open (or at least I was).
>  Yes I know :-)
>  And we don't do anything to close those before the backend closes on
>  us, we have no reason to, the longer we keep it, the more connection
>  setups we avoid.

So does Varnish close HTTP Keep-Alive backend connections after an
idle period?  or not?

Best regards,

varnish-misc mailing list

Re: Miscellaneous questions

2008-03-17 Thread Michael S. Fischer
On Mon, Mar 17, 2008 at 12:42 AM, Dag-Erling Smørgrav <[EMAIL PROTECTED]> wrote:
> "Michael S. Fischer" <[EMAIL PROTECTED]> writes:
> > Dag-Erling Smørgrav <[EMAIL PROTECTED]> writes:
>  > > I think the default timeout on backends connection may be a little
>  > > short, though.
>  > I assume this is the thread_pool_timeout parameter?
>  No, that's how long an idle worker thread is kept alive.  I don't think
>  the backend timeout is configurable, I think it's hardocded to five
>  seconds.

What does the timeout pertain to?  Connect time?  Response time?

varnish-misc mailing list

Re: Miscellaneous questions

2008-03-16 Thread Michael S. Fischer
On Feb 13, 2008 7:41 AM, Dag-Erling Smørgrav <[EMAIL PROTECTED]> wrote:

> I believe varnishlog -w /var/log/varnish.log is enabled by default if
> you install from packages on !FreeBSD.  We may want to change this.

This was true for my RHEL 4 installation.  I was only able to achieve
16,000 connections/second after completely disabling logging to disk
(and fine tuning the thread pool size), which is why I asked my
question about further turning down the verbosity of logging to

> I think the default timeout on backends connection may be a little
> short, though.

I assume this is the thread_pool_timeout parameter?

> > > (3) Feature request: Request hashing.  It would be really cool if
> > > Varnish were able to select the origin server (in reality another
> > > Varnish proxy) by hashing the Request URI.  Having this ability would
> > > improve the cache hit ratio overall where a pool of caching proxies is
> > > used.
> > We have sort of given up on the peer-to-peer cache fetches using
> > dedicated protocols, but if you are able to tell that another
> > varnish is a better place to pick up something, nothing prevents
> > you from making that a backend of this varnish and doing
> > a pass on the request.
> No, I think what he means is selecting the backend based on client-ip
> modulo number-of-backends so each client always gets the same backend
> (which makes session tracking much easier)

That's a good idea, too, and deserves implementation, but I was
referring to something else.  I think phk understood what I was
getting at.

I'm dealing with a situation where the working set of cacheable
responses is larger than the RAM size of a particular Varnish
instance.  (I don't want to go to disk because it will incur at least
a 10ms penalty.)  I also want to maximize the hit ratio.

One good way to do this is to put a pass-only Varnish instance (i.e.,
a content switch) in front of a set of intermediate backends (Varnish
caching proxies), each of which is assigned to cache a subset of the
possible URI namespace.

However, in order to do this, the content switch must make consistent
decisions about which cache to direct the incoming requests to.  One
good way of doing that is implementing a hash function H(U) -> V,
where  U is the request URI, and V is the intermediate-level proxy.

I'd appreciate it if you'd consider adding this as a feature.

Best regards,

varnish-misc mailing list

Re: how to...accelarate randon access to millions of images?

2008-03-16 Thread Michael S. Fischer
On Sun, Mar 16, 2008 at 10:02 AM, Michael S. Fischer

I don't know why I'm having such a problem with this.  Sigh!  I think
I got it right this time.

>  >  If I were designing such a service, my choices would be:
>  Corrections:
>  >  (1) 4 machines, each with 4-disk RAID 0 (fast, but dangerous)
>  >  (2) 4 machines, each with 5-disk RAID 5 (safe, fast reads, but slow
>  >  writes for your file size - also, RAID 5 should be battery backed,
>  >  which adds cost)
>  >  (3) 4 machines, each with 4-disk RAID 10 (will meet workload
>  >  requirement, but won't handle peak load in degraded mode)
>  >  (4) 5 machines, each with 4-disk RAID 10
>  >  (5) 9 machines, each with 2-disk RAID 1

varnish-misc mailing list

Re: how to...accelarate randon access to millions of images?

2008-03-16 Thread Michael S. Fischer
On Sun, Mar 16, 2008 at 10:00 AM, Michael S. Fischer

>  If I were designing such a service, my choices would be:


>  (1) 4 machines, each with 4-disk RAID 1 (fast, but dangerous)
>  (2) 4 machines, each with 5-disk RAID 5 (safe, fast reads, but slow
>  writes for your file size - also, RAID 5 should be battery backed,
>  which adds cost)
>  (3) 4 machines, each with 4-disk RAID 10 (will meet workload
>  requirement, but won't handle peak load in degraded mode)
>  (4) 5 machines, each with 4-disk RAID 10
>  (5) 9 machines, each with 2-disk RAID 0
>  --Michael
varnish-misc mailing list

Re: how to...accelarate randon access to millions of images?

2008-03-16 Thread Michael S. Fischer
On Fri, Mar 14, 2008 at 1:37 PM, Sascha Ottolski <[EMAIL PROTECTED]> wrote:
>  The challenge is to server 20+ million image files, I guess with up to
>  1500 req/sec at peak.

A modern disk drive can service 100 random IOPS (@ 10ms/seek, that's
reasonable).  Without any caching, you'd need 15 disks to service your
peak load, with a bit over 10ms I/O latency (seek + read).

> The files tend to be small, most of them in a
>  range of 5-50 k. Currently the image store is about 400 GB in size (and
>  growing every day). The access pattern is very random, so it will be
>  very unlikely that any size of RAM will be big enough...

Are you saying that the hit ratio is likely to be zero?  If so,
consider whether you want to have caching turned on the first place.
There's little sense buying extra RAM if it's useless to you.

>  Now my question is: what kind of hardware would I need? Lots of RAM
>  seems to be obvious, what ever "a lot" may be...What about the disk
>  subsystem? Should I look into something like RAID-0 with many disk to
>  push the IO-performance?

You didn't say what your failure tolerance requirements were.  Do you
care if you lose data?   Do you care if you're unable to serve some
requests while a machine is down?

Consider dividing up your image store onto multiple machines.  Not
only would you get better performance, but you would be able to
survive hardware failures with fewer catastropic effects (i.e., you'd
lose only 1/n of service).

If I were designing such a service, my choices would be:

(1) 4 machines, each with 4-disk RAID 1 (fast, but dangerous)
(2) 4 machines, each with 5-disk RAID 5 (safe, fast reads, but slow
writes for your file size - also, RAID 5 should be battery backed,
which adds cost)
(3) 4 machines, each with 4-disk RAID 10 (will meet workload
requirement, but won't handle peak load in degraded mode)
(4) 5 machines, each with 4-disk RAID 10
(5) 9 machines, each with 2-disk RAID 0

Multiply each of these machine counts by 2 if you want to be resilient
to failures other than disk failures.

You can then put a Varnish proxy layer in front of your image storage
servers, and direct incoming requests to the appropriate backend

varnish-misc mailing list

Re: URL rewrite

2008-03-10 Thread Michael S. Fischer
On Mon, Mar 10, 2008 at 7:41 AM, Michael S. Fischer
> On Mon, Mar 10, 2008 at 3:57 AM, Gsm Lock <[EMAIL PROTECTED]> wrote:
>  > I have a few backend servers . Static documents on servers has ugly
>  > addresses as http://my-next-back.end/111../785643../blabla/.../my.doc
>  > (mostly unstructured).
>  > Some of them has not unique names.
>  > I need them to be accessible  from frontend as
>  > http://myfront.end/something/my-new-named.doc
>  > There are a few thousands of documents... How can I (when I do) configure
>  > Varnish for this ?
>  This is easily done with Varnish.  See the gsub() method in vcl(7).
>  You'll need a working knowledge of regular expressions.

I should add: For what I gather are performance reasons, Varnish does
not have database lookup support or external map support.  You will
probably want to generate your vcl file programmatically using a map
that contains the mapped url as the key and the backend url as the
value (or vice versa).   Varnish can switch among VCL files at runtime
using the admin console.

Best regards,

varnish-misc mailing list

Re: URL rewrite

2008-03-10 Thread Michael S. Fischer
On Mon, Mar 10, 2008 at 3:57 AM, Gsm Lock <[EMAIL PROTECTED]> wrote:
> I have a few backend servers . Static documents on servers has ugly
> addresses as http://my-next-back.end/111../785643../blabla/.../my.doc
> (mostly unstructured).
> Some of them has not unique names.
> I need them to be accessible  from frontend as
> http://myfront.end/something/my-new-named.doc
> There are a few thousands of documents... How can I (when I do) configure
> Varnish for this ?

This is easily done with Varnish.  See the gsub() method in vcl(7).
You'll need a working knowledge of regular expressions.

Best regards,

varnish-misc mailing list

Re: Tuning varnish for high load

2008-03-04 Thread Michael S. Fischer
On Tue, Mar 4, 2008 at 1:53 AM, Henning Stener <[EMAIL PROTECTED]> wrote:
>  Are you sending one request per connection and closing it, or are you
>  serving a number of requests to 10K different connections? In the last
>  case how many requests/sec are you seeing?

In our test, we sent about 200 requests per connection, and achieved
around 16,000-18,000 requests/sec.  Trying to issue one request per
connection will quickly exhaust the number of available open TCP

varnish-misc mailing list

Re: Tuning varnish for high load

2008-02-29 Thread Michael S. Fischer
On Thu, Feb 28, 2008 at 9:52 PM, Mark Smallcombe <[EMAIL PROTECTED]> wrote:

>  What tuning recommendations do you have for varnish to help it handle high 
> load?

Funny you should ask, I've been spending a lot of time with Varnish in
the lab.  Here are a few observations I've made:

(N.B.  We're using 4-CPU Xeon hardware running RHEL 4.5, which runs
the 2.6.9 Linux kernel.  All machines have at least 4GB RAM and run
the 64-bit Varnish build, but our results are equally applicable to
32-bit builds)

- When the cache hit ratio is very high (i.e. 100%), we discovered
that Varnish's default configuration of thread_pool_max is too high.
When there are too many worker threads, Varnish spends an inordinate
amount of time in system call space.  We're not sure whether this is
due to some flaw in Varnish, our ancient Linux kernel (we were unable
to test with a modern 2.6.22 or later kernel that apparently has a
better scheduler), or is just a fundamental problem when a threaded
daemon like Varnish tries to service thousands of concurrent
connections.  After much tweaking we determined that, on our hardware,
the optimal ratio of threads per CPU is about 16, or around 48-50
threads on a 4-CPU box.  To eliminate dropping work requests, it is
also advisable to raise overflow_max to a significantly higher ratio
than the default (e.g. 1%).  This will cause Varnish to consume
somewhat more RAM, but will provide outstanding performance.  With
these tweaks, we were able to get Varnish to serve 10,000 concurrent
connections, flooding a Gigabit Ethernet channel with 5 KB cached

- Conversely, when the cache hit ratio is 0, the default of 100
threads is too low. (To create this scenario, we used 2 Varnish boxes:
 the front-end proxy was configured to "pass" all requests to an
optimized backend Varnish instance that served all requests from
cache.)  On the same 4-CPU hardware, we found that the optimal
thread_pool_max value in this situation is about 750.  Again, we were
able to serve 10, concurrent connections after optimizing the

I find this interesting, because one would think that Varnish would be
making the system spend much more time in the scheduler in the second
scenario because it is doing significantly less work (no lookups, just
handing off connections to the appropriate backend).  I suspect that
there may be some thread-scalability issues with the cache lookup
process.   If someone with a suitably powerful lab setup (i.e. Gigabit
Ethernet, big hardware) can test with a more modern Linux kernel, I'd
be very interested in the results.  Feel free to contact me if you
need assistance with setup/analysis.

Finally: Varnish performance is absolutely atrocious on a 8-CPU RHEL
4.5 system -- so bad that I have to turn down thread_pool_max to 4 or
restrict it to run only on 4 CPUs via taskset(1).  I've heard that
MySQL has similar problems, so I suspect that this is a Linux kernel

Best regards,

varnish-misc mailing list

Re: Child dying with "Too many open files"

2008-02-28 Thread Michael S. Fischer
I can't help but wonder if you'd set it too high.  What happens when
you set NFILES and fs.file-max both to 131072?  I've tested that as a
known good value.


On Thu, Feb 28, 2008 at 2:58 PM, Andrew Knapp <[EMAIL PROTECTED]> wrote:
> Yup, it is. Here's some output:
>  $ ps auxwww | grep varnish
>  root 12036  0.0  0.0  27704   648 ?Ss   14:54   0:00
>  /usr/sbin/varnishd -a :80 -f /etc/varnish/photo.vcl -T :6082
> -t 120 -w 10,700,30 -s file,/c01/varnish/varnish_storage.bin,12G -u
>  varnish -g varnish -P /var/run/
>  varnish  12037  1.2  0.4 13119108 39936 ?  Sl   14:54   0:00
>  /usr/sbin/varnishd -a :80 -f /etc/varnish/photo.vcl -T :6082
> -t 120 -w 10,700,30 -s file,/c01/varnish/varnish_storage.bin,12G -u
>  varnish -g varnish -P /var/run/
>  -Andy
>  > -Original Message-----
>  > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf
>  > Of Michael S. Fischer
> > Sent: Thursday, February 28, 2008 1:57 PM
>  > To: Andrew Knapp
>  > Cc:
>  > Subject: Re: Child dying with "Too many open files"
>  >
>  > Is varnishd being started as root?  (even if it drops privileges
>  > later) Only root can have > 1024 file descriptors open, to my
>  > knowledge.
>  >
>  > --Michael
>  >
>  > On Thu, Feb 28, 2008 at 11:48 AM, Andrew Knapp <[EMAIL PROTECTED]>
>  > wrote:
>  > > Didn't really get a answer to this, so I'm trying again.
>  > >
>  > >  I've done some testing with the NFILES variable, and I keep getting
>  > the
>  > >  same error as before ("Too many open files"). I've also verified
>  > that
>  > >  the limit is actually being applied by putting a ulimit -a in the
>  > >  /etc/init.d/varnish script.
>  > >
>  > >  Anyone have any ideas? I'm running the 1.1.2-5 rpms from on
>  > >  Centos 5.1.
>  > >
>  > >  Thanks,
>  > >  Andy
>  > >
>  > >
>  > >  > -Original Message-
>  > >  > From: [EMAIL PROTECTED] [mailto:varnish-
>  > misc-
>  > >  > [EMAIL PROTECTED] On Behalf Of Andrew Knapp
>  > >
>  > > > Sent: Wednesday, February 20, 2008 5:52 PM
>  > >  > To: Michael S. Fischer
>  > >  > Cc:
>  > >
>  > >
>  > > > Subject: RE: Child dying with "Too many open files"
>  > >  >
>  > >  > Here's the output:
>  > >  >
>  > >  > $ sysctl fs.file-max
>  > >  > fs.file-max = 767606
>  > >  >
>  > >  > > -Original Message-
>  > >  > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
>  > >  > Behalf
>  > >  > > Of Michael S. Fischer
>  > >  > > Sent: Wednesday, February 20, 2008 5:48 PM
>  > >  > > To: Andrew Knapp
>  > >  > > Cc:
>  > >  > > Subject: Re: Child dying with "Too many open files"
>  > >  > >
>  > >  > > Does 'sysctl fs.file-max' say?  It should be >= the ulimit.
>  > >  > >
>  > >  > > --Michael
>  > >  > >
>  > >  > > On Wed, Feb 20, 2008 at 4:04 PM, Andrew Knapp
>  > >  > wrote:
>  > >  > > >
>  > >  > > >
>  > >  > > >
>  > >  > > >
>  > >  > > > Hello,
>  > >  > > >
>  > >  > > >
>  > >  > > >
>  > >  > > > I'm getting this error when running varnishd:
>  > >  > > >
>  > >  > > >
>  > >  > > >
>  > >  > > > >>
>  > >  > > >
>  > >  > > > Child said (2, 15369): <  > >  cache_pool.c
>  > >  > > line
>  > >  > > > 217:
>  > >  > > >
>  > >  > > >   Condition((pipe(w->pipe)) == 0) not true.
>  > >  > > >
>  > >  > > >   errno = 24 (Too many open files)
>  > >  > > >
>  > >  > > > >>
>  > >  > > >
>  > >  > > > Cache child died pid=15369 status=0x6
>  > >  > > >
>  > >  > > >
>  > >  > > >
>  > >  > > > uname -a:
>  > >  > > 

Re: Child dying with "Too many open files"

2008-02-28 Thread Michael S. Fischer
Is varnishd being started as root?  (even if it drops privileges
later) Only root can have > 1024 file descriptors open, to my


On Thu, Feb 28, 2008 at 11:48 AM, Andrew Knapp <[EMAIL PROTECTED]> wrote:
> Didn't really get a answer to this, so I'm trying again.
>  I've done some testing with the NFILES variable, and I keep getting the
>  same error as before ("Too many open files"). I've also verified that
>  the limit is actually being applied by putting a ulimit -a in the
>  /etc/init.d/varnish script.
>  Anyone have any ideas? I'm running the 1.1.2-5 rpms from on
>  Centos 5.1.
>  Thanks,
>  Andy
>  > -Original Message-
>  > From: [EMAIL PROTECTED] [mailto:varnish-misc-
>  > [EMAIL PROTECTED] On Behalf Of Andrew Knapp
> > Sent: Wednesday, February 20, 2008 5:52 PM
>  > To: Michael S. Fischer
>  > Cc:
> > Subject: RE: Child dying with "Too many open files"
>  >
>  > Here's the output:
>  >
>  > $ sysctl fs.file-max
>  > fs.file-max = 767606
>  >
>  > > -Original Message-
>  > Behalf
>  > > Of Michael S. Fischer
>  > > Sent: Wednesday, February 20, 2008 5:48 PM
>  > > To: Andrew Knapp
>  > > Cc:
>  > > Subject: Re: Child dying with "Too many open files"
>  > >
>  > > Does 'sysctl fs.file-max' say?  It should be >= the ulimit.
>  > >
>  > > --Michael
>  > >
>  > > On Wed, Feb 20, 2008 at 4:04 PM, Andrew Knapp <[EMAIL PROTECTED]>
>  > wrote:
>  > > >
>  > > >
>  > > >
>  > > >
>  > > > Hello,
>  > > >
>  > > >
>  > > >
>  > > > I'm getting this error when running varnishd:
>  > > >
>  > > >
>  > > >
>  > > > >>
>  > > >
>  > > > Child said (2, 15369): <  cache_pool.c
>  > > line
>  > > > 217:
>  > > >
>  > > >   Condition((pipe(w->pipe)) == 0) not true.
>  > > >
>  > > >   errno = 24 (Too many open files)
>  > > >
>  > > > >>
>  > > >
>  > > > Cache child died pid=15369 status=0x6
>  > > >
>  > > >
>  > > >
>  > > > uname -a:
>  > > >
>  > > > Linux  2.6.18-53.1.4.el5 #1 SMP Fri Nov 30 00:45:55 EST
>  > > 2007
>  > > > x86_64 x86_64 x86_64 GNU/Linux
>  > > >
>  > > >
>  > > >
>  > > > command used to start varnish:
>  > > >
>  > > > /usr/sbin/varnishd -d -d -a :80 -f /etc/varnish/photo.vcl -T
>  > > > :6082 -t 120 -w 10,700,30 -s
>  > > > file,/c01/varnish/varnish_storage.bin,12G -u varnish -g varnish -P
>  > > > /var/run/
>  > > >
>  > > >
>  > > >
>  > > > I have NFILES=27 set in /etc/sysconfig/varnish. Do I just need
>  > to
>  > > up
>  > > > that value?
>  > > >
>  > > >
>  > > >
>  > > > Thanks,
>  > > >
>  > > > Andy
>  > > > ___
>  > > >  varnish-misc mailing list
>  > > >
>  > > >
>  > > >
>  > > >
>  > ___
>  > varnish-misc mailing list
>  >
>  >
>  ___
>  varnish-misc mailing list
varnish-misc mailing list

Re: Child dying with "Too many open files"

2008-02-20 Thread Michael S. Fischer
Does 'sysctl fs.file-max' say?  It should be >= the ulimit.


On Wed, Feb 20, 2008 at 4:04 PM, Andrew Knapp <[EMAIL PROTECTED]> wrote:
> Hello,
> I'm getting this error when running varnishd:
> >>
> Child said (2, 15369): < 217:
>   Condition((pipe(w->pipe)) == 0) not true.
>   errno = 24 (Too many open files)
> >>
> Cache child died pid=15369 status=0x6
> uname –a:
> Linux  2.6.18-53.1.4.el5 #1 SMP Fri Nov 30 00:45:55 EST 2007
> x86_64 x86_64 x86_64 GNU/Linux
> command used to start varnish:
> /usr/sbin/varnishd -d -d -a :80 -f /etc/varnish/photo.vcl -T
> :6082 -t 120 -w 10,700,30 -s
> file,/c01/varnish/varnish_storage.bin,12G -u varnish -g varnish -P
> /var/run/
> I have NFILES=27 set in /etc/sysconfig/varnish. Do I just need to up
> that value?
> Thanks,
> Andy
> ___
>  varnish-misc mailing list
varnish-misc mailing list

Miscellaneous questions

2008-02-11 Thread Michael S. Fischer
(1) Feature request: Can a knob be added to turn down the verbosity of
Varnish logging?  Right now on a quad-core Xeon we can service about
14k conn/s, which is good, but I wonder whether we could eke out even
more performance by quelling information that we don't need to log.

(2) HTTP/1.1 keep-alive connection reuse:  Does Varnish have the
ability to reuse origin server connections (assuming they are HTTP/1.1
Keep-Alive connections)?  Or, is there a strict 1:1 mapping between
client-proxy connections and proxy-origin server connections?

(3) Feature request: Request hashing.  It would be really cool if
Varnish were able to select the origin server (in reality another
Varnish proxy) by hashing the Request URI.  Having this ability would
improve the cache hit ratio overall where a pool of caching proxies is

Best regards,

varnish-misc mailing list