Re: Is anyone using ESI with a lot of traffic?

John Adams Fri, 27 Feb 2009 14:24:52 -0800

cc'ing the varnish dev list for comments...

On Feb 27, 2009, at 1:33 PM, Cloude Porteus wrote:

John,
Goodto hear from you. You must be slammed at Twitter. I'm happy to
hear that ESI is holding up for you. It's been in my backlog since you
mentioned it to me pre-Twitter.

Any performance info would be great.

Any comments on our setup are welcome. You may also choose to call uscrazypants. Many, many thanks to Artur Bergman of Wikia for helping usget this configuration straightened out.

Right now, we're running varnish (on search) in a bit of a non-standard way. We plan to use it in the normal fashion (varnish toInternet, nothing inbetween) on our API at some point. We're runningversion 2.0.2, no patches. Cache hit rates range from 10% to 30%, orhigher when a real-time event is flooding search.

2.0.2 is quite stable for us, with the occasional child death here andthere when we get massive headers coming in that flood sess_workspace.I hear this is fixed in 2.0.3, but haven't had time to try it yet.

We have a number of search boxes, and each search box has an apacheinstance on it, and varnish instance. We plan to merge the varnishinstances at some point, but we use very low TTLs (Twitter is the realtime web!) and don't see much of a savings by running less of them.


We do:
        Apache --> Varnish --> Apache -> Mongrels

Apaches are using mod_proxy_balancer. The front end apache is therebecause we've long had a fear that Varnish would crash on us, which itdid many times prior to our figuring out the proper parameters forstartup. We have two entries in that balancer. Either the request goesto varnish, or, if varnish bombs out, it goes directly to the mongrel.

We do this, because we need a load balancing algorithm that varnishdoesn't support, called bybusiness. Without bybusiness, varnish triesto direct requests to Mongrels that are busy, and requests end up inthe listen queue. that adds ~100-150mS to load times, and that's nogood for our desired service times of 200-250mS (or less.)

We'd be so happy if someone put bybusiness into Varnish's backend loadbalancing, but it's not there yet.

We also know that taking the extra hop through localhost costs us nextto nothing in service time, so it's good to have Apache there incasewe need to yank out Varnish. In the future, we might get rid of Apacheand use HAProxy (it's load balancing and backend monitoring is muchricher than Apache, and, it has a beautiful HTTP interface to look at.)


Some variables and our decisions:

              -p obj_workspace=4096 \
              -p sess_workspace=262144 \

Absolutely vital! Varnish does not allocate enough space by defaultfor headers, regexps on cookies, and otherwise. It was increased in2.0.3, but really, not increased enough. Without this we were panicingevery 20-30 requests and overflowing the sess hash.


              -p listen_depth=8192 \

8192 is probably excessive for now. If we're queuing 8k conns,something is really broke!


              -p log_hashstring=off \

Who cares about this - we don't need it.

              -p lru_interval=60 \

We have many small objects in the search cache. Run LRU more often.

              -p sess_timeout=10 \

If you keep session data around for too long, you waste memory.

              -p shm_workspace=32768 \

Give us a bit more room in shm

              -p ping_interval=1 \

Frequent pings in case the child dies on us.

              -p thread_pools=4 \
              -p thread_pool_min=100 \

This must match up with VARNISH_MIN_THREADS. We use four pools, (pools* thread_pool_min == VARNISH_MIN_THREADS)


              -p srcaddr_ttl=0 \

Disable the (effectively unused) per source-IP statistics

              -p esi_syntax=1

Disable ESI syntax verification so we can use it to process JSONrequests.


If you have more than 2.1M objects, you should also add:
# -h classic,250007 = recommeded value for 2.1M objects
#     number should be 1/10 expected working set.

In our VCL, we have a few fancy tricks that we use. We label the cacheserver and cache hit/miss rate in vcl_deliver with this code:


Top of VCL:
C{
#include <stdio.h>
#include <unistd.h>

char myhostname[255] = "";

}C

vcl_deliver:
C{

VRT_SetHdr(sp, HDR_RESP, "\014X-Cache-Svr:", myhostname,vrt_magic_string_end);

}C
     /* mark hit/miss on the request */
     if (obj.hits > 0) {
       set resp.http.X-Cache = "HIT";
       set resp.http.X-Cache-Hits = obj.hits;
     } else {
       set resp.http.X-Cache = "MISS";
     }


vcl_recv:
C{
    if (myhostname[0] == '\0') {

/* only get hostname once - restart required if hostnamechanges */

      gethostname(myhostname, 255);
    }
}C


Portions of /etc/sysconfig/varnish follow...

# The minimum number of worker threads to start
VARNISH_MIN_THREADS=400

# The Maximum number of worker threads to start
VARNISH_MAX_THREADS=1000

# Idle timeout for worker threads
VARNISH_THREAD_TIMEOUT=60

# Cache file location
VARNISH_STORAGE_FILE=/var/lib/varnish/varnish_storage.bin

# Cache file size: in bytes, optionally using k / M / G / T suffix,
# or in percentage of available disk space using the % suffix.
VARNISH_STORAGE_SIZE="8G"
#
# Backend storage specification
VARNISH_STORAGE="malloc,${VARNISH_STORAGE_SIZE}"

# Default TTL used when the backend does not specify one
VARNISH_TTL=5

# the working directory

DAEMON_OPTS="-a ${VARNISH_LISTEN_ADDRESS}:${VARNISH_LISTEN_PORT} \
              -f ${VARNISH_VCL_CONF} \

-T ${VARNISH_ADMIN_LISTEN_ADDRESS}:${VARNISH_ADMIN_LISTEN_PORT} \

              -t ${VARNISH_TTL} \
              -n ${VARNISH_WORKDIR} \

-w ${VARNISH_MIN_THREADS},${VARNISH_MAX_THREADS},${VARNISH_THREAD_TIMEOUT} \

              -u varnish -g varnish \
              -p obj_workspace=4096 \
              -p sess_workspace=262144 \
              -p listen_depth=8192 \
              -p log_hashstring=off \
              -p lru_interval=60 \
              -p sess_timeout=10 \
              -p shm_workspace=32768 \
              -p ping_interval=1 \
              -p thread_pools=4 \
              -p thread_pool_min=100 \
              -p srcaddr_ttl=0 \
              -p esi_syntax=1 \
              -s ${VARNISH_STORAGE}"


---
John Adams
Twitter Operations
j...@twitter.com
http://twitter.com/netik

_______________________________________________
varnish-dev mailing list
varnish-dev@projects.linpro.no
http://projects.linpro.no/mailman/listinfo/varnish-dev

Re: Is anyone using ESI with a lot of traffic?

Reply via email to