Re: mod_cache thundering herd bug

2014-04-21 Thread Graham Leggett
On 19 Apr 2014, at 10:26 PM, Eric Covener cove...@gmail.com wrote:

 Graham -- related subject brought up either in Denver or in the bug.
 It seems that when we serve a stale file while the cache is locked,
 the age headers are small instead of large. I got totally lost trying
 to track down the issue, maybe it makes sense to you?  It's almost as
 if they time of the revalidation is somehow updated early and the
 delta in the stale cache hits is based off of that.

All thundering herd does is after letting the first conditional request 
through, it serves stale data (RFC willing) until that conditional request 
comes back or a specific maximum time is reached, whichever comes first.

The most valuable piece of information in this process is the reason 
variable, which describes the reason why something wasn't eligible for caching. 
In httpd v2.4 the X-Cache-Detail header will give this to you, in httpd v2.2 
you'll need to log at DEBUG level to get this:

ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r,
cache: %s not cached. Reason: %s, r-unparsed_uri,
reason);

The questions to answer are:

- Is there stale content to serve? No stale content, no thundering herd 
protection.
- If stale content is being deleted, identify why that is. This is likely to be 
unrelated to thundering herd, but rather in other parts of mod_cache.

Regards,
Graham
--



Re: mod_cache thundering herd bug

2014-04-21 Thread Jim Riggs
On 21 Apr 2014, at 06:38, Graham Leggett minf...@sharp.fm wrote:

 On 19 Apr 2014, at 10:26 PM, Eric Covener cove...@gmail.com wrote:
 
 Graham -- related subject brought up either in Denver or in the bug.
 It seems that when we serve a stale file while the cache is locked,
 the age headers are small instead of large. I got totally lost trying
 to track down the issue, maybe it makes sense to you?  It's almost as
 if they time of the revalidation is somehow updated early and the
 delta in the stale cache hits is based off of that.
 
 All thundering herd does is after letting the first conditional request 
 through, it serves stale data (RFC willing) until that conditional request 
 comes back or a specific maximum time is reached, whichever comes first.
 
 The most valuable piece of information in this process is the reason 
 variable, which describes the reason why something wasn't eligible for 
 caching. In httpd v2.4 the X-Cache-Detail header will give this to you, in 
 httpd v2.2 you'll need to log at DEBUG level to get this:
 
ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r,
cache: %s not cached. Reason: %s, r-unparsed_uri,
reason);
 
 The questions to answer are:
 
 - Is there stale content to serve? No stale content, no thundering herd 
 protection.
 - If stale content is being deleted, identify why that is. This is likely to 
 be unrelated to thundering herd, but rather in other parts of mod_cache.



Covener - Are you talking about my comments in #16 on the ticket? 
(https://issues.apache.org/bugzilla/show_bug.cgi?id=50317#c16)

If so, do either you or Graham have thoughts on the Age header getting returned 
with stale content? In my testing, when stale content is getting returned, no 
Age header is set which appears to be a violation of HTTP 1.1.



Re: mod_cache thundering herd bug

2014-04-21 Thread Eric Covener
 Covener - Are you talking about my comments in #16 on the ticket? 
 (https://issues.apache.org/bugzilla/show_bug.cgi?id=50317#c16)

 If so, do either you or Graham have thoughts on the Age header getting 
 returned with stale content? In my testing, when stale content is getting 
 returned, no Age header is set which appears to be a violation of HTTP 1.1.


yes, I think it's not that it's unset, but that the calculation
somehow uses the revalidation-in-progress check time as the basis.

-- 
Eric Covener
cove...@gmail.com


Re: mod_cache thundering herd bug

2014-04-19 Thread Eric Covener
On Tue, Apr 8, 2014 at 4:11 PM, Jim Riggs apache-li...@riggs.me wrote:
 https://issues.apache.org/bugzilla/show_bug.cgi?id=50317

 While we are at ApacheCon, I would love to address this nasty bug with 
 someone familiar with 2.2's mod_cache. Our sites were brought down a few 
 times last year before we finally tracked it down to being this particular 
 bug. I am using a crude backport of the 2.3 patch (r1023398) in 2.2. It 
 works, but I don't know if it is correct.

 Can someone look at this one with me? We really need to get this fixed in 
 2.2, because there is NO thundering herd protection at all as things stand 
 right now.



Graham -- related subject brought up either in Denver or in the bug.
It seems that when we serve a stale file while the cache is locked,
the age headers are small instead of large. I got totally lost trying
to track down the issue, maybe it makes sense to you?  It's almost as
if they time of the revalidation is somehow updated early and the
delta in the stale cache hits is based off of that.

-- 
Eric Covener
cove...@gmail.com


Re: mod_cache thundering herd bug

2014-04-14 Thread Maciej Bogucki



r1023398 for 2.2:

http://people.apache.org/~covener/patches/httpd-2.2.x-thunder.diff

The remove_url() prevents other threads from serving a stale cached
file during refresh of a slow response, but it's unnecessary to have a
separate path because the refresh has to deal with 200s already.  When
the remove_url was added, there as no thundering herd lock / no
ability to serve stale content while one guy was reloading.


covener, mrumph, and I looked at this today at ApacheCon. I updated the bug 
with some comments
and attached this patch.

https://issues.apache.org/bugzilla/show_bug.cgi?id=50317


Hello,

Thank You very much for the patch but*it doesn't works*. When I'm doing ab 
(/usr/bin/ab -k -c 5 -n 10http://host/url) test the application get more than 
one request

1.1.1.1 - - [14/Apr/2014:14:01:58 +0200] GET /url HTTP/1.0 200 42398 
9A68DBA96CED90DC517F7D6302F5A748.gpi-app1 1163 1163
1.1.1.1 - - [14/Apr/2014:14:02:05 +0200] GET /url HTTP/1.0 200 42398 
D378685BBD4FB87C63A3A867ABFAFB3E.gpi-app1 2931 2930
1.1.1.1 - - [14/Apr/2014:14:02:05 +0200] GET /url HTTP/1.0 200 42398 
8B77A0C68FC6F16E0BA3A89C7A614E1A.gpi-app1 2992 2991
1.1.1.1 - - [14/Apr/2014:14:02:05 +0200] GET /url HTTP/1.0 200 42398 
57A48B49FB6C52E28F1FA97DDFCDC0C8.gpi-app1 3007 3006
1.1.1.1 - - [14/Apr/2014:14:02:05 +0200] GET /url HTTP/1.0 200 42398 
71573080388181B3C55E88CB4BFAB890.gpi-app1 3051 3051
1.1.1.1 - - [14/Apr/2014:14:02:06 +0200] GET /url HTTP/1.0 200 42398 
38DA8533D4F9B4046A2F607071652E94.gpi-app1 1412 1412


Here are more information how to reproduce it.

*Compilation*

cd /tmp
svn cohttp://svn.apache.org/repos/asf/httpd/httpd/branches/2.2.x
cd 2.2.x/
svn cohttp://svn.apache.org/repos/asf/apr/apr/branches/1.4.x  srclib/apr
svn cohttp://svn.apache.org/repos/asf/apr/apr-util/branches/1.4.x  
srclib/apr-util
./buildconf
./configure --prefix=/etc/httpd --exec-prefix=/usr --bindir=/usr/bin
--sbindir=/usr/sbin --mandir=/usr/share/man --libdir=/usr/lib64
--sysconfdir=/etc/httpd/conf --includedir=/usr/include/httpd
--libexecdir=/usr/lib64/httpd/modules --datadir=/var/www
--with-installbuilddir=/usr/lib64/httpd/build --with-mpm=prefork
--with-apr=/usr --with-apr-util=/usr --enable-suexec --with-suexec
--with-suexec-caller=apache --with-suexec-docroot=/var/www
--with-suexec-logfile=/var/log/httpd/suexec.log
--with-suexec-bin=/usr/sbin/suexec --with-suexec-uidmin=500
--with-suexec-gidmin=100 --enable-pie --with-pcre
--enable-mods-shared=all --enable-ssl --with-ssl --enable-proxy
--enable-cache --enable-disk-cache --enable-ldap --enable-authnz-ldap
--enable-cgid --enable-authn-anon --enable-authn-alias
--disable-imagemap
patch -p0  /root/rpmbuild/SOURCES/httpd-2.2.x-thunder.patch
make
make install

*Configuration**
*
VirtualHost host:80
...
...
## Cache
CacheRoot /tmp/cache
CacheEnable disk /
CacheDisable /static/
CacheMinFileSize 0
CacheMaxFileSize 1048576
CacheDirLevels 2
CacheDirLength 2
CacheLock on
CacheLockPath /tmp/mod_cache-lock
CacheLockMaxAge 5
CacheIgnoreHeaders ETag Set-Cookie
Header unset Expires
Header unset Cache-Control
Header always set Cache-Control max-age=30,stale-while-revalidate=15
/VirtualHost

Best Regards
Maciej Bogucki


Re: mod_cache thundering herd bug

2014-04-09 Thread Eric Covener
r1023398 for 2.2:

  http://people.apache.org/~covener/patches/httpd-2.2.x-thunder.diff

The remove_url() prevents other threads from serving a stale cached
file during refresh of a slow response, but it's unnecessary to have a
separate path because the refresh has to deal with 200s already.  When
the remove_url was added, there as no thundering herd lock / no
ability to serve stale content while one guy was reloading.

On Tue, Apr 8, 2014 at 2:11 PM, Jim Riggs apache-li...@riggs.me wrote:
 https://issues.apache.org/bugzilla/show_bug.cgi?id=50317

 While we are at ApacheCon, I would love to address this nasty bug with 
 someone familiar with 2.2's mod_cache. Our sites were brought down a few 
 times last year before we finally tracked it down to being this particular 
 bug. I am using a crude backport of the 2.3 patch (r1023398) in 2.2. It 
 works, but I don't know if it is correct.

 Can someone look at this one with me? We really need to get this fixed in 
 2.2, because there is NO thundering herd protection at all as things stand 
 right now.

 - Jim




-- 
Eric Covener
cove...@gmail.com


Re: mod_cache thundering herd bug

2014-04-09 Thread Jim Riggs
On 9 Apr 2014, at 14:46, Eric Covener cove...@gmail.com wrote:

 r1023398 for 2.2:
 
  http://people.apache.org/~covener/patches/httpd-2.2.x-thunder.diff
 
 The remove_url() prevents other threads from serving a stale cached
 file during refresh of a slow response, but it's unnecessary to have a
 separate path because the refresh has to deal with 200s already.  When
 the remove_url was added, there as no thundering herd lock / no
 ability to serve stale content while one guy was reloading.


covener, mrumph, and I looked at this today at ApacheCon. I updated the bug 
with some comments and attached this patch.

https://issues.apache.org/bugzilla/show_bug.cgi?id=50317



mod_cache thundering herd bug

2014-04-08 Thread Jim Riggs
https://issues.apache.org/bugzilla/show_bug.cgi?id=50317

While we are at ApacheCon, I would love to address this nasty bug with someone 
familiar with 2.2's mod_cache. Our sites were brought down a few times last 
year before we finally tracked it down to being this particular bug. I am using 
a crude backport of the 2.3 patch (r1023398) in 2.2. It works, but I don't know 
if it is correct.

Can someone look at this one with me? We really need to get this fixed in 2.2, 
because there is NO thundering herd protection at all as things stand right now.

- Jim