Here's the result:http://pastie.org/387928

This box used to run everything (much of which has now been moved to other
clusters). If I can't get it to behave it'll be doing nothing soon :)

log/messages isn't large enough to trigger a misbehavior, but hopefully
it'll show something... I can't really do it on the nginx log (which is
massive) because I always have to kill that before enough backend tests flip
over to cause a site outage.







On Thu, Feb 12, 2009 at 6:44 PM, John Lauro <[email protected]>wrote:

> > I stopped logging so much in haproxy, but I get the same thing if I
> > grep the nginx logs on this server: haproxy's mongrel backend checks
> > start failing. I've noticed it only happens when using httpchk (or at
> > least it happens much, much more quickly).
> >
> > Here's an iostat I ran -- the first two are during the grep on the
> > nginx logs; the last one is after I finished:
>
> The iostat looks ok.
>
> Cut-n-past the following (or run from a script) so we can get a better idea
> of the box's general load and to see if they turn up anything:
>
> cat /proc/interrupts
> free
> netstat --inet -n | awk '{ print $6 }' | sort | uniq -c
> ulimit -a
> vmstat 1 10  & ( sleep 5 ;  grep whatever /var/log/messages >/dev/null )
> cat /proc/interrupts
> echo lsof count `lsof | wc -l`
>
> What type of disk subsystem do you have?  Given how it chokes when doing a
> grep, it almost sounds like you might have a faulty driver.  You do realize
> 8 cores is overkill for this, unless you are running other stuff on the
> box.
> The two checks on the interrupts is to see if something (especially disk
> I/O) is generating too many as we need to look at the difference.
>
>
>

Reply via email to