Re: inode state

2003-12-09 Thread Robert Watson
On Tue, 9 Dec 2003, Fabian Thylmann wrote:

 I have a heavily used threaded server program running on one of my Dell
 Poweredge 1750 servers. Its a statistical analysis package for websites.
 Currently it analyses over 60 million requests a day, which (because of
 many different reasons) causes it to handle around 120 million http
 requests a day. At peaks around 1500 requests a second. 
 
 The system stores most many statistics in memory which is flushed to
 disk in circles by a worker thread. 
 
 Another big part is stored in an on-disk database which is mmap()'d into
 memory. Because we do not have enough memory to keep everything in
 memory at one time the mmap() system of course pages data in and out. 
 
 When I look in systat -v I see that dirtybuf climbs to about 1700 and
 then they get flushed to disk, causing high disk usage of around 300-400
 tps whcih renders the disks useless for anything else. 
 
 When those flushes occure, my apps state as displayed by top(1) gets
 into inode state, PRI is set to -14 and cpu usage rapidly drops. The
 program and ALL of its threads are stalled at that time. Those inode
 states take around 2 oe 3 seconds and happen every 30 seconds or so. 
 
 In those 3 seconds we lose around 1500 hits at peak times for processing
 because the app can not handle them fast enough. This results in around
 2 million or so hits lost over the day for processing. 
 
 I am now wondering if anyone can explain to me why ALL threads and not
 just the threads that actually do I/O work get blocked when dirty
 buffers are flushed and what to do to fix this problem. 
 
 I would be very happy if someone could reply and point me into the right
 direction! 

You don't mention which version of FreeBSD you're running -- if 4.x, you
probably want to relink your application against the linuxthreads port.
This is because libc_r implements threads inside a single process without
the support of the kernel, which means that if the process is blocked in
kernel, all threads will be blocked in kernel.  The linuxthreads package
uses a model similar to Linux's threading implementation (hence the name)
to allow the threads to be scheduled using lightweight versions of
processes (shared file descriptors, etc).  This isn't quite
POSIX-compliant, but it works quite well for disk-bound applications such
as databases.

If you're running on 5.x, especially recent 5.1 or 5.2 prereleases, you
probably want to give libkse a try.  It's the new m:n threading
implementation that will become the default in 5.3, and also permits
parallelism (only in a more POSIX-compliant way, and in theory offering
much greater scalability for large numbers of threads).  I stick the
following lines in my /etc/libmap.conf on 5.x boxes to force all
applications linked against libc_r to use libkse instead:

  libc_r.so.5 libkse.so.1
  libc_r.so   libkse.so

One particularly nice thing about the m:n thread support is that you can
run-time plug the thread library between several options (libc_r, libthr,
libkse) to pick the one that performs best for your application.  Another
benefit of running with a non-libc_r threads package is that if you have
an SMP box, you'll see real parallelism. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Senior Research Scientist, McAfee Research


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: inode state

2003-12-09 Thread Fabian Thylmann
Hi Robert,

thanks for the reply, and yea, totally forgot the version. its 4.9.

Now, the problem is:
1) I can not use linuxthreads since my server is also multiplexing via
kqueue's and I can not find any version of linuxthreads which implements a
thread-safe version of kqueue.
2) I can not use freebsd 5.2 because it fails to boot on a dell poweredge
1750 with two harddisks. The LSILogic SCSI Controller the server uses (mpt
driver) seems to not find any hdds and gives up with an error.
If I remove one of the two disks 5.2 boots but the kernel traps as soon as
it tries to write to the hdd.

Is there some way to keep the number of kernl-level locks as low as
possible? This all seems to be associated to flushing dirty buffers and I
wonder if there is no way to make it flush in way smaller bursts or why
exactly it has to lock the process while doing so.

Fabian

- Original Message - 
From: Robert Watson [EMAIL PROTECTED]
To: Fabian Thylmann [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Tuesday, December 09, 2003 7:38 PM
Subject: Re: inode state


 On Tue, 9 Dec 2003, Fabian Thylmann wrote:

  I have a heavily used threaded server program running on one of my Dell
  Poweredge 1750 servers. Its a statistical analysis package for websites.
  Currently it analyses over 60 million requests a day, which (because of
  many different reasons) causes it to handle around 120 million http
  requests a day. At peaks around 1500 requests a second.
 
  The system stores most many statistics in memory which is flushed to
  disk in circles by a worker thread.
 
  Another big part is stored in an on-disk database which is mmap()'d into
  memory. Because we do not have enough memory to keep everything in
  memory at one time the mmap() system of course pages data in and out.
 
  When I look in systat -v I see that dirtybuf climbs to about 1700 and
  then they get flushed to disk, causing high disk usage of around 300-400
  tps whcih renders the disks useless for anything else.
 
  When those flushes occure, my apps state as displayed by top(1) gets
  into inode state, PRI is set to -14 and cpu usage rapidly drops. The
  program and ALL of its threads are stalled at that time. Those inode
  states take around 2 oe 3 seconds and happen every 30 seconds or so.
 
  In those 3 seconds we lose around 1500 hits at peak times for processing
  because the app can not handle them fast enough. This results in around
  2 million or so hits lost over the day for processing.
 
  I am now wondering if anyone can explain to me why ALL threads and not
  just the threads that actually do I/O work get blocked when dirty
  buffers are flushed and what to do to fix this problem.
 
  I would be very happy if someone could reply and point me into the right
  direction!

 You don't mention which version of FreeBSD you're running -- if 4.x, you
 probably want to relink your application against the linuxthreads port.
 This is because libc_r implements threads inside a single process without
 the support of the kernel, which means that if the process is blocked in
 kernel, all threads will be blocked in kernel.  The linuxthreads package
 uses a model similar to Linux's threading implementation (hence the name)
 to allow the threads to be scheduled using lightweight versions of
 processes (shared file descriptors, etc).  This isn't quite
 POSIX-compliant, but it works quite well for disk-bound applications such
 as databases.

 If you're running on 5.x, especially recent 5.1 or 5.2 prereleases, you
 probably want to give libkse a try.  It's the new m:n threading
 implementation that will become the default in 5.3, and also permits
 parallelism (only in a more POSIX-compliant way, and in theory offering
 much greater scalability for large numbers of threads).  I stick the
 following lines in my /etc/libmap.conf on 5.x boxes to force all
 applications linked against libc_r to use libkse instead:

   libc_r.so.5 libkse.so.1
   libc_r.so   libkse.so

 One particularly nice thing about the m:n thread support is that you can
 run-time plug the thread library between several options (libc_r, libthr,
 libkse) to pick the one that performs best for your application.  Another
 benefit of running with a non-libc_r threads package is that if you have
 an SMP box, you'll see real parallelism.

 Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
 [EMAIL PROTECTED]  Senior Research Scientist, McAfee Research




___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: inode state

2003-12-09 Thread Robert Watson

On Tue, 9 Dec 2003, Fabian Thylmann wrote:

 thanks for the reply, and yea, totally forgot the version. its 4.9. 
 
 Now, the problem is:
 1) I can not use linuxthreads since my server is also multiplexing via
 kqueue's and I can not find any version of linuxthreads which implements a
 thread-safe version of kqueue.

Hmm.  Kqueue should be thread-safe in that it's a system call, but I can't
speak to the safety of various arguments/parameters.  I don't know if
linuxthreads tries to provide locking around file descriptors and might
have reference problems if kqueue were held over a call to close(), but it
could be kqueue will just work with linuxthreads.  Do calls like
select()/poll() require thread-safe versions in linuxthreads? 

 2) I can not use freebsd 5.2 because it fails to boot on a dell
 poweredge 1750 with two harddisks. The LSILogic SCSI Controller the
 server uses (mpt driver) seems to not find any hdds and gives up with an
 error.  If I remove one of the two disks 5.2 boots but the kernel traps
 as soon as it tries to write to the hdd. 

Do you have an outstanding PR on the LSI problem, and/or a stack trace for
the trap?  In the past, our LSI drivers have been fairly well maintained
on the LSI side.  I can certainly try shaking some branches and see if
anything falls down, if there's a detailed bug report I can point at.

 Is there some way to keep the number of kernl-level locks as low as
 possible? This all seems to be associated to flushing dirty buffers and
 I wonder if there is no way to make it flush in way smaller bursts or
 why exactly it has to lock the process while doing so. 

I know some work has been done relating to this problem at Yahoo,
especially relating to disk fragmentation resulting from allocation using
mmap on sparse files.  You might want to try posting about this problem on
freebsd-fs or freebsd-current and see if you manage to hook someone who's
been looking at this.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Senior Research Scientist, McAfee Research


 
 Fabian
 
 - Original Message - 
 From: Robert Watson [EMAIL PROTECTED]
 To: Fabian Thylmann [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 Sent: Tuesday, December 09, 2003 7:38 PM
 Subject: Re: inode state
 
 
  On Tue, 9 Dec 2003, Fabian Thylmann wrote:
 
   I have a heavily used threaded server program running on one of my Dell
   Poweredge 1750 servers. Its a statistical analysis package for websites.
   Currently it analyses over 60 million requests a day, which (because of
   many different reasons) causes it to handle around 120 million http
   requests a day. At peaks around 1500 requests a second.
  
   The system stores most many statistics in memory which is flushed to
   disk in circles by a worker thread.
  
   Another big part is stored in an on-disk database which is mmap()'d into
   memory. Because we do not have enough memory to keep everything in
   memory at one time the mmap() system of course pages data in and out.
  
   When I look in systat -v I see that dirtybuf climbs to about 1700 and
   then they get flushed to disk, causing high disk usage of around 300-400
   tps whcih renders the disks useless for anything else.
  
   When those flushes occure, my apps state as displayed by top(1) gets
   into inode state, PRI is set to -14 and cpu usage rapidly drops. The
   program and ALL of its threads are stalled at that time. Those inode
   states take around 2 oe 3 seconds and happen every 30 seconds or so.
  
   In those 3 seconds we lose around 1500 hits at peak times for processing
   because the app can not handle them fast enough. This results in around
   2 million or so hits lost over the day for processing.
  
   I am now wondering if anyone can explain to me why ALL threads and not
   just the threads that actually do I/O work get blocked when dirty
   buffers are flushed and what to do to fix this problem.
  
   I would be very happy if someone could reply and point me into the right
   direction!
 
  You don't mention which version of FreeBSD you're running -- if 4.x, you
  probably want to relink your application against the linuxthreads port.
  This is because libc_r implements threads inside a single process without
  the support of the kernel, which means that if the process is blocked in
  kernel, all threads will be blocked in kernel.  The linuxthreads package
  uses a model similar to Linux's threading implementation (hence the name)
  to allow the threads to be scheduled using lightweight versions of
  processes (shared file descriptors, etc).  This isn't quite
  POSIX-compliant, but it works quite well for disk-bound applications such
  as databases.
 
  If you're running on 5.x, especially recent 5.1 or 5.2 prereleases, you
  probably want to give libkse a try.  It's the new m:n threading
  implementation that will become the default in 5.3, and also permits
  parallelism (only in a more POSIX-compliant way, and in theory

Re: inode state

2003-12-09 Thread Fabian Thylmann
 Hmm.  Kqueue should be thread-safe in that it's a system call, but I can't
 speak to the safety of various arguments/parameters.  I don't know if
 linuxthreads tries to provide locking around file descriptors and might
 have reference problems if kqueue were held over a call to close(), but it
 could be kqueue will just work with linuxthreads.  Do calls like
 select()/poll() require thread-safe versions in linuxthreads?

Yeah, select/poll is not needed, but the thing is that my app, which I just
changed from pthreads to linuxthreads, did all ok, launched threads and all,
but it crashed when running kevent() and all the referrences it gave to
kevent() were fine.

 Do you have an outstanding PR on the LSI problem, and/or a stack trace for
 the trap?  In the past, our LSI drivers have been fairly well maintained
 on the LSI side.  I can certainly try shaking some branches and see if
 anything falls down, if there's a detailed bug report I can point at.

I have no PR nor do I have a stack trace for the trap right now. How would I
exactly get a stack trace of the trap? It happends when booting from the
5.2-beta cd-rom. Let me know when you need from me and I'll get all that.
The box is at a hosting provider so I'll have to get the info from them.

 I know some work has been done relating to this problem at Yahoo,
 especially relating to disk fragmentation resulting from allocation using
 mmap on sparse files.  You might want to try posting about this problem on
 freebsd-fs or freebsd-current and see if you manage to hook someone who's
 been looking at this.

Thanks, I'll give that a try.

Fabian

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]