On Tue, May 24, 2005 at 07:10:25PM -0700, Doug Hardie said:
> 
> On May 24, 2005, at 13:21, Stephen Gran wrote:
> 
> >On Tue, May 24, 2005 at 12:54:47PM -0700, Doug Hardie said:
> >
> >>ktrace is effectively the same thing as truss so I used it.  There
> >>are two files available:
> >>
> >>http://www.lafn.org/clamav/ktrace.html
> >>http://www.lafn.org/clamav/clamd.html
> >>
> >>ktrace.html is the output of ktrace - its about 14 MB clamd.html is
> >>the clamd.log file entries - very small and probably of no value
> >>
> >
> >It is difficult to say from the provided ktrace file what is
> >happening, as there are no timestamps and all lines have the same
> >pid.  One thing that seems odd is that the milter appears to continue
> >accepting and processing input after a reload event has happened.
> >Not for the body, ut for all other milter events (header, connect,
> >etc).  That is a  start at least.
> >
> >Is there a way to log seperately by pid or something with ktrace?  I
> >don't know it well, so I am not sure what arguments to tell you to
> >pass it.  Also, I am not sure that will even work - in a proper
> >thread implementation, all threads share a pid (but have different
> >lwp  id's) so this may not be possible.
> 
> clamav-milter is only one process.  It has multiple threads but those
> are not visible to the kernel.  

I don't know how the bsd implementation of threads work, as I said.  On
linux, the separate threads share a pid but have different lwp id's, and
are separable to the kernel and to strace.  It will make things a little
harder if the same is not true on bsd.

> The problem does not occur  immediately with a database reload.  It
> takes 10 or so minutes before  it hangs/quits.  I suspect that the
> problem occurs when there are  active messages that do not complete
> before some timeout value.   clamav-milter is waiting for everything
> to go quiet, but on my  receive mail server that never happens.  There
> are always 30-40  active sendmail children.  As a result it never goes
> quiet.  I  suspect that clamav-milter eventually gives up and thats
> when the  problem occurs.  On my outgoing mail server which handles
> considerably less mail, most of the database updates do not cause a
> problem.  On my test server which handles 3 email daily it never
> causes a problem.

This is the generally observed pattern, so it's good to know we're
chasing the same problem, at least.

> kdump will provide the timestamps if that would be helpful, but the
> entries are pretty much evenly spaced out over about a 5 minute period
> between when I touched the daily file and when it hung.

Well, that's helpful - looking at the file at first, I had no way of
telling that.

What I can glean from the output you have provided is that there is a
point reached where some threads begin doing a write(not accepting
inputs), which I would expect from the source.  But puzzlingly, some
(other?  No way to know without being able to separate the threads) are
still accepting and processing messages after that point.

I also see no mutex related calls, which I would have expected to see a
lot of.  Since I suspect the problem is that one htread is prematurely
altering or locking a mutex, stalling the others, this makes it harder
to debug the sequence of events :)  This is presumably a problem of
ktrace or the invocation, rather than an absence of events, though.  It
appears from what I can find of their respective man pages, that truss
may better at this sort of thing than ktrace (it certainly seems to do a
better job following forks and threads in the solaris page I see).  Do
you mind giving it a go?

Thanks again,
-- 
 --------------------------------------------------------------------------
|  Stephen Gran                  | The best portion of a good man's life,  |
|  [EMAIL PROTECTED]             | his little, nameless, unremembered acts |
|  http://www.lobefin.net/~steve | of kindness and love.   -- Wordsworth   |
 --------------------------------------------------------------------------

Attachment: pgppEvgO36Nqc.pgp
Description: PGP signature

_______________________________________________
http://lurker.clamav.net/list/clamav-users.html

Reply via email to