If you speak to the netegrity engineers, they will tell you that the
problem seems to be FCNTL with a feature that netegrity uses in
siteminder.  This is a problem with Solaris' FCNTL and the feature
SiteMinder uses.  You can change the accept/select locking to not use
APR, but then APR and Apache will use two different locking mechanisms,
and you won't have solved the underlying problem, namely that Solaris'
FCNTL doesn't work with some modules.  A better solution IMHO, is to make
APR choose the right kind of lock, depending on the features requested.

APR should always be choosing the best locking mechanism possible for any
given system.  If APR chooses a different locking mechanism, there is a
reason, and IMHO Apache would be stupid to ignore the choice APR makes.

As for the problem that Jeff is describing, this is a problem with trying
to make the old Apache 1.3 code fit the MPM model without paying enough
attention to the flow of the code.  Take a look at the 1.3 code, it uses
longjmp to make sure it is always executing the correct code.  The 2.0
code tries to use return codes from APR functions.  But, the basic code
still looks like the 1.3 code.  A few months ago, child processes were
morphing into parent processes when we tried to kill them off.  To fix
this, I modified some of the code to exit at the right time.  I believe
about a month later, Paul Reder made a similar change.  IMNSHO, this can
be solved only by actually taking the time to trace through the prefork
MPM, and figure out what is happening, and fixing the bugs.  BTW, the
threaded and perchild MPMs are incredibly similar to the prefork MPM in
this respect.

Ryan

On Wed, 30 May 2001, Bill Stoddard wrote:

> I cannot believe the lock calls don't already return EINTR. That seems to be an okay
> solution... Here is another solution I was an advocate for at least 1 1/2 years 
>ago...
>
> Don't APRize the accept/select locking.
>
> And here is one real life scenario to demonstrate... Many (most?) Apache users 
>running
> Apache on Solaris with Netegrity Siteminder have problems with Apache going belly up 
>(not
> serving requests) because of a failure in the default FCNTL locking in Apache 1.3.
> Netegrity's advise to their customers is to use SYSV_SEM locking rather than FCNTL
> locking. If we use APR'ized locks, then all Apache mutexs are changed from FCNTL to
> SYSV_SEM with this change.  This is not good. I believe the accept/select locking is
> specialized enough to merit handling outside of APR.
>
> Bill
>
> > This is easy to see with prefork and a config or platform which
> > requires serialization around select/accept.
> >
> > Pound on apache to create some extra server processes then stop
> > pounding and take a look at the server-status page (or the output of
> > ps).  A bunch of processes will stay in 'I' (SERVER_IDLE_DIE) state
> > indefinitely.  They know they need to go away because the signal
> > handler ran but they are stuck waiting on the accept mutex.
> >
> > To get rid of them you need to pound on the server some more.
> >
> > We ran into problems before because we were doing too much in the
> > signal handler.  Now perhaps we're not doing enough?
> >
> > possibilities
> >
> > . do more stuff in the signal handler or call [sig]longjump() to get
> >   back to the main flow; unfortunately, this may not be better
> >   than the old logic to do a bunch of stuff in the signal handler
> >   because we don't know what mutexes we're holding or what other sorts
> >   of cleanups are needed; I'm not sure we can trust pool cleanups to
> >   solve all problems here
> >
> > . make the mutex lock call return EINTR and then go back to the top of
> >   the loop and find out that we're supposed to go away
> >
> >   we'd assume that if it is somewhere besides the mutex lock then it
> >   is about to exit anyway
> >
> >   (this seems to be the least risky, but is ugly from the APR point of
> >   view since EINTR isn't a common concept)
> >
> > . ???
> >
> > --
> > Jeff Trawick | [EMAIL PROTECTED] | PGP public key at web site:
> >        http://www.geocities.com/SiliconValley/Park/9289/
> >              Born in Roswell... married an alien...
> >
>
>


_______________________________________________________________________________
Ryan Bloom                              [EMAIL PROTECTED]
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------


Reply via email to