Hello Blackdown developers,

        Abstract of this rather long message:

        What exactly is allowed and what is not with respect to signals in
a multithreaded application that uses the Blackdown 1.2.2-RC4 JDK on
Linux? I am experiencing random thread hangs inside the JVM. When that
happens the entire JVM hangs, no other thread may attach to it, but the
rest of the threads (that don't use the JVM) continue to run just fine.
Why do I suspect that this is related to signals? Well, you'll have to
read the long version below:

        I am developing a plug-in for AOLServer to run Tomcat in-process.
For those of you who are not familiar with these two, AOLServer is a
multithreaded, open source web server written in C and Tomcat is a servlet
engine (written in Java).

        I am developing this on Linux 2.2.14 with glibc 2.1.3 and the
Blackdown JDK 1.2.2-RC4. On Linux, the AOLServer use kernel threads, like
the Blackdown JDK (with native threads). So far this is should be safe.

        The plug-in instantiates a JVM inside the AOLServer process.
Threads that handle requests for servlets attach themselves to the JVM,
and call a method in the servlet engine. The request is then processed on
the same thread, but inside the JVM, which calls back the web server to
write out the response. A request is handled this way on a single thread.
ASCII drawing (WS = web server, SE = servlet engine):

        request --> [ C function in WS ] --> [ Java method in SE ] -->
        --> [ C callback in WS ] --> response

        All this works fine, except that in some instances the threads
that process requests hang *at random points* inside the JVM. Usually a
thread hangs when it calls a method (not JNI, a Java to Java call). The
methods called are *not* synchronized. I know for a fact that it happens at
truly random points. I have done extensive logging on the Java side using
synchronous writes. There are no locks involved (synchronized blocks or
methods). Also, the servlet engine (Tomcat) is just fine when it runs
outside the web server.

        Other people that are working on a module for Apache 2.0a have
encountered the same problem (but they work in a three letter company and
won't ask here for help).

        Why do I ask about signals? Well, the really weird thing is that
when I send a SIGINT to the server to shut it down, *all* threads that are
blocked in the JVM suddenly awake and complete normally, sending every bit
of the response to the client. AOLServer has a nice shutdown procedure
that waits a while for threads to finish their work. It does however
kill them if they don't finish in a given interval.

        Thus, I came to the conclusion that it must be related to signals.
So, I started to dig this issue and I found a stern warning in the
Blackdown FAQ:

* Native code using JNI should NOT modify the signal processing state.          
  The VM uses signals and any change to the signal handling may result
  in VM failures.                                                                      
 

        Now, any decent web server (or daemon) has to intercept some
signals. On the AOLServer side the source is nicely commented. The main
thread does the following:

    /*
     * Block SIGHUP, SIGPIPE, SIGTERM, and SIGINT. This mask is                 
     * inherited by all subsequent threads so that only this                    
     * thread will catch the signals in the sigwait() loop below.               
     * Unfortunately this makes it impossible to kill the                       
     * server with a signal other than SIGKILL until startup                    
     * is complete.                                                             
     */                                                                         

        I hope that you guys that wrote the Linux specific JVM can tell me
if this is the reason why threads hang. If so, what should I do?

        Less vague questions:

        1) What *exactly* is permitted and what not with respect to
signals, threads and masks? You cannot reasonably expect a web server not
to deal with any signals...

        2) What exactly does the -Xrs flag do? How does "reduce the use of
OS signals"? Can this help given the list of signals AOLServer uses?

        3) What if I pass the request to a different Java-only thread in
the JVM (lets call this a proxy thread) and make the C-Java thread that
handles the connection wait on the C side until the proxy thread calls back
the C side with the results? This would require some work so, I'd like to
hear your opinion on it. Could it make any difference?

        The AOLServer is open source, so I can hack it any way I like, but
that's not true for the JVM. Clearly a web server that wouldn't handle
any signals is unlikely to be popular, so I can't just hack them out. A
finer solution is needed.

        I have also tried the IBM 1.1.8 JDK and found the same problem. So
the issue might be inherited from Sun's signaling code. But you guys seem
aware of the problem; you're the only ones that mention it in FAQ.

        Finally, if you think I am asking for too much, please note that I
am making the plug-in available for free, and I don't get paid to write
it, so I can't spend any money on Java consulting :-) You can get your
own JVMs to hang on it at http://www.ss.pub.ro/~gaburici/nstomcat/


        TIA,
        Vasile



----------------------------------------------------------------------
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to