Hi Tal,
The timeout patch will certainly go into tomcat3.3, and I think Marc will
add it to 3.2 also - it's a good solution and I don't think it can brake
anything.
I was thinking about what we can do for a more general solution - and
looking at the code I think there are few easy things that need to be
done related with thread pools:
- use a common thread pool for all endpoints ( right now ajp12 is creating
a full thread pool for itself, etc).
- document the thread pool implementation, remove some old code
- add an admin page to monitor the thread pools
- use the thread pool to run session expire and verify the expire bugs
- add a ThreadPoolListener/Event to allow a (future) module to monitor and
manage the thread pool
- add a field to store the current "owner" or "user" of each thread.
- add some log messages for the case when the thread pool is at the
maximum capacity
- maybe provide a spare "admin" thread that can be used to "un-hang" the
server without restarting tomcat ( i.e. if the thread pool is at max
capacity, and if the connector detects a localhost connection allow it to
create an extra thread - so an admin application can kill threads ).
- API change in ThreadPool - allow it to run normal Runnable ( the current
ThreadPoolRunnable has some nice performance tricks, but it should be
usable for normal tasks that don't want to take advantage of overhead-free
thread data )
None of those would resolve the DOS problem, but I think it would be nice
to have them (and very easy to implement - without affecting the current
functionality )
Costin
On Tue, 20 Mar 2001, Tal Dayan wrote:
> Hi,
>
> Our first priority is to make Tomcat to work in normal conditions with good
> intentioned users. We will
> worry later about DOS (as long as we don't introduce new vulnerabilities).
>
> Yes, we tested the timeout patch all day yesterday with a production system
> with real users and normal load and
> all the hanging threads and connections was cleaned up perfectly (we are
> using 'netstat' to
> get the number of HTTP connections and 'ps' to get the number of thread and
> all is graphed around the clock by MRTG). We are running with a relatively
> timeout of 5 minutes (50*60*1000) just to be on the safe side but a shorter
> one can be used.
>
> Note that I am in no way and expert in Tomcat nor do I claim to understand
> all the implications of the patch so we need some qualified person to
> understand the implication and make sure it does not break anything. For
> example, if another service like Ajp uses this connection pool for long term
> connections that need to wait long periods for a data from some client (e.g.
> a front end web server, client side applets, etc), the patch will break it.
> The patch assumes that the connections are should never be idle for a long
> period of time.
>
> As for Apache, it supports a request timeout (see
> http://httpd.apache.org/docs/mod/core.html#timeout) and this will
> will eventually cleanup hanging connections. The timeout in this case is
> longer because it is for the entire request and the cleanup will be slower.
>
> Implementing a similar query timeout in Tomcat may require things like
> asynchronous thread kill (yuck) or some synchronous termination of the
> worker threads, for example by closing the sockets they are waiting on (I
> think this will release the socketRead() but I am not sure). But this is of
> course a more involved change than simply adding the timeout statement.
> Having an asynchronous I/O may help here (see Bug Parade
> http://developer.java.sun.com/developer/bugParade/bugs/4075058.html).
>
> BTW, does Tomcat 4.X has the same problem ?
>
> Tal
>
>
>
>
> > -----Original Message-----
> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
> > Sent: Tuesday, March 20, 2001 7:03 AM
> > To: [EMAIL PROTECTED]
> > Subject: Re: Bug 1006, what's next ?
> >
> >
> > Hi,
> >
> > I had a (long) weekend without computers. But I still found one and read
> > the mail once - and your report is very serious and important ( and not
> > easy to fix ). You have (at least ) my full attention. The read
> > timeout will be checked in soon - but the general problem ( with a servlet
> > hanging a thread ) is very hard to resolve (or I don't know any good
> > solution ).
> >
> > We could stop setting an upper limit on the thread count ( we still have
> > the OS upper limit ), and we could also use the (dangerous,
> > deprecated) suspend/terminate on the thread that is taking too much time.
> >
> > Have you tried any fix ? The timeout will not resolve the "bursts" ( and
> > high-loaded servers ) - unless it is very short.
> >
> > BTW, this is not a tomcat-specific problem ( I would guess Apache does
> > have the same issue - and we need to find how they deal with that ).
> >
> > Costin
> >
> >
> > On Tue, 20 Mar 2001, Tal Dayan wrote:
> >
> > >
> > > Two days ago I filed a bug report regarding a sever hanging problem in
> > > PollTcpEndpoint of Tomcat 3.x. The bug is
> > > at http://nagoya.apache.org/bugzilla/show_bug.cgi?id=1006 and
> > also include a
> > > suggestion for a patch.
> > >
> > > Since then I did not notice any change in the status or
> > resolution of the
> > > bug report nor any
> > > indication that it got anywhere.
> > >
> > > 1. Is this is the right place to file the bug ?
> > >
> > > 2. Is the bug filed correctly ?
> > >
> > > 3. Should I do anything else to make sure the bug gets the
> > attention of the
> > > relevant maintainers ?
> > >
> > > Thanks,
> > >
> > > Tal
> > >
> > >
> >
> >
>