Re: mod_jk does not detect a hung Tomcat

2003-09-25 Thread David Rees
Henri Gomez said:

 This won't work  with the pre-fork MPM, since each Apache child will
 have
 its own idea of the timing.  The only way that it could tell that a
 Tomcat
 failed is to try the request and fail :).


The idea needs to be flushed out some more. But we should be able to
 track
enough data about how a worker is performing to make some simple

 Well the code is ready, native (jk), java (tc 3.3 + jtc) and I've
 updated documentations.

 Added PING/PONG support in ajp13, and make use of select() to
 determine if there is something to read.

 I attached a copy of jk_ajp_common.c for review but I'd like to commit
 the code before HEAD diverge. Since the new functionnalities are off
 by default it shouldn't hurt ;)

You work fast, Henri.  ;-)

If you want, I can test it out tomorrow.  If you can give me patches for
Tomcat 4.1.27 and mod_jk, I'll see if I can get it compiled and run some
quick tests.

-Dave

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: mod_jk does not detect a hung Tomcat

2003-09-25 Thread Henri Gomez
Bill Barker a écrit :


BTW, select() call is supported by WIN32 ?



Strangely enough, it seems that select(int, fd_set *, fd_set *, fd_set *,
const timeval *) is actually supported by MS.  However, it seems that you
need to use MS's weird error codes to handle errors.
Ok I'll take a look in APR which is the definitive guide
for OS portability.
Which make me think that we should discuss one day of the use of APR
in jk/jk2


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: mod_jk does not detect a hung Tomcat

2003-09-25 Thread Henri Gomez
David Rees a écrit :

Henri Gomez said:

This won't work  with the pre-fork MPM, since each Apache child will
have
its own idea of the timing.  The only way that it could tell that a
Tomcat
failed is to try the request and fail :).


The idea needs to be flushed out some more. But we should be able to
track
enough data about how a worker is performing to make some simple
Well the code is ready, native (jk), java (tc 3.3 + jtc) and I've
updated documentations.
Added PING/PONG support in ajp13, and make use of select() to
determine if there is something to read.
I attached a copy of jk_ajp_common.c for review but I'd like to commit
the code before HEAD diverge. Since the new functionnalities are off
by default it shouldn't hurt ;)


You work fast, Henri.  ;-)
The blocking read() need to be fixed for years :)

If you want, I can test it out tomorrow.  If you can give me patches for
Tomcat 4.1.27 and mod_jk, I'll see if I can get it compiled and run some
quick tests.
If nobody object I'll commit it today, just wait jtc commiters to give
a confirmation since jk 1.2.5 should be released before.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


jk 1.2.5 release : Was: mod_jk does not detect a hung Tomcat

2003-09-25 Thread Henri Gomez
Did the jk 1.2.5 has been tagged so I could commit my modifications
for PING/PONG in jtc ?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: jk 1.2.5 release : Was: mod_jk does not detect a hung Tomcat

2003-09-25 Thread Bill Barker
For me, doing a cvs log doesn't mention a 1.2.5 tag.

- Original Message - 
From: Henri Gomez [EMAIL PROTECTED]
To: Tomcat Developers List [EMAIL PROTECTED]
Sent: Thursday, September 25, 2003 12:25 AM
Subject: jk 1.2.5 release : Was: mod_jk does not detect a hung Tomcat


 Did the jk 1.2.5 has been tagged so I could commit my modifications
 for PING/PONG in jtc ?
 
 
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 

This message is intended only for the use of the person(s) listed above as the 
intended recipient(s), and may contain information that is PRIVILEGED and 
CONFIDENTIAL.  If you are not an intended recipient, you may not read, copy, or 
distribute this message or any attachment. If you received this communication in 
error, please notify us immediately by e-mail and then delete all copies of this 
message and any attachments.

In addition you should be aware that ordinary (unencrypted) e-mail sent through the 
Internet is not secure. Do not send confidential or sensitive information, such as 
social security numbers, account numbers, personal identification numbers and 
passwords, to us via ordinary (unencrypted) e-mail.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[Fwd: Re: mod_jk does not detect a hung Tomcat]

2003-09-25 Thread Glenn Nielsen
Bill Barker wrote:
- Original Message -
From: Glenn Nielsen [EMAIL PROTECTED]
To: Tomcat Developers List [EMAIL PROTECTED]
Sent: Wednesday, September 24, 2003 12:28 PM
Subject: Re: mod_jk does not detect a hung Tomcat


Henri Gomez wrote:

David Rees a écrit :


Henri Gomez said:


Henri Gomez a écrit :


Nope since you don't have to just test at protocol level but also on
higher level, for instance check the full chain, up to servlet
handling.


It's easy to simulate this behavior by sending a STOP signal to
Tomcat.
I've also attached a log from mod_jk showing the problem.  I marked
the
point at which processing in mod_jk stopped until I sent a CONT
signal to
tomcat.
Does mod_jk2 have this same problem?  Is there any interest in

fixing

this? Does anyone have a workaround for this issue?


Well, if you have a hung tomcat, you're probably allready in serious
trouble.

No, actually in my case I wasn't.  I had two Tomcats running, as one

was

prone to locking up due to a JVM or application bug.  With a 50-50 load
distribution between two Tomcats, this left me with 1/2 of the requests
getting stuck and clients waiting forever and tying up Apache
processes. Eventually, a DOS will be the result if action is not taken
in time.  If
mod_jk noticed it wasn't really alive, this wouldn't be an issue at

all.


Anyway, if we add stuff like time-out in ajp request, you could be
stuck with long running servlets. Also jk read request in a blocking
mode for performance and adding timeout here is not an option.

Agreed that we wouldn't want a timeout normally to handle normal long
running servlet processes, but if there was a PING/PONG added to the
protocol there should be a timeout to prevent the above situation.


When I worked on ajp13++ (ajp14) protocol, I added a more secure auth
mecanism at connection time.
Since there is a bidirectionnal communication, jk could detect that
even if the connection is open, the remote didn't respond and so fall
back to the next in cluster configuration.
But on allready established connections, the problem persist.

Or we should add a PING/PONG before sending any request to tomcat.

It could be done as optional but I work on it only if many users make
such requirements


if many users ask for such feature ;)


Well, you've got one so far.  ;-)  Adding a configurable option to have
mod_jk verify (PING/PONG) that Tomcat is actually responding before

using

the connection would solve the problem and I can't imagine that it

would

add a lot of complexity to the code as well.  If I wasn't so rusty
with my
C programming and had some spare time, I would offer to help code it
up. ;-)  In any case, I'll be more than happy to help test.


Well, if you could find more users or at least one tomcat commiter
(Glenn, Remy, Costin, JFC...) who need it, I'll add the necessary code
in java and C areas ;)


There may be a simple way to achieve what David is asking for without
setting a request timeout or implementing a PING/PONG between mod_jk
and Tomcat.
What if each worker tracked the number of requests which were handled
by the worker since the last successful completion of a request.
i.e. add the following to a worker

worker-last_completed // Time in seconds since last successfully
completed request

worker-requests_since_last_completed  // Number of requests sent to
worker

since last successful completion.

Then logic could be added to try and detect an instance of Tomcat which
has

failed.  Perhaps even allow several additional worker properties to
determine

when mod_jk should consider the worker failed.


This won't work  with the pre-fork MPM, since each Apache child will have
its own idea of the timing.  The only way that it could tell that a Tomcat
failed is to try the request and fail :).
Argh, you are right, this goes back to the age old problem of not being able
to write a global worker connection pool or shared memory with the current code.
The only way to move forward would be to rewrite mod_jk 1.2 to use APR.

Glenn

--
Glenn Nielsen [EMAIL PROTECTED] | /* Spelin donut madder|
MOREnet System Programming   |  * if iz ina coment.  |
Missouri Research and Education Network  |  */   |
--


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: mod_jk does not detect a hung Tomcat

2003-09-23 Thread Henri Gomez
David Rees a écrit :

I posted this to tomcat-users last week, but didn't get a reply...  I'm
hoping to get some feedback from the connectors developers on this issue I
occasionally run into...
I've got a setup where I've got two load balanced Tomcats running off of
Apache and mod_jk.
I've got a problem where one of the Tomcats will occasionally hang, but
not die.  This means that it will accept new connections, but will not
actually process anything.  This renders all clients using the hung Tomcat
completely stuck as they are not switched over to the other Tomcat.
mod_jk seems to assume that if it can connect to Tomcat, it must be ready
to respond to requests.
Yes, if jk can connect to tomcat it assume it could handle the requests.

It seems that some sort of connection test (with a short socket timeout)
would be appropriate to validate that the connection is actually
responding.  While this would increase the latency of each request a bit,
it would improve the reliability.  Is there any provision in the AJP13
protocol to allow for testing of connections before sending a request over
it?
Nope since you don't have to just test at protocol level but also on 
higher level, for instance check the full chain, up to servlet handling.

It's easy to simulate this behavior by sending a STOP signal to Tomcat.

I've also attached a log from mod_jk showing the problem.  I marked the
point at which processing in mod_jk stopped until I sent a CONT signal to
tomcat.
Does mod_jk2 have this same problem?  Is there any interest in fixing
this? Does anyone have a workaround for this issue?
Well, if you have a hung tomcat, you're probably allready in serious 
trouble.

Anyway, if we add stuff like time-out in ajp request, you could be stuck 
with long running servlets. Also jk read request in a blocking mode for 
performance and adding timeout here is not an option.



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: mod_jk does not detect a hung Tomcat

2003-09-23 Thread David Rees
Henri Gomez said:
 Henri Gomez a écrit :
 Nope since you don't have to just test at protocol level but also on
 higher level, for instance check the full chain, up to servlet
 handling.

 It's easy to simulate this behavior by sending a STOP signal to
 Tomcat.

 I've also attached a log from mod_jk showing the problem.  I marked
 the
 point at which processing in mod_jk stopped until I sent a CONT
 signal to
 tomcat.

 Does mod_jk2 have this same problem?  Is there any interest in fixing
 this? Does anyone have a workaround for this issue?

 Well, if you have a hung tomcat, you're probably allready in serious
 trouble.

No, actually in my case I wasn't.  I had two Tomcats running, as one was
prone to locking up due to a JVM or application bug.  With a 50-50 load
distribution between two Tomcats, this left me with 1/2 of the requests
getting stuck and clients waiting forever and tying up Apache processes. 
Eventually, a DOS will be the result if action is not taken in time.  If
mod_jk noticed it wasn't really alive, this wouldn't be an issue at all.

 Anyway, if we add stuff like time-out in ajp request, you could be
 stuck with long running servlets. Also jk read request in a blocking
 mode for performance and adding timeout here is not an option.

Agreed that we wouldn't want a timeout normally to handle normal long
running servlet processes, but if there was a PING/PONG added to the
protocol there should be a timeout to prevent the above situation.

 When I worked on ajp13++ (ajp14) protocol, I added a more secure auth
 mecanism at connection time.

 Since there is a bidirectionnal communication, jk could detect that
 even if the connection is open, the remote didn't respond and so fall
 back to the next in cluster configuration.

 But on allready established connections, the problem persist.

 Or we should add a PING/PONG before sending any request to tomcat.

 It could be done as optional but I work on it only if many users make
 such requirements

 if many users ask for such feature ;)

Well, you've got one so far.  ;-)  Adding a configurable option to have
mod_jk verify (PING/PONG) that Tomcat is actually responding before using
the connection would solve the problem and I can't imagine that it would
add a lot of complexity to the code as well.  If I wasn't so rusty with my
C programming and had some spare time, I would offer to help code it up. 
;-)  In any case, I'll be more than happy to help test.

Thanks,
Dave



-Dave

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: mod_jk does not detect a hung Tomcat

2003-09-23 Thread Henri Gomez
David Rees a écrit :
Henri Gomez said:

Henri Gomez a écrit :

Nope since you don't have to just test at protocol level but also on
higher level, for instance check the full chain, up to servlet
handling.

It's easy to simulate this behavior by sending a STOP signal to
Tomcat.
I've also attached a log from mod_jk showing the problem.  I marked
the
point at which processing in mod_jk stopped until I sent a CONT
signal to
tomcat.
Does mod_jk2 have this same problem?  Is there any interest in fixing
this? Does anyone have a workaround for this issue?
Well, if you have a hung tomcat, you're probably allready in serious
trouble.


No, actually in my case I wasn't.  I had two Tomcats running, as one was
prone to locking up due to a JVM or application bug.  With a 50-50 load
distribution between two Tomcats, this left me with 1/2 of the requests
getting stuck and clients waiting forever and tying up Apache processes. 
Eventually, a DOS will be the result if action is not taken in time.  If
mod_jk noticed it wasn't really alive, this wouldn't be an issue at all.


Anyway, if we add stuff like time-out in ajp request, you could be
stuck with long running servlets. Also jk read request in a blocking
mode for performance and adding timeout here is not an option.


Agreed that we wouldn't want a timeout normally to handle normal long
running servlet processes, but if there was a PING/PONG added to the
protocol there should be a timeout to prevent the above situation.

When I worked on ajp13++ (ajp14) protocol, I added a more secure auth
mecanism at connection time.
Since there is a bidirectionnal communication, jk could detect that
even if the connection is open, the remote didn't respond and so fall
back to the next in cluster configuration.
But on allready established connections, the problem persist.

Or we should add a PING/PONG before sending any request to tomcat.

It could be done as optional but I work on it only if many users make
such requirements
if many users ask for such feature ;)


Well, you've got one so far.  ;-)  Adding a configurable option to have
mod_jk verify (PING/PONG) that Tomcat is actually responding before using
the connection would solve the problem and I can't imagine that it would
add a lot of complexity to the code as well.  If I wasn't so rusty with my
C programming and had some spare time, I would offer to help code it up. 
;-)  In any case, I'll be more than happy to help test.
Well, if you could find more users or at least one tomcat commiter
(Glenn, Remy, Costin, JFC...) who need it, I'll add the necessary code
in java and C areas ;)


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


mod_jk does not detect a hung Tomcat

2003-09-22 Thread David Rees
I posted this to tomcat-users last week, but didn't get a reply...  I'm
hoping to get some feedback from the connectors developers on this issue I
occasionally run into...

I've got a setup where I've got two load balanced Tomcats running off of
Apache and mod_jk.

I've got a problem where one of the Tomcats will occasionally hang, but
not die.  This means that it will accept new connections, but will not
actually process anything.  This renders all clients using the hung Tomcat
completely stuck as they are not switched over to the other Tomcat.

mod_jk seems to assume that if it can connect to Tomcat, it must be ready
to respond to requests.

It seems that some sort of connection test (with a short socket timeout)
would be appropriate to validate that the connection is actually
responding.  While this would increase the latency of each request a bit,
it would improve the reliability.  Is there any provision in the AJP13
protocol to allow for testing of connections before sending a request over
it?

It's easy to simulate this behavior by sending a STOP signal to Tomcat.

I've also attached a log from mod_jk showing the problem.  I marked the
point at which processing in mod_jk stopped until I sent a CONT signal to
tomcat.

Does mod_jk2 have this same problem?  Is there any interest in fixing
this? Does anyone have a workaround for this issue?

Thanks,
Dave

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]