On 09/07/2008 12:43 PM, Rainer Jung wrote:
Ruediger Pluem schrieb:

On 09/06/2008 10:54 PM, Rainer Jung wrote:
Rüdiger Plüm schrieb:


But in case the user of the connection knows, that it's broken, and
closes it, it would make more sense to put in under the stack, since it
is no longer connected.
Why? IMHO this defers only the efforts that need to be done anyway to
create
a new TCP connection. And keep in mind that even in the case that I put the
faulty connection back *under* the stack the next connection on top of the
stack was even longer idle then the one that was faulty. So it is likely to
be faulty as well. It might be the case though that this faultiness is
detected earlier (in the TCP connection check) and thus our next
CPING/CPONG
in the loop happens with a fine fresh TCP connection.

Yes, I think that's the question: if one CPING fails, what do you want
to do with the remaining connections in the pool?

- do you assume they are broken as well and close them directly (and
afterwards try to open a new one)
- do you want to test them immediately (and if one of them is still OK
maybe end the testing and use it)
- you don't care and simply try to open a new one.

Concerning the argument that the next connection on the step would be
even longer idle: yes, unless another thread returned one to the pool in
the meantime.

Of course. This could have happened, but as this connection needs to be fixed
sooner or later anyway I think I should do it now.
I guess this is a matter of assumption and probability:
If you assume that a healthy connection was put back in the reslist by another
thread and that the broken connection won't be used in the near future
anyway, then the approach of getting another one from the reslist makes sense.
If you assume that you get the broken one back anyway and that it will be needed
in the near future the other approach makes sense.
As you notice I am leaning to assuming the second one :-).


My personal opinion (based on mod_jk's connection handling code): CPING
failure is very rare. In most cases, it indicates connection drop has
happened by a firewall, in most remaining cases the backend is in a very

I recently had a situation on one of my systems (JBOSS 4.0.x with Tomcat 5.5,
classic connector) where this wasn't true AFAICT. Both httpd
and JBOSS are on the same box so there was definitely no firewall / network
issue that caused the problem. CPING's failed with a timeout, but new
connections worked fine and I couldn't find any blocked processors threads
in the thread dump and load and GC wasn't very high. That was the starting point
for my patch as I thought that a failed CPING should not do a final verdict
on the request, but should trigger one more try. I fixed this temporarily by
a somewhat tricky LB configuration over the one backend with one retry attempt.
But to be honest, I do not know what caused this strange situation. As the
JBOSS version and thus the Tomcat version is quited aged it is possible
that there might be a bug in the classic connector that is fixed in the
meantime. But I am leaving the subject of the thread here.

badly broken state. Both situations would result in nearly all remaining
connections would be broken as well, but not necessarily all (in the
firewall case, there might be less idle connections coming back from
other threads). So a good reaction to a CPING failure would be a pool
wide connection check and using a new connection.

In general I agree, but doing this in the scope of the request is IMHO too
time consuming and expensive.


If you are afraid, that the check of all connections in the pool takes
to long (maybe running into TCP timeouts), you could directly try a

That is what the patch does.

fresh connection, and set an indicator in the pool, that the maintenance
task, which is usually only looking for idle connections to close,
should additionally do a function check for all connections. That would
be non-critical concerning latency for requests, once maintenance runs
decoupled from a request (like what Mladen suggested, either in a
separate thread or using the monitor hook).

This seems like a nice idea once we have some kind of maintenance "thread".
But I am not sure how this can be done with the current reslist implementation
because of its stack character. Keep in mind that we cannot extend the API of
the reslist until APR-UTIL 1.4.0 and cannot change it until APR-UTIL 2.0.
So this could proof to be tricky.

One question as you are more familiar with the AJP server code on Tomcat
side:
If a connector closes down a connection due to its idleness does it send
any
kind of AJP shutdown package via the TCP connection or does it just
close the
socket like in the HTTP keepalive case?

Unfortunately the AJP13 protocol is a little weak on connection
handling. There is no message indicating the backend shuts down its
connection. So it just closes the socket.

Thanks for pointing. This is all I wanted to know. If there would be unread
data (an AJP connection close packet) the detection whether the remote side
closed the socket wouldn't work.

Regards

Rüdiger

Reply via email to