[ 
https://issues.apache.org/activemq/browse/AMQ-771?page=comments#action_36794 ] 
            
Kevin Yaussy commented on AMQ-771:
----------------------------------

Rob,

The fixes I have for the 4.0.1 release for this issue rely upon the fix I made 
to DemandForwardingBridgeSupport for issue [ 
https://issues.apache.org/activemq/browse/AMQ-776?page=all ].

So, for 4.0.2, as I reported in AMQ-776, the 4.0.1 fix did not fix 4.0.2.  I 
will have to try and track down what is wrong there and get 4.0.2 fixed for 
AMQ-776.

At any rate, I should describe this issue better:

Not only is there an issue with TransportConnection::stop, wherein it attempts 
to send something on the socket before closing, there are problems in general 
with the fact that FailoverTransport is decorated by MutexTransport.  When 
publishing to a consumer between two brokers, and the consumer-side broker is 
frozen, and the socket fills up, then the FailoverTransport (InactivityMonitor) 
attempts to close down the connection.  This will fail, as everything is 
blocked around the MutexTransport.  See the scenario list below for how to 
recreate the problem.  

The changes I made are rather surgical, in order to make it work.  I wasn't 
particularly happy with them, but maybe they are acceptable.  I will attach 
patches as soon as I get AMQ-776 working.  But, the changed source files 
include:
org.apache.activemq.transport.MutexTransport
org.apache.activemq.transport.failover.FailoverTransport
org.apache.activemq.transport.tcp.TcpTransport
org.apache.activemq.broker.TransportConnection

The changes were not tested against all unit tests, so there may be similar 
changes required to other files (i.e. some other transport than TcpTransport).


Scenario (using ConsumerTool and ProducerTool from examples):
-Start broker A
-Start broker B
-Start consumer, on FOO, attaching to broker B (failover transport, only broker 
B)
-Start publisher, on FOO, publishing large messages, such as 10K bytes, 
attaching to broker A (failover transport, only broker A)
-On Solaris, pstop broker B

Wait for the socket to fill up, and then when broker A reports the dead 
connection, notice that it does not close off the connection properly.  Do a 
kill-3 on broker A and note that it is waiting on MutexTransport lock and 
FailoverTransport can't close off the connection.


> org.apache.activemq.broker.TransportConnection::stop should not attempt to 
> send a message over the connection.
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-771
>                 URL: https://issues.apache.org/activemq/browse/AMQ-771
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Connector
>    Affects Versions: 4.0.1, 4.0
>            Reporter: Kevin Yaussy
>         Assigned To: Rob Davies
>
> Especially when using "failover", there can be a problem with respect to 
> TransportConnection::stop attempting to send a "shutdown" message over the 
> connection.  If another thread is sending messages to the connection, and it 
> gets stuck for some reason, such as a network freeze, the target machine 
> panics, or the target process freezes for some reason, the 
> TransportConnection::dispatch will eventually block, locking the 
> MutextTransport object.  When the InactivityMonitor wakes up and detects that 
> the connection is dead, it will go through the process of stopping the 
> connection.  This goes back into TransportConnection, and calls stop, which 
> attemtps to lock the MutexTransport so it can send the "shutdown" command.  
> Now, both threads are stuck, potentially for a long time, as a box panic will 
> not cleanly close the tcp connection.
> I'm not sure the rationale for wanting to send a shutdown command to the 
> other side of the connection, since the target has to handle the connection 
> going down hard anyway.  Seems to me, if you are intending on closing the 
> connection, just close it - don't try to be nice to the other side.  
> Especially in this code path, there is something wrong with the other side 
> anyway.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to