JIRA: https://issues.apache.org/jira/browse/AMQ-6666

I'm running now with a patched version without the "command.isMessage()" 
condition and fail-over behaviour has substantially improved.
Can't really judge whether it is the correct fix, however.

- Martin


On 30.04.2017 18:50, Tim Bain wrote:
It does seem like there needs to be another path out of that code block,
though I wonder if the expectation was that nothing that wasn't a message
would ever make it to that block. Either way, it seems to not be working
the way it should, so please submit a bug in JIRA for the problem.

Tim

On Apr 29, 2017 4:24 AM, "Martin Lichtin" <lich...@yahoo.com.invalid> wrote:

I'm running into a situation with the Failover Transport not respecting
the timeout that's been set.
The symptom is endless messages of this kind:

2017-04-29 09:48:26,128 | TRACE | .engine.cfgengine.in]-11 |
FailoverTransport                | sport.failover.FailoverTransport 615 |
81 - org.apache.activemq.activemq-osgi - 5.14.0 | Waiting for transport
to reconnect..: TransactionInfo {commandId = 127798, responseRequired =
true, type = 7, connectionId = ID:inucdev4-57330-1493370444659-3:3,
transactionId = XID:[1096044365,globalId=63747
26c6366672d656e67696e653130333530323030303034,branchId=63747
26c6366672d656e67696e6531313036383134]}
2017-04-29 09:48:26,228 | TRACE | .engine.cfgengine.in]-11 |
FailoverTransport                | sport.failover.FailoverTransport 615 |
81 - org.apache.activemq.activemq-osgi - 5.14.0 | Waiting for transport
to reconnect..: TransactionInfo {commandId = 127798, responseRequired =
true, type = 7, connectionId = ID:inucdev4-57330-1493370444659-3:3,
transactionId = XID:[1096044365,globalId=63747
26c6366672d656e67696e653130333530323030303034,branchId=63747
26c6366672d656e67696e6531313036383134]}
2017-04-29 09:48:26,329 | TRACE | .engine.cfgengine.in]-11 |
FailoverTransport                | sport.failover.FailoverTransport 615 |
81 - org.apache.activemq.activemq-osgi - 5.14.0 | Waiting for transport
to reconnect..: TransactionInfo {commandId = 127798, responseRequired =
true, type = 7, connectionId = ID:inucdev4-57330-1493370444659-3:3,
transactionId = XID:[1096044365,globalId=63747
26c6366672d656e67696e653130333530323030303034,branchId=63747
26c6366672d656e67696e6531313036383134]}
2017-04-29 09:48:26,429 | TRACE | .engine.cfgengine.in]-11 |
FailoverTransport                | sport.failover.FailoverTransport 615 |
81 - org.apache.activemq.activemq-osgi - 5.14.0 | Waiting for transport
to reconnect..: TransactionInfo {commandId = 127798, responseRequired =
true, type = 7, connectionId = ID:inucdev4-57330-1493370444659-3:3,
transactionId = XID:[1096044365,globalId=63747
26c6366672d656e67696e653130333530323030303034,branchId=63747
26c6366672d656e67696e6531313036383134]}
2017-04-29 09:48:26,530 | TRACE | .engine.cfgengine.in]-11 |
FailoverTransport                | sport.failover.FailoverTransport 615 |
81 - org.apache.activemq.activemq-osgi - 5.14.0 | Waiting for transport
to reconnect..: TransactionInfo {commandId = 127798, responseRequired =
true, type = 7, connectionId = ID:inucdev4-57330-1493370444659-3:3,
transactionId = XID:[1096044365,globalId=63747
26c6366672d656e67696e653130333530323030303034,branchId=63747
26c6366672d656e67696e6531313036383134]}
...
2017-04-29 09:48:33,270 | TRACE | .engine.cfgengine.in]-11 |
FailoverTransport                | sport.failover.FailoverTransport 615 |
81 - org.apache.activemq.activemq-osgi - 5.14.0 | Waiting for transport
to reconnect..: TransactionInfo {commandId = 127798, responseRequired =
true, type = 7, connectionId = ID:inucdev4-57330-1493370444659-3:3,
transactionId = XID:[1096044365,globalId=63747
26c6366672d656e67696e653130333530323030303034,branchId=63747
26c6366672d656e67696e6531313036383134]}
2017-04-29 09:48:33,371 | TRACE | .engine.cfgengine.in]-11 |
FailoverTransport                | sport.failover.FailoverTransport 615 |
81 - org.apache.activemq.activemq-osgi - 5.14.0 | Waiting for transport
to reconnect..: TransactionInfo {commandId = 127798, responseRequired =
true, type = 7, connectionId = ID:inucdev4-57330-1493370444659-3:3,
transactionId = XID:[1096044365,globalId=63747
26c6366672d656e67696e653130333530323030303034,branchId=63747
26c6366672d656e67696e6531313036383134]}


It seems to never get out of this loop:

     while (transport == null && !disposed && connectionFailure == null
             && !Thread.currentThread().isInterrupted() &&
willReconnect()) {

         LOG.trace("Waiting for transport to reconnect..: {}", command);
         long end = System.currentTimeMillis();
         if (command.isMessage() && timeout > 0 && (end - start > timeout))
{
             timedout = true;
             LOG.info("Failover timed out after {} ms", (end - start));
             break;
         }
         try {
             reconnectMutex.wait(100);
         } catch (InterruptedException e) {
             Thread.currentThread().interrupt();
             LOG.debug("Interupted:", e);
         }
         transport = connectedTransport.get();
     }

The timeout is set to 5000ms and should have hit a long time ago, but I
guess "command.isMessage()" returns false in this situation?
Any reason this condition is needed? It seems to me anything that can't be
sent within the timeout should cause the throw.

- Martin



Reply via email to