[ 
https://issues.apache.org/jira/browse/DISPATCH-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540187#comment-16540187
 ] 

Chuck Rolke commented on DISPATCH-1008:
---------------------------------------

In diagnosing Commit cc3ab655 the issue was not the number of attempts to find 
the correct failover list. The failover list from Router B includes a *bogus 
host 'third-host'* whose name never resolves. Since the host name is a simple 
name and does not include a domain name field, a test system, like my laptop, 
might spend a lot of time querying DNS (internet provider DNS, corporate VPN 
DNS, etc) trying to resolve the name. In my case I can see that Router A 
disconnects from Router B and reconnects to Router C 4.5 seconds later. That's 
why the test takes so long on some systems.
 
◊  ◊◊ 3.233983  {color:#000000} Frame 309  127.0.0.1:45172  -> 127.0.0.1:20902  
->  {color} *flow* [0,2] (1,250)
◊  ◊◊ 3.381848  {color:#000000} Frame 356  127.0.0.1:45172  -> 127.0.0.1:20902  
->  {color} *close* [0]                 <-- connection to B lost
◊  ◊◊ 7.936659   Frame 479  127.0.0.1:50038  -> 127.0.0.1:20904  ->   *init* 
SASL (3): (1.0.0)    <-- Connection to C started
◊  ◊◊ 7.939116   Frame 481  127.0.0.1:50038 <-  127.0.0.1:20904 <-    *init* 
SASL (3): (1.0.0), *method* Method:
 

> Router should preserve original connection information when attempting to 
> make failover connections
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DISPATCH-1008
>                 URL: https://issues.apache.org/jira/browse/DISPATCH-1008
>             Project: Qpid Dispatch
>          Issue Type: Bug
>            Reporter: Ganesh Murthy
>            Assignee: Ganesh Murthy
>            Priority: Major
>             Fix For: 1.2.0
>
>         Attachments: broker-slave.xml, broker.xml, qdrouterd-failover.conf
>
>
> # Start artemis master and slave brokers and the router with the attached 
> config files.
>  # Notice that the router receives an open frame from the master broker with 
> the following failover information
>  # 
> {noformat}
> 2018-05-22 22:11:11.830106 -0230 SERVER (trace) [1]:0 <- @open(16) 
> [container-id="localhost", max-frame-size=4294967295, channel-max=65535, 
> idle-time-out=30000, 
> offered-capabilities=@PN_SYMBOL[:"sole-connection-for-container", 
> :"DELAYED_DELIVERY", :"SHARED-SUBS", :"ANONYMOUS-RELAY"], 
> properties={:product="apache-activemq-artemis", 
> :"failover-server-list"=[{:hostname="0.0.0.8", :scheme="amqp", :port=61617, 
> :"network-host"="0.0.0.0"}]"}]{noformat}
>  
>  # Now, kill the master broker and notice that the router correctly fails 
> over to the slave broker. But the slave broker does not provide any failover 
> information in its open frame and hence the router erases its original master 
> broker connection information
>  # When the master broker is now restarted and the slave broker is killed, 
> the router attempts to repeatedly connect only to the slave broker but never 
> attempts a connection to the master broker.
>  # If the router did not erase its failover list but preserved the original 
> master connection information, it would have connected the master broker.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to