[ https://issues.apache.org/jira/browse/DISPATCH-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16437039#comment-16437039 ]
Marcel Meulemans commented on DISPATCH-966: ------------------------------------------- Added to the configuration and ran the test again several times. However now I see some things I did not expect; the network seems to come up correctly, but after a while it seems to fail in a weird way. The inter-router connections do not seems to drop anymore but routing via the network does not seem to work (i.e.ROUTER_LS (info) Computed next hops: {} and qdstat -n show only a single router). Maybe this is the issue unmasked by allowing unsettled multicasts? I attached two more file: * the logs of router-0 (from router start until slightly after the network fails) at info level * a tcpdump to the inter router communication to an from router-0 (tcpdump -i eth0 tcp port 55672 -s 65535) I hope this helps (the dump is fairly large, so I hope you can find any hidden needles). -- Marcel > Qpid dispatch unstable inter-router connections > ----------------------------------------------- > > Key: DISPATCH-966 > URL: https://issues.apache.org/jira/browse/DISPATCH-966 > Project: Qpid Dispatch > Issue Type: Bug > Components: Routing Engine > Affects Versions: 1.0.1 > Reporter: Marcel Meulemans > Assignee: Ted Ross > Priority: Major > Attachments: qdrouterd-unsettled-true.log, qdrouterd.conf, > qdrouterd.log, router-unsettled-true.dump, router.dump > > > I am running a three node fully connected mesh of dispatch routers with 10000 > attached clients and I am seeing some unstable inter-router connections (I am > sending around 1000 small, less than 1K, messages per second through the > network). The inter-router connections fail every so many seconds with the > message: > {{Connection to router-2:55672 failed: amqp:session:invalid-field sequencing > error, expected delivery-id 7, got 6}} > (the numbers 7 and 6 differ per connection loss) > In wireshark, using the attached tcpdump capture, I can see that every time > before the inter router connection is dropped, therw is a rejected > disposition with the message: > {{Condition: qd:forbidden}} > {{Description: Deliveries to a multicast address must be pre-settled}} > The routers are connected as follows: > * router-0 -> router-1 > * router-0 -> router-2 > * router-1 -> router-2 > The routers are running as a docker container (debian stretch) on google > compute engine machines (every router on a separate node). > Attached are: > * my qdrouter.conf (from one of the routers) > * a log snippet from router-0 at debug level from connection drop to > connection re-established to connection drop again. > * a tcpdump capture of the inter-router connection between router-0 and > router-1 during which several of the failures occur > Versions: > * qpid-dispatch@1.0.1-rc1 > * qpid-proton@0.20.0 > > [^qdrouterd.log] > [^qdrouterd.conf] > [^router.dump] -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org