On Dec 4, 2012, at 11:52 AM, Bela Ban <[email protected]> wrote: > > > On 12/4/12 11:30 AM, Dan Berindei wrote: >> BTW, I also got an exception yesterday in MarshallExternalPojosTest and >> I investigated it, but in my case the error was much weirder: two nodes >> both opened a TCP connection to each other, yet none of them received >> the forwarded command. I've asked Bela to investigate as well, but he >> didn't find anything suspicious in JGroups. > > If a node A connects to B and B connects to A at the exact same time > (and there wasn't any existing connection between the 2 nodes, then one > of the 2 will 'win' and the other one will close its connection. The > message to be sent is then lost. > > This is corrected by one of the upper layers, e.g. UNICAST retransmits > the message until it gets an ack. Re-sending a message will then create > a new connection, if the existing one was closed / removed. > > However, with UNICAST2, if a given message was the last message and no > further messages are sent, then only UNICAST2's stability messages will > detect that the other node is missing the last message sent. Stability > is triggered every 60 seconds by default, so unless that property was > changed, or stability was triggered programmatically, that last (lost) > message won't get retransmitted for 60 seconds.
^ Isn't that a default too high? Seems to me the scenario explained could happen relatively easily if two nodes are started simultaneously. We no longer ask users to stagger their startups? > > -- > Bela Ban, JGroups lead (http://www.jgroups.org) > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño [email protected] twitter.com/galderz Project Lead, Escalante http://escalante.io Engineer, Infinispan http://infinispan.org _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
