Hello Bela,
Yes, but Scott found two interesting things:
- it seems that even if two nodes share the same view, the name of the
members of the view may appear different (one has the name, the other the IP
address)
- there is an unstable condition somewhere that make the singleton service
flip-flap while it shouldn't be necessary (+ race condition but that's at
another level)
I will most probably play with the additional code you had added at the
begginig of the year to add another information to the IpAddress, but for
this will have to change TCP as well (you only changed UDP) so that this
additional information is always taken in account when generating new
IpAddress at the TCP JG protocol layer.
Cheers,
Sacha
> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On
> Behalf Of Bela Ban
> Sent: lundi, 4. ao�t 2003 23:55
> To: [EMAIL PROTECTED]
> Subject: Re: [JBoss-dev]
> DistributedReplicantManager.isMasterReplica(String) false positives?
>
>
> The entire logic to determine when to become singleton is
> handled in the
> view callback. Since this can potentially be length, and also uses
> remote group calls, I suggest to run this in a separate
> thread, so you
> won't run into a deadlock. By default, HA Clustering does *not* use
> deadlock detection.
>
>
> Scott M Stark wrote:
>
> > There is a race condition i the
> > DistributedReplicantManager.isMasterReplica(String) that
> shows up when
> > this method is called from within a notifyKeyListeners as shown by
> > this stack trace:
> >
> > Thread "main"@65 status: RUNNING
> > - isMasterReplica():437,
> > org.jboss.ha.framework.server.DistributedReplicantManagerImpl
> > - isDRMMasterReplica():234, org.jboss.ha.jmx.HAServiceMBeanSupport
> > - partitionTopologyChanged():103,
> > org.jboss.ha.singleton.HASingletonSupport
> > - replicantsChanged():197, org.jboss.ha.jmx.HAServiceMBeanSupport$1
> > - notifyKeyListeners():675,
> > org.jboss.ha.framework.server.DistributedReplicantManagerImpl
> > - add():326,
> > org.jboss.ha.framework.server.DistributedReplicantManagerImpl
> > - registerDRMListener():204, org.jboss.ha.jmx.HAServiceMBeanSupport
> > - startService():144, org.jboss.ha.jmx.HAServiceMBeanSupport
> >
> > This is due the the choice to return true when the key in
> question is
> > in the
> > localReplicants table, but not the replicants table:
> >
> > public boolean isMasterReplica (String key)
> > {
> > if (!localReplicants.containsKey (key))
> > return false;
> >
> > Vector allNodes = this.partition.getCurrentView ();
> > HashMap repForKey = (HashMap)replicants.get(key);
> > if (repForKey==null)
> > return true; ????
> >
> > This seems to be an ambiguous condition as this condition
> exists for a
> > node that calls add and when the state has not synched or
> has failed
> > to synch. Another problem I'm seeing at least in the context of the
> > singleton service is that the notion of the master node is
> unstable.
> > Here is the output from one of 3 nodes running the
> singleton service
> > starting with the addition of the final node shown as view 2.
> >
> > 15:35:44,637 INFO [Server] JBoss (MX MicroKernel)
> [3.2.2RC3 (build:
> > CVSTag=Branch_3_2 date=200307312219)] Started in 5s:948ms
> > 15:36:27,719 INFO [DefaultPartition] New cluster view: 2
> > ([lamia:32947, 172.17.66.54:2821, ironmaiden:51770] delta: 1)
> > 15:36:27,749 INFO [DefaultPartition:ReplicantManager] Dead
> members: 0
> > 15:37:13,555 INFO [DefaultPartition] New cluster view (id:
> 3, delta:
> > -1) : [172.17.66.54:2821, ironmaiden:51770]
> > 15:37:13,575 INFO [DefaultPartition:ReplicantManager] Dead
> members: 1
> > 15:38:13,321 INFO [HASingletonMBeanExample] Notified to start as
> > singleton
> > 15:38:13,321 INFO [DefaultPartition] New cluster view (id:
> 4, delta:
> > 1) : [172.17.66.54:2821, ironmaiden:51770, lamia:32949]
> > 15:38:13,331 INFO [DefaultPartition:ReplicantManager] Dead
> members: 0
> > 15:38:13,361 INFO [HASingletonMBeanExample] Notified to stop as
> > singleton
> > 15:39:13,447 INFO [HASingletonMBeanExample] Notified to start as
> > singleton
> > 15:39:13,457 INFO [HASingletonMBeanExample] Notified to stop as
> > singleton
> >
> > With view 3 the orginal node and singleton is killed and
> the node for
> > which the console output corresponds(172.17.66.54) is
> selected as the
> > singleton. When the third node is started again there is some
> > thrashing due to the existing 2 nodes both selecting
> themselves as the
> > singleton and telling the other to stop and it appears that
> there is
> > no singleton choosen. The problem seems to be inconsistent
> matching
> > of member names. Once only knows it IP while the other node
> knows the
> > hostnames. Here is the console view of the second node showing the
> > hostnames and its thrashing:
> >
> > 15:25:21,023 INFO [Server] JBoss (MX MicroKernel)
> [3.2.2RC3 (build:
> > CVSTag=Branch_3_2 date=200307312219)] Started in 13s:597ms
> > 15:26:05,562 INFO [DefaultPartition] New cluster view: 3
> > ([succubus:2821, ironmaiden:51770] delta: -1)
> > 15:26:05,573 INFO [DefaultPartition:ReplicantManager] Dead
> members: 1
> > 15:27:05,506 INFO [HASingletonMBeanExample] Notified to start as
> > singleton
> > 15:27:05,509 INFO [DefaultPartition] New cluster view: 4
> > ([succubus:2821, ironmaiden:51770, lamia:32949] delta: 1)
> > 15:27:05,513 INFO [DefaultPartition:ReplicantManager] Dead
> members: 0
> > 15:27:05,531 INFO [HASingletonMBeanExample] Notified to stop as
> > singleton
> > 15:28:05,520 INFO [HASingletonMBeanExample] Notified to start as
> > singleton
> > 15:28:05,526 INFO [HASingletonMBeanExample] Notified to stop as
> > singleton
> >
> > Its not clear that the
> DistributedReplicantManager.isMasterReplica was
> > designed to be used for the selection of a singleton node,
> but if it
> > is, the logic needs to be firmed up. If not, the singleton service
> > needs to be built on something else.
> >
>
> --
> Bela Ban
> http://www.javagroups.com
> Cell: (408) 316-4459
>
>
>
>
> -------------------------------------------------------
> This SF.Net email sponsored by: Free pre-built ASP.NET sites including
> Data Reports, E-commerce, Portals, and Forums are available now.
> Download today and enter to win an XBOX or Visual Studio .NET.
> http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet
> _072303_01/01
> _______________________________________________
> JBoss-Development mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/jboss-development
>
>
-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
JBoss-Development mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-development