[ http://jira.jboss.com/jira/browse/JBAS-754?page=comments#action_12312673 ] Sacha Labourey commented on JBAS-754: -------------------------------------
Logged In: YES user_id=95900 JBossHA node naming is now di-associated from JavaGroups naming. Node name can be explicitely set but, by default, a name is created at startup by JBoss (localIP:JNDI_PORT as a first strategy). This should fix some of the singleton issues seen. Furthermore the DRM.add method know makes synchronous calls over the cluster to avoid DRM.isMasterReplica consider itself as a master because state with other nodes is not yet synched. This should solved the remaining singleton issues. As part of these changes a farming bug has been fixed which was causing already-deployed apps to be re-deployed on running nodes when starting a new node (most frequent when having at least 3 nodes). Please TEST this new version and provide feedback if something is broken by this change. > DistributedReplicantManager.isMasterReplica(String) false + > ----------------------------------------------------------- > > Key: JBAS-754 > URL: http://jira.jboss.com/jira/browse/JBAS-754 > Project: JBoss Application Server > Type: Bug > Components: Clustering > Versions: JBossAS-3.2.6 Final > Reporter: SourceForge User > Assignee: Sacha Labourey > > > SourceForge Submitter: slaboure . > There is a race condition i the > DistributedReplicantManager.isMasterReplica(String) that > shows up when this > method is called from within a notifyKeyListeners as > shown by this stack trace: > Thread "main"@65 status: RUNNING > - isMasterReplica():437, > org.jboss.ha.framework.server.DistributedReplicantManag > erImpl > - isDRMMasterReplica():234, > org.jboss.ha.jmx.HAServiceMBeanSupport > - partitionTopologyChanged():103, > org.jboss.ha.singleton.HASingletonSupport > - replicantsChanged():197, > org.jboss.ha.jmx.HAServiceMBeanSupport$1 > - notifyKeyListeners():675, > org.jboss.ha.framework.server.DistributedReplicantManag > erImpl > - add():326, > org.jboss.ha.framework.server.DistributedReplicantManag > erImpl > - registerDRMListener():204, > org.jboss.ha.jmx.HAServiceMBeanSupport > - startService():144, > org.jboss.ha.jmx.HAServiceMBeanSupport > This is due the the choice to return true when the key in > question is in the > localReplicants table, but not the replicants table: > public boolean isMasterReplica (String key) > { > if (!localReplicants.containsKey (key)) > return false; > Vector allNodes = this.partition.getCurrentView (); > HashMap repForKey = (HashMap)replicants.get > (key); > if (repForKey==null) > return true; ???? > This seems to be an ambiguous condition as this > condition exists for a node that > calls add and when the state has not synched or has > failed to synch. Another > problem I'm seeing at least in the context of the > singleton service is that the > notion of the master node is unstable. Here is the output > from one of 3 nodes > running the singleton service starting with the addition > of the final node shown > as view 2. > 15:35:44,637 INFO [Server] JBoss (MX MicroKernel) > [3.2.2RC3 (build: > CVSTag=Branch_3_2 date=200307312219)] Started in > 5s:948ms > 15:36:27,719 INFO [DefaultPartition] New cluster view: > 2 ([lamia:32947, > 172.17.66.54:2821, ironmaiden:51770] delta: 1) > 15:36:27,749 INFO [DefaultPartition:ReplicantManager] > Dead members: 0 > 15:37:13,555 INFO [DefaultPartition] New cluster view > (id: 3, delta: -1) : > [172.17.66.54:2821, ironmaiden:51770] > 15:37:13,575 INFO [DefaultPartition:ReplicantManager] > Dead members: 1 > 15:38:13,321 INFO [HASingletonMBeanExample] Notified > to start as singleton > 15:38:13,321 INFO [DefaultPartition] New cluster view > (id: 4, delta: 1) : > [172.17.66.54:2821, ironmaiden:51770, lamia:32949] > 15:38:13,331 INFO [DefaultPartition:ReplicantManager] > Dead members: 0 > 15:38:13,361 INFO [HASingletonMBeanExample] Notified > to stop as singleton > 15:39:13,447 INFO [HASingletonMBeanExample] Notified > to start as singleton > 15:39:13,457 INFO [HASingletonMBeanExample] Notified > to stop as singleton > With view 3 the orginal node and singleton is killed and > the node for which the > console output corresponds(172.17.66.54) is selected as > the singleton. When the > third node is started again there is some thrashing due > to the existing 2 nodes > both selecting themselves as the singleton and telling > the other to stop and it > appears that there is no singleton choosen. The problem > seems to be inconsistent > matching of member names. Once only knows it IP > while the other node knows the > hostnames. Here is the console view of the second node > showing the hostnames and > its thrashing: > 15:25:21,023 INFO [Server] JBoss (MX MicroKernel) > [3.2.2RC3 (build: > CVSTag=Branch_3_2 date=200307312219)] Started in > 13s:597ms > 15:26:05,562 INFO [DefaultPartition] New cluster view: > 3 ([succubus:2821, > ironmaiden:51770] delta: -1) > 15:26:05,573 INFO [DefaultPartition:ReplicantManager] > Dead members: 1 > 15:27:05,506 INFO [HASingletonMBeanExample] Notified > to start as singleton > 15:27:05,509 INFO [DefaultPartition] New cluster view: > 4 ([succubus:2821, > ironmaiden:51770, lamia:32949] delta: 1) > 15:27:05,513 INFO [DefaultPartition:ReplicantManager] > Dead members: 0 > 15:27:05,531 INFO [HASingletonMBeanExample] Notified > to stop as singleton > 15:28:05,520 INFO [HASingletonMBeanExample] Notified > to start as singleton > 15:28:05,526 INFO [HASingletonMBeanExample] Notified > to stop as singleton > Its not clear that the > DistributedReplicantManager.isMasterReplica was > designed > to be used for the selection of a singleton node, but if it > is, the logic needs > to be firmed up. If not, the singleton service needs to be > built on something else. > -- > xxxxxxxxxxxxxxxxxxxxxxxx > Scott Stark > Chief Technology Officer > JBoss Group, LLC > xxxxxxxxxxxxxxxxxxxxxxxx -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa - If you want more information on JIRA, or have a bug to report see: http://www.atlassian.com/software/jira ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://productguide.itmanagersjournal.com/ _______________________________________________ JBoss-Development mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/jboss-development