Re: [JBoss-dev] DistributedReplicantManager.isMasterReplica(String)false positives?

2003-08-14 Thread Bela Ban
No name service. UDP and TCP use them when generating an Address 
(IpAddress). It works like this (pseudo code):

JChannel ch=new JChannel(props);
Event cfg=new Event(Event.CONFIG);
Map m=new HashMap();
m.put(additional_data, MyChannelName);
cfg.setObject(m);
ch.connect(); // -- will get a local address with additional_data attached
So, now you can associate MyChannelName with the IpAddress. Your app 
would only use ipAddr.getAdditionalData() to identify the address, 
rather than host:port.

Sacha, try this out and let me know whether you have additional reqs.

Scott M Stark wrote:

But what is the nameservice for these 'logical' names, and what layer 
uses them?

--
Bela Ban
www.javagroups.com
(408) 316-4459




---
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa0013ave/direct;at.aspnet_072303_01/01
___
JBoss-Development mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-development


RE: [JBoss-dev] DistributedReplicantManager.isMasterReplica(String) false positives?

2003-08-14 Thread Sacha Labourey
Right, now, there is no such naming service thing, but it may interesting in
the future for some features requiring to name nodes of the cluster. In the
meantime, I will use the IP-ified name as this logical name and never change
it, even after a shun.

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On 
 Behalf Of Scott M Stark
 Sent: mercredi, 6. août 2003 04:05
 To: [EMAIL PROTECTED]
 Subject: Re: [JBoss-dev] 
 DistributedReplicantManager.isMasterReplica(String) false positives?
 
 
 But what is the nameservice for these 'logical' names, and 
 what layer uses them?
 
 -- 
 
 Scott Stark
 Chief Technology Officer
 JBoss Group, LLC
 
 
 Bela Ban wrote:
 
  Yes I agree. What Sacha referred to, however, was the fact 
 that you can 
  have 'logical' names rather than host:port as member 
 addresses. This is 
  useful if a member is shunned, leaves and rejoins the group under a 
  different host:port address. The logical name would remain 
 the same in 
  this case.
  
 
 
 
 ---
 This SF.Net email sponsored by: Free pre-built ASP.NET sites including
 Data Reports, E-commerce, Portals, and Forums are available now.
 Download today and enter to win an XBOX or Visual Studio .NET.
 http://aspnet.click-url.com/go/psa0013ave/direct;at.aspnet
 _072303_01/01
 ___
 JBoss-Development mailing list
 [EMAIL PROTECTED]
 https://lists.sourceforge.net/lists/listinfo/jboss-development
 
 




---
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa0013ave/direct;at.aspnet_072303_01/01
___
JBoss-Development mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-development


Re: [JBoss-dev] DistributedReplicantManager.isMasterReplica(String)false positives?

2003-08-14 Thread Bela Ban
Yes I agree. What Sacha referred to, however, was the fact that you can 
have 'logical' names rather than host:port as member addresses. This is 
useful if a member is shunned, leaves and rejoins the group under a 
different host:port address. The logical name would remain the same in 
this case.

Scott M Stark wrote:

The ip addresses are consistent or else they would not be able to 
communicate. The base representation should be the ip address, not the 
name.

--
Bela Ban
http://www.javagroups.com
Cell: (408) 316-4459


---
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa0013ave/direct;at.aspnet_072303_01/01
___
JBoss-Development mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-development


RE: [JBoss-dev] DistributedReplicantManager.isMasterReplica(String) false positives?

2003-08-14 Thread Bill Burke
I think this service will be very useful for clustered durable queues and
topics.  Keep it up.

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Behalf Of Sacha
 Labourey
 Sent: Tuesday, August 05, 2003 2:21 AM
 To: [EMAIL PROTECTED]
 Subject: RE: [JBoss-dev]
 DistributedReplicantManager.isMasterReplica(String) false positives?


 Hello Bela,

 Yes, but Scott found two interesting things:
  - it seems that even if two nodes share the same view, the name of the
 members of the view may appear different (one has the name, the
 other the IP
 address)
  - there is an unstable condition somewhere that make the
 singleton service
 flip-flap while it shouldn't be necessary (+ race condition but that's at
 another level)

 I will most probably play with the additional code you had added at the
 begginig of the year to add another information to the IpAddress, but for
 this will have to change TCP as well (you only changed UDP) so that this
 additional information is always taken in account when generating new
 IpAddress at the TCP JG protocol layer.

 Cheers,


   Sacha

  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On
  Behalf Of Bela Ban
  Sent: lundi, 4. août 2003 23:55
  To: [EMAIL PROTECTED]
  Subject: Re: [JBoss-dev]
  DistributedReplicantManager.isMasterReplica(String) false positives?
 
 
  The entire logic to determine when to become singleton is
  handled in the
  view callback. Since this can potentially be length, and also uses
  remote group calls, I suggest to run this in a separate
  thread, so you
  won't run into a deadlock. By default, HA Clustering does *not* use
  deadlock detection.
 
 
  Scott M Stark wrote:
 
   There is a race condition i the
   DistributedReplicantManager.isMasterReplica(String) that
  shows up when
   this method is called from within a notifyKeyListeners as shown by
   this stack trace:
  
   Thread main@65 status: RUNNING
   - isMasterReplica():437,
   org.jboss.ha.framework.server.DistributedReplicantManagerImpl
   - isDRMMasterReplica():234, org.jboss.ha.jmx.HAServiceMBeanSupport
   - partitionTopologyChanged():103,
   org.jboss.ha.singleton.HASingletonSupport
   - replicantsChanged():197, org.jboss.ha.jmx.HAServiceMBeanSupport$1
   - notifyKeyListeners():675,
   org.jboss.ha.framework.server.DistributedReplicantManagerImpl
   - add():326,
   org.jboss.ha.framework.server.DistributedReplicantManagerImpl
   - registerDRMListener():204, org.jboss.ha.jmx.HAServiceMBeanSupport
   - startService():144, org.jboss.ha.jmx.HAServiceMBeanSupport
  
   This is due the the choice to return true when the key in
  question is
   in the
   localReplicants table, but not the replicants table:
  
  public boolean isMasterReplica (String key)
  {
 if (!localReplicants.containsKey (key))
return false;
  
 Vector allNodes = this.partition.getCurrentView ();
 HashMap repForKey = (HashMap)replicants.get(key);
 if (repForKey==null)
return true; 
  
   This seems to be an ambiguous condition as this condition
  exists for a
   node that calls add and when the state has not synched or
  has failed
   to synch. Another problem I'm seeing at least in the context of the
   singleton service is that the notion of the master node is
  unstable.
   Here is the output from one of 3 nodes running the
  singleton service
   starting with the addition of the final node shown as view 2.
  
   15:35:44,637 INFO  [Server] JBoss (MX MicroKernel)
  [3.2.2RC3 (build:
   CVSTag=Branch_3_2 date=200307312219)] Started in 5s:948ms
   15:36:27,719 INFO  [DefaultPartition] New cluster view: 2
   ([lamia:32947, 172.17.66.54:2821, ironmaiden:51770] delta: 1)
   15:36:27,749 INFO  [DefaultPartition:ReplicantManager] Dead
  members: 0
   15:37:13,555 INFO  [DefaultPartition] New cluster view (id:
  3, delta:
   -1) : [172.17.66.54:2821, ironmaiden:51770]
   15:37:13,575 INFO  [DefaultPartition:ReplicantManager] Dead
  members: 1
   15:38:13,321 INFO  [HASingletonMBeanExample] Notified to start as
   singleton
   15:38:13,321 INFO  [DefaultPartition] New cluster view (id:
  4, delta:
   1) : [172.17.66.54:2821, ironmaiden:51770, lamia:32949]
   15:38:13,331 INFO  [DefaultPartition:ReplicantManager] Dead
  members: 0
   15:38:13,361 INFO  [HASingletonMBeanExample] Notified to stop as
   singleton
   15:39:13,447 INFO  [HASingletonMBeanExample] Notified to start as
   singleton
   15:39:13,457 INFO  [HASingletonMBeanExample] Notified to stop as
   singleton
  
   With view 3 the orginal node and singleton is killed and
  the node for
   which the console output corresponds(172.17.66.54) is
  selected as the
   singleton. When the third node is started again there is some
   thrashing due to the existing 2 nodes both selecting
  themselves as the
   singleton and telling the other to stop and it appears that
  there is
   no singleton choosen. The problem seems

Re: [JBoss-dev] DistributedReplicantManager.isMasterReplica(String)false positives?

2003-08-14 Thread Bela Ban
Sacha Labourey wrote:

Hello Bela,

Yes, but Scott found two interesting things:
- it seems that even if two nodes share the same view, the name of the
members of the view may appear different (one has the name, the other the IP
address)
This is probably caused by different /etc/hosts or NIS/NIS+ 
configurations. All DNSs have to be setup correctly between nodes.
If the names appear inconsistently, you will get into trouble, better 
avoid it.

- there is an unstable condition somewhere that make the singleton service
flip-flap while it shouldn't be necessary (+ race condition but that's at
another level)
Has nothing to do with JavaGroups I assume ?

I will most probably play with the additional code you had added at the
begginig of the year to add another information to the IpAddress, but for
this will have to change TCP as well (you only changed UDP)
done.

so that this additional information is always taken in account when generating new
IpAddress at the TCP JG protocol layer.
Check out TCP from the CVS and let me know whether this works / covers 
what you want.

--
Bela Ban
http://www.javagroups.com
Cell: (408) 316-4459


---
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa0013ave/direct;at.aspnet_072303_01/01
___
JBoss-Development mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-development


Re: [JBoss-dev] DistributedReplicantManager.isMasterReplica(String)false positives?

2003-08-06 Thread Scott M Stark
The ip addresses are consistent or else they would not be able to communicate. 
The base representation should be the ip address, not the name.

--

Scott Stark
Chief Technology Officer
JBoss Group, LLC

Bela Ban wrote:

This is probably caused by different /etc/hosts or NIS/NIS+ 
configurations. All DNSs have to be setup correctly between nodes.
If the names appear inconsistently, you will get into trouble, better 
avoid it.



---
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa0013ave/direct;at.aspnet_072303_01/01
___
JBoss-Development mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-development


RE: [JBoss-dev] DistributedReplicantManager.isMasterReplica(String) false positives?

2003-08-05 Thread Sacha Labourey
Hello Bela,

Yes, but Scott found two interesting things:
 - it seems that even if two nodes share the same view, the name of the
members of the view may appear different (one has the name, the other the IP
address)
 - there is an unstable condition somewhere that make the singleton service
flip-flap while it shouldn't be necessary (+ race condition but that's at
another level)

I will most probably play with the additional code you had added at the
begginig of the year to add another information to the IpAddress, but for
this will have to change TCP as well (you only changed UDP) so that this
additional information is always taken in account when generating new
IpAddress at the TCP JG protocol layer.

Cheers,


Sacha

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On 
 Behalf Of Bela Ban
 Sent: lundi, 4. août 2003 23:55
 To: [EMAIL PROTECTED]
 Subject: Re: [JBoss-dev] 
 DistributedReplicantManager.isMasterReplica(String) false positives?
 
 
 The entire logic to determine when to become singleton is 
 handled in the 
 view callback. Since this can potentially be length, and also uses 
 remote group calls, I suggest to run this in a separate 
 thread, so you 
 won't run into a deadlock. By default, HA Clustering does *not* use 
 deadlock detection.
 
 
 Scott M Stark wrote:
 
  There is a race condition i the 
  DistributedReplicantManager.isMasterReplica(String) that 
 shows up when 
  this method is called from within a notifyKeyListeners as shown by 
  this stack trace:
 
  Thread main@65 status: RUNNING
  - isMasterReplica():437, 
  org.jboss.ha.framework.server.DistributedReplicantManagerImpl
  - isDRMMasterReplica():234, org.jboss.ha.jmx.HAServiceMBeanSupport
  - partitionTopologyChanged():103, 
  org.jboss.ha.singleton.HASingletonSupport
  - replicantsChanged():197, org.jboss.ha.jmx.HAServiceMBeanSupport$1
  - notifyKeyListeners():675, 
  org.jboss.ha.framework.server.DistributedReplicantManagerImpl
  - add():326, 
  org.jboss.ha.framework.server.DistributedReplicantManagerImpl
  - registerDRMListener():204, org.jboss.ha.jmx.HAServiceMBeanSupport
  - startService():144, org.jboss.ha.jmx.HAServiceMBeanSupport
 
  This is due the the choice to return true when the key in 
 question is 
  in the
  localReplicants table, but not the replicants table:
 
 public boolean isMasterReplica (String key)
 {
if (!localReplicants.containsKey (key))
   return false;
 
Vector allNodes = this.partition.getCurrentView ();
HashMap repForKey = (HashMap)replicants.get(key);
if (repForKey==null)
   return true; 
 
  This seems to be an ambiguous condition as this condition 
 exists for a 
  node that calls add and when the state has not synched or 
 has failed 
  to synch. Another problem I'm seeing at least in the context of the 
  singleton service is that the notion of the master node is 
 unstable. 
  Here is the output from one of 3 nodes running the 
 singleton service 
  starting with the addition of the final node shown as view 2.
 
  15:35:44,637 INFO  [Server] JBoss (MX MicroKernel) 
 [3.2.2RC3 (build: 
  CVSTag=Branch_3_2 date=200307312219)] Started in 5s:948ms
  15:36:27,719 INFO  [DefaultPartition] New cluster view: 2 
  ([lamia:32947, 172.17.66.54:2821, ironmaiden:51770] delta: 1)
  15:36:27,749 INFO  [DefaultPartition:ReplicantManager] Dead 
 members: 0
  15:37:13,555 INFO  [DefaultPartition] New cluster view (id: 
 3, delta: 
  -1) : [172.17.66.54:2821, ironmaiden:51770]
  15:37:13,575 INFO  [DefaultPartition:ReplicantManager] Dead 
 members: 1
  15:38:13,321 INFO  [HASingletonMBeanExample] Notified to start as 
  singleton
  15:38:13,321 INFO  [DefaultPartition] New cluster view (id: 
 4, delta: 
  1) : [172.17.66.54:2821, ironmaiden:51770, lamia:32949]
  15:38:13,331 INFO  [DefaultPartition:ReplicantManager] Dead 
 members: 0
  15:38:13,361 INFO  [HASingletonMBeanExample] Notified to stop as 
  singleton
  15:39:13,447 INFO  [HASingletonMBeanExample] Notified to start as 
  singleton
  15:39:13,457 INFO  [HASingletonMBeanExample] Notified to stop as 
  singleton
 
  With view 3 the orginal node and singleton is killed and 
 the node for 
  which the console output corresponds(172.17.66.54) is 
 selected as the 
  singleton. When the third node is started again there is some 
  thrashing due to the existing 2 nodes both selecting 
 themselves as the 
  singleton and telling the other to stop and it appears that 
 there is 
  no singleton choosen. The problem seems to be inconsistent  
 matching 
  of member names. Once only knows it IP while the other node 
 knows the 
  hostnames. Here is the console view of the second node showing the 
  hostnames and its thrashing:
 
  15:25:21,023 INFO  [Server] JBoss (MX MicroKernel) 
 [3.2.2RC3 (build: 
  CVSTag=Branch_3_2 date=200307312219)] Started in 13s:597ms
  15:26:05,562 INFO  [DefaultPartition] New cluster view: 3 
  ([succubus:2821

Re: [JBoss-dev] DistributedReplicantManager.isMasterReplica(String)false positives?

2003-08-05 Thread Scott M Stark
But what is the nameservice for these 'logical' names, and what layer uses them?

--

Scott Stark
Chief Technology Officer
JBoss Group, LLC

Bela Ban wrote:

Yes I agree. What Sacha referred to, however, was the fact that you can 
have 'logical' names rather than host:port as member addresses. This is 
useful if a member is shunned, leaves and rejoins the group under a 
different host:port address. The logical name would remain the same in 
this case.



---
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa0013ave/direct;at.aspnet_072303_01/01
___
JBoss-Development mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jboss-development


RE: [JBoss-dev] DistributedReplicantManager.isMasterReplica(String) false positives?

2003-08-04 Thread Sacha Labourey
I will work on that, thanks for the complete report.

Cheers,

Sacha


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On 
 Behalf Of Scott M Stark
 Sent: lundi, 4. août 2003 00:59
 To: [EMAIL PROTECTED]
 Subject: [JBoss-dev] 
 DistributedReplicantManager.isMasterReplica(String) false positives?
 
 
 There is a race condition i the 
 DistributedReplicantManager.isMasterReplica(String) that 
 shows up when this 
 method is called from within a notifyKeyListeners as shown by 
 this stack trace:
 
 Thread main@65 status: RUNNING
 - isMasterReplica():437, 
 org.jboss.ha.framework.server.DistributedReplicantManagerImpl
 - isDRMMasterReplica():234, org.jboss.ha.jmx.HAServiceMBeanSupport
 - partitionTopologyChanged():103, 
 org.jboss.ha.singleton.HASingletonSupport
 - replicantsChanged():197, org.jboss.ha.jmx.HAServiceMBeanSupport$1
 - notifyKeyListeners():675, 
 org.jboss.ha.framework.server.DistributedReplicantManagerImpl
 - add():326, 
 org.jboss.ha.framework.server.DistributedReplicantManagerImpl
 - registerDRMListener():204, org.jboss.ha.jmx.HAServiceMBeanSupport
 - startService():144, org.jboss.ha.jmx.HAServiceMBeanSupport
 
 This is due the the choice to return true when the key in 
 question is in the
 localReplicants table, but not the replicants table:
 
 public boolean isMasterReplica (String key)
 {
if (!localReplicants.containsKey (key))
   return false;
 
Vector allNodes = this.partition.getCurrentView ();
HashMap repForKey = (HashMap)replicants.get(key);
if (repForKey==null)
   return true; 
 
 This seems to be an ambiguous condition as this condition 
 exists for a node that 
 calls add and when the state has not synched or has failed to 
 synch. Another 
 problem I'm seeing at least in the context of the singleton 
 service is that the 
 notion of the master node is unstable. Here is the output 
 from one of 3 nodes 
 running the singleton service starting with the addition of 
 the final node shown 
 as view 2.
 
 15:35:44,637 INFO  [Server] JBoss (MX MicroKernel) [3.2.2RC3 (build: 
 CVSTag=Branch_3_2 date=200307312219)] Started in 5s:948ms
 15:36:27,719 INFO  [DefaultPartition] New cluster view: 2 
 ([lamia:32947, 
 172.17.66.54:2821, ironmaiden:51770] delta: 1)
 15:36:27,749 INFO  [DefaultPartition:ReplicantManager] Dead members: 0
 15:37:13,555 INFO  [DefaultPartition] New cluster view (id: 
 3, delta: -1) : 
 [172.17.66.54:2821, ironmaiden:51770]
 15:37:13,575 INFO  [DefaultPartition:ReplicantManager] Dead members: 1
 15:38:13,321 INFO  [HASingletonMBeanExample] Notified to 
 start as singleton
 15:38:13,321 INFO  [DefaultPartition] New cluster view (id: 
 4, delta: 1) : 
 [172.17.66.54:2821, ironmaiden:51770, lamia:32949]
 15:38:13,331 INFO  [DefaultPartition:ReplicantManager] Dead members: 0
 15:38:13,361 INFO  [HASingletonMBeanExample] Notified to stop 
 as singleton
 15:39:13,447 INFO  [HASingletonMBeanExample] Notified to 
 start as singleton
 15:39:13,457 INFO  [HASingletonMBeanExample] Notified to stop 
 as singleton
 
 With view 3 the orginal node and singleton is killed and the 
 node for which the 
 console output corresponds(172.17.66.54) is selected as the 
 singleton. When the 
 third node is started again there is some thrashing due to 
 the existing 2 nodes 
 both selecting themselves as the singleton and telling the 
 other to stop and it 
 appears that there is no singleton choosen. The problem seems 
 to be inconsistent 
   matching of member names. Once only knows it IP while the 
 other node knows the 
 hostnames. Here is the console view of the second node 
 showing the hostnames and 
 its thrashing:
 
 15:25:21,023 INFO  [Server] JBoss (MX MicroKernel) [3.2.2RC3 (build: 
 CVSTag=Branch_3_2 date=200307312219)] Started in 13s:597ms
 15:26:05,562 INFO  [DefaultPartition] New cluster view: 3 
 ([succubus:2821, 
 ironmaiden:51770] delta: -1)
 15:26:05,573 INFO  [DefaultPartition:ReplicantManager] Dead members: 1
 15:27:05,506 INFO  [HASingletonMBeanExample] Notified to 
 start as singleton
 15:27:05,509 INFO  [DefaultPartition] New cluster view: 4 
 ([succubus:2821, 
 ironmaiden:51770, lamia:32949] delta: 1)
 15:27:05,513 INFO  [DefaultPartition:ReplicantManager] Dead members: 0
 15:27:05,531 INFO  [HASingletonMBeanExample] Notified to stop 
 as singleton
 15:28:05,520 INFO  [HASingletonMBeanExample] Notified to 
 start as singleton
 15:28:05,526 INFO  [HASingletonMBeanExample] Notified to stop 
 as singleton
 
 Its not clear that the 
 DistributedReplicantManager.isMasterReplica was designed 
 to be used for the selection of a singleton node, but if it 
 is, the logic needs 
 to be firmed up. If not, the singleton service needs to be 
 built on something else.
 
 -- 
 
 Scott Stark
 Chief Technology Officer
 JBoss Group, LLC
 
 
 
 
 
 ---
 This SF.Net email sponsored by: Free 

Re: [JBoss-dev] DistributedReplicantManager.isMasterReplica(String)false positives?

2003-08-04 Thread Bela Ban
The entire logic to determine when to become singleton is handled in the 
view callback. Since this can potentially be length, and also uses 
remote group calls, I suggest to run this in a separate thread, so you 
won't run into a deadlock. By default, HA Clustering does *not* use 
deadlock detection.

Scott M Stark wrote:

There is a race condition i the 
DistributedReplicantManager.isMasterReplica(String) that shows up when 
this method is called from within a notifyKeyListeners as shown by 
this stack trace:

Thread main@65 status: RUNNING
- isMasterReplica():437, 
org.jboss.ha.framework.server.DistributedReplicantManagerImpl
- isDRMMasterReplica():234, org.jboss.ha.jmx.HAServiceMBeanSupport
- partitionTopologyChanged():103, 
org.jboss.ha.singleton.HASingletonSupport
- replicantsChanged():197, org.jboss.ha.jmx.HAServiceMBeanSupport$1
- notifyKeyListeners():675, 
org.jboss.ha.framework.server.DistributedReplicantManagerImpl
- add():326, 
org.jboss.ha.framework.server.DistributedReplicantManagerImpl
- registerDRMListener():204, org.jboss.ha.jmx.HAServiceMBeanSupport
- startService():144, org.jboss.ha.jmx.HAServiceMBeanSupport

This is due the the choice to return true when the key in question is 
in the
localReplicants table, but not the replicants table:

   public boolean isMasterReplica (String key)
   {
  if (!localReplicants.containsKey (key))
 return false;
  Vector allNodes = this.partition.getCurrentView ();
  HashMap repForKey = (HashMap)replicants.get(key);
  if (repForKey==null)
 return true; 
This seems to be an ambiguous condition as this condition exists for a 
node that calls add and when the state has not synched or has failed 
to synch. Another problem I'm seeing at least in the context of the 
singleton service is that the notion of the master node is unstable. 
Here is the output from one of 3 nodes running the singleton service 
starting with the addition of the final node shown as view 2.

15:35:44,637 INFO  [Server] JBoss (MX MicroKernel) [3.2.2RC3 (build: 
CVSTag=Branch_3_2 date=200307312219)] Started in 5s:948ms
15:36:27,719 INFO  [DefaultPartition] New cluster view: 2 
([lamia:32947, 172.17.66.54:2821, ironmaiden:51770] delta: 1)
15:36:27,749 INFO  [DefaultPartition:ReplicantManager] Dead members: 0
15:37:13,555 INFO  [DefaultPartition] New cluster view (id: 3, delta: 
-1) : [172.17.66.54:2821, ironmaiden:51770]
15:37:13,575 INFO  [DefaultPartition:ReplicantManager] Dead members: 1
15:38:13,321 INFO  [HASingletonMBeanExample] Notified to start as 
singleton
15:38:13,321 INFO  [DefaultPartition] New cluster view (id: 4, delta: 
1) : [172.17.66.54:2821, ironmaiden:51770, lamia:32949]
15:38:13,331 INFO  [DefaultPartition:ReplicantManager] Dead members: 0
15:38:13,361 INFO  [HASingletonMBeanExample] Notified to stop as 
singleton
15:39:13,447 INFO  [HASingletonMBeanExample] Notified to start as 
singleton
15:39:13,457 INFO  [HASingletonMBeanExample] Notified to stop as 
singleton

With view 3 the orginal node and singleton is killed and the node for 
which the console output corresponds(172.17.66.54) is selected as the 
singleton. When the third node is started again there is some 
thrashing due to the existing 2 nodes both selecting themselves as the 
singleton and telling the other to stop and it appears that there is 
no singleton choosen. The problem seems to be inconsistent  matching 
of member names. Once only knows it IP while the other node knows the 
hostnames. Here is the console view of the second node showing the 
hostnames and its thrashing:

15:25:21,023 INFO  [Server] JBoss (MX MicroKernel) [3.2.2RC3 (build: 
CVSTag=Branch_3_2 date=200307312219)] Started in 13s:597ms
15:26:05,562 INFO  [DefaultPartition] New cluster view: 3 
([succubus:2821, ironmaiden:51770] delta: -1)
15:26:05,573 INFO  [DefaultPartition:ReplicantManager] Dead members: 1
15:27:05,506 INFO  [HASingletonMBeanExample] Notified to start as 
singleton
15:27:05,509 INFO  [DefaultPartition] New cluster view: 4 
([succubus:2821, ironmaiden:51770, lamia:32949] delta: 1)
15:27:05,513 INFO  [DefaultPartition:ReplicantManager] Dead members: 0
15:27:05,531 INFO  [HASingletonMBeanExample] Notified to stop as 
singleton
15:28:05,520 INFO  [HASingletonMBeanExample] Notified to start as 
singleton
15:28:05,526 INFO  [HASingletonMBeanExample] Notified to stop as 
singleton

Its not clear that the DistributedReplicantManager.isMasterReplica was 
designed to be used for the selection of a singleton node, but if it 
is, the logic needs to be firmed up. If not, the singleton service 
needs to be built on something else.

--
Bela Ban
http://www.javagroups.com
Cell: (408) 316-4459


---
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.