Hello Nadia,

The line is added to revision of 1.8/HEAD. If this fix should also go into the stable release (1.7) then an issue in the bugtracker would be nice to document the change. It is very likely that we won't release a new version of the current stable branch, because we are working on the release of 1.8. IOW, it is not worth the effort.

Regards,

Nico



Nadia Poulou wrote:
Hello,

We came across a potential problem in the clustering module. According
to the JGroups documentation:

"When a network error occurs, the cluster might be partitioned into
several different partitions. JGroups has a MERGE service that allows
the coordinators in partitions to communicate with each other and form a
single cluster back again."

In our environment this indeed resulted to a problem in the case of two
coordinators sharing one channel. Costyn van Dongen van IC&S found a way
to prevent it. With the change described here, two coordinators will be
merged and 1 coordinator will be selected.
It would be nice to have this change in the CVS. What is the procedure
to have this change accepted? (I suppose it should be a VOTE CALL, but
this should be then initiated by a MMC member, am I right?)

Anyway, here is a description of the change: The JGroups protocol stack is defined in multicast.xml (in 1.7.1
version) en jgroup.xml (in the current cvs/1.8).
In the current CVS, this is the file I am talking about:
http://cvs.mmbase.org/viewcvs/*checkout*/applications/clustering/config/
utils/jgroup.xml?content-type=text%2Fplain

By adding the following line between the lines of PING en FD, the
problem should be prevented: MERGE2(min_interval=5000;max_interval=10000).

The file then becomes:

<property name="channelproperties">
     UDP(mcast_addr=224.0.0.1;mcast_port=16080;ip_ttl=1):
     PING(timeout=3000;num_initial_members=6):
     MERGE2(min_interval=5000;max_interval=10000):
     FD(timeout=3000):
     VERIFY_SUSPECT(timeout=1500):
     pbcast.NAKACK(gc_lag=10;retransmit_timeout=600,1200,2400,4800):
     UNICAST(timeout=600,1200,2400,4800):
     pbcast.STABLE(desired_avg_gossip=10000):
     FRAG:
pbcast.GMS(join_timeout=5000;join_retry_timeout=2000;shun=true;print_loc
al_addr=true)
   </property>

Greetings,

Nadia Poulou
Stichting Kennisnet


DISCLAIMER:



Dit bericht (met bijlagen) is met grote zorgvuldigheid samengesteld. Voor 
mogelijke onjuistheid en/of onvolledigheid van de hierin verstrekte informatie 
kan Kennisnet geen aansprakelijkheid aanvaarden, evenmin kunnen aan de inhoud 
van dit bericht (met bijlagen) rechten worden ontleend. De inhoud van dit 
bericht (met bijlagen) kan vertrouwelijke informatie bevatten en is uitsluitend 
bestemd voor de geadresseerde van dit bericht. Indien u niet de beoogde 
ontvanger van dit bericht bent, verzoekt Kennisnet u dit bericht te 
verwijderen, eventuele bijlagen niet te openen en wijst Kennisnet u op de 
onrechtmatigheid van het gebruiken, kopiƫren of verspreiden van de inhoud van 
dit bericht (met bijlagen).



This message (with attachments) is given in good faith. Kennisnet cannot assume 
any responsibility for the accuracy or reliability of the information contained 
in this message (with attachments), nor shall the information be construed as 
constituting any obligation on the part of Kennisnet. The information contained 
in this message (with attachments) may be confidential or privileged and is 
only intended for the use of the named addressee. If you are not the intended 
recipient, you are requested by Kennisnet to delete this message (with 
attachments) without opening it and you are notified by Kennisnet that any 
disclosure, copying or distribution of the information contained in this 
message (with attachments) is strictly prohibited and unlawful.


_______________________________________________
Developers mailing list
[email protected]
http://lists.mmbase.org/mailman/listinfo/developers





_______________________________________________
Developers mailing list
[email protected]
http://lists.mmbase.org/mailman/listinfo/developers

Reply via email to