Olivier Fambon a écrit :
Gérard BUNEL [17/10/07 12:01]:
Hello,

Héllo Gérard,


We have 2 controllers with each 2 backends. VDB is in autoload config
We stop one controller, then restart it. Using the console we can see that the VDB is not mounted and the log is as here-under. As the controller as just been shutdonw then restarted, it was effectively the last man down. So why is sequoia complaining about not beeing the last man down.

If I understand you correctly, you keep the other controller up, thus doing down-up on one controller does not make it the last man down: there remains one man up ! May be the term "last man down" is miss-leading. It is meant to identify the last vdb part that went down in a full (all parts) vdb shutdown.
So in case only one controller went down, the second remaining up, this control should not occur.


The logs states that there's an error while trying to join the group. So maybe an appia problem ?

First, a side note: the error shows at group-join time, but this is really a by product: actually, it is vdb.xml load time: when we check for last-man-down condition, IF we are the first vdb part in the group.

Now for your case: according to your logs, the vdb part you are attempting to load is actually the first member in the group:
It is because it didn't find the other controller. It should as it remains up. We use static view in our Appia configuration, so, no use of multicast to discover other members of the group:
<channel name="TCP SEQ Channel" template="tcp_sequencer" initialized="yes">
<memorymanagement size="40000000" up_threshold="15000000" down_threshold="7000000" />
       <chsession name="hederalayer">
<parameter name="base_view">ptrimd01:27752,btrimd01:27752</parameter> <parameter name="base_endpoints">ptrimd01,btrimd01</parameter>
               <parameter name="initial_endpoints">ptrimd01</parameter>
               <parameter name="local_address">ptrimd01:27752</parameter>
               <parameter name="local_endpoint">ptrimd01</parameter>
       </chsession>
       <chsession name="suspectl">
               <parameter name="suspect_sweep">10000</parameter>
               <parameter name="suspect_time">30000</parameter>
       </chsession>
</channel>

But currently we do not understand the whole sense of this configuration part. Particularily, the 2 parameters suspect_sweep and suspect_time are a bit obscur for us (configuration elements provided by another sequoia user who encountered the same problem as us when using multicast-discovering). These 2 parameters seem to be of importance as I've also encountered a loose of sync of the controllers when multiplying these values by 2.

When we restart the whole platform by first recreating recoveryLog on each controller we do not have any problem. Each VDB is mounted correctly and each controller is added to the group.




2007-10-17 11:56:06,201 INFO controller.virtualdatabase.MATISSEDB First controller in group MATISSEDB


So either the other controller is down - and we are not in the case you describe - or the controller you are restarting can not see the other one - and you have a connectivity issue.

Hope this helps.

A+O.
_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia



--

*Gérard BUNEL
*Chef de Projet
____________________________________________________________________


Technopôle Brest Iroise
Site du Vernis – CS 23866
29238 Brest Cedex 3
Tél : + 33 2 98 05 43 21
Fax : + 33 2 98 05 20 34
www.altran.com <http://www.altran.com>

_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia

Reply via email to