Hello all,
 Making some tests with SC 3.2, i have faced some situations that are at least, 
interesting..

 1- First i shutdown the two nodes, and after that i did try to boot just "one" 
node, and for my surprise, the node did not boot. The message was about the 
"other node is unreachable through this path". I have waited more than an hour, 
to see if was a "timeout" or something, and after that i boot the other node 
too. After that, the cluster becomes online again.
 
 2- In other case, i just cut the power off on one node, and for my surprise 
(again), the other node crash too (reboot). After that, i was thinking "Now 
what? the node will not boot because of the case (1) above"... but i was wrong, 
this time the node boot ok.

 The environment is: Two-node sun cluster 3.2, with just "one" cluster 
interconnect interface.

 Testing "evacuate" or "switch" just works. The problem is when i try to 
simulate "real" failures.

 So, the questions are:
  a) The case (1) is fine? How can i fix that in a real world scenario?
  b) and the case (2)?
  c) In the above configuration, what i can expect and what i can not expect 
for a failover/switch back scenarios? I mean, what are the failures that are 
covered in such configuration? How many servers can crash, there is a order to 
respect (shutdown)... ?
  
 I know that all should be obvious for you, and i think there is a explanation 
for all that... but, i just want to know to be aware of.

 Thanks for your time!

 Leal.
--

This message posted from opensolaris.org


Reply via email to