Corrado, The error message implies that the Start method of resource rs1 failed -- either it exited non-zero, or time-out (failed to complete within the Start_timeout interval). You say there are no errors in /var/adm/messages, but this is not expected. Did you check the /var/adm/messages on every node? You need to look on the node where the RG was online to see the start-failure messages.
[Note, the method that failed could be either the Start or Prenet_start method of rs1, depending on its resource type.] The recovery action that is taken depends on the Failover_mode property of rs1. If Failover_mode is set to Soft or Hard, the whole resource group will attempt to fail over to a different node; if no other node is available, the RG may try to restart on the same node. If it succeded in restarting, it might give the false appearance that the resource started successfully on the first try. If Failover_mode has any other value besided Soft or Hard, the resource group remains online on the same node but the resource moves to Start_failed state and the RG moves to Online_faulted state. From your description, it does not sound like this is what happened. You can examine the details of what happened by looking at /var/adm/messages on the node where the RG was online. Also check for messages of the form "resource rs1 state on node xxx changed to yyy", which might appear on a different cluster node (the current RGM president node). On 02/ 5/10 10:04 AM, Corrado Romano wrote: > Hello, > > In a Sun Cluster 3.2 I have the following behavior: > - if I enable the whole resource group sudo scswitch -Z -g <group name> > everything is correct > > > - if I enable the resources one by one (e.g. resource rs1) with the > following two commands in sequence: > > scswitch -e -j rs1 > the resource starts correctly (normal messages in /var/adm/messages and no > errors) but I get this output: > [i]scswitch: resource group failed to start on chosen node; it may end up > failing over to other node(s)[/i].No errors in the /var/adm/messages > > clrs monitor rs1 > the monitor starts correctly (normal messages in /var/adm/messages and no > errors) but I get this output: > [i]resource group failed to start on chosen node; it may end up failing over > to other node(s)[/i] > > Does somebody know what it can be the reason. I see this behavior for the > first time after working with other 5 clusters 3.2 which behaved normally > > Thx > Corrado >