Hi Lalith et al, you need to also look at /var/adm/messages on all nodes to see what specifically is going on with your resource.
You can increase the level of debug informations by changing the following line in /etc/syslog.conf on all nodes: *.err;kern.debug;daemon.notice;mail.crit /var/adm/messages to *.err;kern.debug;daemon.debug;mail.crit /var/adm/messages Afterwards restart syslog by # svcadm restart system-log Further, in your <agent directory>/etc/config you can change DEBUG= to DEBUG=ALL Since you use GDS, check that the start command does leave a process running for PMF. You might want to have a look at the Child_mon_level property if the pids being left are only childs. If the start method does not leave a process behind, you need to disable the pmf action script, see http://src.opensolaris.org/source/xref/ha-utilities/GDS-template/SUNCscxxx/bin/functions#222 http://src.opensolaris.org/source/xref/ha-utilities/GDS-template/SUNCscxxx/bin/functions#326 If you see "Failed to saty up" messages from pmf, then it would be an indication. Of course it is hard to tell without seeing your code and knowing what you want to achieve. The success messages in the logfiles you look at just indicate that the methods did return 0 (=success) - but this does not mean they run as desired. Greets Thorsten Lalith Suresh wrote: > Hi all, > > I'm almost done with the coding part as far as HA-Cron is concerned and > I'd tested the start, stop, probe and validate scripts individually. > They seemed to work fine. I then went on to run Make_Package to get the > pkg, and then installed it on my single node cluster. But whenever I > start the RG containing HA-Cron, it goes online for an instant, then > goes offline. I have no clue what's going on. Please help. Here are some > messages that'll help. > > * > *The first time I run this, I don't get any errors > > *bash-3.2# clrg online test > > *In about 3 seconds* > > bash-3.2# cluster status > > === Cluster Nodes === > > --- Node Status --- > > Node Name Status > --------- ------ > irule Online > > > === Cluster Transport Paths === > > Endpoint1 Endpoint2 Status > --------- --------- ------ > > > === Cluster Quorum === > > --- Quorum Votes Summary --- > > Needed Present Possible > ------ ------- -------- > 1 1 1 > > > --- Quorum Votes by Node --- > > Node Name Present Possible Status > --------- ------- -------- ------ > irule 1 1 Online > > > === Cluster Device Groups === > > --- Device Group Status --- > > Device Group Name Primary Secondary Status > ----------------- ------- --------- ------ > > > --- Spare, Inactive, and In Transition Nodes --- > > Device Group Name Spare Nodes Inactive Nodes In Transistion Nodes > ----------------- ----------- -------------- -------------------- > > > --- Multi-owner Device Group Status --- > > Device Group Name Node Name Status > ----------------- --------- ------ > > === Cluster Resource Groups === > > Group Name Node Name Suspended State > ---------- --------- --------- ----- > test irule No Offline > > > === Cluster Resources === > > Resource Name Node Name State Status Message > ------------- --------- ----- -------------- > node irule Offline Offline - > LogicalHostname offline. > > SUNCsccron irule Offline Offline > > > === Cluster DID Devices === > > Device Instance Node Status > --------------- ---- ------ > /dev/did/rdsk/d1 irule Ok > > > === Zone Clusters === > > --- Zone Cluster Status --- > > Name Node Name Zone HostName Status Zone Status > > > *Here's the log file for that run*: > > **/var/cluster/logs/DS/test/SUNCsccron/start_stop_log.txt* > > 01/31/2009 16:47:38 irule STOP-INFO> Stop succeeded > [/opt/SUNCsccron/bin/control_cron -R SUNCsccron -G test -C > /var/spool/cron/crontabs/xyz stop]. > 01/31/2009 16:47:38 irule STOP-INFO> Successfully stopped the application > 01/31/2009 16:47:38 irule --INFO> Validate has been executed > [/opt/SUNCsccron/bin/control_cron -R SUNCsccron -G test -C > /var/spool/cron/crontabs/xyz validate exited with status 0] > 01/31/2009 16:47:38 irule START-INFO> Start succeeded. > [/opt/SUNCsccron/bin/control_cron -R SUNCsccron -G test -C > /var/spool/cron/crontabs/xyz start] > 01/31/2009 16:47:38 irule STOP-INFO> Stop succeeded > [/opt/SUNCsccron/bin/control_cron -R SUNCsccron -G test -C > /var/spool/cron/crontabs/xyz stop]. > 01/31/2009 16:47:38 irule STOP-INFO> Successfully stopped the application > 01/31/2009 16:47:39 irule --INFO> Validate has been executed > [/opt/SUNCsccron/bin/control_cron -R SUNCsccron -G test -C > /var/spool/cron/crontabs/xyz validate exited with status 0] > 01/31/2009 16:47:39 irule START-INFO> Start succeeded. > [/opt/SUNCsccron/bin/control_cron -R SUNCsccron -G test -C > /var/spool/cron/crontabs/xyz start] > 01/31/2009 16:47:39 irule STOP-INFO> Stop succeeded > [/opt/SUNCsccron/bin/control_cron -R SUNCsccron -G test -C > /var/spool/cron/crontabs/xyz stop]. > 01/31/2009 16:47:39 irule STOP-INFO> Successfully stopped the application > * > > *But when I run it again, I get this,* > > **bash-3.2# clrg online test* > *clrg: (C748634) Resource group test failed to start on chosen node and > might fail over to other node(s) > clrg: (C135343) No primary node could be found for resource group test; > it remains offline* > > > And another weird thing is that even thought the log file says it > successfully stopped the application, it isn't producing the expected > results either. (once HA-Cron stops, it's supposed to return the root > crontab file back to the way it was before the RG was started, by making > use of a backup. Although the backup is removed, the original crontab > isn't being restored.) > > > -- > Lalith Suresh > Department of Computer Engineering > Malaviya National Institute of Technology, Jaipur > +91-9982190365 , lalithsuresh.wordpress.com > <http://lalithsuresh.wordpress.com> -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten Amtsgericht Muenchen: HRB 161028 Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer Vorsitzender des Aufsichtsrates: Martin Haering ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~