This one is easy.... I always script everything for the sake of repeatability....
I shall make the assumption that you have already set up two domUs with build 101a as that is the build required to run the last and final version of cluster express. The image for 101a and the matching version of cluster express are here: http://opensolaris.org/os/community/ha-clusters/ohac/Documentation/SCXdocs/SCX/ Your domUs will need at least 3 interfaces on 3 different networks: 1. public interface 2. heartbeat interface 1 3. heartbeat interface 2 To demonstrate this issue you will NOT need any shared storage or quorum server as the issue manifests itself before that requirement comes in. I installed my domUs with ZFS root filesystem. This may be important as it forces me to use lofi for the globaldevices filesystem. Once your systems are up and running, you need to install cluster express. RPC bind needs to listen: svccfg -f - << EOI select network/rpc/bind setprop config/local_only=false quit EOI svcadm refresh network/rpc/bind:default For the actual cluster software there is a graphical installer, no command line version, I'm afraid. You need to install core cluster only. When it asks you to configure it now, say yes. Any upgrades to any java packages etc should be accepted even when it doesn't look like it's required. Leave locale support enabled. If you're struggling with the graphical installer, you can alternatively run this (in ${clusterexpress}/Solaris_x86 where $clusterexpress is the directory with the uncompressed software): ./installer -noconsole -nodisplay -state /path/to/cluster-install-sol11-101a.state I have attached my statefile. reboot both nodes. I'm now assuming that $link1 is the interface name of the first cluster heartbeat interface and $link2 that of the second. I am also assuming that ${node1} is the name of the first node of the cluster (the sponsoring node) and $node2 is that of the second node: On the first node (node1) run this: PATH=/usr/cluster/bin:/usr/cluster/sbin:${PATH} export PATH scinstall -ik \ -C sol \ -F \ -G lofi \ -T node=${node1},node=${node2},authtype=sys \ -A trtype=dlpi,name=${link1} -A trtype=dlpi,name=${link2} \ -B type=switch,name=switch1 -B type=switch,name=switch2 \ -m endpoint=:${link1},endpoint=switch1 \ -m endpoint=:${link2},endpoint=switch2 Wait for it to complete, then reboot the node. It will take a long time after this reboot for the node to come back with svcs -x only showing just one cluster service not running. You will note that it runs scgdevs for a very long time (I suspect another bug). I think it took somewhere between 15 and 30 mins. Either way, have a lot of patience, the scgdevs will eventually complete and just one of the cluster services will show as not running in svcs -x. Once that is the case, on ${node2} execute this: PATH=/usr/cluster/bin:/usr/cluster/sbin:${PATH} export PATH scinstall -ik \ -C sol \ -G lofi \ -N ${node1} \ -A trtype=dlpi,name=${link1} -A trtype=dlpi,name=${link2} \ -B type=switch,name=switch1 -B type=switch,name=switch2 \ -m endpoint=:${link1},endpoint=switch1 \ -m endpoint=:${link2},endpoint=switch2 This will eventually complete. At this point if you run scstat on $node1 you should see two configured and failed interconnects. This is OK, because you now need to reboot ${node2}. $node2 will not complete its reboot but just sit there saying something like "waiting for node1 to become available, cluster has not yet reached quorum". You can't log on to $node2 at this point at all, as you won't get a console login and ssh has not yet started. You can add another node onto the interconnects to do some snooping and you can obviously snoop from $node1. Hope the above makes sense. If you need any assistance on this, I'm more than willing. After all, you're helping me... ;-) Thanks again. Ciao, Eric David Edmondson wrote: > Could someone please describe how to set up the cluster software in a > guest so that I could debug this? (Though it won't be for a few days.) > > Presume that I know nothing about cluster (which is pretty > accurate...). > > dme. > -- OpenSolaris <http://www.opensolaris.com/> Sun <http://www.sun.com/> *EricA.Bautsch* BTAnnuityService ServiceSupportManager Email: *eric.bautsch at sun.com* <mailto:eric.bautsch at sun.com> Telephone: 07710495920 Team Mail: *ann-service-support at sun.com* <mailto:ann-service-support at sun.com> Team Web: http://btannuity.uk/servicesupport/ Sun <http://www.sun.com/solaris/> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/ha-clusters-discuss/attachments/20090701/3b22cf68/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: opensolaris_logo_trans.png Type: image/png Size: 5705 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/ha-clusters-discuss/attachments/20090701/3b22cf68/attachment.png> -------------- next part -------------- A non-text attachment was scrubbed... Name: sig-left.gif Type: image/gif Size: 4771 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/ha-clusters-discuss/attachments/20090701/3b22cf68/attachment.gif> -------------- next part -------------- A non-text attachment was scrubbed... Name: sig-right.gif Type: image/gif Size: 3749 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/ha-clusters-discuss/attachments/20090701/3b22cf68/attachment-0001.gif> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cluster-install-sol11-101a.state URL: <http://mail.opensolaris.org/pipermail/ha-clusters-discuss/attachments/20090701/3b22cf68/attachment.ksh> -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3327 bytes Desc: S/MIME Cryptographic Signature URL: <http://mail.opensolaris.org/pipermail/ha-clusters-discuss/attachments/20090701/3b22cf68/attachment.bin>