Re: [xcat-user] xCAT HA documentation
further from this last year, I am in the process of doing some rhels6.5 installs, and again RedHat have changed it again. Now they officially support pcs. But there are docs suggesting that they will be moving away from corosync, and therefore need to use ccs and cman. I have spent a couple of days on this, and have now updated the (the below mentioned) documentation a bit, and added a Appendix B, with the relevant configs for cman, and the config changes for pcs. FYI, I have successfully done HA with pcs already with at least 3 customer sites(rhel6.4), and one with crm (rhel6.3) The customer even installed the Primary MN with the secondary, and it was successful. So thanks again for the xCAT team for all the original docs. regards, Arif -- Arif Ali IRC: arif-ali at freenode LinkedIn: http://uk.linkedin.com/in/arifali On 21 August 2013 02:29, Guang Cheng Li ligua...@cn.ibm.com wrote: HI Arif, Thanks for the configuration listed below, I updated the xCAT doc *http://sourceforge.net/apps/mediawiki/xcat/index.php?title=Setup_HA_Mgmt_Node_With_DRBD_Pacemaker_Corosync*http://sourceforge.net/apps/mediawiki/xcat/index.php?title=Setup_HA_Mgmt_Node_With_DRBD_Pacemaker_Corosync to reflect this. If you have any further updates with the configuration or procedure, please let me know and I can update the doc. Thanks. - Li,Guang Cheng (李光成) IBM China System Technology Laboratory Email: ligua...@cn.ibm.com Address: Building 28, ZhongGuanCun Software Park, No.8, Dong Bei Wang West Road, Haidian District Beijing 100193, PRC 北京市海淀区东北旺西路8号中关村软件园28号楼 邮编: 100193 [image: Inactive hide details for Arif Ali ---2013-08-21 05:18:40---Hi Lindsay, Thanks for the info, that is good to know.]Arif Ali ---2013-08-21 05:18:40---Hi Lindsay, Thanks for the info, that is good to know. From: Arif Ali m...@arif-ali.co.uk To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 2013-08-21 05:18 Subject: Re: [xcat-user] xCAT HA documentation -- Hi Lindsay, Thanks for the info, that is good to know. But after diagnosing for the day, I have found the relevant stuff to get it all working,So wrt the documentation from *http://sourceforge.net/apps/mediawiki/xcat/index.php?title=Setup_HA_Mgmt_Node_With_DRBD_Pacemaker_Corosync*http://sourceforge.net/apps/mediawiki/xcat/index.php?title=Setup_HA_Mgmt_Node_With_DRBD_Pacemaker_Corosync, the following would be the current changes, this is now successfully working for me know. Will be doing more testing tomorrow. If the devs are happy I will make the relevant changes to reflect the chang in rhels6.4 pcs property set stonith-enabled=false pcs property set no-quorum-policy=ignore pcs resource op defaults timeout=120s pcs resource create ip_xCAT ocf:heartbeat:IPaddr2 ip=10.1.0.1 \ iflabel=xCAT cidr_netmask=24 \ op monitor interval=37s pcs resource create NFS_xCAT lsb:nfs \ op monitor interval=41s pcs resource create NFSlock_xCAT lsb:nfslock \ op monitor interval=43s pcs resource create apache_xCAT ocf:heartbeat:apache configfile=/etc/httpd/conf/httpd.conf \ statusurl=*http://localhost:80/icons/README.html*http://localhost/icons/README.html testregex=/html \ op monitor interval=57s pcs resource create db_xCAT ocf:heartbeat:mysql config=/xCATdrbd/etc/my.cnf test_user=mysql \ binary=/usr/bin/mysqld_safe pid=/var/run/mysqld/mysqld.pid socket=/var/lib/mysql/mysql.sock \ op monitor interval=57s pcs resource create dhcpd lsb:dhcpd op monitor interval=37s pcs resource create drbd_xCAT ocf:linbit:drbd drbd_resource=xCAT pcs resource master ms_drbd_xCAT drbd_xCAT master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true pcs resource create dummy ocf:heartbeat:Dummy pcs resource create fs_xCAT ocf:heartbeat:Filesystem device=/dev/drbd/by-res/xCAT directory=/xCATdrbd fstype=ext4 \ op monitor interval=57s pcs resource create named lsb:named \ op monitor interval=37s pcs resource create symlinks_xCAT ocf:tummy:drbdlinks configfile=/xCATdrbd/etc/drbdlinks.xCAT.conf \ op monitor interval=31s pcs resource create xCAT lsb:xcatd \ op monitor interval=42s pcs resource clone clone_named named clone-max=2 clone-node-max=1 notify=false pcs resource group add fs_xCAT symlinks_xCAT pcs constraint colocation add NFS_xCAT grp_xCAT pcs constraint colocation add NFSlock_xCAT grp_xCAT pcs constraint colocation add apache_xCAT grp_xCAT pcs constraint colocation add dhcpd grp_xCAT pcs constraint colocation add db_xCAT grp_xCAT pcs constraint colocation add dummy grp_xCAT pcs constraint colocation add xCAT grp_xCAT pcs constraint colocation add grp_xCAT ms_drbd_xCAT INFINITY with-rsc-role=Master pcs constraint colocation add ip_xCAT ms_drbd_xCAT INFINITY with-rsc-role=Master pcs constraint order list xCAT dummy pcs constraint order list
Re: [xcat-user] How to create and deploy an xCAT Service Node
Okay, I guess I need to revive this again now that I have the SNs deployed and now I am trying to snmove some nodes onto them. The Heirarchical Cluster wiki page is oriented toward those setting up a brand new cluster and not migrating an established cluster to include SNs, so it does not include clear instructions of what commands to run after you have created groups of CNs for SNs to manage. I am assuming that to get nodes to initially look away from the MN and put them on an SN for the first time you must execute snmove with -d and -D to point to the SN. My config follows: I am testing on just two of the nodes in my cluster for now. So first I did this: mkdef -t group -o serv1_compute members=node0001,node0002 Then following the documentation for creating service pools I did this: chdef -t group serv1_compute servicenode=xcat-serv1,xcat-serv2 # lsdef -t group serv1_compute Object name: serv1_compute grouptype=static members=node0001,node0002 servicenode=xcat-serv1,xcat-serv2 And noderes looks like this now: #node,servicenode,netboot,tftpserver,tftpdir,nfsserver,monserver,nfsdir,installnic,primarynic,discoverynics,cmdinterface,xcatmaster,current_osimage,next_osimage,nimserver,routenames,nameservers,comments,disable user,,xnba,MN_IP,,MN_IP,,,eth0,eth0,,,MN_IP,,, service,,xnba,MN_IP,,MN_IP,,,mac,mac,,,MN_IP,,, storage,,xnba,MN_IP,,MN_IP,,,eth1,eth1,,,MN_IP,,, compute,,xnba,MN_IP,,MN_IP,,,eth0,eth0,,,MN_IP,,, login,,xnba,MN_IP,,MN_IP,,,eth0,eth0,,,MN_IP,,, node0059,,xnba, hinode01,,xnba, serv1_compute,xcat-serv1,xcat-serv2,, node0001,xcat-serv1,xcat-serv2,,xcat-serv1,,xcat-serv1,,,xcat-serv1,,, node0002,xcat-serv1,xcat-serv2,,xcat-serv1,,xcat-serv1,,,xcat-serv1,,, I may have a conflict problem though in that the established compute group which node0001 and node0002 are in is pointing to MN_IP (the MN's ip address) while serv1_compute points to xcat-serv1. I was hoping that since noderes FURTHER defined the servicenode and xcatmaster for them that it would override the settings for compute. Will that work or do I have to remove node0001 node0002 from compute altogether? Their nodelist entries look like this: node0001,compute,compute-profile,ipmi,dx360m2,rack01,all,serv1_compute,booting,11-24-2013 13:55:00,synced,02-05-2014 08:59:57,, node0002,compute,compute-profile,ipmi,dx360m2,rack01,all,serv1_compute,booting,11-24-2013 13:55:00 Then after all the configuration, I tried an snmove on just node0001: # snmove serv1_compute -d xcat-serv1 -D xcat-serv1 Moving nodes to their backup service nodes. Setting new values in the xCAT database. node0001: install centos6.4-x86_64-compute node0002: install centos6.4-x86_64-compute node0001: install centos6.4-x86_64-compute node0002: install centos6.4-x86_64-compute Running postscripts on the nodes. If you specify the -s flag you must not specify either the -S or -k or -P flags In /var/log/messages I saw: Allowing nodeset to node0001,node0002 install for x3650-head01.haib.org http://x3650-head01.haig.org/ from x3650-head01 Firstly, why was a nodeset done when I typed snmove? The nodes are already installed, I don't want to reinstall them. Secondly, According to the wiki documentation: If the CNs are up at the time the *snmove* command is run then snmove will run postscripts on the CNs to reconfigure them for the new SN. However I checked files on node0001 like /etc/ntp.conf and their timestamp had not changed (therefore I deduce the postscript did not run). So I ran the postscripts manually with updatenode node0001 syslog,setupntp. I checked /etc/ntp.conf again and this time the timestamp was updated but the file's contents were identical to before: it pointed to the MN_IP and not xcat-serv1 as it should based on the xcatmaster setting in the noderes table. What am I doing wrong here? Thanks, Josh On Fri, Jan 10, 2014 at 1:48 PM, Josh Nielsen jniel...@hudsonalpha.orgwrote: Thank you Lissa, that is helpful. -Josh On Fri, Jan 10, 2014 at 1:25 PM, Lissa Valletta lis...@us.ibm.com wrote: DNS and DHCP will still work from the Service Node, if setup correctly. In other words, you have configured the service node as the DNS server and/or DHCP server for the nodes and there is no requirement on the Management Node for dns or dhcp. You will not be able to run any xcat commands on the service node, if the Management Node is down. xCAT requires access to the database configured on the MN for the xcat cluster ( mysql, postgresql) to run most xcat commands. Even to recognize that the node is in the xcat cluster. Lissa K. Valletta 8-3/B10 Poughkeepsie, NY 12601 (tie 293) 433-3102 [image: Inactive hide details for Josh Nielsen ---01/10/2014 12:59:12 PM---Hi Wang Xiaopeng (I apologize if I got your name wrong befor]Josh Nielsen ---01/10/2014 12:59:12 PM---Hi Wang Xiaopeng (I apologize if I got your