Re: [xcat-user] confignetwork and localhost hostname

2019-05-23 Thread Yuan Y Bai
Hi Roosen, Nicolas
 
I am glad that `confignetwork` can work well in your postbootscripts.
 
Since `confignetwork` has some restrictions under postscripts in 2.14.6, it can work well in postbootscripts. I have enhance it running in postscripts in 2.15 branch, If you want to run it in postscripts, you can download latest `confignetwork` , `configeth` and  `nicutils.sh` from 2.15 master branch.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Roosen, Nicolas" To: "xcat-user@lists.sourceforge.net" Cc:Subject: [EXTERNAL] Re: [xcat-user] confignetwork and localhost hostnameDate: Thu, May 23, 2019 6:46 PM 
Hi,On 5/23/19 12:28 PM, Yuan Y Bai wrote:> Hi Nicolas>  > Based on your log, you use `confignics`. `confignics` cannot configure bond.>  > Could you try to use like the following command to configure bond?> >     chdef cn1  postbootscripts="otherpkgs,confignetwork -s"> postscripts="syslog,remoteshell,syncfiles">  >  yes you're right, I was mixing the two methods.It still doesn't work when I use the "confignetwork" in the postscripts.But postbootscript is fine for me.So I did some tests recently, and it almost works (see below why) whenprovisioning the node using the "postbootscript" method.Now I have an issue with the Ethernet driver (i40e / i40iw on a IntelX722 card) which does a core dump, but that's another story :-/Thanks for you help.Nicolas>  > Best Regards> --> Yuan Bai (白媛)>> CSTL HPC System Management Development> Tel:86-10-82451401> E-mail: by...@cn.ibm.com> Address: IBM ZGC Campus. Ring Building 28,> ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,> Beijing P.R.China 100193>> IBM环宇大厦> 北京市海淀区东北旺西路8号,中关村软件园28号楼> 邮编:100193>  >  >>     - Original message ->     From: "Roosen, Nicolas" >     To: "xcat-user@lists.sourceforge.net" >     Cc:>     Subject: [EXTERNAL] Re: [xcat-user] confignetwork and localhost hostname>     Date: Wed, May 22, 2019 3:51 PM>      >     On 5/22/19 8:52 AM, Yuan Y Bai wrote:>     > Hi Roosen,>     >  >     > Could you try to use `confignetwork -s` instead of `confignetwork` in>     > your postscripts?>     >  >     > You can use this command to change your postscripts:  chdef node1>     > postscripts="syslog,remoteshell,syncfiles,confignetwork -s">     >  >     > I think you use install NIC as one of bond slaves. `confignetwork -s`>     > can configure hostname during configure install NIC, after that, it>     > start to create bond.  >     >  >>     Thanks for the suggestion. I added the "-s" switch, still the issue is>     the same.>>     In the logs I see that the "bonding" module fails to load when>     provisioning, maybe I have to add this module somewhere (initramfs ?).>>     rt..: confignics>     confignics on node1: config install nic:0, remove: 0, iba ports: 1>     ib0!10.148.251.11>     bond0!1.2.3.4>     bond1!6.7.8.9>     confignics on node1: unknown nic type for bond0: 1.2.3.4 .>     confignics on node1: unknown nic type for bond1: 6.7.8.9 .>     confignics on node1: executed script: configib for nics: ib0, ports: 1>     bond0!BONDING_OPTS=mode=2>     bond1!MTU=9000>     ...: confignics return with 1>     [...]>     configure nic and its device : bond0 enp195s0f0@enp195s0f1>     type=ethernet>     ond0".>     [E]:Error: Fail to load kernel module "bonding">     [I]: >>>     ./nicutils.sh: line 1391: /sys/class/net/bonding_masters: Permission>     denied>     [E]:Error: stage 0: Fail to create bond device "bond0">>>     Thanks.>     Nicolas>>     >  >     > Best Regards>     > -->     > Yuan Bai (白媛)>     >>     > CSTL HPC System Management Development>     > Tel:86-10-82451401>     > E-mail: by...@cn.ibm.com>     > Address: IBM ZGC Campus. Ring Building 28,>     > ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian>     District,>     > Beijing P.R.China 100193>     >>     > IBM环宇大厦>     > 北京市海淀区东北旺西路8号,中关村软件园28号楼>     > 邮编:100193>     >  >     >  >     >>     >     - Original message ->     >     From: "Roosen, Nicolas" >

Re: [xcat-user] confignetwork and localhost hostname

2019-05-23 Thread Yuan Y Bai
Hi Nicolas
 
Based on your log, you use `confignics`. `confignics` cannot configure bond.
 
Could you try to use like the following command to configure bond?
 
    chdef cn1  postbootscripts="otherpkgs,confignetwork -s" postscripts="syslog,remoteshell,syncfiles"
 
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Roosen, Nicolas" To: "xcat-user@lists.sourceforge.net" Cc:Subject: [EXTERNAL] Re: [xcat-user] confignetwork and localhost hostnameDate: Wed, May 22, 2019 3:51 PM 
On 5/22/19 8:52 AM, Yuan Y Bai wrote:> Hi Roosen,>  > Could you try to use `confignetwork -s` instead of `confignetwork` in> your postscripts?>  > You can use this command to change your postscripts:  chdef node1> postscripts="syslog,remoteshell,syncfiles,confignetwork -s">  > I think you use install NIC as one of bond slaves. `confignetwork -s`> can configure hostname during configure install NIC, after that, it> start to create bond.  >  Thanks for the suggestion. I added the "-s" switch, still the issue isthe same.In the logs I see that the "bonding" module fails to load whenprovisioning, maybe I have to add this module somewhere (initramfs ?).rt..: confignicsconfignics on node1: config install nic:0, remove: 0, iba ports: 1ib0!10.148.251.11bond0!1.2.3.4bond1!6.7.8.9confignics on node1: unknown nic type for bond0: 1.2.3.4 .confignics on node1: unknown nic type for bond1: 6.7.8.9 .confignics on node1: executed script: configib for nics: ib0, ports: 1bond0!BONDING_OPTS=mode=2bond1!MTU=9000...: confignics return with 1[...]configure nic and its device : bond0 enp195s0f0@enp195s0f1type=ethernetond0".[E]:Error: Fail to load kernel module "bonding"[I]: >>./nicutils.sh: line 1391: /sys/class/net/bonding_masters: Permission denied[E]:Error: stage 0: Fail to create bond device "bond0"Thanks.Nicolas>  > Best Regards> --> Yuan Bai (白媛)>> CSTL HPC System Management Development> Tel:86-10-82451401> E-mail: by...@cn.ibm.com> Address: IBM ZGC Campus. Ring Building 28,> ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,> Beijing P.R.China 100193>> IBM环宇大厦> 北京市海淀区东北旺西路8号,中关村软件园28号楼> 邮编:100193>  >  >>     - Original message ->     From: "Roosen, Nicolas" >     To: "xcat-user@lists.sourceforge.net" >     Cc:>     Subject: [EXTERNAL] [xcat-user] confignetwork and localhost hostname>     Date: Tue, May 21, 2019 8:34 PM>      >     Hello, on xCAT 2.14.6 w/ RHEL 7.6, I have this weird issue: after>     provisioning the hostname is not correctly set, it stays to "localhost".>>     This happened since I added the "confignetwork" script to the node>     definition (to setup bonded interfaces).>>     Here are some details:>>     lsdef -t node -o node1>>     Object name: node1>         arch=x86_64>         currchain=boot>         currstate=install rhels7.6-x86_64-compute>         groups=all>         installnic=mac>         ip=1.2.3.4>         mac=08:00:00:00:00:00>         mgt=none>         netboot=xnba>         nicdevices.bond0=enp195s0f0|enp195s0f1>         nicdevices.bond1=enP2p193s0|enP3p65s0>         nicextraparams.bond0=BONDING_OPTS=mode=2>         nicextraparams.bond1=MTU=9000>         nicips.bond0=1.2.3.4>         nicips.bond1=6.7.8.9>         nicnetworks.bond0=1_1_3_4-255_255_0_0>         nicnetworks.bond1=6_7_8_9-255_255_255_0>         nictypes.enP3p65s0=ethernet>         nictypes.bond0=bond>         nictypes.bond1=bond>         nictypes.enp195s0f1=ethernet>         nictypes.enP2p193s0=ethernet>         nictypes.enp195s0f0=ethernet>         os=rhels7.6>         postbootscripts=otherpkgs>         postscripts=syslog,remoteshell,syncfiles,confignetwork>         primarynic=mac>         profile="">>         provmethod=rhels7.6-x86_64-install-node>         routenames=defaultroute>         status=booted>         statustime=05-20-2019 17:42:15>         updatestatus=synced>         updatestatustime=05-20-2019 14:31:18>>>     lsdef -t osimage -o rhels7.6-x86_64-install-node>>     Object name: rhels7.6-x86_64-install-node>         addkcmdline=earlyprintk=ttyS0,115200 console=tty0>     console=ttyS0,115200>         imagetype=linux>         osarch=x86_64>         osdistroname=rhels7.6-x86_64>         osname=Linux>       

Re: [xcat-user] confignetwork and localhost hostname

2019-05-22 Thread Yuan Y Bai
Hi Roosen,
 
Could you try to use `confignetwork -s` instead of `confignetwork` in your postscripts?
 
You can use this command to change your postscripts:  chdef node1 postscripts="syslog,remoteshell,syncfiles,confignetwork -s"
 
I think you use install NIC as one of bond slaves. `confignetwork -s` can configure hostname during configure install NIC, after that, it start to create bond.   
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Roosen, Nicolas" To: "xcat-user@lists.sourceforge.net" Cc:Subject: [EXTERNAL] [xcat-user] confignetwork and localhost hostnameDate: Tue, May 21, 2019 8:34 PM 
Hello, on xCAT 2.14.6 w/ RHEL 7.6, I have this weird issue: afterprovisioning the hostname is not correctly set, it stays to "localhost".This happened since I added the "confignetwork" script to the nodedefinition (to setup bonded interfaces).Here are some details:lsdef -t node -o node1Object name: node1    arch=x86_64    currchain=boot    currstate=install rhels7.6-x86_64-compute    groups=all    installnic=mac    ip=1.2.3.4    mac=08:00:00:00:00:00    mgt=none    netboot=xnba    nicdevices.bond0=enp195s0f0|enp195s0f1    nicdevices.bond1=enP2p193s0|enP3p65s0    nicextraparams.bond0=BONDING_OPTS=mode=2    nicextraparams.bond1=MTU=9000    nicips.bond0=1.2.3.4    nicips.bond1=6.7.8.9    nicnetworks.bond0=1_1_3_4-255_255_0_0    nicnetworks.bond1=6_7_8_9-255_255_255_0    nictypes.enP3p65s0=ethernet    nictypes.bond0=bond    nictypes.bond1=bond    nictypes.enp195s0f1=ethernet    nictypes.enP2p193s0=ethernet    nictypes.enp195s0f0=ethernet    os=rhels7.6    postbootscripts=otherpkgs    postscripts=syslog,remoteshell,syncfiles,confignetwork    primarynic=mac    profile="">    provmethod=rhels7.6-x86_64-install-node    routenames=defaultroute    status=booted    statustime=05-20-2019 17:42:15    updatestatus=synced    updatestatustime=05-20-2019 14:31:18lsdef -t osimage -o rhels7.6-x86_64-install-nodeObject name: rhels7.6-x86_64-install-node    addkcmdline=earlyprintk=ttyS0,115200 console=tty0 console=ttyS0,115200    imagetype=linux    osarch=x86_64    osdistroname=rhels7.6-x86_64    osname=Linux    osvers=rhels7.6    otherpkgdir=/install/post/otherpkgs/rhels7.6/x86_64    otherpkglist=/install/custom/install/rh/sdflex.rhels7.otherpkgs.pkglist    partitionfile=/install/custom/install/rh/sdflexparitions    pkgdir=/install/rhels7.6/x86_64,/install/post/otherpkgs/rhels7.6/x86_64    pkglist=/opt/xcat/share/xcat/install/rh/sdflex.rhels7.pkglist    profile="">    provmethod=install    template=/install/custom/install/rh/sdflex.rhels7.tmpl--Nicolas___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user 
 


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Ethernet bonding mode (BONDING_OPTS)

2019-05-20 Thread Yuan Y Bai
Hi Roosen,
 
Thanks your information.
 
Could you find `bond0!BONDING_OPTS=mode=2` in output of updatenode?
Do you find `BONDING_OPTS="mode=2"` from ifcfg-bond0 after updatenode command?
 
I cannot reproduce this issue in my RH7.6 environment, I tried the command `chdef node1 nicextraparams.bond0="BONDING_OPTS=mode=2"` and `updatenode node1 confignetwork`, it worked.
 
"802.3ad" is the default value for BONDING_OPTS in our code /install/postscripts/nicutils.sh,  You can replace line _bonding_opts="mode=802.3ad miimon=100"  with _bonding_opts="mode=2" in our code /install/postscripts/nicutils.sh, then try to run "updatenode node1 confignetwork" to see if it work or not. 
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Roosen, Nicolas" To: "xcat-user@lists.sourceforge.net" Cc:Subject: [EXTERNAL] Re: [xcat-user] Ethernet bonding mode (BONDING_OPTS)Date: Fri, May 17, 2019 9:54 PM 
Answering to myself:On 5/17/19 12:43 PM, Roosen, Nicolas wrote:> Hello, I am trying to setup a bonded Ethernet interface on a node (xCAT> 2.14.6 / RHEL 7.6).>> It works almost fine, except for the bonding options which I fail to change.> The default seems to be "802.3ad".> I'd like to set it to "mode=2".>> I tried this:>> chdef node1 nicextraparams.bond0="BONDING_OPTS=mode=2">> Which seems to be taken into account:>> tabdump nics> #node,nicips,nichostnamesuffixes,nichostnameprefixes,nictypes,niccustomscripts,nicnetworks,nicaliases,nicextraparams,nicdevices,nicsadapter,comments,disable> "node1","bond0!172.31.0.11",,,"bond0!bond,enp195s0f1!ethernet,enp195s0f0!ethernet",,"bond0!172_31_0_0-255_255_0_0",,"bond0!BONDING_OPTS=mode=2","bond0!enp195s0f0|enp195s0f1",,,>> But when I update the network configuration on the node, the settings is> not applied.>> cat /proc/net/bonding/bond0 | grep -i mode> Bonding Mode: IEEE 802.3ad Dynamic link aggregation>> What else can I try?>It works fine when I re-install the node.[root@localhost network-scripts]# cat ifcfg-bond0_ONBOOT_="yes"USERCTL="no"TYPE="Bond"BONDING_MASTER="yes"BONDING_OPTS="mode=2"BOOTPROTO="static"DHCLIENTARGS="-timeout 200"MTU="1500"DEVICE="bond0"cat /proc/net/bonding/bond0Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)Bonding Mode: load balancing (xor)Transmit Hash Policy: layer2 (0)--Nicolas___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user 
 


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


[xcat-user] Collect xCAT users logo and successful client case

2019-05-06 Thread Yuan Y Bai
Hi xCAT Users!
We thank you for your continued support and interest in our product. As you may have noticed we recently deployed a new logo and revamped the project homepage. We added a section for "Who's using xCAT" (http://xcat.org/index.html#who-is-using-xcat) to help back success stories of the product.
If you and/or your organization are willing to opt-in and allow us to use your logo on our homepage, please open an issue here: https://github.com/xcat2/xcat2.github.io/issues/new?template=logo.md
We appreciate your continued support!
 
If you have problem, please contact us freely.
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


[xcat-user] Collect xCAT users logo and successful client case

2019-05-06 Thread Yuan Y Bai
Hi xCAT Users!
We thank you for your continued support and interest in our product. As you may have noticed we recently deployed a new logo and revamped the project homepage. We added a section for "Who's using xCAT" (http://xcat.org/index.html#who-is-using-xcat) to help back success stories of the product.
If you and/or your organization are willing to opt-in and allow us to use your logo on our homepage, please open an issue here: https://github.com/xcat2/xcat2.github.io/issues/new?template=logo.md
We appreciate your continued support!
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] confignics -s -r

2019-03-25 Thread Yuan Y Bai
Hi Christopher,
 
When the node is installed successfully, the node status  is "booted", you can use command "lsdef  -i status" to check the status. If "confignics -s -r" is failed in postscripts or postbootscripts, the status will not be booted.
 
As you mentioned "17 failed nodes have DHCP network, and you can use updatenode to work around", I think your OS provision was finished. You can find all xCAT related logs under management node /var/log/xcat directory, the provision log is named computes.log*.
 
You can also use xcatdebugmode to debug problems, especially on OS provision, using command "chdef -t site xcatdebugmode=1" to enable basic debug mode, after you do some actions, you can find logs under /var/log/xcat.
Here is related doc:https://xcat-docs.readthedocs.io/en/stable/troubleshooting/index.html
 
"confignics -s -r" is to configure installnic and clear up other NICs configuration, so scripts behind "confignics -s -r" in postscripts or postbootscripts, which had functions related NICs or network may be affected here. You can look at the logs.
 
Do you have other problems during installation? If you can provide us more information, we can give you a better judgement.
 
Thanks.
 
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Christopher Walker To: xCAT Users Mailing list , Yuan Y Bai Cc:Subject: Re: [xcat-user] confignics -s -rDate: Mon, Mar 25, 2019 6:22 PM 
On 21/03/2019 02:37, Yuan Y Bai wrote:> Hi Christopher,> Thanks your answers.> I think you can put "confignics -s -r" in postbootscripts, not> postscripts in your failed nodes definition.> I think you'd better upgrade  xCAT.We plan to do this soon.> Since in xCAT 2.12.4, "confignics> -s"  did the actions "ifdown , generate configure files,> then ifup ", these actions made unstable to configure> installnic in the postscripts stage.Thanks. Is the instability something that just affects installnic, ordoes it affect other parts of the install too?Chris> Best Regards> --> Yuan Bai (白媛)>> CSTL HPC System Management Development> Tel:86-10-82451401> E-mail: by...@cn.ibm.com> Address: IBM ZGC Campus. Ring Building 28,> ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,> Beijing P.R.China 100193>> IBM环宇大厦> 北京市海淀区东北旺西路8号,中关村软件园28号楼> 邮编:100193>>     - Original message ->     From: Christopher Walker >     To: "xcat-user@lists.sourceforge.net" >     Cc:>     Subject: Re: [xcat-user] confignics -s -r>     Date: Thu, Mar 21, 2019 7:24 AM>     On 20/03/2019 02:52, Yuan Y Bai wrote:>      > Hi Christopher,>      > Could you try to use "confignics -s -r" in postbootscripts?>>     We could, yes.>>      > In postscripts stage,   "-r" is to shut down the NIC if it is on, and>      > remove interface configuration at the same time, when it ifdown>     install>      > NIC, it may cause unrealiabe.>>     It sounds like this may well be the issue.>>     Are you saying there's a potential race between the "-s" and the "-r">     options?>>>      > In order to help us know what happened in your failed nodes,>     could you>      > share the following information?>      > You have 10 nodes successfully, and 17 failed, are all these nodes>      > installing the same OS?>>     Yes.>>     Furthermore, they were all of the same hardware type plugged into the>     same switches.>>      > Which OS do you use?>>     Centos 7.4>>      > We have different code>      > logic for different OS.>      > I think you want to use "-r" to "deconfigure other network>     cards", you>      > mentioned there was only one network, so I think other network cards>      > were not configured in postscripts stage,>>     Correct, though they get the default config from Centos - which is to>     DHCP. We'd prefer that the config were removed - otherwise we>     potentially end up with two IPs on the same network (though it's>     probably sensible to disable the network ports too).>>>      > is "confignics -s" enough>      > here?>>     No, we wish to remove the config for the other nics.>>     We can, I guess put confignics -s in postcripts and confignics -r in>     postbootscripts (or vice versa). Is that what you'd s

Re: [xcat-user] confignics -s -r

2019-03-21 Thread Yuan Y Bai
Hi Christopher,
 
Thanks contribute your usage about "confignics -s -r".
 
In latest xCAT 2.14.x, we use "confignetwork` instead of "confignics". But there is no "-r" action in "confignetwork".
 
Your usage about "confignics -s -r" is a reasonable requirement for "confignetwork". We will plan to implement it in "confignetwork", using this issue to track the work https://github.com/xcat2/xcat2-task-management/issues/649
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Yuan Y Bai" To: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] confignics -s -rDate: Thu, Mar 21, 2019 10:37 AM 
Hi Christopher,
 
Thanks your answers.
 
I think you can put "confignics -s -r" in postbootscripts, not postscripts in your failed nodes definition.
 
I think you'd better upgrade  xCAT. Since in xCAT 2.12.4, "confignics -s"  did the actions "ifdown , generate configure files, then ifup ", these actions made unstable to configure installnic in the postscripts stage.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Christopher Walker To: "xcat-user@lists.sourceforge.net" Cc:Subject: Re: [xcat-user] confignics -s -rDate: Thu, Mar 21, 2019 7:24 AM 
On 20/03/2019 02:52, Yuan Y Bai wrote:> Hi Christopher,> Could you try to use "confignics -s -r" in postbootscripts?We could, yes.> In postscripts stage,   "-r" is to shut down the NIC if it is on, and> remove interface configuration at the same time, when it ifdown install> NIC, it may cause unrealiabe.It sounds like this may well be the issue.Are you saying there's a potential race between the "-s" and the "-r"options?> In order to help us know what happened in your failed nodes, could you> share the following information?> You have 10 nodes successfully, and 17 failed, are all these nodes> installing the same OS?  Yes.Furthermore, they were all of the same hardware type plugged into thesame switches.> Which OS do you use?  Centos 7.4> We have different code> logic for different OS.> I think you want to use "-r" to "deconfigure other network cards", you> mentioned there was only one network, so I think other network cards> were not configured in postscripts stage,Correct, though they get the default config from Centos - which is toDHCP. We'd prefer that the config were removed - otherwise wepotentially end up with two IPs on the same network (though it'sprobably sensible to disable the network ports too).> is "confignics -s" enough> here?No, we wish to remove the config for the other nics.We can, I guess put confignics -s in postcripts and confignics -r inpostbootscripts (or vice versa). Is that what you'd suggest?>  Do you have different comments here? Please feel freely to> contact us, thanks.>      10 ran it successfully>      17 failed, so nodes still had a dhcp addressYes indeed.Thanks,Chris> Best Regards> --> Yuan Bai (白媛)>> CSTL HPC System Management Development> Tel:86-10-82451401> E-mail: by...@cn.ibm.com> Address: IBM ZGC Campus. Ring Building 28,> ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,> Beijing P.R.China 100193>> IBM环宇大厦> 北京市海淀区东北旺西路8号,中关村软件园28号楼> 邮编:100193>>     - Original message ->     From: Christopher Walker >     To: "xcat-user@lists.sourceforge.net" >     Cc:>     Subject: [xcat-user] confignics -s -r>     Date: Tue, Mar 19, 2019 7:25 PM>     We have a problem with "configics -s -r" not running reliably in a>     postscript.>>     While we have some infiniband nodes, the majority use only one network>     for install and as the single network for the nodes.>>     On node install, we wish to assign a static IP address on the install>     nic, and deconfigure other network cards.>>     updatenode  confignics -s -r>>>     Does this just fine.>>     However, it seems unreliable when run as a postscript. On a recent>     reinstall of 30 node:>>          10 ran it successfully>          17 failed, so nodes 

Re: [xcat-user] confignics -s -r

2019-03-20 Thread Yuan Y Bai
Hi Christopher,
 
Thanks your answers.
 
I think you can put "confignics -s -r" in postbootscripts, not postscripts in your failed nodes definition.
 
I think you'd better upgrade  xCAT. Since in xCAT 2.12.4, "confignics -s"  did the actions "ifdown , generate configure files, then ifup ", these actions made unstable to configure installnic in the postscripts stage.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Christopher Walker To: "xcat-user@lists.sourceforge.net" Cc:Subject: Re: [xcat-user] confignics -s -rDate: Thu, Mar 21, 2019 7:24 AM 
On 20/03/2019 02:52, Yuan Y Bai wrote:> Hi Christopher,> Could you try to use "confignics -s -r" in postbootscripts?We could, yes.> In postscripts stage,   "-r" is to shut down the NIC if it is on, and> remove interface configuration at the same time, when it ifdown install> NIC, it may cause unrealiabe.It sounds like this may well be the issue.Are you saying there's a potential race between the "-s" and the "-r"options?> In order to help us know what happened in your failed nodes, could you> share the following information?> You have 10 nodes successfully, and 17 failed, are all these nodes> installing the same OS?  Yes.Furthermore, they were all of the same hardware type plugged into thesame switches.> Which OS do you use?  Centos 7.4> We have different code> logic for different OS.> I think you want to use "-r" to "deconfigure other network cards", you> mentioned there was only one network, so I think other network cards> were not configured in postscripts stage,Correct, though they get the default config from Centos - which is toDHCP. We'd prefer that the config were removed - otherwise wepotentially end up with two IPs on the same network (though it'sprobably sensible to disable the network ports too).> is "confignics -s" enough> here?No, we wish to remove the config for the other nics.We can, I guess put confignics -s in postcripts and confignics -r inpostbootscripts (or vice versa). Is that what you'd suggest?>  Do you have different comments here? Please feel freely to> contact us, thanks.>      10 ran it successfully>      17 failed, so nodes still had a dhcp addressYes indeed.Thanks,Chris> Best Regards> --> Yuan Bai (白媛)>> CSTL HPC System Management Development> Tel:86-10-82451401> E-mail: by...@cn.ibm.com> Address: IBM ZGC Campus. Ring Building 28,> ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,> Beijing P.R.China 100193>> IBM环宇大厦> 北京市海淀区东北旺西路8号,中关村软件园28号楼> 邮编:100193>>     - Original message ->     From: Christopher Walker >     To: "xcat-user@lists.sourceforge.net" >     Cc:>     Subject: [xcat-user] confignics -s -r>     Date: Tue, Mar 19, 2019 7:25 PM>     We have a problem with "configics -s -r" not running reliably in a>     postscript.>>     While we have some infiniband nodes, the majority use only one network>     for install and as the single network for the nodes.>>     On node install, we wish to assign a static IP address on the install>     nic, and deconfigure other network cards.>>     updatenode  confignics -s -r>>>     Does this just fine.>>     However, it seems unreliable when run as a postscript. On a recent>     reinstall of 30 node:>>          10 ran it successfully>          17 failed, so nodes still had a dhcp address>          3 failed for other reasons (telling the bios which image to boot).>>     I've no idea what causes this - could it be a race condition somewhere?>     If so, is there a timer I could increase to make it less likely to>     happen?>>     The workaround is to run>          updatenode  confignics -s -r>>     by hand afterwards.>>     We are running a relatively old version of xCAT - 2.12.4 - and do plan>     to upgrade soon.>>     Chris>>     -->     Dr Christopher J. Walker>     ITS Research>     Queen Mary University of London, E1 4NS>     +44 20 7882 5969>>     ___>     xCAT-user mailing list>     xCAT-user@lists.sourceforge.net>     https://lists.sourceforge.net/lists/listinfo/xcat-user>>>>> ___> xCAT-user mailing list> xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user>--Dr Christopher J. WalkerITS ResearchQueen Mary University of London, E1 4NS+44 20 7882 5969___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user 
 


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] confignics -s -r

2019-03-19 Thread Yuan Y Bai
Hi Christopher,
 
Could you try to use "confignics -s -r" in postbootscripts?
 
In postscripts stage,   "-r" is to shut down the NIC if it is on, and remove interface configuration at the same time, when it ifdown install NIC, it may cause unrealiabe.
 
In order to help us know what happened in your failed nodes, could you share the following information?
You have 10 nodes successfully, and 17 failed, are all these nodes installing the same OS?  Which OS do you use?  We have different code logic for different OS.
I think you want to use "-r" to "deconfigure other network cards", you mentioned there was only one network, so I think other network cards were not configured in postscripts stage, is "confignics -s" enough here?  Do you have different comments here? Please feel freely to contact us, thanks.
 
    10 ran it successfully    17 failed, so nodes still had a dhcp address
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Christopher Walker To: "xcat-user@lists.sourceforge.net" Cc:Subject: [xcat-user] confignics -s -rDate: Tue, Mar 19, 2019 7:25 PM 
We have a problem with "configics -s -r" not running reliably in apostscript.While we have some infiniband nodes, the majority use only one networkfor install and as the single network for the nodes.On node install, we wish to assign a static IP address on the installnic, and deconfigure other network cards.updatenode  confignics -s -rDoes this just fine.However, it seems unreliable when run as a postscript. On a recentreinstall of 30 node:    10 ran it successfully    17 failed, so nodes still had a dhcp address    3 failed for other reasons (telling the bios which image to boot).I've no idea what causes this - could it be a race condition somewhere?If so, is there a timer I could increase to make it less likely to happen?The workaround is to run    updatenode  confignics -s -rby hand afterwards.We are running a relatively old version of xCAT - 2.12.4 - and do planto upgrade soon.Chris--Dr Christopher J. WalkerITS ResearchQueen Mary University of London, E1 4NS+44 20 7882 5969___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user 
 


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] noderes,nics,confignics

2019-02-20 Thread Yuan Y Bai
Well actually, it's when, after my first scenario, I change the nicstable to be"tars-113","192.168.130.170,eth2!192.168.128.115", 
[bai] I think nicips format is wrong here. Is this ip "192.168.130.170" for install nic? If it is for installnic, it should not be configured in nics table. The nicips format is "!,!". install nic ip should be configured in hosts table. You can use command `chdef  ip=xx.xx.xx.xx`, then you can find it is saved in `hosts` table.
and then run makedns tars-113
[bai]after tars-113 is configured correctly in `hosts` table or `/etc/hosts` or nics table, `makedns tars-113` will have correct results.Can you help me understand what's happening here ?
 
We have some best practice here about DNS, Hostname, Alias: https://xcat-docs.readthedocs.io/en/latest/QA/makehosts.html
 
The confignetwork related doc: https://xcat-docs.readthedocs.io/en/latest/guides/admin-guides/manage_clusters/ppc64le/diskful/customize_image/network/cfg_network_adapter.html?highlight=confignetwork
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Thomas HUMMEL To: xcat-user@lists.sourceforge.netCc:Subject: Re: [xcat-user] noderes,nics,confignicsDate: Wed, Feb 20, 2019 10:46 PM 
On 2/20/19 2:51 PM, Thomas HUMMEL wrote:> Note : Before those 2 scenarii I had, probably coming from a previous> test I don't quite remember, a CNAME for tars-113-eth2 pointing to> tars-113 indeed. I ran the 2 scenarii above after I cleaned the zone> (i.e. remove all tars-113 related records, dans add the node again)Well actually, it's when, after my first scenario, I change the nicstable to be"tars-113","192.168.130.170,eth2!192.168.128.115",and then run makedns tars-113Can you help me understand what's happening here ?Thanks___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user 
 


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] noderes,nics,confignics

2019-02-19 Thread Yuan Y Bai
Hi Thomas
 
To set install NIC with static ip, you can follow these steps:
1.
chdef  ip=
 
2.  Use `confignetwork -s`, details refer to: https://xcat-docs.readthedocs.io/en/latest/guides/admin-guides/manage_clusters/common/deployment/network/cfg_network_ethernet_nic.html#configure-adapters-with-static-ips
 
 
BTW:
 
nics.nicips is mainly used for secondary nics, it contains comma-separated list of IP addresses per NIC. You can refer to usage command: `tabdump -d nics|grep nicips`
 
noderes.installnic : The network adapter on the node that will be used for OS deployment, the installnic can be set to the network adapter name or the mac address or the keyword "mac" which means that the network interface specified by the mac address in the mac table will be used. You can refer to usage command: `tabdump -d noderes|grep installnic`.
 
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Thomas HUMMEL To: xcat-user@lists.sourceforge.netCc:Subject: Re: [xcat-user] noderes,nics,confignicsDate: Tue, Feb 19, 2019 5:18 PM 
On 2/19/19 2:34 AM, Bin XA Xu wrote:> To set static, you can use `hardeths` or `confignetwork -s`Ok. But does it has something to do with either noderes.installnic ornics.nicips (I still cannot figure out how to use thoses attributes).Thanks.--TH___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user 
 


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


[xcat-user] Vote for new xCAT logo

2019-01-23 Thread Yuan Y Bai
xCAT Users!
 
We are considering a refresh of the xCAT logo and would appreciate your valued feedback.  
 
Could you take a look at the proposals and participate in the discussion and voting at: https://github.com/xcat2/xcat2.github.io/issues/7
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


[xcat-user] Vote for new xCAT logo

2019-01-21 Thread Yuan Y Bai
xCAT Users!
 
We are considering a refresh of the xCAT logo and would appreciate your valued feedback.  
 
Could you take a look at the proposals and participate in the discussion and voting at: https://github.com/xcat2/xcat2.github.io/issues/7
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] xcat-dep repository for SLES15

2019-01-10 Thread Yuan Y Bai
Hello Winfried,
 
We do not have SLES15 xcat-dep repository now.  SLES15 renamed as SLE15 based on its official release notes.  Now, we support using SLES12.3 xCAT management node to provision SLE15 diskful compute node.
 
Could you share more information about How to use xCAT on SLES and your organization name?  
 
We can weigh the priority of supporting different OS based on customer requirements. 
Thanks.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: abang To: xcat-user@lists.sourceforge.netCc:Subject: [xcat-user] xcat-dep repository for SLES15Date: Thu, Jan 10, 2019 11:30 PM 
Hello,There is no repository for SLES15 inhttp://xcat.org/files/xcat/repos/yum/xcat-dep/Did I miss something? Or doesn't SLES15 need a xcat-dep repository? Oris it just too early?Many thanks!Winfried___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user 
 


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Staging a new management node but keeping it inactive?

2018-12-11 Thread Yuan Y Bai
Hi Kevin,
 
We have a script xcatha.py can setup a standby xcat management node.
It is used in one of xCAT HA solution. We can use it setup 2 inactive xcat management node, then,  use it active one of the xcat management node. You scenario is similar with our solution.
You can find the related code and doc here: https://github.com/xcat2/xcat-extensions/tree/master/HA
I hope it can help you.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Rich Sudlow To: xCAT Users Mailing list , David Johnson Cc:Subject: Re: [xcat-user] Staging a new management node but keeping it inactive?Date: Sat, Dec 8, 2018 5:21 AM 
We do something slightly different by not using a DHCP pool but only allowDHCP to answer with MACs that are know.On 12/7/18 1:30 PM, David Johnson wrote:> Yes, only one can have a dynamic range.  In my case neither of them do,> since I manually paste the MAC addresses into the mac table.>> My issue with deleting first is when I deleted 16 mac addresses and then got> sidetracked and went home, those nodes later lost their lease and ended up getting> evicted from GPFS.  Not an issue if the nodes were otherwise idle, but they had> been marked to drain, jobs were still running on them.>>   — ddj>>> On Dec 7, 2018, at 1:25 PM, Kevin Keane >> > wrote: Thank you, Dave. That is an interesting alternative approach; I might actually>> consider that. So you are saying that the old and new DHCP servers can run in parallel? I>> assume that they just can't both have dynamic ranges? I'm not sure I understand what the problem is with deleting a node first, and>> then adding it on the new system. Even if the lease expires, wouldn't it just>> reacquire the new one once you create it? ___>> Kevin Keane | Systems Architect | University of San Diego ITS |>> kke...@sandiego.edu >> Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859 *REMEMBER! **_No one from IT at USD will ever ask to confirm or supply your>> password_*.>> These messages are an attempt to steal your username and password. Please do>> not reply to, click the links within, or open the attachments of these>> messages. Delete them!>> On Thu, Dec 6, 2018 at 4:43 PM >> > wrote:     We’ve kept parallel clusters on the same network for nearly a year now>>     while transitioning to RH7 from CentOS 6.>>     Initially copied the hosts and nodelist and MAC tables into the new xcat>>     database.  Carefully controlled use of makedhcp so that nodes moving to>>     the new cluster were first added to new dhcp server and then deleted from>>     the old. (  Didn’t want a repeat of what happened when I left some deleted>>     from the old but not added to the new cluster and they lost their lease.>>     The postscript hardeths also helped.  ). Make new images and use nodeset>>     to point to them. Reboot and test.     Drawback is having to make parallel changes on both management servers all>>     the time, but we needed both clusters to access gpfs so it was a necessary>>     evil.       -- ddj>>     Dave Johnson     On Dec 6, 2018, at 4:23 PM, Kevin Keane >>     > wrote:>     I'm in the middle of upgrading our existing HPC (from RHEL 6 to RHEL 7).>>>     I'm doing most of my testing on a separate "sandbox" test bed, but now>>>     I'm close to going live. I'm trying to figure out how to do this with>>>     minimal disruption.>>     My question: how can I install the new management node and keep it>>>     *almost* completely operational, without interfering with the existing>>>     cluster? Is it enough to disable DHCP, or do I need to do anything else?>>     How do I prevent DHCP from accidentally getting enabled before I'm ready?>>>     Is makedhcp responsible for that?>>     Step-by-step, here is what I plan to do:>>     - Set up the new management node, but keep it inactive.>>>     - Test>>>     - Bring down all compute nodes.>>>     - Via IPMI, reset all the compute nodes' BMC controllers to DHCP>>>     - Other migration steps (home directories, modifications on the storage>>>     node, etc.)>>>     - De-activate the old management node (but keep it running)>>>     - Activate the new management node.>>>     - Discover and boot compute nodes>>     Is there anything glaringly obvious that I overlooked?>>     Thanks!>>     ___>>>     Kevin Keane | Systems Architect | University of San Diego ITS |>>>     

Re: [xcat-user] choosing kernel for genimage (xCAT 2.13.4)

2018-12-03 Thread Yuan Y Bai
Hi David,
 
Thanks you pointing out that the doc is not clear about "kerneldir", and make you confused about "kerneldir" and "pkgdir".
 
Here you should use "pkgdir" to append kernels repo directory into pkgdir.  Here is my example, could you try like this, if you have problems, contact me freely.
 
]# ls /install/kernels/3.10.0-860.el7.x86_64/dracut-033-535.el7.x86_64.rpmdracut-network-033-535.el7.x86_64.rpmkernel-3.10.0-860.el7.x86_64.rpmlinux-firmware-20180220-62.git6d51311.el7.noarch.rpmrepodata
 
]# chdef -t osimage rhels7.3-x86_64-netboot-compute -p pkgdir=/install/kernels/3.10.0-860.el7.x86_64
 
]# lsdef -t osimage rhels7.3-x86_64-netboot-compute -i pkgdirObject name: rhels7.3-x86_64-netboot-compute    pkgdir=/install/rhels7.3/x86_64,/install/kernels/3.10.0-860.el7.x86_64
 
]# genimage rhels7.3-x86_64-netboot-compute -k 3.10.0-860.el7.x86_64
 
]# packimage rhels7.3-x86_64-netboot-compute
 
 
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: David Johnson To: xCAT Users Mailing list Cc:Subject: Re: [xcat-user] choosing kernel for genimage (xCAT 2.13.4)Date: Tue, Dec 4, 2018 12:16 AM The document seems to have a problem with the examples —
the description talks about “kerneldir” attribute, but the example uses “pkgdir”. 
When I use pkgdir it clobbers the existing value (/install/rhels7.3/x86_64) 
with new value (/install/kernels/3.10.0-862.14.14.el7.x86_64)
 
I think it should say chdef -t osimage <> -p kerneldir=<> 
I got further using that format. However, I’m still having trouble with the packimage
command “Error: Cannot find rhels7.3-x86_64-neboot-comp_kup from the linuximage table.
But it’s there:
[root@mgt5 ~]# tabdump linuximage | grep rhels7.3-x86_64-netboot-comp_kup
"rhels7.3-x86_64-netboot-comp_kup","/install/kernels/3.10.0-862.14.4.el7.x86_64,/install/rhels7.3/x86_64",,"/install/kernels/3.10.0-862.14.14.el7.x86_64",,,,,,,,
 
Thanks for the pointers,
 
 — ddj
 
On Dec 3, 2018, at 4:08 AM, Yuan Y Bai <by...@cn.ibm.com> wrote: 

 
Hi David,
 
Do you want to create osimage to use RHEL 7.5 kernel rpms based on RHEL 7.3 .iso?
 
I think you can refer to examples in this doc, if you have problem, contact us freely.
 
https://xcat-docs.readthedocs.io/en/latest/guides/admin-guides/manage_clusters/ppc64le/diskless/customize_image/install_new_kernel.html?highlight=kernel
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: David Johnson <david_john...@brown.edu>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: [xcat-user] choosing kernel for genimage (xCAT 2.13.4)Date: Sat, Dec 1, 2018 3:47 AM 
I am trying to build new netboot images based on the RHEL 7.3 .iso,while updating only the kernel (to the one released with RHEL 7.5).I would like to retain the ability to rebuild the original images.It seems to me that “otherpkgs” is probably too late, as the kerneland initrd are populated  first.  If I drop the updated RPMS in /install/rhels7.3/x86_64/Packagesand rebuild the repo, it should choose the newer ones. Then I’dneed the -g and/or -k options to genimage to use the originals.Does this make sense or is there an easier way?Thanks, — ddjDave Johnson___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user 
 ___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user
 


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] choosing kernel for genimage (xCAT 2.13.4)

2018-12-03 Thread Yuan Y Bai
 
Hi David,
 
Do you want to create osimage to use RHEL 7.5 kernel rpms based on RHEL 7.3 .iso?
 
I think you can refer to examples in this doc, if you have problem, contact us freely.
 
https://xcat-docs.readthedocs.io/en/latest/guides/admin-guides/manage_clusters/ppc64le/diskless/customize_image/install_new_kernel.html?highlight=kernel
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: David Johnson To: xCAT Users Mailing list Cc:Subject: [xcat-user] choosing kernel for genimage (xCAT 2.13.4)Date: Sat, Dec 1, 2018 3:47 AM 
I am trying to build new netboot images based on the RHEL 7.3 .iso,while updating only the kernel (to the one released with RHEL 7.5).I would like to retain the ability to rebuild the original images.It seems to me that “otherpkgs” is probably too late, as the kerneland initrd are populated  first.  If I drop the updated RPMS in /install/rhels7.3/x86_64/Packagesand rebuild the repo, it should choose the newer ones. Then I’dneed the -g and/or -k options to genimage to use the originals.Does this make sense or is there an easier way?Thanks, — ddjDave Johnson___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user 
 


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] [External] How to restrict xCAT's NFS shares?

2018-11-29 Thread Yuan Y Bai
Hi Kevin,
 
Nfs-based statelite / directory is ro mounted,  "/install *(ro,no_root_squash,sync,no_subtree_check)" will not break statelite.
We realized nfs share security issue, this part should be enhanced.
For your scenario, you can also use /etc/hosts.deny and /etc/hosts.allow to control the valid IP.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Kevin Keane To: xCAT Users Mailing list Cc:Subject: Re: [xcat-user] [External] How to restrict xCAT's NFS shares?Date: Fri, Nov 30, 2018 1:58 AM 
Yes, I was also concerned about security.
 
These nfs directories might only be *used* in those two scenarios, but their existence also affects stateless and stateful nodes. An attacker can simply mount those two directories to any machine on campus, install xCAT, and then manipulate the images to her heart's content, such as inject bitcoin miners. She can then also run genimage/packimage and make the images available to PXE boot.
 
It might help to change the shares from rw to ro, but that might break statelite nodes?___
Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.eduMaher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859
REMEMBER! No one from IT at USD will ever ask to confirm or supply your password.These messages are an attempt to steal your username and password. Please do not reply to, click the links within, or open the attachments of these messages. Delete them!
  

On Thu, Nov 29, 2018 at 1:38 AM Song BJ Yang  wrote:
the 2 shared nfs entries will be added on xCAT installation or upgrading or `xcatconfig -i/-f`
 
these 2 nfs shared directories are only used in 2 scenarios:
1) NFS based statelite
2) hierarchy cluster when site.sharedtftp and site.sharedinstall
any missing scenario?
 
We got some complains on this to be a security issue, we are considering not to export these 2 directories by default, provide some command or step to export them only needed in any of scenarios above, any comment or suggestion?
 
thanks
--YANG Song (杨嵩)IBM China System Technology LaboratoryTel: 86-10-82452903Email: yang...@cn.ibm.comAddress: Building 28, ZhongGuanCun Software Park,No.8, Dong Bei Wang West Road, Haidian District Beijing 100193, PRC北京市海淀区东北旺西路8号中关村软件园28号楼邮编: 100193
 
 
- Original message -From: Kevin Keane To: xCAT Users Mailing list Cc:Subject: Re: [xcat-user] [External] How to restrict xCAT's NFS shares?Date: Thu, Nov 29, 2018 7:17 AM 
Yes, you appear to be correct. I just, for testing, uninstalled all of xCAT. Then I manually removed the entries, and re-installed the xCAT RPMs. Lo and behold - it did in fact re-create the entries (but did not remove them when uninstalling xCAT).
 
Thanks for the help!
___Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.eduMaher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859
REMEMBER! No one from IT at USD will ever ask to confirm or supply your password.These messages are an attempt to steal your username and password. Please do not reply to, click the links within, or open the attachments of these messages. Delete them!
  

On Wed, Nov 28, 2018 at 2:51 PM Christian Caruthers  wrote:
I believe that is created when xCAT is installed. Not sure which RPM does it, though. Possible the main xCAT or xCAT-server package. I don’t see the file in any of the packages, so I’m guessing it’s created by a script.
 
Regards,
Christian Caruthers
Lenovo Professional Services
Mobile: 757-289-9872
 
From: Kevin Keane Sent: Wednesday, November 28, 2018 17:26To: xCAT Users Mailing list Subject: Re: [xcat-user] [External] How to restrict xCAT's NFS shares?
 
My question is actually, how does the /etc/exports get generated, and how do I get xCAT to generate the exports file without the world-writable permissions?
 
Thanks,
___Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.eduMaher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859
REMEMBER! No one from IT at USD will ever ask to confirm or supply your password.These messages are an attempt to steal your username and password. Please do not reply to, click the links within, or open the attachments of these messages. Delete them!
 
 
 
On 

Re: [xcat-user] nvidia driver on stateless cluster

2018-11-18 Thread Yuan Y Bai
Hi Huette,
 
If you mean a node equipped with NVIDIA GPU(s),
 
could you have a look at this doc  https://xcat-docs.readthedocs.io/en/latest/advanced/gpu/index.html 
 
Hope these steps can help you.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Heckes Frank (CI/OSB4)" To: xCAT Users Mailing list Cc:Subject: Re: [xcat-user] nvidia driver on stateless clusterDate: Fri, Nov 16, 2018 9:28 PM  
Hello,
 
I suppose you mean a node equipped with NVIDIA GPU(s).
 
There’s one option I currently use to install the driver in image of a rhel/centos node. 
 
On a node with the kernel-devel RPM of the target node installed (might be MN or a build host of sorts),run the downloaded driver:
 
./NVIDIA-Linux-x86_64-390.87.run --add-this-kernel  
 
The node don’t have to be the target node.This will create a self extracting file customized with the kernel running on your target node. ./NVIDIA-Linux-x86_64-390.87-custom.run .In case the kernel isn’t running on the ‘build’ node you  can specify the kernel version and src dir via command-line
Options (see –advanced-options output)Now can start this version from a postscript. The file might be in a network FS share or inside the image and deleted afterwards by running:
 
NVIDIA-Linux-x86_64-390.87-custom.run –x; ./nvidia-install -s
You need to blacklist the noveau in the diskless boot before.
There’s another possibility to use dkms with the nvidia installer. You’d need to chroot (and bind /dev/, /proc/, sys) manually and run the installer with –dkms option.
 
Mit freundlichen Grüßen / Best regardsFrank HeckesCI Operations - Server Services Sun Solaris, Linux (CI/OSB4) frank.hec...@de.bosch.com 
Von: Huette, Antoine Gesendet: Freitag, 16.  November 2018 12:36An: xCAT Users Mailing list Betreff: [xcat-user] nvidia driver on stateless cluster
 
Hello,
 
On a stateless CentOS 7.5 cluster with Quadro GPUs, I need to install the Nvidia driver. I’m using the runfile downloaded from the Nvidia website.
What is the suggested procedure ? Is it better to install the driver in the osimage, or should I make the installer run when the nodes start ?
 
The problem I see with the first option is the fact that the driver checks if a GPU is present in the system, so I’m not sure if this method can work.
 
The problem with the second method is that, after trying it, it’s very difficult to have a working X server with a Gnome desktop. The driver installer needs the node to be in runlevel 3 (multi-user.target) but once it is installed, I need to switch to runlevel 5 (graphical.target) which almost never works. So far the only way I’ve found is by installing the driver manually on a freshly booted node, run nvidia-xconfig to fill the Xorg.conf file, and then restarting the gnome services.
 
Any help on this subject would be much appreciated ! 
 
 
Best regards,
 
Antoine Huette HPC Engineer
antoine.hue...@bechtle.com | 03.67.07.97.37/07.72.31.82.12 |  bechtle.fr |
 
             
 
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user
 

___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] unexpected hostname

2018-11-01 Thread Yuan Y Bai
Hi Sandra,
 
I define mgt0 node in my MN, delete `mgt0-pub:172.16.13.99... ...` in `otherinterfaces` from hosts table,  and add `nicaliases.eth0=mgt0-mgt` in node definition, then execute `makehosts mgt0`, it can generate `mgt0-pub` and `mgt0-data` in /etc/hosts file, the domain `cluster.com` is coming from my `site` table. I curious that you need to define mgt0-pub and mgt0-data in otherinterfaces from `hosts` table.
 
BTW: is there  DHCP server for eth1 before the node provision?
 
My example here:
 
[root@bybc0607 ~]# lsdef mgt0Object name: mgt0    arch=x86_64    authdomain=mcri.edu.au    chain=standby    conserver=xcat    currchain=boot    currstate=boot    domaintype=activedirectory    groups=mgt,vm    hostnames=mgt0    ip=10.40.113.99    mac=    mgt=esx    netboot=pxe    nfsdir=/install    nfsserver=xcat    nicaliases.eth0=mgt0-mgt    nichostnamesuffixes.eth2=-data    nichostnamesuffixes.eth1=-pub    nicips.eth2=10.50.113.99    nicips.eth1=172.16.13.99    nicips.eth0=10.40.113.99    nicnetworks.eth2=Data    nicnetworks.eth1=Public    nicnetworks.eth0=Management    nictypes.eth2=Ethernet    nictypes.eth1=Ethernet    nictypes.eth0=Ethernet    os=centos7.5    ou=    postbootscripts=otherpkgs,    postscripts=syslog,remoteshell,syncfiles,setupntp,confignics,    profile="">    provmethod=centos7-mgt    routenames=14NetRoute,MySQLUCSCRoute    servicenode=xcat    status=failed    statustime=10-29-2018 14:14:01    updatestatus=failed    updatestatustime=10-29-2018 13:53:40[root@bybc0607 ~]# makehosts mgt0
 
[root@bybc0607 ~]# grep mgt0 /etc/hosts10.40.113.99 mgt0 mgt0.cluster.com mgt0-mgt10.50.113.99 mgt0-data mgt0-data.cluster.com172.16.13.99 mgt0-pub mgt0-pub.cluster.com
[root@bybc0607 ~]# lsxcatd -vVersion 2.14.4 (git commit 51bd7fea2746d1812aa0eba3d655d63e16b718e2, built Wed Oct 17 06:15:55 EDT 2018)
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Sandra Maksimovic To: 'xCAT Users Mailing list' Cc:Subject: Re: [xcat-user] unexpected hostnameDate: Thu, Nov 1, 2018 1:29 PM  
Btw I managed to work around this issue by setting eth1 to use DHCP and eth0 to send DHCP_HOSTNAME using a postscript.
 
Cheers,
Sandra
 
From: Sandra MaksimovicSent: Tuesday, 30 October 2018 6:30 PMTo: 'xCAT Users Mailing list' Subject: RE: [xcat-user] unexpected hostname
 
Hi Yuan,
 
Just to let you know, it seems that when I remove otherinterfaces=”mgt0-pub:172.16.13.99,mgt0-data:10.50.113.99” from the mgt0 definition, the /etc/hosts file does not regenerate with mgt0-pub or mgt0-data entries, only mgt0 and its fqdn is listed.
 
The xcat servicenode should be managing nodes over the 10.40.0.0/24 network, however, I don’t think this has been setup properly because the servicenode table is blank. A lot of this new cluster’s configuration has been carried over from our current prod iteration so I’m not sure whether some of these definitions are still relevant.
 
The /var/lib/dhclient directory is missing the dhclient.leases file but contains the following:
 
# cat chrony.servers.eth0
10.40.115.100 iburst
 
# cat ntp.conf.predhclient.eth0

 
The IP 10.40.115.100 is the management NIC on my xCAT server, which seems to indicate the correct provisioning network… 
 
I’ve just noticed that when I run ‘dhclient’ manually on the ‘mgt0-pub’ node the leases file appears along with some others… 
 
dhcp-server-identifier on eth0 (which is the mgt/provisioning NIC on the 10.40.0.0 net) is 10.40.115.100 
host-name is “mgt0”
 
I’m now wondering what would have stopped this information from being generated during deployment? And would this have managed to impact the hostname?
 
Many thanks,
Sandra
 
From: Yuan Y Bai <by...@cn.ibm.com>Sent: Monday, 29 October 2018 4:40 PMTo: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] unexpected hostname
 
Hi Sandra
 
From your node definition,  `nichostnamesuffixes.eth1=-pub nicips.eth1=172.16.13.99` will generate `172.16.13.99 mgt0-pub ..` entry in /etc/hosts file. No need to `mgt0-pub:172.16.13.99` in otherinterfaces.  
 
And you use service node,  `servicenode=xcat`, which network service node use?   
 
Could you login `mgt0-pub` and check lease file under directory `/var/lib/dhclient` to see what are  `dhcp-server-identifier`  and `host-name`?
It seems `mgt0` node get hostname `mgt0-pub` from 172.xx.xx.xx DHCP server. The provision network should 10.xx.xx.xx network.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Roa

Re: [xcat-user] unexpected hostname

2018-10-28 Thread Yuan Y Bai
Hi Sandra
 
From your node definition,  `nichostnamesuffixes.eth1=-pub nicips.eth1=172.16.13.99` will generate `172.16.13.99 mgt0-pub ..` entry in /etc/hosts file. No need to `mgt0-pub:172.16.13.99` in otherinterfaces.  
 
And you use service node,  `servicenode=xcat`, which network service node use?   
 
Could you login `mgt0-pub` and check lease file under directory `/var/lib/dhclient` to see what are  `dhcp-server-identifier`  and `host-name`?
It seems `mgt0` node get hostname `mgt0-pub` from 172.xx.xx.xx DHCP server. The provision network should 10.xx.xx.xx network.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Sandra Maksimovic To: "'xcat-user@lists.sourceforge.net'" Cc:Subject: Re: [xcat-user] unexpected hostnameDate: Mon, Oct 29, 2018 12:15 PM  
Hi Bin,
 
Thanks for your response.
 
mgt0 and mgt0-pub do not point to the same IP address nor are they in the same subnet. Please see the output below:
 
Object name: mgt0
    arch=x86_64
    authdomain=mcri.edu.au
    chain=standby
    conserver=xcat
    currchain=boot
   currstate=boot
    domaintype=activedirectory
    groups=mgt,vm
    hostnames=mgt0
    ip=10.40.113.99
    mac=
    mgt=esx
    netboot=pxe
    nfsdir=/install
    nfsserver=xcat
    nichostnamesuffixes.eth0=-mgmt
    nichostnamesuffixes.eth1=-pub
    nichostnamesuffixes.eth2=-data
    nicips.eth0=10.40.113.99
    nicips.eth1=172.16.13.99
    nicips.eth2=10.50.113.99
    nicnetworks.eth0=Management
    nicnetworks.eth1=Public
    nicnetworks.eth2=Data
    nictypes.eth0=Ethernet
    nictypes.eth1=Ethernet
    nictypes.eth2=Ethernet
    os=centos7.5
    otherinterfaces=mgt0-pub:172.16.13.99,mgt0-data:10.50.113.99
    ou=
    postbootscripts=otherpkgs,
    postscripts=syslog,remoteshell,syncfiles,setupntp,confignics,
    profile="">
    provmethod=centos7-mgt
    routenames=14NetRoute,MySQLUCSCRoute
    servicenode=xcat
    status=failed
    statustime=10-29-2018 14:14:01
    updatestatus=failed
    updatestatustime=10-29-2018 13:53:40
 
FYI some of our postscripts are failing during deployment which is why the updatestatus=failed.
 
Also, thanks Brian for your suggestion, I shall look into this further regarding the NIC setup. I did a quick test and this doesn’t appear to be what I’m after at this stage since the deployed node’s hostname is unaffected when specifying the nicaliases.
 
Thanks,
Sandra
 
 
‐‐‐ Original Message ‐‐‐
On Friday, October 26, 2018 5:17 PM, Bin XA Xu  wrote:
 
Hi Sandra,
 
    Is the mgt0 and mgt0-pub pointing to the same IP address, or in the same subnet?  And what's your `mgt01` definition, you can use `lsdef mgt01` to get the information and hide the sensitive attributes.
 
    And Yuan, do you have more suggestions?
 
Bin Xu
HPC Software DevelopmentSoftware Defined Infrastructure, IBM Systems
Phone: 86-010-82454067
E-mail: bx...@cn.ibm.com
 
 
- Original message -
From: Sandra Maksimovic via xCAT-user 
To: "xcat-user@lists.sourceforge.net" 
Cc: Sandra Maksimovic 
Subject: [xcat-user] unexpected hostname
Date: Thu, Oct 25, 2018 11:35 PM
  
Hi all,
 
xCAT/HPC/list newbie here!
 
I have recently configured an xCAT node and am attempting to provision a separate management node, but for some reason xCAT is sort of not applying the expected hostname.
 
I'd like the resulting hostname on the node to just be "mgt0", but instead it's tacking on the public NIC suffix as well as the FQDN, i.e. mgt0-pub.meerkat.mcri.edu.au
 
The cluster is entirely CentOS7 based and will be eventually utilising MOAB and PBS/Torque for scheduling and resource management. The version of xCAT for this particular build is v2.14.4.
 
I've trawled through the debug enabled build logs and stepped through post.rh.common and from what I can tell the node should just be named "mgt0" (sans all suffixes).
 
Also, the DNS on the xCAT node contains entries for "mgt0", "mgt0-data", "mgt0-pub", but (if this is indeed the issue) I'm not sure why xCAT would have selected "mgt0-pub" to hand out when the node is being provisioned via its management IP which is actually associated with "mgt0" (as opposed to its public one which is associated with "mgt0-pub").
 
Any ideas on other avenues that might be worth investigating?
 
Also, please feel free recommend some useful resources for learning xCAT and/or HPC in general! I'm already heavily utilising the official xCAT docs and the Sourceforge Wiki/mailing list search...
 
Cheers,
Sandra
 
Sent from  ProtonMail, encrypted email based in Switzerland.
 
 
___xCAT-user mailing 

Re: [xcat-user] [External] Diskless install leads to emergency mode

2018-10-24 Thread Yuan Y Bai
Hi Huette
 
When you want to add some packages from OS, you can append package short names in the pkglist. You can find the pkglist from osimage definition, then edit the pkglist to append packages like following commands:
 
~]# lsdef -t osimage rhels7.4-x86_64-install-compute -i pkglistObject name: rhels7.4-x86_64-install-compute    pkglist=/opt/xcat/share/xcat/install/rh/compute.rhels7.pkglist
 
 ~]# cat /opt/xcat/share/xcat/install/rh/compute.rhels7.pkglist#Please make sure there is a space between @ and group namewgetntpnfs-utilsnet-snmprsyncyp-toolsopenssh-serverutil-linuxnet-tools
 
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Huette, Antoine" To: xCAT Users Mailing list Cc:Subject: Re: [xcat-user] [External] Diskless install leads to emergency modeDate: Thu, Oct 25, 2018 3:42 AM  Yes, I already know how to install a package with the installroot option :)
 
What I found weird is that some packages, including yum, weren't installed even though they are specified in the pkglist file. 
 
Whatever, I just added some packages like a desktop environment, gcc and python in the rootimg, and after a reboot yum was also working. I guess it was installed with a package I added.
 
Best Regards
 
 
 Message d'origine 
De : Jarrod Johnson 
Date : 24/10/2018 19:22 (GMT+01:00)
À : xCAT Users Mailing list 
Objet : Re: [xcat-user] [External] Diskless install leads to emergency mode
 
The general principle is to leave things like yum out during packimage stage.
 
However, prior to packimage, you can, for example:
# chroot /install/netboot/rhels7.5/x86_64/compute/rootimg/
[root@mn10 /]#
 
To manage the image in place, and using yum externally:yum –installroot=/install/netboot/rhels7.5/x86_64/compute/rootimage/
 
In either case, /install/netboot/rhels7.5/x86_64/compute/rootimage/etc/repos.d is what yum will complaint…
 
 
From: Huette, Antoine Sent: Wednesday, October 24, 2018 1:07 PMTo: xCAT Users Mailing list Subject: Re: [xcat-user] [External] Diskless install leads to emergency mode
 
Finally I've found out that the nfs exports weren't done correctly !
After fixing it the node booted correctly :)
 
However...some basic packages like yum are missing...
I'm quite sure I don't need to put such basic packages in the pkglist file, right ?
 
Best Regards
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user
 


___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Problem deploying compute node, vmlinuz not being downloaded

2018-09-06 Thread Yuan Y Bai
Hi,
 
Could you check if firewall and selinux on in your xCAT management node?
 
xCAT uses HTTP heavily during booting, but if you use firewall and selinux, the HTTP server cannot serve files from /tftpboot .
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: wodel youchi To: xcat-user@lists.sourceforge.netCc:Subject: Re: [xcat-user] Problem deploying compute node, vmlinuz not being downloadedDate: Fri, Sep 7, 2018 7:39 AM 
Hi again,
 
Here is some details about my lab test.
 
My physical machine is a Fedora 28, I have created on it two VMs, A manager and Compute node using CentOS 7.5
 
The problem is as described previously, the vmlinuz does no get downloaded using xnba or pxe.
 
I made another test, I used nested KVM to create a CentOS 7.5 hypervisor then on top of it i created one Compute VM.
 
This time when using pxe, the vmlinuz get downloaded at 66% then it freezes, but when using xnba the VM completely freeze and I get this error on libvirt log
KVM internal error. Suberror: 1emulation failureEAX=8d9604fe EBX=0001 ECX=bdeae265 EDX=2500b04cESI=d889c6d2 EDI=000104ca EBP=620e2060 ESP=0001bfb0EIP=41dcc3ea EFL=00010082 [--S] CPL=0 II=0 A20=1 SMM=0 HLT=0ES =0010 7fef5000  00c09300 DPL=0 DS   [-WA]CS =0008 7fef5000  00c09f00 DPL=0 CS32 [CRA]SS =0010 7fef5000  00c09300 DPL=0 DS   [-WA]DS =0010 7fef5000  00c09300 DPL=0 DS   [-WA]FS =0010 7fef5000  00c09300 DPL=0 DS   [-WA]GS =0010 7fef5000  00c09300 DPL=0 DS   [-WA]LDT=   8200 DPL=0 LDTTR =   8b00 DPL=0 TSS32-busyGDT= 0009cf70 0047IDT= 7ff107f0 07ffCR0=0011 CR2= CR3= CR4=DR0= DR1= DR2= DR3=DR6=0ff0 DR7=0400EFER=Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 
 
 
Regards. 

Le jeu. 6 sept. 2018 à 19:03, wodel youchi  a écrit :
Hi,
 
Thanks for the help.
 
For now I didn't configure any hardware management for the compute nodes.
 
Initially the netboot option was xnba, and as explained, I start the node VM with pxe boot, the VM gets it's IP then It downloads the /tftpboot/xcat/xnba/nodes/cnode001 file, then when trying to download the vmlinuz nothing happens, this is what I get from httpd access_log
10.10.2.61 - - [04/Sep/2018:13:17:28 +0100] "GET /tftpboot/xcat/xnba/nodes/cnode001 HTTP/1.1" 200 447 "-" "iPXE/1.0.3-131028 (d603e)"10.10.2.61 - - [04/Sep/2018:13:17:28 +0100] "GET /tftpboot/xcat/osimage/centos7.5-x86_64-install-compute/vmlinuz HTTP/1.1" 200 6224704 "-" "iPXE/1.0.3-131028 (d603e)"
 
I switched to pxe, and the same problem
10.10.2.31 - - [05/Sep/2018:23:06:44 +0100] "GET /tftpboot/xcat/xnba/nets/10.10.2.0_24 HTTP/1.1" 200 241 "-" "iPXE/1.0.3-131028 (d603e)"10.10.2.31 - - [05/Sep/2018:23:06:44 +0100] "GET /tftpboot/xcat/genesis.kernel.x86_64 HTTP/1.1" 200 5877760 "-" "iPXE/1.0.3-131028 (d603e)"
 
The provisioning never happened, the process stopped here, the vmlinuz never gets downloaded.
 
What I don't understand also is, why there is a difference on the booted image between xnba and pxe???
When using xnba, the kernel from the osimage centos7.5-x86_64-install-compute was used, and when using pxe the genesis kernel was used.
 
[root@manager01 ~]# lsdef -t node cnode001Object name: cnode001    arch=x86_64    currchain=boot    currstate=install centos7.5-x86_64-compute    groups=mynodes,compute,all    ip=10.10.2.61    mac=52:54:00:19:13:aa    netboot=pxe    os=centos7.5    postbootscripts=otherpkgs    postscripts=syslog,remoteshell,syncfiles    profile="">    provmethod=centos7.5-x86_64-install-compute
 
 
 
 
Regards. 

Le mer. 5 sept. 2018 à 03:43, Song BJ Yang  a écrit :
> I made abstraction of BMC configuration and I didn't use xcat with KVM, the idea is to simulate physical deployment.
 not quite understand this, what is the "mgt" attribute of the node?
 
> The problem : the PXE boot works fine until the download of the vmlinuz image and it hangs, and nothing after.
 
what is the "netboot" attribute of the node? if "pxe" does not work, please try "xnba". 
 
can you see the console of the node during provision?
 
 
--YANG Song (杨嵩)IBM China System Technology LaboratoryTel: 86-10-82452903Email: yang...@cn.ibm.comAddress: Building 28, ZhongGuanCun Software Park,No.8, Dong Bei Wang West Road, Haidian District Beijing 100193, PRC北京市海淀区东北旺西路8号中关村软件园28号楼邮编: 

Re: [xcat-user] nfs mounting root image from different server

2018-08-22 Thread Yuan Y Bai
Hi Jeff,
 
It is possible to do the nfs mount from a non-XCAT server. 
 
You need to sync up the rootimg from xCAT management node to new nfs server. And change nodes definition nfsserver and nfsdir attributes on current xCAT management node. The kernel and initrd-statelite.gz should be in current xCAT management node directory like "/install/netboot/rhels7.4/x86_64/compute/".  
 
Here is my example, you can do the following changes based on my previous example, I hope this can help you.
 
10.5.106.1 is new nfs server.
xcatmn1 is old nfs server and xCAT MN node, there is source statelite rootimg directory on xcatmn1.
 
1. login new nfs server 10.5.106.1, create nfs dir /HAtest and directory structure like the following:
/]# showmount -e
Export list for x.cluster.com:
/HAtest *
 
/]#mkdir -p /HAtest/netboot/rhels7.4/x86_64/compute/
 
2. rsync rootimg from xcatmn1 to new nfs server, execute command on new nfs server:
/]#rsync -avz xcatmn1:/install/netboot/rhels7.4/x86_64/compute/rootimg /HAtest/netboot/rhels7.4/x86_64/compute/
/]# ls /HAtest/netboot/rhels7.4/x86_64/compute/rootimg
 
3. login current xCAT MN xcatmn2, change nfsserver and nfsdir for compute node, make sure there are and initrd-statelite.gz, make sure "litefile" table is correct as before, provision statelite on it:
/]#tabdump litefile
/]#ls /install/netboot/rhels7.4/x86_64/compute/initrd-statelite.gz  kernel
/]#chdef bybc0609 nfsserver=10.5.106.1 nfsdir=/HAtest
/]#rinstall bybc0609 osimage=rhels7.4-x86_64-statelite-compute
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Jeff Berry To: xCAT Users Mailing list Cc:Subject: Re: [xcat-user] nfs mounting root image from different serverDate: Wed, Aug 22, 2018 6:46 PM  
Hi Yuan,
 
that’s helpful.   A further question – is it possible to do the nfs mount from a non-XCAT server?  IE From a NAS device of some sort?
 
Really what we’re hoping for is some redundancy and possible throughput increase by using our storage system.
 
Regards,
 
Jeff Berry, MRC CBSU
 
From: Yuan Y Bai [mailto:by...@cn.ibm.com]Sent: 22 August 2018 10:35To: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] nfs mounting root image from different server
 
Hi Jeff,
 
Here is my example to nfs mounting root image from different server, I hope this can help you.
 
My example: the remote nfsserver is 10.5.106.2, it is the first xCAT management node xcatmn1. Current xCAT management node is xcatmn2. I use xcatmn2 to provison statelite compute node bybc0609. I use redhat 7.4 OS, system arch is x86_64.
 
1. create statelite osimage on first xcatmn1 as usual:
copycds 
tabedit litefile
genimage rhels7.4-x86_64-statelite-compute
liteimg rhels7.4-x86_64-statelite-compute
 
2. add nfsserver for compute node definition on second xcatmn2:
chdef bybc0609 nfsserver=10.5.106.2
 
3. rsync initrd-statelite.gz and kernel from the first xcatmn1 to current xCAT MN xcatmn2, execute the followings on xcatmn1:
rsync /install/netboot/rhels7.4/x86_64/compute/initrd-statelite.gz xcatmn2:/install/netboot/rhels7.4/x86_64/compute/
rsync /install/netboot/rhels7.4/x86_64/compute/kernel xcatmn2:/install/netboot/rhels7.4/x86_64/compute/
 
4. sync up litefile table the same between xcatmn1 and xcatmn2
 
5. provision statelite compute node on xcatmn2:
rinstall bybc0609 osimage=rhels7.4-x86_64-statelite-compute
 
6. You can use "rcons bybc0609" to look at the process, then login CN to check result using:
df -h
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Jeff Berry <jeff.be...@mrc-cbu.cam.ac.uk>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: [xcat-user] nfs mounting root image from different serverDate: Tue, Aug 21, 2018 7:42 PM  
Good afternoon all,we'd like to mount our statelite root images from a different machine than our XCAT master.   Are there any XCAT specific pitfalls I should be aware of and/or best practices I should try to follow?I had hoped it would be as simple as adding an nfsserver entry to the noderes table and adjusting the rootimg dir in the osimage object.  However, that didn't work out very smoothly.  When I change the rootimgdir, I can no longer run liteimg since it looks for that image directory.   When I try just changing the root image mount in the tftp file as a test, my no

Re: [xcat-user] nfs mounting root image from different server

2018-08-22 Thread Yuan Y Bai
Hi Jeff,
 
Here is my example to nfs mounting root image from different server, I hope this can help you.
 
My example: the remote nfsserver is 10.5.106.2, it is the first xCAT management node xcatmn1. Current xCAT management node is xcatmn2. I use xcatmn2 to provison statelite compute node bybc0609. I use redhat 7.4 OS, system arch is x86_64.
 
1. create statelite osimage on first xcatmn1 as usual:
copycds 
tabedit litefile
genimage rhels7.4-x86_64-statelite-compute
liteimg rhels7.4-x86_64-statelite-compute
 
2. add nfsserver for compute node definition on second xcatmn2:
chdef bybc0609 nfsserver=10.5.106.2
 
3. rsync initrd-statelite.gz and kernel from the first xcatmn1 to current xCAT MN xcatmn2, execute the followings on xcatmn1:
rsync /install/netboot/rhels7.4/x86_64/compute/initrd-statelite.gz xcatmn2:/install/netboot/rhels7.4/x86_64/compute/
rsync /install/netboot/rhels7.4/x86_64/compute/kernel xcatmn2:/install/netboot/rhels7.4/x86_64/compute/
 
4. sync up litefile table the same between xcatmn1 and xcatmn2
 
5. provision statelite compute node on xcatmn2:
rinstall bybc0609 osimage=rhels7.4-x86_64-statelite-compute
 
6. You can use "rcons bybc0609" to look at the process, then login CN to check result using:
df -h
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Jeff Berry To: xCAT Users Mailing list Cc:Subject: [xcat-user] nfs mounting root image from different serverDate: Tue, Aug 21, 2018 7:42 PM 
Good afternoon all,we'd like to mount our statelite root images from a different machine than our XCAT master.   Are there any XCAT specific pitfalls I should be aware of and/or best practices I should try to follow?I had hoped it would be as simple as adding an nfsserver entry to the noderes table and adjusting the rootimg dir in the osimage object.  However, that didn't work out very smoothly.  When I change the rootimgdir, I can no longer run liteimg since it looks for that image directory.   When I try just changing the root image mount in the tftp file as a test, my node hangs when attempting to mount the root image.That latter problem may be nfs related and have nothing to do with xcat per se, but it seems like there must be a way to build/modify the image on the server, copy the image to another server, and then do the mount directly from there.  However, I could be wrong on this ...lsxcatd -vVersion 2.14.2 (git commit f2090565b1d1b8efa7558d034de0478456c38e4c, built Wed Jul 11 07:15:53 EDT 2018)Anyone tried to do something like this before?Jeff Berry, MRC CBSU--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! http://sdm.link/slashdot___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user 
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Genesis boot for ppc64le and RAID options for IBM AC922?

2018-08-21 Thread Yuan Y Bai
Hi Keith,
 
1. ppc64le for POWER9 nodes :  mknb ppc64
 
2. We do not have P9 with hard RAID nodes, we use iprconfig, diskdiscover and configraid in P8 before. You can use iprconfig to check if there are raid devices. You can use iprconfig interactive mode to show devices, or "iprconfig -c show-config". xCAT hard RAID solution is here: https://xcat-docs.readthedocs.io/en/latest/advanced/raid/hardware_raid.html?highlight=configraid
 
3. soft raid in xcat provision : https://xcat-docs.readthedocs.io/en/latest/guides/admin-guides/manage_clusters/ppc64le/diskful/customize_image/raid_cfg.html?highlight=raid
 
I think all these should be general for P8 and P9, if you have problem, you can paste here, we can have a look at it together.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Keith Ball To: xcat-user@lists.sourceforge.netCc:Subject: Re: [xcat-user] Genesis boot for ppc64le and RAID options for IBM AC922?Date: Wed, Aug 22, 2018 6:38 AM 
Hi All,
 
Two questions regarding the new POWER9 servers (in this case, AC922) and provisioning ppc64le:
 
1.) From documentation and mknb, it appears that creating a working ppc64le for POWER9 nodes is not yet supported. Does anyone know a workaround, or know when this feature might be released? 
 
2.) Also, has anyone used iprconfig, diskdiscover and configraid to discover/configure hardware RAID on Power9? From what I can tell, no RAID solution was configured for the nodes, so the 2 disks appear to be un-RAIDed (show up as 2 distinct devices). 
 
There does not seem to be anything in Petitboot, or similar at boot time, that looks like a RAID configurator (like the Intel RAID one would see on e.g. a Supermicro x86 machine). Have folks used hardware RAID on POWER9 machines, or have they used some software RAID solution, in conjunction with xCAT deployment?
 
Many Thanks,
   Keith--

Keith D. Ball, PhD
RedLine Performance Solutions, LLC
web:  http://www.redlineperf.com/
email: kb...@redlineperf.com
cell: 540-557-7851
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! http://sdm.link/slashdot
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] statelite crashes with nfs server timeout

2018-07-15 Thread Yuan Y Bai
 
kdump doc link: https://xcat-docs.readthedocs.io/en/latest/guides/admin-guides/manage_clusters/ppc64le/diskless/customize_image/enable_kdump.html?highlight=kdump
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Song BJ Yang" To: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] statelite crashes with nfs server timeoutDate: Mon, Jul 16, 2018 10:59 AM 
Hi Jeff,
 
did you enabled kdump? the dump core file might help to find out the problem
--YANG Song (杨嵩)IBM China System Technology LaboratoryTel: 86-10-82452903Email: yang...@cn.ibm.comAddress: Building 28, ZhongGuanCun Software Park,No.8, Dong Bei Wang West Road, Haidian District Beijing 100193, PRC北京市海淀区东北旺西路8号中关村软件园28号楼邮编: 100193
 
 
- Original message -From: David Johnson To: xCAT Users Mailing list Cc:Subject: Re: [xcat-user] statelite crashes with nfs server timeoutDate: Fri, Jul 13, 2018 10:58 PM I’m not sure, but this seems to have a whiff of a problem we had a while back, where 
we forgot to make appropriate holes in the firewall.  GPFS died after the node had been
up successfully for a while.  It worked because connection initiated from the host to the
GPFS servers, but traffic back to the node was blocked after the session was dropped from
the firewall table of active connections.
 
On Jul 13, 2018, at 5:06 AM, Javier Ron  wrote: 

Hello,
 
It sounds like an NFS thing, you could try different options for the mount
 
https://www.centos.org/forums/viewtopic.php?t=8787
 
NFS hard mounts vs soft mounts - CentOS
www.centos.org
[quote] simon_matthews wrote: I think that the reason hard mounts are recommended is that this covers the case where the user's home directory is on an NFS server. 

 
From: Jeff Berry Sent: 12 July 2018 14:35:53To: xCAT Users Mailing listSubject: [xcat-user] statelite crashes with nfs server timeout
 
Good afternoon all,I've got some Centos 7.5 statelite nodes which seem to be booting properly, but after being up for less than a day, they crash with what look like nfs timeouts.  The server is up, and if I rpower reset the nodes, they come back up with no problem, but then they crash again overnight.this may not be an xcat problem at all, it may be an nfs issue, but I thought I'd toss it out here and see if it rang any bells for anyone,Jeff Berry, MRC CBSU--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! http://sdm.link/slashdot___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! http://sdm.link/slashdot___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! http://sdm.link/slashdot
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user
  

--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! http://sdm.link/slashdot
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] resolv.conf multiple search domains

2018-07-03 Thread Yuan Y Bai
Hi Jeff,
 
I think "/etc/resolv.conf" in your node is over-written by dhclient.
You can try the following to get correct result.
 
If "/etc/resolv.conf" in litefile table is "ro", you can customize the rootimg/etc/resolv.conf, after the node is booted, you can get the customized /etc/resolv.conf, it is "ro".
 
If  "/etc/resolv.conf" in litefile table is "tmpfs" or "rw",  "/etc/resolv.conf" will be generated by dhclient. In this situation, nameservers and domain from "networks" table are more priority than that in "site" table. So you can configure correct nameservers and domain for specific network entry in networks table. If nameservers and domain are empty in networks table, you can correct nameservers and domain in site table. 
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Jeff Berry To: xCAT Users Mailing list Cc:Subject: [xcat-user] resolv.conf multiple search domainsDate: Fri, Jun 29, 2018 9:54 PM 
Good afternoon,configuring the cluster is proceeding apace, and I find myself unclear on how best to deal with some dns issues.xcat 2.14.1, Master and statelite nodes all running CentOS7.5When I boot, a resolv.conf file is being generated and installed, and although it has the right nameservers, it does not have the search domains we want.  After liteimg, the .defaults/etc/resolv.conf file contains just the dummy line.  And on boot, I end up with a resolv.conf that looks like:search  nameserver nameserver nameserver nameserver That is, the same domain is duplicated on the search line.  The nameservers themselves are correct, though.I tried editing the .defaults/etc/resolv.conf file, but it had no effect.Obviously I am unclear on how that resolv.conf file is being generated, and any pointers that anyone can provide will be gratefully followed up.Jeff Berry, MRC CBU--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! http://sdm.link/slashdot___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user 
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] resolv.conf multiple search domains

2018-07-03 Thread Yuan Y Bai
You can configure litetree table, here is example:
 
[root@bybc0602 ~]# tabdump litetree#priority,image,directory,mntopts,comments,disable"1","rhels7.3-custom-statelite","bybc0602:/statelite/install/",,,
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Jeff Berry To: xCAT Users Mailing list Cc:Subject: Re: [xcat-user] resolv.conf multiple search domainsDate: Mon, Jul 2, 2018 5:11 PM  
Hi Yuan,
 
that’s what I thought should be happening – but it isn’t working properly.     When I log into the node and compare /etc/resolv.conf and /.default/etc/resolv.conf  they are not the same.  
 
I wondering if there’s a configuration setting that I’ve got wrong.
 
Best,
 
Jeff
 
 
From: Yuan Y Bai [mailto:by...@cn.ibm.com]Sent: 02 July 2018 03:40To: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] resolv.conf multiple search domains
 
Hi Jeff,
 
You can also customize /etc/resolv.conf.
 
When a node boots up in statelite mode, it will by default copy all of its tmpfs files from the .default directory of the root image, for example /install/netboot/rhels7.3/x86_64/compute/rootimg/.default, so there is not required to set up a litetree table. If you decide that you want some of the files pulled from different locations that are different per node, you can use this table. The litetree table controls where the initial content of the files in the litefile table come from, and the long term content of the ro files.
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Jeff Berry <jeff.be...@mrc-cbu.cam.ac.uk>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: [xcat-user] resolv.conf multiple search domainsDate: Fri, Jun 29, 2018 9:54 PM  
Good afternoon,configuring the cluster is proceeding apace, and I find myself unclear on how best to deal with some dns issues.xcat 2.14.1, Master and statelite nodes all running CentOS7.5When I boot, a resolv.conf file is being generated and installed, and although it has the right nameservers, it does not have the search domains we want.  After liteimg, the .defaults/etc/resolv.conf file contains just the dummy line.  And on boot, I end up with a resolv.conf that looks like:search  nameserver nameserver nameserver nameserver That is, the same domain is duplicated on the search line.  The nameservers themselves are correct, though.I tried editing the .defaults/etc/resolv.conf file, but it had no effect.Obviously I am unclear on how that resolv.conf file is being generated, and any pointers that anyone can provide will be gratefully followed up.Jeff Berry, MRC CBU--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org!  http://sdm.link/slashdot___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user 
 
 
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! http://sdm.link/slashdot
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] resolv.conf multiple search domains

2018-07-01 Thread Yuan Y Bai
Hi Jeff,
 
You can also customize /etc/resolv.conf.
 
When a node boots up in statelite mode, it will by default copy all of its tmpfs files from the .default directory of the root image, for example /install/netboot/rhels7.3/x86_64/compute/rootimg/.default, so there is not required to set up a litetree table. If you decide that you want some of the files pulled from different locations that are different per node, you can use this table. The litetree table controls where the initial content of the files in the litefile table come from, and the long term content of the ro files.
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Jeff Berry To: xCAT Users Mailing list Cc:Subject: [xcat-user] resolv.conf multiple search domainsDate: Fri, Jun 29, 2018 9:54 PM 
Good afternoon,configuring the cluster is proceeding apace, and I find myself unclear on how best to deal with some dns issues.xcat 2.14.1, Master and statelite nodes all running CentOS7.5When I boot, a resolv.conf file is being generated and installed, and although it has the right nameservers, it does not have the search domains we want.  After liteimg, the .defaults/etc/resolv.conf file contains just the dummy line.  And on boot, I end up with a resolv.conf that looks like:search  nameserver nameserver nameserver nameserver That is, the same domain is duplicated on the search line.  The nameservers themselves are correct, though.I tried editing the .defaults/etc/resolv.conf file, but it had no effect.Obviously I am unclear on how that resolv.conf file is being generated, and any pointers that anyone can provide will be gratefully followed up.Jeff Berry, MRC CBU--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! http://sdm.link/slashdot___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user 
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] xCAT node name invalid characters or hard code limitations ?

2018-06-21 Thread Yuan Y Bai
Hi Peter,
 
1 : is the ‘_’ not allow to use in the compute name ?

'_' is not allowed to use in the compute name.
2: is there any way to make it possible to use the ‘_’ in compute name in xCAT ?
I suggest you not use '_' in compute name, '_' is invalid, if you change makedns code to let `makedns -n` works, I am not sure if other functions work well too. If you want to change code, you can enhance this line 440: "unless ($names =~ /^[a-z0-9\. \t\n-]+$/i) {"  in /opt/xcat/lib/perl/xCAT_plugin/ddns.pm
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: peter CZ1 Peng To: xCAT Users Mailing list Cc:Subject: [xcat-user] xCAT node name invalid characters or hard code limitations ?Date: Fri, Jun 22, 2018 1:31 PM  
Hi ,Dear
I setup the xCAT and see below failed info when try to makedns –n ,can anybody help to take a look ,thanks 
 
Ignoring line 172.32.44.178 cfdv3_192g_012-bmc cfdv3_192g_012-bmc.cluster  in /etc/hosts, names  cfdv3_192g_012-bmc cfdv3_192g_012-bmc.cluster  contain invalid characters (valid characters include a through z, numbers and the '-', but not '_' 
makedns -n , 
 
 
[root@mgt ~]# makedns -n
Ignoring line 172.29.101.6 n05_xcc n05_xcc.cluster  in /etc/hosts, names  n05_xcc n05_xcc.cluster  contain invalid characters (valid characters include a throug   h z, numbers and the '-', but not '_'
Handling n01-xcc in /etc/hosts.
 
So question is 
 
1 : is the ‘_’ not allow to use in the compute name ?
2: is there any way to make it possible to use the ‘_’ in compute name in xCAT ?
 
 
 
 
Peter CZ pengDepartment :Complex Solution Rack TEAddress:ISH3 Shenzhen 
Lenovo China +86 86361590+86 18129979128609 1590peng...@lenovo.com  Lenovo.com /www.lenovo.com Twitter | Facebook | Instagram | Blogs | Forums 
 
 
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! http://sdm.link/slashdot
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user
 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] SciLinux 7.4 statelite problems

2018-06-19 Thread Yuan Y Bai
Hi Jeff,
 
Could you try rd.break=cleanup as following, or you can try to set break point addkcmdline=rd.break=pre-pivot.
 
 chdef node-i01 addkcmdline=rd.break=cleanup
 rinstall node-i01 osimage
 rcons node-i01
 
Have you try to add "/etc/systemd/" in litefile? Now we just add  "/etc/systemd/system/multi-user.target.wants/". 
 
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Jeff Berry To: xCAT Users Mailing list Cc:Subject: Re: [xcat-user] SciLinux 7.4 statelite problemsDate: Mon, Jun 18, 2018 5:36 PM  
Hi everyone,
 
thanks for the pointers.   I decided to go back to the very beginning and did a clean reinstall of xcat:
Version 2.14.1 (git commit 70d6e7f93cc9714a127c22df2e7ca53d4996a34c, built Fri Jun  1 03:00:53 EDT 2018)
 
then I walked through the documentation - https://xcat-docs.readthedocs.io/en/stable - and it works slighly better now.  I’m no longer getting udev errors, but I’m still getting journald errors:
code killed, status 6/ABRT
on restart ‘/run/log/journal//system.journal corrupted or uncleanly shut down.
 
which looks like it might be a space/memory issue?
 
In any case, even just after boot, I have the same problem where I can’t ssh to the node or rcons, or even get a console prompt on the drac card (it’s a dell C6420).  It’s pingable at the correct ip address.
 
As per the email below, I checked the image for pkglist, exlist, and postinall:
 
Object name: SL7.4-statelite-v1
    exlist=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.x86_64.exlist
    imagetype=linux
    osarch=x86_64
    osdistroname=SL7.4-x86_64
    osname=Linux
    osvers=SL7.4
    otherpkgdir=/install/post/otherpkgs/SL7.4/x86_64
    permission=755
    pkgdir=/install/SL7.4/x86_64
    pkglist=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.x86_64.pkglist
    postinstall=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.x86_64.postinstall
    profile="">
    provmethod=statelite
    rootimgdir=/install/netboot/SL7.4/x86_64/compute
 
I had a brief moment where I thought it might be an selinux problem, but in the rootimg selinux is disabled in /etc/selinux/config ...
the litefile is standard, but I’m thinking that I might change /var and /run to persistent to see if I can some extra insight into what’s happening on the node.
#image,file,options,comments,disable
"ALL","/etc/adjtime","tmpfs",,
"ALL","/etc/securetty","tmpfs",,
"ALL","/etc/lvm/","tmpfs",,
"ALL","/etc/ntp.conf","tmpfs",,
"ALL","/etc/rsyslog.conf","tmpfs",,
"ALL","/etc/rsyslog.conf.XCATORIG","tmpfs",,
"ALL","/etc/udev/","tmpfs",,
"ALL","/etc/ntp.conf.predhclient","tmpfs",,
"ALL","/etc/resolv.conf","tmpfs",,
"ALL","/etc/yp.conf","tmpfs",,
"ALL","/etc/resolv.conf.predhclient","tmpfs",,
"ALL","/etc/sysconfig/","tmpfs",,
"ALL","/etc/ssh/","tmpfs",,
"ALL","/etc/inittab","tmpfs",,
"ALL","/tmp/","tmpfs",,
"ALL","/var/","tmpfs",,
"ALL","/opt/xcat/","tmpfs",,
"ALL","/xcatpost/","tmpfs",,
"ALL","/etc/systemd/system/multi-user.target.wants/","tmpfs",,
"ALL","/root/.ssh/","tmpfs",,
"ALL","/etc/rc3.d/","tmpfs",,
"ALL","/etc/rc2.d/","tmpfs",,
"ALL","/etc/rc4.d/","tmpfs",,
"ALL","/etc/rc5.d/","tmpfs",,
 
I’m booting with rd.debug and rd.break=cleanup, but I don’t get a shell – I think because the root image *is* mounting.
 
As I said, thanks for the thoughts, and I just wanted to make sure that people know that I appreciate the input,
 
Best,
 
Jeff Berry
 
 
 
 
From: Yuan Y Bai [mailto:by...@cn.ibm.com]Sent: 12 June 2018 10:01To: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] SciLinux 7.4 statelite problems
 
Hi Jeff,
 
Could you check your osimage definition about exlist, pkglist and postinstall?
We do not formal ship compute.SL7.pkglist, we user the same files for rhels7. so could you try to use the rhels7 related files for your osimage? 
 
Here I give you an example for osimage, you can find the right arch files under /opt/xcat/share/xcat/netboot/rh/:
]# ls

Re: [xcat-user] SciLinux 7.4 statelite problems

2018-06-12 Thread Yuan Y Bai
Hi Jeff,
 
Could you check your osimage definition about exlist, pkglist and postinstall?
We do not formal ship compute.SL7.pkglist, we user the same files for rhels7. so could you try to use the rhels7 related files for your osimage? 
 
Here I give you an example for osimage, you can find the right arch files under /opt/xcat/share/xcat/netboot/rh/:
]# lsdef -t osimage rhels7.4-x86_64-statelite-compute -i exlist,pkglist,postinstallObject name: rhels7.4-x86_64-statelite-compute    exlist=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.x86_64.exlist    pkglist=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.x86_64.pkglist    postinstall=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.x86_64.postinstall
 
 
"Failing to install mlx_en", I got the same message when there is no mlx in my system.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Jeff Berry To: xCAT Users Mailing list Cc:Subject: [xcat-user] SciLinux 7.4 statelite problemsDate: Tue, Jun 12, 2018 4:25 PM  
Good morning all,
 
I’m still wrestling with a SciLinux 7.4 statelite deployment with xcat 2.13.11.    The dracut hooks don’t seem to be working properly, which is both making it difficult to debug and also probably symptomatic of a larger problem.   Running genimage, a few things have caught my eye.
 
The package list is looking for busybox-anaconda, which doesn’t seem to exist for SciLinux 7.  A bit of poking seems to suggest that it is deprecated, but it’s not clear to me what a suitable replacement might be.  Is there a preferred solution/workaround?
 
The dracut install also is throwing a couple of errors.  Failing to install mlx_en is, I think, benign.  I am also getting this error: “dracut-install: ERROR: installing '/etc/udev/udev.conf'”  which seems like it might be more significant, especially in light of my dracut problems.  However, I don’t know what might be causing this problem, nor how to fix it.
 
Any insight will be latched upon to with unseemly haste,
 
Jeff Berry
MRC-CBSU, Cambridge
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! http://sdm.link/slashdot
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] [External] Problem with statelite boot - no console or ssh

2018-05-20 Thread Yuan Y Bai
Hi Jeff:
 
You can do following to check:
 
1, check all nfs-server export directory correctly, including nfs server from `statelite` table and `litetree` table.
    "showmount -e "
 
2, You can use "lsdef -t osimage  -i pkglist" to find pkglist path, then you can add package names like yum into this file.  Then you should execute:
  genimage 
  liteimg 
  nodeset  osimage=
  rsetboot  net    #if your node is not VM
  rpower  reset
 
3, I installed rh7.4, and dracut is following, but I think you use a different OS:
dracut-033-502.el7.x86_64dracut-network-033-502.el7.x86_64
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Jeff Berry <jeff.be...@mrc-cbu.cam.ac.uk>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] [External] Problem with statelite boot - no console or sshDate: Fri, May 18, 2018 7:35 PM  
Some more digging suggests that the image is missing a lot of useful(needed) packages – including yum, and the only installed dracut package is just dracut.x86_64. 
 
It looks like maybe the image build didn’t get the needed packages? 
 
I haven’t been able to get rd.debug output or get any breakpoints to work – I tried them all.  
 
Thanks for everyone’s time, and sorry if I’m  making obvious rookie mistakes ...
 
Jeff Berry
jeff.be...@mrc-cbu.cam.ac.uk
 
 
From: Yuan Y Bai [mailto:by...@cn.ibm.com]Sent: 18 May 2018 02:19To: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] [External] Problem with statelite boot - no console or ssh
 
 
Did you try all these pre-mount or mount or pre-pivot break points have problem? 
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Jeff Berry <jeff.be...@mrc-cbu.cam.ac.uk>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] [External] Problem with statelite boot - no console or sshDate: Thu, May 17, 2018 7:22 PM 
Good afternoon,
 
thanks for the pointers.
The xcat version is: 2.13.11
 
As per Gilad’s suggestion, I tried booting to shell and that worked just fine.
 
I then tried your suggestions below with no luck.   However, it looks like there is a more fundamental problem.  None of the rd.break breakpoints worked, and the node booted to the same point before hanging.  This suggests to me that the dracut hooks are not working properly.   I’m investigating that more thorougly.
 
I did want to thank you both for the replies.
 
Jeff Berry
jeff.be...@mrc-cbu.cam.ac.uk
 
 
 
From: Yuan Y Bai [mailto:by...@cn.ibm.com]Sent: 16 May 2018 06:37To: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] [External] Problem with statelite boot - no console or ssh
 
Hi Jeff Berry
 
I looked at cluster.log, I guess  xCAT version is not latest version, what is your xCAT version? You can execute "lsxcatd -v" to get it.
 
From log,  you node hang during "Allowing litetree from node-i01",  you can add break point and enter the node-i01 to debug/find more useful information. 
Executing the following to enter node-i01 through console:
 
 chdef node-i01 addkcmdline=rd.break=cleanup
 rinstall node-i01 osimage
 rcons node-i01
 
After you enter node-i01, you can find statelite.log under "/sysroot/.statelite", you can munally check if mount is ok etc. After you check all of them, execute "exit", exit ..., if system is fine it can continue enter the normal statelite system.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Gilad Berman <gber...@lenovo.com>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] [External] Problem with statelite boot - no console or sshDate: 

Re: [xcat-user] [External] Problem with statelite boot - no console or ssh

2018-05-17 Thread Yuan Y Bai
 
Did you try all these pre-mount or mount or pre-pivot break points have problem? 
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Jeff Berry <jeff.be...@mrc-cbu.cam.ac.uk>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] [External] Problem with statelite boot - no console or sshDate: Thu, May 17, 2018 7:22 PM  
Good afternoon,
 
thanks for the pointers.
The xcat version is: 2.13.11
 
As per Gilad’s suggestion, I tried booting to shell and that worked just fine.
 
I then tried your suggestions below with no luck.   However, it looks like there is a more fundamental problem.  None of the rd.break breakpoints worked, and the node booted to the same point before hanging.  This suggests to me that the dracut hooks are not working properly.   I’m investigating that more thorougly.
 
I did want to thank you both for the replies.
 
Jeff Berry
jeff.be...@mrc-cbu.cam.ac.uk
 
 
 
From: Yuan Y Bai [mailto:by...@cn.ibm.com]Sent: 16 May 2018 06:37To: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] [External] Problem with statelite boot - no console or ssh
 
Hi Jeff Berry
 
I looked at cluster.log, I guess  xCAT version is not latest version, what is your xCAT version? You can execute "lsxcatd -v" to get it.
 
From log,  you node hang during "Allowing litetree from node-i01",  you can add break point and enter the node-i01 to debug/find more useful information. 
Executing the following to enter node-i01 through console:
 
 chdef node-i01 addkcmdline=rd.break=cleanup
 rinstall node-i01 osimage
 rcons node-i01
 
After you enter node-i01, you can find statelite.log under "/sysroot/.statelite", you can munally check if mount is ok etc. After you check all of them, execute "exit", exit ..., if system is fine it can continue enter the normal statelite system.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Gilad Berman <gber...@lenovo.com>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] [External] Problem with statelite boot - no console or sshDate: Wed, May 16, 2018 12:02 AM 
Do you have any issues booting to shell or installing standard image?
 
Gilad BermanHPC ArchitectLenovo EMEA+972-52-2554262gber...@lenovo.com  Lenovo.com Twitter |  Facebook | Instagram | Blogs |  Forums  

Re: [xcat-user] ask for support the hardware raid in genesis image

2018-05-10 Thread Yuan Y Bai
Hi Peter,
 
You can have a try if you want to add command into genesis image, we do not have x86_64 system to verify:
 
1, copy your command into : /opt/xcat/share/xcat/netboot/genesis/x86_64/fs/usr/sbin
2, execute "mknb x86_64" command before nodeset command to update network boot root image.
 
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Jarrod Johnson To: peter CZ1 Peng , xCAT Users Mailing list Cc:Subject: Re: [xcat-user] ask for support the hardware raid in genesis imageDate: Thu, May 10, 2018 11:17 PM  
Putting storcli64 in the genesis image should function.  The driver in the genesis image is sufficient to enable storcli64, and it has the needed libraries.
 
Storcli64 is proprietary, so we don’t bundle it in the rpm.
 
From: peter CZ1 PengSent: Thursday, May 10, 2018 1:48 AMTo: xCAT Users Mailing list Cc: Jarrod Johnson Subject: ask for support the hardware raid in genesis image 
 
Hi ,All
    I check the xcat document and see that the xCAT have the support PPP64 with iprconfig to configure the Hardware Raid in the genesis image (https://github.com/xcat2/xcat-core/blob/1ad3a53108739f26531bdd52cd02da8ce68dff40/docs/source/advanced/raid/hardware_raid.rst ) , I would like to know if the genesis image will also support x86_64 system , or is there anyone that can tell me how to integrated the MegaCli64 or storcli64 command into the genesis image so we can setup the Hardware RAID in the node discovery stage . it would be better if we can integrated some command command into the genesis image like the lspci / dmidecode ,thanks 
 
 
 
Peter CZ pengDepartment :Complex Solution Rack TEAddress:ISH3 Shenzhen 
Lenovo China +86 86361590+86 18129979128609 1590peng...@lenovo.com  Lenovo.com /www.lenovo.com Twitter | Facebook | Instagram | Blogs | Forums 
 
 
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! http://sdm.link/slashdot
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user
 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] confignics -s -r not removing ib0

2018-05-01 Thread Yuan Y Bai
Hi Javier,
 
Here I tried to clarify your real actions. Could you want to use `confignics` to remove configured nics, then re-configure nics configured in nics table including IB nics? 
 
If the answer is yes, I think you can use scripts order like "confignics -s -r; confignetwork --ibaports=2". "confignics -s -r" will configure install nic and remove some other nics, "confignetwork --ibaports=2" will configure all nics configured in nics table including IB. When "confignetwork --ibaports=2" start to configure IB (--ibports number depends on your own case), it will remove old IB configuration from the compute node, then start to configuration based on nics table.
 
So you do not need to consider using `confignics` to remove IB adapter from compute node before configuring IB, it removes old configuration every time. I carefully check `confignics`, you are right, the "confignics -s -r not removing ib0".
 
You may notice that `confignics` is deprecated in the doc, the new script is `confignetwork`. We noticed that `confignics -r` always make user confuse about `In which scenario, confignics can remove nics?`, so we let user to modify the updates by themselves, `confignetwork` will not do remove work. The old `confignics` is kept here, but we suggest use confignetwork.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Javier Ron <j@qmul.ac.uk>To: "xcat-user@lists.sourceforge.net" <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] confignics -s -r not removing ib0Date: Fri, Apr 27, 2018 7:36 PM 
 
Hi Yuan,
 
There are several other Ethernet devices that are undefined & do get removed (maybe removed is wrong, just not configured) whilst deploying & specifying in the postbootscripts :confignics -s -r
however ib0 gets a config file (but is not enabled) & that causes an error when the network service is loaded
 
tabdump shows:#
 
"host1""enp6s0f0!MTU=9000""host2""enp6s0f0!MTU=9000""host3""enp6s0f0!MTU=9000""host4""enp6s0f0!MTU=9000" 

 
 
Kind Regards
From: Yuan Y Bai <by...@cn.ibm.com>Sent: 26 April 2018 09:00:46To: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] confignics -s -r not removing ib0
 
Hi Javier,
 
`confignics -r` is only works in this scenario: If the compute node’s nics were configured by confignics and the nics configuration changed in the nics table, user the confignics -r to remove the undefined nic.
 
You can refer to doc:  http://xcat-docs.readthedocs.io/en/latest/guides/admin-guides/manage_clusters/ppc64le/diskless/customize_image/network/cfg_second_adapter.html?highlight=confignics#option-r-to-remove-the-undefined-nics
 
 
That means: IB interface should be configured by `confignics` at the beginning. When you delete IB interface from nics table, `confignics -r` should work.  Could you use `tabdump nics` to check your nics table? If it does not work, could you open a issue and describe your steps ?
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Javier Ron <j@qmul.ac.uk>To: "xcat-user@lists.sourceforge.net" <xcat-user@lists.sourceforge.net>Cc:Subject: [xcat-user] confignics -s -r not removing ib0Date: Wed, Apr 25, 2018 11:07 PM 
Hello,
 
I am hoping someone can help.
Deploying a host with xcat & the ib0 adapter config file is not being removed
 
 
 
running:
updatenode nodename -P 'confignics -s -r'
 
produces below output:
 
nodename: confignics on nodename: executed script: 'configib -r' to remove all ib nics and configuration filesnodename: nothing to do.
 
 
This is coming from the configib script, but does not remove the ib0 config file from network-scripts
 
 
 
#if $NIC_IBNICS is not defined, all ib nics' configuration files will be deleted.if [ -z "$NIC_IBNICS" ]; then    echo "nothing to do."fi
 
 
 
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org!  https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=FoOFYlaQb0BtJ8KkH_cWf

Re: [xcat-user] confignics -s -r not removing ib0

2018-04-26 Thread Yuan Y Bai
Hi Javier,
 
`confignics -r` is only works in this scenario: If the compute node’s nics were configured by confignics and the nics configuration changed in the nics table, user the confignics -r to remove the undefined nic.
 
You can refer to doc: http://xcat-docs.readthedocs.io/en/latest/guides/admin-guides/manage_clusters/ppc64le/diskless/customize_image/network/cfg_second_adapter.html?highlight=confignics#option-r-to-remove-the-undefined-nics
 
 
That means: IB interface should be configured by `confignics` at the beginning. When you delete IB interface from nics table, `confignics -r` should work.  Could you use `tabdump nics` to check your nics table? If it does not work, could you open a issue and describe your steps ?
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Javier Ron To: "xcat-user@lists.sourceforge.net" Cc:Subject: [xcat-user] confignics -s -r not removing ib0Date: Wed, Apr 25, 2018 11:07 PM 
Hello,
 
I am hoping someone can help.
Deploying a host with xcat & the ib0 adapter config file is not being removed
 
 
 
running:
updatenode nodename -P 'confignics -s -r'
 
produces below output:
 
nodename: confignics on nodename: executed script: 'configib -r' to remove all ib nics and configuration filesnodename: nothing to do.
 
 
This is coming from the configib script, but does not remove the ib0 config file from network-scripts
 
 
 
#if $NIC_IBNICS is not defined, all ib nics' configuration files will be deleted.if [ -z "$NIC_IBNICS" ]; then    echo "nothing to do."fi
 
 
 
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=FoOFYlaQb0BtJ8KkH_cWfl4Ynz6Wyv0Pjfrk0puPvio=xB4XKQpBLsmZ3hgtqdE2MLm1OJV4BUznxXGFQXYSyZo=
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_xcat-2Duser=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=FoOFYlaQb0BtJ8KkH_cWfl4Ynz6Wyv0Pjfrk0puPvio=ORiD3HOG1dcpJkg78mU3i20IIyUeKEaS1rvBcQeEbFU=
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Defining more than one NTP server.

2018-03-27 Thread Yuan Y Bai
Hi Hakan,
 
It is possible to define more than one NTP servers in networks table. You can use `tabedit networks` to edit networks table directly or use chdef command, for example : chdef -t network mgtnetwork ntpservers=10.0.0.102,10.0.0.103
 
After you add ntpservers in networks table, you can execute `makedhcp -n` command, then ntpservers will be updated in dhcp configure files, for example: 
]# cat /etc/dhcp/dhcpd.conf|grep ntp    option ntp-servers 10.0.0.102, 10.0.0.103;
 
BTW: What is your user scenario for ntpservers in networks table?
There are another ntp servers attributes in site table: ntpservers and extntpservers. These 2 attributes are used by `makentp` command and `setupntp` script. You can also look into these if needed.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Hakan Bayındır" To: xCAT User Mailing List Cc:Subject: [xcat-user] Defining more than one NTP server.Date: Mon, Mar 26, 2018 2:36 PM 
Hello all,Is it possible to define more than one NTP server in the networks table?It's defined as NTP servers, but I'm unable to find any details aboutthe definition syntax in relevant documentation(http://xcat-docs.readthedocs.io/en/stable/guides/admin-guides/references/man5/networks.5.html?highlight=ntp).Any help will be appreciated,Best regards,Hakan--*Hakan BAYINDIR*Başuzman AraştırmacıAğ Teknolojileri BirimiTÜBİTAK ULAKBİMT.C. Bilim, Sanayi ve Teknoloji Bakanlığı (Eski Bina)Mustafa Kemal Mahallesi Dumlupınar Bulvarı(Eskişehir Yolu 7.Km) 2151.Cadde No:154ODTÜ Karşısı06510 Çankaya, ANKARAT +90 312 298 9373F +90 312 266 5181www.ulakbim.gov.tr hakan.bayin...@tubitak.gov.trSorumluluk Reddi  
 
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! http://sdm.link/slashdot
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Confignetworks and default route

2018-01-09 Thread Yuan Y Bai
 
I missed one line in last mail:
 
confignetwork migrate routes to new interface. you can also use nicextraparams in nics table to customize something for specific interface,  the nicextraparams content will be added into ifcfg-xxx file. 
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Yuan Y Bai" <by...@cn.ibm.com>To: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] Confignetworks and default routeDate: Wed, Jan 10, 2018 10:12 AM 
Hi Nathan,
 
Thanks Russ.
We cannot configure gateway in nics table. You can configure gateway in networks table for the specific networks , confignetwork also use the gateway from networks table, but confignetworks
If you want to configure the default gateway as the static gateway, after running confignetworks , you can use makeroutes or setroute script to do that , here is my draft doc for these 2 command/script https://github.com/xcat2/xcat-core/pull/4580/commits/e9fd1c9e345997e409b54229be159bafadc3de73
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Russ Auld <russa...@comcast.net>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] Confignetworks and default routeDate: Wed, Jan 10, 2018 8:28 AM 
The gateway field should be used to set the default route. Make sure there's just one gateway set if you use multiple nics, otherwise the last one will win. 
 
On Jan 9, 2018 12:16 PM, Nathan Harper <nathan.har...@cfms.org.uk> wrote:
Hi,
 
We've been using confignetworks post OS install to take the installnic and bond it with another interface.
 
As the default gateway is set by DHCP, is there some config I'm missing in the nics table to get it to set the default gateway?
 --

Nathan Harper // IT Systems Lead
e: nathan.har...@cfms.org.uk   t: 0117 906 1104  m:  0787 551 0891  w: www.cfms.org.uk  
CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR 
 
CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=wUdq-uqfTIZWkxxk9T5fq0Ms-QlKJVymzSNR1jIlnGc=i8XDzvVqchn8wGhQme7nVtB0-_M9R4j-KdhkqIfJb4U=
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_xcat-2Duser=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=wUdq-uqfTIZWkxxk9T5fq0Ms-QlKJVymzSNR1jIlnGc=pja89jnFc264DoFp_9moXhy7Dek9Iwaa-UjoMZFKOzo=
  

--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=di4U8_rLCg5toqXle7vmgzpG20nBiVUT4oD9fvMG6D8=1AcmAI7smXKekNM6RFIRMi_pF8ZnZbXVbw3aaNPTrn4=
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_xcat-2Duser=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=di4U8_rLCg5toqXle7vmgzpG20nBiVUT4oD9fvMG6D8=82RnRo29CkkG45FNQ0S5-fLghIkcGAuTmagPxZ2-AWs=
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Confignetworks and default route

2018-01-09 Thread Yuan Y Bai
Hi Nathan,
 
Thanks Russ.
We cannot configure gateway in nics table. You can configure gateway in networks table for the specific networks , confignetwork also use the gateway from networks table, but confignetworks
If you want to configure the default gateway as the static gateway, after running confignetworks , you can use makeroutes or setroute script to do that , here is my draft doc for these 2 command/script https://github.com/xcat2/xcat-core/pull/4580/commits/e9fd1c9e345997e409b54229be159bafadc3de73
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Russ Auld To: xCAT Users Mailing list Cc:Subject: Re: [xcat-user] Confignetworks and default routeDate: Wed, Jan 10, 2018 8:28 AM 
The gateway field should be used to set the default route. Make sure there's just one gateway set if you use multiple nics, otherwise the last one will win. 
 
On Jan 9, 2018 12:16 PM, Nathan Harper  wrote:
Hi,
 
We've been using confignetworks post OS install to take the installnic and bond it with another interface.
 
As the default gateway is set by DHCP, is there some config I'm missing in the nics table to get it to set the default gateway?
 --

Nathan Harper // IT Systems Lead
e: nathan.har...@cfms.org.uk   t: 0117 906 1104  m:  0787 551 0891  w: www.cfms.org.uk  
CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons Green // Bristol // BS16 7FR 
 
CFMS Services Ltd is registered in England and Wales No 05742022 - a subsidiary of CFMS Ltd CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1 4QP
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=wUdq-uqfTIZWkxxk9T5fq0Ms-QlKJVymzSNR1jIlnGc=i8XDzvVqchn8wGhQme7nVtB0-_M9R4j-KdhkqIfJb4U=
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_xcat-2Duser=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=wUdq-uqfTIZWkxxk9T5fq0Ms-QlKJVymzSNR1jIlnGc=pja89jnFc264DoFp_9moXhy7Dek9Iwaa-UjoMZFKOzo=
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Local scratch for stateless compute nodes

2017-11-28 Thread Yuan Y Bai
Hi Bin,
 
Could you help look into this issue?
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Yuan Y Bai" <by...@cn.ibm.com>To: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] Local scratch for stateless compute nodesDate: Wed, Nov 29, 2017 10:10 AM 
 
Hi Gilad,
Could you add an entry in policy table to permit the running of the "getpartition" command from the node?
 
chtab priority=7.1 policy.commands=getpartition policy.rule=allow
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Gilad Berman <gber...@lenovo.com>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] Local scratch for stateless compute nodesDate: Tue, Nov 28, 2017 8:34 PM  
Yuan Bai hello,
 
I did follow exactly those instructions (although I am pretty sure the litefile table part is not needed for the stateless image) but I still see nothing happening on the nodes. I suspect that either there is no piece of code for stateless that handle localdisk or I am missing something very basic. 
 
Have you tried it with stateless? Where can I find the code and how I can debug it?    I can’t find anything related to partition nor local disk setup in the logs. 
 
Here is my local disk file – 
enable=yes
enablepart=yes
 
[disk]
dev=/dev/sda
clear=yes
parts=1,19,80
 
[swapspace]
dev=/dev/sda1
 
[localspace]
dev=/dev/sda2
fstype=ext4
 
[localspace]
dev=/dev/sda3
fstype=ext4
 
 
 
# lsdef -t osimage -o rhels7.3-x86_64-netboot-compute
Object name: rhels7.3-x86_64-netboot-compute
    exlist=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.x86_64.exlist
    imagetype=linux
    osarch=x86_64
    osdistroname=rhels7.3-x86_64
    osname=Linux
    osvers=rhels7.3
    otherpkgdir=/install/post/otherpkgs/rhels7.3/x86_64
    partitionfile=/install/custom/netboot/localdisk
    permission=755
    pkgdir=/install/rhels7.3/x86_64
    pkglist=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.x86_64.pkglist
    postinstall=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.x86_64.postinstall
    profile="">
    provmethod=netboot
    rootimgdir=/install/netboot/rhels7.3/x86_64/compute
    synclists=/install/custom/netboot/compute.synclist

THX!!
 
 
Gilad BermanHPC ArchitectLenovo EMEA+972-52-2554262gber...@lenovo.com  Lenovo.com Twitter | Facebook | Instagram | Blogs | Forums         
 
 
From: Yuan Y Bai [mailto:by...@cn.ibm.com]Sent: Tuesday, November 28, 2017 7:30 AMTo: xcat-user@lists.sourceforge.netCc: xcat-user@lists.so

Re: [xcat-user] Local scratch for stateless compute nodes

2017-11-28 Thread Yuan Y Bai
 
Hi Gilad,
Could you add an entry in policy table to permit the running of the "getpartition" command from the node?
 
chtab priority=7.1 policy.commands=getpartition policy.rule=allow
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Gilad Berman <gber...@lenovo.com>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] Local scratch for stateless compute nodesDate: Tue, Nov 28, 2017 8:34 PM  
Yuan Bai hello,
 
I did follow exactly those instructions (although I am pretty sure the litefile table part is not needed for the stateless image) but I still see nothing happening on the nodes. I suspect that either there is no piece of code for stateless that handle localdisk or I am missing something very basic. 
 
Have you tried it with stateless? Where can I find the code and how I can debug it?    I can’t find anything related to partition nor local disk setup in the logs. 
 
Here is my local disk file – 
enable=yes
enablepart=yes
 
[disk]
dev=/dev/sda
clear=yes
parts=1,19,80
 
[swapspace]
dev=/dev/sda1
 
[localspace]
dev=/dev/sda2
fstype=ext4
 
[localspace]
dev=/dev/sda3
fstype=ext4
 
 
 
# lsdef -t osimage -o rhels7.3-x86_64-netboot-compute
Object name: rhels7.3-x86_64-netboot-compute
    exlist=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.x86_64.exlist
    imagetype=linux
    osarch=x86_64
    osdistroname=rhels7.3-x86_64
    osname=Linux
    osvers=rhels7.3
    otherpkgdir=/install/post/otherpkgs/rhels7.3/x86_64
    partitionfile=/install/custom/netboot/localdisk
    permission=755
    pkgdir=/install/rhels7.3/x86_64
    pkglist=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.x86_64.pkglist
    postinstall=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.x86_64.postinstall
    profile="">
    provmethod=netboot
    rootimgdir=/install/netboot/rhels7.3/x86_64/compute
    synclists=/install/custom/netboot/compute.synclist

THX!!
 
 
Gilad BermanHPC ArchitectLenovo EMEA+972-52-2554262gber...@lenovo.com  Lenovo.com Twitter | Facebook | Instagram | Blogs | Forums         
 
 
From: Yuan Y Bai [mailto:by...@cn.ibm.com]Sent: Tuesday, November 28, 2017 7:30 AMTo: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] Local scratch for stateless compute nodes
 
Hello,
 
Please refer to "Enabling the localdisk Option" section under "diskless installation" section :  http://xcat-docs.readthedocs.io/en/latest/guides/admin-guides/manage_clusters/ppc64le/diskless/customize_image/localdisk.html
 
The main diskless installation doc link is here:
http://xcat-docs.readthedocs.io/en/latest/guides/admin-guides/manage_clusters/ppc64le/diskless/index.html
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by

Re: [xcat-user] Local scratch for stateless compute nodes

2017-11-27 Thread Yuan Y Bai
That is correct, diskless and statelite both support local disk for different purposes now.
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Vinícius Ferrão To: xCAT Users Mailing list Cc:Subject: Re: [xcat-user] Local scratch for stateless compute nodesDate: Tue, Nov 28, 2017 12:14 AM 
 
Hello,
 
Accordingly to the documentation they are different on a specific way:
 
. Stateless: nodes boot from a RAMdisk OS image downloaded from the xCAT mgmt node or service node at boot time.
 
. Statelite: nodes boot from an NFS-root diskless OS image.
 
And both support local disk for different purposes, what we are targeting are local disks just for scratch and swap. They will not handle any state, and this is perfectly supported as said on the documentation.
 
Here’s the documentation: https://sourceforge.net/p/xcat/wiki/XCAT_Overview,_Architecture,_and_Planning/#xcat-cluster-node-types
 
Thanks,
V. 
Sent from my iPhone
On 27 Nov 2017, at 12:50, Russ Auld  wrote: 
If you're using netboot and local disk,  then isn't that "statelite"?
Do the satellite instructions not work? 
 
On Nov 27, 2017 9:26 AM, Gilad Berman  wrote:

All, 
 
I would like to join this question – 
Does even localdisk works with stateless? From the docs it seems that should be supported (because it is under stateless), however – 
- the instructions are taken from statelite and refer to statelite code (litefile)
- The rc.localdisk code is under statelite
- In the linuximage man – “Partitionfile - Only available for diskful osimages and statelite osimages(localdisk enabled)“ 
 
A very quick trial on my statless nodes results in nothing J, it seems there is simply no reference to localdisk with stateless. 
 
So, can someone please help clarify it? 
 
** as always, there is a chance I missed something very basic and it should be working J 
 
THX in advance!
 
Gilad BermanHPC ArchitectLenovo EMEA+972-52-2554262gber...@lenovo.com  Lenovo.com Twitter | Facebook | Instagram | Blogs | Forums 
 
 
From: Vinícius Ferrão [mailto:fer...@versatushpc.com.br]Sent: Wednesday, November 22, 2017 4:09 AMTo: xcat-user@lists.sourceforge.netSubject: [xcat-user] Local scratch for stateless compute nodes
 
Hello,
 
I would like to enable swap and local /tmp on my stateless nodes, but after following the documentation on the following link nothing appears to work:
http://xcat-docs.readthedocs.io/en/stable/advanced/hierarchy/provision/diskless_sn.html
 
I’m aware that the documentation is for service nodes and not for compute nodes, but I was thinking the procedure would be similar.
 
At this point I’m with this settings on osimage:
[root@headnode xcat]# lsdef -t osimage centos7.4-x86_64-netboot-compute    

Re: [xcat-user] Managing user's SSH configuration with xCAT

2017-11-27 Thread Yuan Y Bai
Hi Kevin,
 
Please refer to http://xcat-docs.readthedocs.io/en/latest/advanced/security/security.html#commands-access-control
 
Here is "Granting Users xCAT Privileges" topics.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Kevin Keane To: xCAT Users Mailing list Cc:Subject: [xcat-user] Managing user's SSH configuration with xCATDate: Tue, Nov 28, 2017 1:54 AM 
xCAT seems to be very good at managing root's known_hosts file, and distributing root's SSH key to the nodes for passwordless logon. I believe this is done by updatenodes. Can xCAT do the same for ordinary users as well? I have not found a way to do it yet. Thanks!

--
___Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.eduMaher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=WNzrm1WNohOUleG-2i55Q1zqQpjptdn_dLcYptLUO98=_MGiMPOonJirm93WbVf1u4sh7wWG5adxImp0UlRK45Y=
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_xcat-2Duser=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=WNzrm1WNohOUleG-2i55Q1zqQpjptdn_dLcYptLUO98=YLdLE6QxuPk5fOEe6mqmU_z_Z2mR5SPGnwwfOQUO1Ig=
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] xCAT creates "impossible" node

2017-11-15 Thread Yuan Y Bai
Hi Kevin,
 
You can report issue here : https://github.com/xcat2/xcat-core/issues
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Kevin Keane <kke...@sandiego.edu>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] xCAT creates "impossible" nodeDate: Thu, Nov 16, 2017 2:15 AM 
The second mystery is (mostly)solved as well. I believe this is a bug in xCAT - how do I report it? updatenode will try to update, or create if it doesn't exist, a management node based in whatever is in site.master. updatenode is smart enough to strip a hostname suffix (mn-mgt becomes mn) but not smart enough to strip the domain from an FQDN.
 
In any case, thanks for your patient help on this! 
 
On Tue, Nov 14, 2017 at 6:21 PM, Yuan Y Bai <by...@cn.ibm.com> wrote:

Hi Kevin,
 
1. Could you check "/proc/sys/kernel/hostname" this file? it will affect the "hostname" result. I do not know if you hit the same problem.
 
[root@bybc0602 ~]# echo bybc0602.cluster.com > /proc/sys/kernel/hostname[root@bybc0602 ~]# hostnamebybc0602.cluster.com[root@bybc0602 ~]# hostname -fbybc0602
 
2. I have a question here : why there are 2 management node mn and mn.dev.sabre2.sandiego.edu ?  You may want to use "makehosts" to generate /etc/hosts file using node mn. But there is another node named mn.dev.sabre2.sandiego.edu.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Kevin Keane <kke...@sandiego.edu>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: [xcat-user] xCAT creates "impossible" nodeDate: Wed, Nov 15, 2017 7:41 AM 
I have my xCAT system mostly up and running, and am trying to resolve some of the last details. This is xCAT 2.13.8 on RedHat 7.4. 
Among those problems is that xCAT (specifically, updatenode) creates an "impossible" node. I created one node called "mn" to cover the management node (see below for the details), and one compute node cn-001. updatenode is adding a third node named with the FQDN of the management node. I am calling this an "impossible" node because node names aren't supposed to contain periods. 
Specifically, I see this "impossible" node when I use:
 
 
updatenode '/cn-.*' 
Why would updatenode even try to update the management node? 
I have a second, and potentially related, problem: 
After a fresh install, hostname and hostname -f both return the FQDN:mn.dev.sabre2.sandiego.edu 
At some point after installing xCAT (I have not nailed down when), this changes. hostname will continue to return the FQDN, but hostname -f will return "mn". Of course, I would expect the reverse.
 
Thanks!
 
Here is the output of lsdef -l
[root@mn ~]# lsdef -lObject name: cn-001    arch=x86_64    cons=ipmi    currstate=netboot rhels7.4-x86_64-hpccn    groups=ipmi,all,compute    installnic=mac    ip=192.168.101.3    mac=52:54:00:f2:e6:39    mgt=ipmi    netboot=xnba    nichostnamesuffixes.eth0=-comp    nicips.eth0=192.168.100.3    nicips.eth1=192.168.101.3    os=rhels7.4    postbootscripts=otherpkgs,confignics    postscripts=syslog,remoteshell,syncfiles,setupntp    primarynic=mac    profile="">    provmethod=rhels7.4-x86_64-netboot-hpccn    updatestatus=failed    updatestatustime=11-14-2017 14:37:59Object name: mn    groups=__mgmtnode    hostnames=mn.dev.sabre2.sandiego.edu    ip=192.168.20.2    nichostnamesuffixes.eth0=-comp    nichostnamesuffixes.eth2=-mgt    nicips.eth1=192.168.20.2    nicips.eth0=192.168.100.2    nicips.eth2=192.168.101.2    postbootscripts=otherpkgs    postscripts=syslog,remoteshell,syncfilesObject name: mn.dev.sabre2.sandiego.edu    postbootscripts=otherpkgs    postscripts=syslog,remoteshell,syncfiles    updatestatus=synced    updatestatustime=11-14-2017 14:37:59  

--
___Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.eduMaher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_sla

Re: [xcat-user] xCAT creates "impossible" node

2017-11-14 Thread Yuan Y Bai
Hi Kevin,
 
1. Could you check "/proc/sys/kernel/hostname" this file? it will affect the "hostname" result. I do not know if you hit the same problem.
 
[root@bybc0602 ~]# echo bybc0602.cluster.com > /proc/sys/kernel/hostname[root@bybc0602 ~]# hostnamebybc0602.cluster.com[root@bybc0602 ~]# hostname -fbybc0602
 
2. I have a question here : why there are 2 management node mn and mn.dev.sabre2.sandiego.edu ?  You may want to use "makehosts" to generate /etc/hosts file using node mn. But there is another node named mn.dev.sabre2.sandiego.edu.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Kevin Keane To: xCAT Users Mailing list Cc:Subject: [xcat-user] xCAT creates "impossible" nodeDate: Wed, Nov 15, 2017 7:41 AM 
I have my xCAT system mostly up and running, and am trying to resolve some of the last details. This is xCAT 2.13.8 on RedHat 7.4. 
Among those problems is that xCAT (specifically, updatenode) creates an "impossible" node. I created one node called "mn" to cover the management node (see below for the details), and one compute node cn-001. updatenode is adding a third node named with the FQDN of the management node. I am calling this an "impossible" node because node names aren't supposed to contain periods. 
Specifically, I see this "impossible" node when I use:
 
 
updatenode '/cn-.*' 
Why would updatenode even try to update the management node? 
I have a second, and potentially related, problem: 
After a fresh install, hostname and hostname -f both return the FQDN:mn.dev.sabre2.sandiego.edu 
At some point after installing xCAT (I have not nailed down when), this changes. hostname will continue to return the FQDN, but hostname -f will return "mn". Of course, I would expect the reverse.
 
Thanks!
 
Here is the output of lsdef -l
[root@mn ~]# lsdef -lObject name: cn-001    arch=x86_64    cons=ipmi    currstate=netboot rhels7.4-x86_64-hpccn    groups=ipmi,all,compute    installnic=mac    ip=192.168.101.3    mac=52:54:00:f2:e6:39    mgt=ipmi    netboot=xnba    nichostnamesuffixes.eth0=-comp    nicips.eth0=192.168.100.3    nicips.eth1=192.168.101.3    os=rhels7.4    postbootscripts=otherpkgs,confignics    postscripts=syslog,remoteshell,syncfiles,setupntp    primarynic=mac    profile="">    provmethod=rhels7.4-x86_64-netboot-hpccn    updatestatus=failed    updatestatustime=11-14-2017 14:37:59Object name: mn    groups=__mgmtnode    hostnames=mn.dev.sabre2.sandiego.edu    ip=192.168.20.2    nichostnamesuffixes.eth0=-comp    nichostnamesuffixes.eth2=-mgt    nicips.eth1=192.168.20.2    nicips.eth0=192.168.100.2    nicips.eth2=192.168.101.2    postbootscripts=otherpkgs    postscripts=syslog,remoteshell,syncfilesObject name: mn.dev.sabre2.sandiego.edu    postbootscripts=otherpkgs    postscripts=syslog,remoteshell,syncfiles    updatestatus=synced    updatestatustime=11-14-2017 14:37:59  

--
___Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.eduMaher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=AectGMeVYNVooI6ZS5FKZQ_qtuvwSZpBlckGSZT6CCs=EMigJjB7Xo1b0276J04byWrra2NqTzt7aVMXobj_SIA=
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_xcat-2Duser=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=AectGMeVYNVooI6ZS5FKZQ_qtuvwSZpBlckGSZT6CCs=ovtbdL7rmOl_aLy5B5o4dQzMQjQeXWHiH6AvNxlOLks=
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] makedns-generated zones and their NS record

2017-11-06 Thread Yuan Y Bai
Thanks Kevin, I got it, have a fun of xCAT .
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Kevin Keane <kke...@sandiego.edu>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] makedns-generated zones and their NS recordDate: Tue, Nov 7, 2017 3:45 AM 
Thank you, Yuan. Yes, I know about the name mismatch on eth1. I just didn't get around to fixing that. And your configuration would work, but it has public and private reversed. As I said - I'll simply have the DNS server listen on my eth1 interface as well; that should take care of my issues, although it feels like a hack. Thanks for all your help! 
 
On Mon, Nov 6, 2017 at 2:15 AM, Yuan Y Bai <by...@cn.ibm.com> wrote:

 
Hi Kevin,
I noticed your hpcmn-test eth1 hostname domain is "kkeane.sandiego.edu", and "hpcpublic" in networks table domain attribute is "sabre.kkeane.sandiego.edu", is there sub-domain or mis-spelling?
 
Although it works somewhat, I still feel curious about "But makedns doesn’t use the name that corresponds to eth2, but rather the hostname (from hostname -f) in the NS records". 
 
The following configurations can work in my env, could you give me your differences? So that I can compare with them, Thanks.
 
In the host:
private eth1: hpcmn-test.kkeane.sandiego.edu
public eth2: hpcmn-test.imm.sabre.kkeane.sandiego.edu
the "hostname -f" return hpcmn-test.kkeane.sandiego.edu
 
1, Make sure /etc/resolv.conf is persistent. And private domain "imm.sabre.kkeane.sandiego.edu" is before public domain "sabre.kkeane.sandiego.edu".
In case /etc/resolv.conf is updated/synced up automatically, add "PEERDNS=NO" in ifcfg-eth* file.
 
2, In xcat, "DNS server should only be listening on eth2", so eth2 is the primary interface. "define its primary interface (the one whose domain matches site.domain value) in hosts.node/hosts.ip (this should also be what was deinfed in nodelist above)", assume all nodes belong to node group testgroup, the site table has the following configurations:
domain=imm.sabre.kkeane.sandiego.edu
nameservers=192.168.101.2
master=192.168.101.2
dnsinterfaces=testgroup|eth2 # if needed
forwarders=
 
3, In my previous example, we can use "makehosts -n" to get the following /etc/hosts:
192.168.101.2 hpcmn-test hpcmn-test.imm.sabre.kkeane.sandiego.edu
192.168.20.2 hpcmn-test-eth1 hpcmn-test.kkeane.sandiego.edu
 
4, Execute "makedns -n", xcat DNS server should be 192.168.101.2, all the following nslookup command give corresponding ip.
nslookup hpcmn-test
nslookup hpcmn-test.imm.sabre.kkeane.sandiego.edu
nslookup hpcmn-test.kkeane.sandiego.edu
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Kevin Keane (USD)" <kke...@sandiego.edu>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc: "xcat-user@lists.sourceforge.net" <xcat-user@lists.sourceforge.net>Subject: Re: [xcat-user] makedns-generated zones and their NS recordDate: Mon, Nov 6, 2017 2:53 PM 
Yes, I used the dnsinterfaces attribute in the site table. But makedns doesn’t use the name that corresponds to eth2, but rather the hostname (from hostname -f) in the NS records – which corresponds to eth1 in my case.
 
Maybe I’ll simply have the DNS server listen on eth1 as well. I was hoping to avoid that, but it may be my easiest solution here.
 
Sent from Mail for Windows 10
 
From: Yuan Y BaiSent: Sunday, November 5, 2017 10:35 PM
To: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] makedns-generated zones and their NS record
 
 
Hi Kevin,
 
In your playgroud, " the DNS server should only be listening on eth2".
 
So "dnsinterfaces" attribute in site table can control which the network interfaces DNS should listen on.
 
dnsinterfaces:  The network interfaces DNS should listen on.  If it is the same for all nodes, use a simple comma-separated list of NICs.  To specify different NICs for different nodes, use the format: "xcatmn|eth1,eth2;service|bond0", where xcatmn is the name of the management node, and DNS should listen on the eth1 and eth2 interfaces.  All the nods in group 'service' should

Re: [xcat-user] makedns-generated zones and their NS record

2017-11-06 Thread Yuan Y Bai
 
Hi Kevin,
I noticed your hpcmn-test eth1 hostname domain is "kkeane.sandiego.edu", and "hpcpublic" in networks table domain attribute is "sabre.kkeane.sandiego.edu", is there sub-domain or mis-spelling?
 
Although it works somewhat, I still feel curious about "But makedns doesn’t use the name that corresponds to eth2, but rather the hostname (from hostname -f) in the NS records". 
 
The following configurations can work in my env, could you give me your differences? So that I can compare with them, Thanks.
 
In the host:
private eth1: hpcmn-test.kkeane.sandiego.edu
public eth2: hpcmn-test.imm.sabre.kkeane.sandiego.edu
the "hostname -f" return hpcmn-test.kkeane.sandiego.edu
 
1, Make sure /etc/resolv.conf is persistent. And private domain "imm.sabre.kkeane.sandiego.edu" is before public domain "sabre.kkeane.sandiego.edu".
In case /etc/resolv.conf is updated/synced up automatically, add "PEERDNS=NO" in ifcfg-eth* file.
 
2, In xcat, "DNS server should only be listening on eth2", so eth2 is the primary interface. "define its primary interface (the one whose domain matches site.domain value) in hosts.node/hosts.ip (this should also be what was deinfed in nodelist above)", assume all nodes belong to node group testgroup, the site table has the following configurations:
domain=imm.sabre.kkeane.sandiego.edu
nameservers=192.168.101.2
master=192.168.101.2
dnsinterfaces=testgroup|eth2 # if needed
forwarders=
 
3, In my previous example, we can use "makehosts -n" to get the following /etc/hosts:
192.168.101.2 hpcmn-test hpcmn-test.imm.sabre.kkeane.sandiego.edu
192.168.20.2 hpcmn-test-eth1 hpcmn-test.kkeane.sandiego.edu
 
4, Execute "makedns -n", xcat DNS server should be 192.168.101.2, all the following nslookup command give corresponding ip.
nslookup hpcmn-test
nslookup hpcmn-test.imm.sabre.kkeane.sandiego.edu
nslookup hpcmn-test.kkeane.sandiego.edu
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Kevin Keane (USD)" <kke...@sandiego.edu>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc: "xcat-user@lists.sourceforge.net" <xcat-user@lists.sourceforge.net>Subject: Re: [xcat-user] makedns-generated zones and their NS recordDate: Mon, Nov 6, 2017 2:53 PM 
Yes, I used the dnsinterfaces attribute in the site table. But makedns doesn’t use the name that corresponds to eth2, but rather the hostname (from hostname -f) in the NS records – which corresponds to eth1 in my case.
 
Maybe I’ll simply have the DNS server listen on eth1 as well. I was hoping to avoid that, but it may be my easiest solution here.
 
Sent from Mail for Windows 10
 
From: Yuan Y BaiSent: Sunday, November 5, 2017 10:35 PMTo: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] makedns-generated zones and their NS record
 
Hi Kevin,
 
In your playgroud, " the DNS server should only be listening on eth2".
 
So "dnsinterfaces" attribute in site table can control which the network interfaces DNS should listen on.
 
dnsinterfaces:  The network interfaces DNS should listen on.  If it is the same for all nodes, use a simple comma-separated list of NICs.  To specify different NICs for different nodes, use the format: "xcatmn|eth1,eth2;service|bond0", where xcatmn is the name of the management node, and DNS should listen on the eth1 and eth2 interfaces.  All the nods in group 'service' should listen on the 'bond0' interface.
 NOTE: If using this attribute to block certain interfaces, make sure the IP maps to your hostname of xCAT MN is not blocked since xCAT needs to use this IP to communicate with the local NDS server on MN.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Yuan Y Bai" <by...@cn.ibm.com>To: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] makedns-generated zones and their NS recordDate: Mon, Nov 6, 2017 10:47 AM  
Hi Kevin,
 
Thanks for your summary.
 
After "xcatconfig -m", there is xcat MN node hpcmn-test,  you need to "chdef hpcmn-test ip=...; makehosts -n"

Re: [xcat-user] makedns-generated zones and their NS record

2017-11-05 Thread Yuan Y Bai
Hi Kevin,
 
In your playgroud, " the DNS server should only be listening on eth2".
 
So "dnsinterfaces" attribute in site table can control which the network interfaces DNS should listen on.
 
dnsinterfaces:  The network interfaces DNS should listen on.  If it is the same for all nodes, use a simple comma-separated list of NICs.  To specify different NICs for different nodes, use the format: "xcatmn|eth1,eth2;service|bond0", where xcatmn is the name of the management node, and DNS should listen on the eth1 and eth2 interfaces.  All the nods in group 'service' should listen on the 'bond0' interface.
 NOTE: If using this attribute to block certain interfaces, make sure the IP maps to your hostname of xCAT MN is not blocked since xCAT needs to use this IP to communicate with the local NDS server on MN.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: "Yuan Y Bai" <by...@cn.ibm.com>To: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] makedns-generated zones and their NS recordDate: Mon, Nov 6, 2017 10:47 AM 
Hi Kevin,
 
Thanks for your summary.
 
After "xcatconfig -m", there is xcat MN node hpcmn-test,  you need to "chdef hpcmn-test ip=...; makehosts -n", its name and ip will be added into /etc/hosts.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Kevin Keane <kke...@sandiego.edu>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] makedns-generated zones and their NS recordDate: Sat, Nov 4, 2017 12:23 AM 
Thanks for that command. I deleted my management node, and used xcatconfig -m to recreate it. That gave me this:Object name: hpcmn-test    groups=__mgmtnode    postbootscripts=otherpkgs    postscripts=syslog,remoteshell,syncfiles    setuptftp=yes makenode -n did not add hpcmn-test to /etc/hosts at all Good question about the background. We have a (working) cluster running RedHat 6.8. I'm not touching that one for now, but sometimes use it for reference, and eventually want to both rebuild it with RH 7.4, and also fully automate the setup to make it reproducible (in part as disaster recovery). For that future setup, I haven't fully made up my mind yet about the architecture. I might separate the management node from the user login node, for instance, or I might switch to statelite or stateful rather than stateless nodes.  Currently, I'm building my own "playground" to be sure I fully understand what I'm doing, and to try out these various options.My goal is to have a 100% automated setup (using Ansible). That's why I want to avoid any manual configuration, and also why I try to accomplish as much as I can with chdef and chtab (I wrote an Ansible module for those), and - whenever I can - stay away from commands such as nodedef (which are harder to manage using Ansible), or manually configuring /etc/hosts.Right now, I'm using virtual machines for the playground - just one management node and one compute node. The MN has a public network to the outside world at eth1 (192.168.20.2), a (simulated) high-speed interconnect network on eth0 (192.168.100.2) and a (simulated) lower-speed management network on eth2 (192.168.101.2). From what I understand, a fairly standard setup, except my virtual machines don't have IPMI/BMC. 
The hostname of the management node is hpcmn-test.kkeane.sandiego.edu. This will also be associated with the public IP on eth1, and would also be how the system is being reached from the outside world.
 
The DNS server should only be listening on eth2.
 
This setup basically works to my satisfaction. Except for one thing: makedns fails. I assume makedns uses nsupdate under the hood. nsupdate uses the NS record in the zone to find the authoritative name server. The NS record would point to eth1, but my DNS server only listens on eth2.
 
 
On Thu, Nov 2, 2017 at 11:05 PM, Yuan Y Bai <by...@cn.ibm.com> wrote:

Hi Kevin,
 
To add the Management Node to the DB, use command:  xcatconfig -m ;
 
And I saw questions from your mail and tried to give a way to let it work, I realize I do not know your overall requirements about hosts/DNS clearly , could you sum

Re: [xcat-user] makedns-generated zones and their NS record

2017-11-05 Thread Yuan Y Bai
Hi Kevin,
 
Thanks for your summary.
 
After "xcatconfig -m", there is xcat MN node hpcmn-test,  you need to "chdef hpcmn-test ip=...; makehosts -n", its name and ip will be added into /etc/hosts.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Kevin Keane <kke...@sandiego.edu>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] makedns-generated zones and their NS recordDate: Sat, Nov 4, 2017 12:23 AM 
Thanks for that command. I deleted my management node, and used xcatconfig -m to recreate it. That gave me this:Object name: hpcmn-test    groups=__mgmtnode    postbootscripts=otherpkgs    postscripts=syslog,remoteshell,syncfiles    setuptftp=yes makenode -n did not add hpcmn-test to /etc/hosts at all Good question about the background. We have a (working) cluster running RedHat 6.8. I'm not touching that one for now, but sometimes use it for reference, and eventually want to both rebuild it with RH 7.4, and also fully automate the setup to make it reproducible (in part as disaster recovery). For that future setup, I haven't fully made up my mind yet about the architecture. I might separate the management node from the user login node, for instance, or I might switch to statelite or stateful rather than stateless nodes.  Currently, I'm building my own "playground" to be sure I fully understand what I'm doing, and to try out these various options.My goal is to have a 100% automated setup (using Ansible). That's why I want to avoid any manual configuration, and also why I try to accomplish as much as I can with chdef and chtab (I wrote an Ansible module for those), and - whenever I can - stay away from commands such as nodedef (which are harder to manage using Ansible), or manually configuring /etc/hosts.Right now, I'm using virtual machines for the playground - just one management node and one compute node. The MN has a public network to the outside world at eth1 (192.168.20.2), a (simulated) high-speed interconnect network on eth0 (192.168.100.2) and a (simulated) lower-speed management network on eth2 (192.168.101.2). From what I understand, a fairly standard setup, except my virtual machines don't have IPMI/BMC. 
The hostname of the management node is hpcmn-test.kkeane.sandiego.edu. This will also be associated with the public IP on eth1, and would also be how the system is being reached from the outside world.
 
The DNS server should only be listening on eth2.
 
This setup basically works to my satisfaction. Except for one thing: makedns fails. I assume makedns uses nsupdate under the hood. nsupdate uses the NS record in the zone to find the authoritative name server. The NS record would point to eth1, but my DNS server only listens on eth2.
 
 
On Thu, Nov 2, 2017 at 11:05 PM, Yuan Y Bai <by...@cn.ibm.com> wrote:

Hi Kevin,
 
To add the Management Node to the DB, use command:  xcatconfig -m ;
 
And I saw questions from your mail and tried to give a way to let it work, I realize I do not know your overall requirements about hosts/DNS clearly , could you summarize your original requirements? 
 
Let us see if we can do some enhancements here.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Kevin Keane <kke...@sandiego.edu>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] makedns-generated zones and their NS recordDate: Fri, Nov 3, 2017 12:00 AM  
Thank you, Yuan and Christian! Yuan, I haven't tried it, but it seems that your suggestion would not actually solve my problem. Makehosts would generate, as you were saying, this line:192.168.20.2 hpcmn-test-eth1 hpcmn-test.kkeane.sandiego.edu But the DNS server should only listen on 192.168.101.2. That is why I my preferred solution would be to change the zones to have an NS record that actually points to the correct NIC: hpcmn-test.imm.sabre.kkeane.sandiego.edu Christian - you are right. Putting the names into the "hostnames" field was a bit of a hack. It actually works if there is only one name in that field, but if there are multiple names in that field, makehosts seems to only use the last one. And it also seems to *replace* rather than add to the names that xCAT would ordinarily use. 
 
On Wed, Nov 1, 2017 at 7:04 PM, Yuan Y Bai <by...@cn.ibm.com> wrote: 

Re: [xcat-user] xcat integration to an existing cluster

2017-11-05 Thread Yuan Y Bai
Hi imam,
 
You are right, xcat is better adaptable than other provisioning systems, I think you can do the following basic things at first:
 
  1, Install one system as xcat management node: http://xcat-docs.readthedocs.io/en/latest/guides/install-guides/index.html
  2, Define nodes to add the existing machine IP #'s, MAC addresses, arch, serialport, serialspeed, cons, hardware   control related attribute like mgt,bmc,bmcusername,bmcpassword etc. 
  3, Add these nodes name and into /etc/hosts table using "makehosts -n"
  4, Configure DNS, DHCP, console, set up the SSH keys to use parallel commands or updatenode
  5, Make sure harware management work well
  6, Since you do not want to re-provision os, you need to remember not to run "rinstall " or "nodeset  osimage=..." or "nodeset  osimage" etc.
 
Here is manage clusters doc:
http://xcat-docs.readthedocs.io/en/latest/guides/admin-guides/manage_clusters/index.html
 
Based on this doc, you can try these sections and have fun with it:
Configure xCATHardware Discovery & Define Node  # if you already have mac etc, you can manually define nodesHardware ManagementUsing UpdatenodeParallel Commands
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Imam Toufique To: xCAT Users Mailing list Cc:Subject: [xcat-user] xcat integration to an existing clusterDate: Sat, Nov 4, 2017 12:53 AM 
Hi, 
 
I am highly considering using Xcat in our current cluster, it seems to be better adaptable than other provisioning systems I have recently tried. 
 
Having said that, if I did that, what do you recommend for integrating Xcat into a running cluster?  We have over 300 nodes in the cluster, and I don't want Xcat to flag the current running systems to provision.  I was thinking about starting Xcat with a new subnet, but I am not sure if that will be enough.  Perhaps there is a way to just add the existing machine names, IP #'s, MAC addresses, BMC address/credentials in Xcat and set the state to 'boot' (for chain) and hopefully, Xcat will not flag them to 'image' during their next reboot or startup.  
 
Looking for suggestions here, Thanks again for the help!
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=IrzrhcFPrLL2bJumzvJHPgGzaAN-tVxIfeAbVfoLxi0=lpFptZXr6FGgsYZScPYXHILATF6bE01i_70yWo5mAAc=
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_xcat-2Duser=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=IrzrhcFPrLL2bJumzvJHPgGzaAN-tVxIfeAbVfoLxi0=UQFAB1MbsekoxEr3qgCOCEozNZ3CiVjAliTmv5qHuQo=
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] makedns-generated zones and their NS record

2017-11-03 Thread Yuan Y Bai
Hi Kevin,
 
To add the Management Node to the DB, use command:  xcatconfig -m ;
 
And I saw questions from your mail and tried to give a way to let it work, I realize I do not know your overall requirements about hosts/DNS clearly , could you summarize your original requirements? 
 
Let us see if we can do some enhancements here.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Kevin Keane <kke...@sandiego.edu>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] makedns-generated zones and their NS recordDate: Fri, Nov 3, 2017 12:00 AM 
Thank you, Yuan and Christian! Yuan, I haven't tried it, but it seems that your suggestion would not actually solve my problem. Makehosts would generate, as you were saying, this line:192.168.20.2 hpcmn-test-eth1 hpcmn-test.kkeane.sandiego.edu But the DNS server should only listen on 192.168.101.2. That is why I my preferred solution would be to change the zones to have an NS record that actually points to the correct NIC: hpcmn-test.imm.sabre.kkeane.sandiego.edu Christian - you are right. Putting the names into the "hostnames" field was a bit of a hack. It actually works if there is only one name in that field, but if there are multiple names in that field, makehosts seems to only use the last one. And it also seems to *replace* rather than add to the names that xCAT would ordinarily use. 
 
On Wed, Nov 1, 2017 at 7:04 PM, Yuan Y Bai <by...@cn.ibm.com> wrote:

Hi Kevin,
 
I am glad that it does work.
 
For your question about makehosts:
In your example, the short names are all 'hpcmn-test',  since 'makehsots' generate both short name and long name for one nic ip, here you need to configure nics table for eth1 and eth0,  so that, makehosts can generate different short names for all nics. and you can add specific long name in nics.nicaliases. Take eth1 and eth2 as example:
public nic: eth1: hpcmn-test.kkeane.sandiego.edu   management nic: eth2: hpcmn-test.imm.sabre.kkeane.sandiego.edu
 
1, configure hosts table for mangement ip:
"hpcmn-test","192.168.101.2"
 
2, configure nics table for secondary nics:
"hpcmn-test","eth1!192.168.20.2",,"eth1!hpcmn-test.kkeane.sandiego.edu",
 
3, use the same networks table with yours.
 
4, use lsdef to see hpcmn-test node definition:
]# lsdef hpcmn-testObject name: hpcmn-test    groups=all    ip=192.168.101.2    nicaliases.eth1=hpcmn-test.kkeane.sandiego.edu    nicips.eth1=192.168.20.2    postbootscripts=otherpkgs    postscripts=syslog,remoteshell,syncfiles
 
5. execute 'makehosts hpcmn-test', check /etc/hosts file:
192.168.101.2 hpcmn-test hpcmn-test.imm.sabre.kkeane.sandiego.edu192.168.20.2 hpcmn-test-eth1 hpcmn-test.kkeane.sandiego.edu
 
Here makehosts create short name hpcmn-test-eth1 which is different short name from hpcmn-test. And the long name hpcmn-test.kkeane.sandiego.edu is what your wanted. If you define hosts table otherinterfaces, it should have different short name with the node name, so you have errors.
 
I also think Christian gave your another tips to work this, Thanks Christian.
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Kevin Keane <kke...@sandiego.edu>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Cc:Subject: Re: [xcat-user] makedns-generated zones and their NS recordDate: Thu, Nov 2, 2017 12:00 AM 
Thank you, Yuan and Christian! I have actually pretty much done what both of you had suggested, and it does work - somewhat. I have the following networks:[root@hpcmn-test ~]# tabdump networks#netname,net,mask,mgtifname,gateway,dhcpserver,tftpserver,nameservers,ntpservers,logservers,dynamicrange,staticrange,staticrangeincrement,nodehostname,ddnsdomain,vlanid,domain,mtu,comments,disable"hpcpublic","192.168.20.0","255.255.255.0","eth1","192.168.20.1""sabre.kkeane.sandiego.edu","1500","HPC Public Network","hpccompute","192.168.100.0","255.255.255.0","eth0",,"192.168.100.2","""192.168.100.200-192.168.100.229",,,"/z/-compute/",,,"compute.sabre.kkeane.sandiego.edu","1500","HPC Compute Network","hpcmanagement","192

Re: [xcat-user] makedns-generated zones and their NS record

2017-11-01 Thread Yuan Y Bai
Hi Kevin,
 
I am glad that it does work.
 
For your question about makehosts:
In your example, the short names are all 'hpcmn-test',  since 'makehsots' generate both short name and long name for one nic ip, here you need to configure nics table for eth1 and eth0,  so that, makehosts can generate different short names for all nics. and you can add specific long name in nics.nicaliases. Take eth1 and eth2 as example:
public nic: eth1: hpcmn-test.kkeane.sandiego.edu   management nic: eth2: hpcmn-test.imm.sabre.kkeane.sandiego.edu
 
1, configure hosts table for mangement ip:
"hpcmn-test","192.168.101.2"
 
2, configure nics table for secondary nics:
"hpcmn-test","eth1!192.168.20.2",,"eth1!hpcmn-test.kkeane.sandiego.edu",
 
3, use the same networks table with yours.
 
4, use lsdef to see hpcmn-test node definition:
]# lsdef hpcmn-testObject name: hpcmn-test    groups=all    ip=192.168.101.2    nicaliases.eth1=hpcmn-test.kkeane.sandiego.edu    nicips.eth1=192.168.20.2    postbootscripts=otherpkgs    postscripts=syslog,remoteshell,syncfiles
 
5. execute 'makehosts hpcmn-test', check /etc/hosts file:
192.168.101.2 hpcmn-test hpcmn-test.imm.sabre.kkeane.sandiego.edu192.168.20.2 hpcmn-test-eth1 hpcmn-test.kkeane.sandiego.edu
 
Here makehosts create short name hpcmn-test-eth1 which is different short name from hpcmn-test. And the long name hpcmn-test.kkeane.sandiego.edu is what your wanted. If you define hosts table otherinterfaces, it should have different short name with the node name, so you have errors.
 
I also think Christian gave your another tips to work this, Thanks Christian.
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Kevin Keane To: xCAT Users Mailing list Cc:Subject: Re: [xcat-user] makedns-generated zones and their NS recordDate: Thu, Nov 2, 2017 12:00 AM 
Thank you, Yuan and Christian! I have actually pretty much done what both of you had suggested, and it does work - somewhat. I have the following networks:[root@hpcmn-test ~]# tabdump networks#netname,net,mask,mgtifname,gateway,dhcpserver,tftpserver,nameservers,ntpservers,logservers,dynamicrange,staticrange,staticrangeincrement,nodehostname,ddnsdomain,vlanid,domain,mtu,comments,disable"hpcpublic","192.168.20.0","255.255.255.0","eth1","192.168.20.1""sabre.kkeane.sandiego.edu","1500","HPC Public Network","hpccompute","192.168.100.0","255.255.255.0","eth0",,"192.168.100.2","""192.168.100.200-192.168.100.229",,,"/z/-compute/",,,"compute.sabre.kkeane.sandiego.edu","1500","HPC Compute Network","hpcmanagement","192.168.101.0","255.255.255.0","eth2","","192.168.101.2","""192.168.101.200-192.168.101.229",,,"/z/-imm/",,,"imm.sabre.kkeane.sandiego.edu","1500","HPC Management Network", I also have defined hpcmn-test as a node (thanks, Christian, for the tip about unmanaged!):Object name: hpcmn-test    groups=all    hostnames=hpcmn-test.kkeane.sandiego.edu hpcmn-test.imm.sabre.kkeane.sandiego.edu    ip=192.168.101.2    postbootscripts=otherpkgs    postscripts=syslog,remoteshell,syncfiles When I manually edit /etc/hosts as Yuan suggested, everything does work: 127.0.0.1 localhost192.168.101.2 hpcmn-test hpcmn-test.imm.sabre.kkeane.sandiego.edu hpcmn-test.kkeane.sandiego.edu. But there are two problems with this: - hpcmn-test.kkeane.sandiego.edu really should be associated with the public IP (192.168.20.2 in my example)- makehosts does not honor this name, even though it is in the node's hostnames attribute. Here is what makehosts produces:127.0.0.1 localhost192.168.101.2 hpcmn-test hpcmn-test.imm.sabre.kkeane.sandiego.edu 

 
 
On Wed, Nov 1, 2017 at 7:00 AM, Christian Caruthers  wrote:

I just had to deal with a similar issue. Try the following:
 
-  Fill out each network in the networks table, including domain name. Don’t worry about nameservers or gateway unless there is are external resources that should be used. NOTE defining an external nameserver in networks table will cause makedns to ignore any IPs in that subnet.
-  Define hpcmn-test as an unmanaged node in the cluster (nodeadd hpcmn-test groups=__Unmanaged or something similar)
-  Define its primary interface (the one whose domain matches site.domain value) in hosts.node/hosts.ip (this should also be what was deinfed in nodelist above)
-  Define all other interfaces in hosts.otherinterfaces with fqdn. For example:“hpcmn-test”,”1.2.3.4”,,”hpctest.compute.sabre.kkeane.sandiego.edu:2.3.4.5,hpcmn-test.imm.sabre.kkeane.sandiego.edu:3.4.5.6”The domain names listed for each IP in hosts should match the networks.domain entry for each respective 

Re: [xcat-user] makedns-generated zones and their NS record

2017-10-31 Thread Yuan Y Bai
Hi Kevin,
 
I saw there were 3 nics in your xcat MN, Which is your xcat management network?
 
Here I can give you an example and may help your problem.
 
In example, there are 3 nics in xcat MN:
 
10_0_0_0-255_0_0_0 is xcat management network;
eth0 is xcat management network nic: 10.5.106.100 rhmn rhmn.cluster.com, domain is cluster.cometh1: 30.5.106.9 bybc0609.private.cluster, domain is private.clustereth2: 40.5.106.9 bybc0609.imm, domain is imm
 
1,  in /etc/hosts:
10.5.106.100 rhmn rhmn.cluster.com30.5.106.9 bybc0609.private.cluster40.5.106.9 bybc0609.imm
 
2, in /etc/resolv.conf
search cluster.com.nameserver 10.5.106.100
 
3, in xcat networks table, 'tabdump networks', you should specify domain for every network.
"30_0_0_0-255_0_0_0","30.0.0.0","255.0.0.0","eth1","",,"",,"private.cluster","1500",,"40_0_0_0-255_0_0_0","40.0.0.0","255.0.0.0","eth2","",,"",,"imm","1500",,"10_0_0_0-255_0_0_0","10.0.0.0","255.0.0.0","eth0","10.5.106.2",,"",,"cluster.com","1500",,
 
notes: you can execute "makenetworks" to generate all the networks entry in networks table. Then you can use chdef or tabedit to modify the domain attribute.
like: chdef -t network 10_0_0_0-255_0_0_0 domain=cluster.com
 
4, execute 'makedns -n', there is no error, then use 'nslookup' to check results:
 
]# nslookup byrhmnServer:        10.5.106.100Address:    10.5.106.100#53
Name:    byrhmn.cluster.comAddress: 10.5.106.100
 
]# nslookup bybc0609.private.clusterServer:        10.5.106.100Address:    10.5.106.100#53
Name:    bybc0609.private.clusterAddress: 30.5.106.9
 
]# nslookup bybc0609.immServer:        10.5.106.100Address:    10.5.106.100#53
Name:    bybc0609.immAddress: 40.5.106.9
 
5, 3 networks data are generated here.
]# ls /var/named/data    db.172.29.0  db.40.jnl   db.imm.jnl  named.emptydb.10   db.172.30.0  db.cluster.com  db.it.raboof.edu    named.localhostdb.10.0.0   db.172.40.0  db.cluster.com.jnl  db.private.cluster  named.loopbackdb.10.jnl   db.30    db.cluster.local    db.private.cluster.jnl  slavesdb.148.143.201  db.30.jnl    db.dc.foo.edu   dynamicdb.172.20.0 db.40    db.imm  named.ca
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Kevin Keane To: xCAT Users Mailing list Cc:Subject: [xcat-user] makedns-generated zones and their NS recordDate: Wed, Nov 1, 2017 4:21 AM 
I have a management node with three NICs, and want to use makedns to generate the DNS configuration. My management node has three names, corresponding to the three NICs:eth0: hpcmn-test.compute.sabre.kkeane.sandiego.edueth1: hpcmn-test.kkeane.sandiego.edueth2: hpcmn-test.imm.sabre.kkeane.sandiego.edu
 
hostname -f returns hpcmn-test.kkeane.sandiego.edu (which is name by which my management node will be known on our public network).
 
I have the DNS server listening only on eth2. Consequently, the zones in the DNS server should have the corresponding name server hpcmn-test.imm.sabre.kkeane.sandiego.edu. However, the zones generated by makedns -n instead use the hpcmn-test.kkeane.sandiego.edu name.$TTL 86400@ IN SOA hpcmn-test.kkeane.sandiego.edu. root.hpcmn-test.kkeane.sandiego.edu. ( 2017103100 10800 3600 604800 86400 )  IN NS  hpcmn-test.kkeane.sandiego.edu. 
This wreaks havoc with future calls to makedns; updates will time out because the DNS server is not listening at the IP address that corresponds to this name (and in fact, makehosts doesn't even put this name into /etc/hosts) 
How can I get makedns to generate zones with an NS record that points to hpcmn-test.imm.sabre.kkeane.sandiego.edu ?
Thanks!
 --

___Kevin Keane | Systems Architect | University of San Diego ITS | kke...@sandiego.eduMaher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 | 619.260.6859
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=NPU6yg_jvEmDNhw9PFv8NApP5BEHW_5_uC_a-cQNv8s=OgTtKhV4noEZqKHJKyO0rqZTntj7zygZyPxXJb7QbeI=
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_xcat-2Duser=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=NPU6yg_jvEmDNhw9PFv8NApP5BEHW_5_uC_a-cQNv8s=ArDwlLwYY3sSzxfuTHT8VW8MosbCyNBcYYTMs6qL0hE=
 



Re: [xcat-user] Same short name causing makehosts problems

2017-10-25 Thread Yuan Y Bai
Hi Christian,
 
I found bloom.geode.iu.edu and public.geode.iu.edu were not in our example 2. These 2 domain may be created by other data/networks.
Could you check all your nodes ,/etc/hosts file and networks table etc to confirm where these 2 coming from.
 
Only based on example 2, "makedns -n" work well in my environment.
 
[root@bybc0605 ~]# makedns -nHandling node1-ib in /etc/hosts.Handling bybc0607 in /etc/hosts.Handling rhmn in /etc/hosts.Handling farnsworth in /etc/hosts.Handling node1-imm in /etc/hosts.Handling node1 in /etc/hosts.Handling farnsworth in /etc/hosts.Handling c910f05c01bc06 in /etc/hosts.Handling localhost in /etc/hosts.Getting reverse zones, this may take several minutes for a large cluster.Completed getting reverse zones.Updating zones.Completed updating zones.Restarting namedRestarting named completeUpdating DNS records, this may take several minutes for a large cluster.Completed updating DNS records.DNS setup is completed
 
[root@bybc0605 ~]# cat /etc/hosts127.0.0.1 localhost172.20.0.11 node1 node1.cluster.local172.29.0.11 node1-imm node1-imm.cluster.local10.0.0.38 farnsworth farnsworth.dc.foo.edu148.143.201.25 farnsworth farnsworth.it.raboof.edu10.5.106.1 c910f05c01bc0610.5.106.100 rhmn172.40.0.11 node1-ib node1-ib.cluster.local
 
[root@bybc0605 ~]# lsxcatd -vVersion 2.13.7 (git commit 9ec6d0c0cce9f61a078a4dcd044927f5a65a606f, built Fri Sep 22 02:16:45 EDT 2017)
 
[root@bybc0605 ~]# tabdump networks#netname,net,mask,mgtifname,gateway,dhcpserver,tftpserver,nameservers,ntpservers,logservers,dynamicrange,staticrange,staticrangeincrement,nodehostname,ddnsdomain,vlanid,domain,mtu,comments,disable"provision","172.20.0.0","255.255.255.0","ens3",,,"172.20.0.2",,"172.20.0.1",,"172.20.0.200-172.20.0.250","imm","172.29.0.0","255.255.255.0","ens3",,,"172.29.0.2","infra","172.30.0.0","255.255.255.0","ens3",,,"172.30.0.2","IPoIB","172.40.0.0","255.255.255.0","ens3",,,"172.40.0.2","intersite","10.0.0.0","255.255.255.0","ens8","dc.foo.edu",," ","public","148.143.201.0","255.255.255.224",,"149.165.232.1""it.raboof.edu",,,"VNDC-NET-00116-IPv4","10.0.0.0","255.0.0.0","eth0","10.5.106.2",,"",,"vndc","1500",,
 
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Christian Caruthers <ccaruth...@lenovo.com>To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>Cc:Subject: Re: [xcat-user] Same short name causing makehosts problemsDate: Wed, Oct 25, 2017 10:01 PM  
Oddly, when I set up example 2, I received the following error when running makedns –n:
 
Error: Failure encountered updating bloom.geode.iu.edu., error was NOTZONE. See more details in system log.
Error: Failure encountered updating public.geode.iu.edu., error was NOTZONE. See more details in system log.
 
Looking in the syslog, I see:
 
Oct 25 09:13:30 xcat-bloom named[26918]: client 172.20.0.2#52572/key xcat_key: updating zone 'bloom.geode.iu.edu/IN': update failed: update RR is outside zone (NOTZONE)
Oct 25 09:13:35 xcat-bloom named[26918]: client 172.20.0.2#60694/key xcat_key: updating zone 'public.geode.iu.edu/IN': update failed: update RR is outside zone (NOTZONE)
 
Regards,Christian CaruthersLenovo Professional Services
Mobile: 757-289-9872
 
From: Yuan Y Bai [mailto:by...@cn.ibm.com]Sent: Monday, October 23, 2017 2:48 AMTo: xcat-user@lists.sourceforge.netCc: xcat-user@lists.sourceforge.netSubject: Re: [xcat-user] Same short name causing makehosts problems
 
Hi Christian,
 
If you only want to "external" networks into "/etc/hosts", you can only all it into hosts table like exemaple 1).
 
But I noticed that your want to configure "bond0" and "ens2f1" in nics table. so I modified your nics table and hosts table a little and get the another results which near your requirements and make sure `nics` table configuration correct. The different is :  2 “external” networks in favor of “farnsworth.domain.name” are added into /etc/hosts, but it has short name with "iface".   like example 2)
 
example 1) only all "external" networks into /etc/hosts:
 
]# cat /etc/hosts
172.20.0.11 node1 node1.cluster.local172.29.0.11 node1-imm node1-imm.cluster.local10.0.0.38 farn

Re: [xcat-user] Same short name causing makehosts problems

2017-10-23 Thread Yuan Y Bai
Hi Christian,
 
If you only want to "external" networks into "/etc/hosts", you can only all it into hosts table like exemaple 1).
 
But I noticed that your want to configure "bond0" and "ens2f1" in nics table. so I modified your nics table and hosts table a little and get the another results which near your requirements and make sure `nics` table configuration correct. The different is :  2 “external” networks in favor of “farnsworth.domain.name” are added into /etc/hosts, but it has short name with "iface".   like example 2)
 
example 1) only all "external" networks into /etc/hosts:
 
]# cat /etc/hosts
172.20.0.11 node1 node1.cluster.local172.29.0.11 node1-imm node1-imm.cluster.local10.0.0.38 farnsworth farnsworth.dc.foo.edu148.143.201.25 farnsworth farnsworth.it.raboof.edu172.40.0.11 node1-ib node1-ib.cluster.local
 
]# tabdump hosts#node,ip,hostnames,otherinterfaces,comments,disable"node1","172.20.0.11",,"-imm:172.29.0.11,farnsworth.dc.foo.edu:10.0.0.38,farnsworth.it.raboof.edu:148.143.201.25",,
 
]# tabdump nics#node,nicips,nichostnamesuffixes,nichostnameprefixes,nictypes,niccustomscripts,nicnetworks,nicaliases,nicextraparams,nicdevices,nicsadapter,comments,disable"node1","ib0!172.40.0.11","ib0!-ib",,"ib0!Infiniband,bond0!Ethernet,ens2f1!Ethernet","bond0!configbond-mtu bond0 ens1@ens1d1 mode=4@xmit_hash_policy=layer3+4@miimon=100","ib0!IPoIB,bond0!intersite,ens2f1!public",,,"bond0!MTU=9000,ens2f1!MTU=9000",,,
 
exmaple 2) configure bond0 in nics table example:
]# cat /etc/hosts127.0.0.1 localhost172.20.0.11 node1 node1.cluster.local172.29.0.11 node1-imm node1-imm.cluster.local148.143.201.25 node1-ens2f1 farnsworth.it.raboof.edu172.40.0.11 node1-ib node1-ib.cluster.local10.0.0.38 node1-bond0 farnsworth.dc.foo.edu
 
]# lsdef node1 |grep nic    nicaliases.bond0=farnsworth.dc.foo.edu    nicaliases.ens2f1=farnsworth.it.raboof.edu    niccustomscripts.bond0=configbond-mtu bond0 ens1@ens1d1 mode=4@xmit_hash_policy=layer3+4@miimon=100    nicdevices.bond0=MTU=9000    nicdevices.ens2f1=MTU=9000    nichostnamesuffixes.ib0=-ib    nicips.ib0=172.40.0.11    nicips.bond0=10.0.0.38    nicips.ens2f1=148.143.201.25    nicnetworks.ib0=IPoIB    nicnetworks.bond0=intersite    nicnetworks.ens2f1=public    nictypes.ib0=Infiniband    nictypes.bond0=Ethernet    nictypes.ens2f1=Ethernet
 
]# tabdump hosts#node,ip,hostnames,otherinterfaces,comments,disable"node1","172.20.0.11",,"-imm:172.29.0.11",,
 
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Christian Caruthers To: "xCAT Users Mailing list (xcat-user@lists.sourceforge.net)" Cc:Subject: [xcat-user] Same short name causing makehosts problemsDate: Sat, Oct 21, 2017 2:46 AM  
Running version 2.13.5.POST107.g8489de8
 
I have a handful on nodes that use the same short name on 2 networks. Here’s what it should look like:
 
node1.cluster.local 172.20.0.11
node1-imm.cluster.local 172.29.0.11
node1-ib.cluster.local 172.40.0.11
farnsworth.dc.foo.edu 10.0.0.38
farnsworth.it.raboof.edu 148.143.201.25
 
Networks setup is:
#netname,net,mask,mgtifname,gateway,dhcpserver,tftpserver,nameservers,ntpservers,logservers,dynamicrange,staticrange,staticrangeincrement,nodehostname,ddnsdomain,vlanid,domain,mtu,comments,disable
"provision","172.20.0.0","255.255.255.0","ens3",,,"172.20.0.2",,"172.20.0.1",,"172.20.0.200-172.20.0.250",
"imm","172.29.0.0","255.255.255.0","ens3",,,"172.29.0.2",
"infra","172.30.0.0","255.255.255.0","ens3",,,"172.30.0.2",
"IPoIB","172.40.0.0","255.255.255.0","ens3",,,"172.40.0.2",
"intersite","10.0.0.0","255.255.255.0","ens8","dc.foo.edu",, ,
"public","148.143.201.0","255.255.255.224",,"149.165.232.1""it.raboof.edu",,,
 
If I set up the provisioning and IMM IPs in the hosts table w/ everything else defined in nics, I see:
 
Node setup:
Nics table: "node1","ib0!172.40.0.11,bond0!10.0.0.38,ens2f1!148.143.201.25","ib0!-ib",,"ib0!Infiniband,bond0!Ethernet,ens2f1!Ethernet","bond0!configbond-mtu bond0 ens1@ens1d1 mode=4@xmit_hash_policy=layer3+4@miimon=100","ib0!IPoIB,bond0!intersite,ens2f1!public",”bond0!farnsworth,ens2f1!farnsworth”,"bond0!MTU=9000,ens2f1!MTU=9000"
 
Hosts table: "node1","172.20.0.11",,"-imm:172.29.0.11",,
 
After running makehosts –n, I see:
 
172.20.0.17 node1 node1.cluster.local
172.29.0.17 node1-imm node1-imm.cluster.local
148.143.201.25 node1-ens2f1 node1-ens2f1.it.raboof.edu farnsworth
172.40.0.17 node1-ib node1-ib.cluster.local
10.0.0.38 node1-bond0 node1-bond0.dc.foo.edu
 
Note that it creates an alias for farnsworth (which won’t work for our purposes), but ignores the setting for the 10.0.0.0/bond0 

Re: [xcat-user] Problem with syncfiles after install.

2017-10-12 Thread Yuan Y Bai
Hi Hakan,
 
Run updatenode to synchronize the files : "updatenode  -F -V" .
Other tips: `remoteshell` is running before `syncfiles` in postscripts, `remoteshell` will start up the sshd for syncfiles postscript to do the sync work. you can look if there is error during running remoteshell, "updatenode  remoteshell"
 
If you got nothing from updatenode, after open xcatdebugmode, you can re-provision the node, and look into the xcat.log for all postscript remoteshell/syncfiles may help.
 
In `syncfiles`, it calls `startsyncfiles.awk` using openssl.
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Hakan Bayındır To: xCAT Users Mailing list , Xiao Peng Wang Cc:Subject: Re: [xcat-user] Problem with syncfiles after install.Date: Thu, Oct 12, 2017 5:32 PM 
Hello Again,After enabling debuglevel with the suggestion of Yuan, I noticed thatsyncfiles also doesn't work after update, but with a different error.The complete log is below.server1: Thu Oct 12 12:29:27 +03 2017 Running postscript: syncfilesserver1: + '[' -d /.statelite ']'server1: + '[' -f /etc/os-release ']'server1: + cat /etc/os-releaseserver1: + grep -i -e '^NAME=[ "'\'']*Cumulus Linux[ "'\'']*$'server1: + '[' -n 1 ']'server1: + '[' 1 -eq 1 ']'server1: + logger -t xcat -p local4.err './syncfiles: Did not sync anyfiles. Use updatenode -F to sync the files.'server1: + exit 0server1: postscript: syncfiles exited with code 0server1: Running of postscripts has completed.Regards,HakanOn 10/12/2017 11:34 AM, Hakan Bayındır wrote:> Hello Wang,>> When I run 'updatenode  -P syncfiles' after automated install and> reboot, the process runs without any problems. The log is below.>> server1: xcatdsklspost: downloaded postscripts successfully> server1: Thu Oct 12 11:30:16 +03 2017 Running postscript: syncfiles> server1: postscript: syncfiles exited with code 0> server1: Running of postscripts has completed.>> Regards,>> Hakan>> On 10/11/2017 01:34 PM, Xiao Peng Wang wrote:>> xdcp -F calls the 'rsync' to sync files to CN directly from MN. But>> during the OS deployment, a postscript named 'syncfiles' is executed in>> compute node side.>>  >> You may try 'updatenode  -P syncfiles'. And could you paste the log? Best Regards>> -->> Wang Xiaopeng (王晓朋)>>  >> Manager for HPC SW Dev: xCAT, ESSL, SMI, Test>> IBM China Systems Laboratory (CSL) Tel: 86-10-82453455>> Email: w...@cn.ibm.com>>  >>       - Original message ->>     From: Hakan Bayındır >>     To: xcat-user@lists.sourceforge.net>>     Cc:>>     Subject: Re: [xcat-user] Problem with syncfiles after install.>>     Date: Wed, Oct 11, 2017 2:13 PM>>      >>     Hello Wang,     I guessed that they're different, but don't know the details abut the>>     differences. I tested with xdcp anyway to be sure that my syncfile is>>     correct.     Is there any way to debug syncfiles further?     Thanks in advance,     Hakan     On 10/10/2017 05:57 PM, Xiao Peng Wang wrote:>>     > One thing to be aware that using xdcp -F is different with the>>     syncfile>>     > process during OS deployment.>>     >>>     > Best Regards>>     > -->>     > Wang Xiaopeng (王晓朋)>>     >  >>     > Manager for HPC SW Dev: xCAT, ESSL, SMI, Test>>     > IBM China Systems Laboratory (CSL)>>     >>>     > Tel: 86-10-82453455>>     > Email: w...@cn.ibm.com>>     >  >>     >  >>     >>>     >     - Original message ->>     >     From: Russ Auld >>     >     To: xCAT Users Mailing list >>     >     Cc:>>     >     Subject: Re: [xcat-user] Problem with syncfiles after install.>>     >     Date: Tue, Oct 10, 2017 7:17 PM>>     >      >>     >     Typically this is indicative of a problem with ssh>>     connectivity between>>     >     the compute node and the master/service node.>>     >     Ensure that DNS is also working correctly.>>     >>>     >     On Mon, 2017-10-09 at 19:00 +0300, Hakan Bayındır wrote:>>     >     > Hello All,>>     >     >>>     >     > I'm having a problem with the default syncfiles script, which is>>     >     > fired by default as a "postscript" after an installation. The>>     >     > installation completes as it should be and the system reboots,>>     >     > however I get an error in the xcat.log which states that the>>     >     > syncfiles have returned 1, hence failed. The strange thing>>     is, when I>>     >     > fire the same syncfile with xdcp, 

Re: [xcat-user] Problem with syncfiles after install.

2017-10-11 Thread Yuan Y Bai
Hi Hakan,
 
Could you open 'xcatdebugmode' in site table? 
For example: chdef -t site xcatdebugmode=2
 
After open xcatdebugmode, you can get more information from xcat.log for all postscripts . xcatdebugmode just like open "set -x" for these postscripts. 
 
What is your xcat version? what is your os?
 
 
Best Regards--Yuan Bai (白媛)CSTL HPC System Management DevelopmentTel:86-10-82451401E-mail: by...@cn.ibm.comAddress: IBM ZGC Campus. Ring Building 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,Beijing P.R.China 100193IBM环宇大厦北京市海淀区东北旺西路8号,中关村软件园28号楼邮编:100193
 
 
- Original message -From: Hakan Bayındır To: xCAT Users Mailing list , Russ Auld Cc:Subject: Re: [xcat-user] Problem with syncfiles after install.Date: Wed, Oct 11, 2017 2:45 PM 
Hello Russ,Thanks for your answer. I'm utilizing the xCAT's internal DNS, soeverything is managed automatically. When I checked with nslookup, theinternal DNS also returns the IPs correctly in both the MN and thetarget machine.Also, since the network is internal, the xCAT MN is transparent. It hasno firewall or other restrictions in connectivity.Is there any way that I can debug syncfiles script?Regards,HakanOn 10/10/2017 02:15 PM, Russ Auld wrote:> Typically this is indicative of a problem with ssh connectivity between> the compute node and the master/service node.> Ensure that DNS is also working correctly.>> On Mon, 2017-10-09 at 19:00 +0300, Hakan Bayındır wrote:>> Hello All, I'm having a problem with the default syncfiles script, which is>> fired by default as a "postscript" after an installation. The>> installation completes as it should be and the system reboots,>> however I get an error in the xcat.log which states that the>> syncfiles have returned 1, hence failed. The strange thing is, when I>> fire the same syncfile with xdcp, everything works as it should. Did>> anyone had this error and nudge me in the right direction? Best regards, Hakan Hakan BAYINDIR >> Uzman Araştırmacı >> Ağ Teknolojileri Birimi >> YÖK Binası B-5 Blok Kat:4 >> 06539 Bilkent ANKARA >> T +90 312 298 9373 >> F +90 312 266 5181 >> [ http://www.ulakbim.gov.tr/ | www.ulakbim.gov.tr ] >> hakan.bayin...@tubitak.gov.tr >> .>> ...  [ http://www.ulakbim.gov.tr/ ] >> [ http://www.tubitak.gov.tr/sorumlulukreddi | >>   Sorumluluk Reddi ] --->> --->> Check out the vibrant tech community on one of the world's most>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot>> ___>> xCAT-user mailing list>> xCAT-user@lists.sourceforge.net>> https://lists.sourceforge.net/lists/listinfo/xcat-user>> --> Check out the vibrant tech community on one of the world's most> engaging tech sites, Slashdot.org! http://sdm.link/slashdot> ___> xCAT-user mailing list> xCAT-user@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/xcat-user>--*Hakan BAYINDIR*Başuzman AraştırmacıAğ Teknolojileri BirimiTÜBİTAK ULAKBİMT.C. Bilim, Sanayi ve Teknoloji Bakanlığı (Eski Bina)Mustafa Kemal Mahallesi Dumlupınar Bulvarı(Eskişehir Yolu 7.Km) 2151.Cadde No:154ODTÜ Karşısı06510 Çankaya, ANKARAT +90 312 298 9373F +90 312 266 5181www.ulakbim.gov.tr hakan.bayin...@tubitak.gov.trSorumluluk Reddi  
 
--Check out the vibrant tech community on one of the world's mostengaging tech sites, Slashdot.org! http://sdm.link/slashdot
___xCAT-user mailing listxCAT-user@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/xcat-user
 


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Setting user hostnames

2017-08-22 Thread Yuan Y Bai
Hi All,

We have the following design to resolve user hostnames solution in xCAT,
may realize it in future, could you check if the following design can
resolve your user hostnames problem?
Welcome any comments from you.

Feature:
Add one new column into `hosts` table, for example, named `customhostname`.
User can add their expected user hostnames into this column for nodes. If
set the value of this column, xCAT will configure final system hostname the
same with this value.


Take the following example to explain the design.
node name is computenode1 in xCAT DB, user wants its user hostname is
publichostname after the node is provisioned.

   In xCAT MN, hosts table as:
   #tabdump hosts
   #node,ip,hostnames,customhostname,otherinterfaces,comments,disable
   "computenode1",,,"publichostname",,,


   After the computenode1 is provisioned, using xdsh to check the hostname
   of the node, it can get publichostname

   #xdsh computenode1 hostname
   publichostname

Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   Christian Caruthers 
To: xCAT Users Mailing list 
Date:   08/22/2017 09:15 AM
Subject:Re: [xcat-user] Setting user hostnames



I agree, I don't think it is supported. I think I can make it work using
the solution I outlined, but that doesn't appear to be working according to
the documentation.

Looking for verification of what I'm trying to do, confirmation that the
envlist files are still parsed (it appears they are), and/or suggestions.

Regards,
Christian Caruthers
Lenovo Professional Services
Mobile: 757-289-9872
Sent from my mobile device




On Mon, Aug 21, 2017 at 5:58 PM -0500, "Kevin Keane" 
wrote:

  The way I read it, this was a design suggestion for a future version of
  xCAT; it is described as a low-priority item. It may not yet be
  implemented.

  On Mon, Aug 21, 2017 at 3:42 PM, Christian Caruthers <
  ccaruth...@lenovo.com> wrote:
   Using this as a reference:





   https://sourceforge.net/p/xcat/wiki/Exporting_more_table_attributes_to_nodes/





   Currently, we have n1…n10 configured in xCAT. These are internal
   hostnames. All of these hosts have external "public" IP addresses, and
   the desire is to have the hosname match the public name.





   Short of setting up DNS for the public IPs, which I really don't want to
   do since there will be an external DNS server w/ that info, I thought I
   could use an unused table.field (prodkey.key) for a .envlist
   file. So I have:





   ls -la /install/custom/install/rh/x86_84/chum.envlist


   prodkey,key,,NEWHOST





   (NOTE: I have also tried this in /install/custom/install/rh/)





   lsdef n1 | egrep 'profile|post|productkey'


   postbootscripts=,chumhostname


   productkey=rabidgator1.floridaman.arrr.edu


   profile=chum





   cat /install/postscripts/chumhostname


   hostnamectl set-hostname $NEWHOST





   When I try this with updatenode, it doesn't appear to be grabbing the
   info from that envlist file. Is this feature still in use? If not,
   what's a recommended way to set a different hostname?





   Regards,
   Christian Caruthers
   Lenovo Professional Services


   Mobile: 757-289-9872

   
--

   Check out the vibrant tech community on one of the world's most
   engaging tech sites, Slashdot.org! http://sdm.link/slashdot
   ___
   xCAT-user mailing list
   xCAT-user@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/xcat-user




  --


  ___
  Kevin Keane | Systems Architect | University of San Diego ITS |
  kke...@sandiego.edu
  Maher Hall, 192 |5998 Alcalá Park | San Diego, CA 92110-2492 |
  619.260.6859
  --

  Check out the vibrant tech community on one of the world's most
  engaging tech sites, Slashdot.org!
  
https://urldefense.proofpoint.com/v2/url?u=http-3A__sdm.link_slashdot=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=3ItVZCNQ0X4APLmp_G7FTo9vyhX_fwg0aZShIEuC4PU=bLKPxWs9RJL_PmDZIDi0IbDnqFH9ZfY89QaAMkQ1v-M=
   ___
  xCAT-user mailing list
  xCAT-user@lists.sourceforge.net
  
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_xcat-2Duser=DwICAg=jf_iaSHvJObTbx-siA1ZOg=uiTcPxjMR44SPRNNb6l_nA=3ItVZCNQ0X4APLmp_G7FTo9vyhX_fwg0aZShIEuC4PU=agiOtffNnVigh3hLsaOsd6BvS3u-LOr813lKfctyGKU=








Re: [xcat-user] Node name on secondary interface.

2017-08-09 Thread Yuan Y Bai

Hi  Ulf,

xCAT only configure the node name to management/deploy interface on a
compute node.

I think if you wants the node real name for Secondary/public 10GB interface
from xCAT MN, you can define the node use one name schema in xcat DB like
testnode1, you can use the expected name as the alias for the deploy
interface like node1-adm, you can use the node real name as the alias for
the Secondary/public interface like node1.

Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   Ulf Johansson 
To: xCAT Users Mailing list 
Date:   08/09/2017 09:03 PM
Subject:[xcat-user] Node name on secondary interface.



Hi.

One question regarding network interfaces and host/node names.

Is it possible to have xCat configure the  node/host name on a
secondary/public interface instead of the management interface on a compute
node?
For example
Management/Deploy interface, name in /etc/hosts and DNS node1-adm
Secondary/Public 10Gb interface, name  in /etc/hosts and DNS node1

Regards,
-Ulf

--
Ulf Johansson

Go Virtual Nordic AB
Datavägen 21A
SE-436 32 Askim
Sweden

Phone: +46-31-748 88 79
Fax: +46-31-286205
E-mail: ulf.johans...@govirtual.se
Web: www.govirtual.se

Think Visual - Go Virtual
--

Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Error while executing makehosts command

2017-07-10 Thread Yuan Y Bai

Hi Ahmed,

I cannot reproduce the makehosts error, could provide more information
about makehosts ?
1) Could you check if /tmp/xcat is rw? could you use non-root user?
2) when executing makehosts, is there a file "/tmp/xcat/hostsfile.lock" ?
3) Could you provide output of "lsxcatd -v", "XCATBYPASS=1 makehosts", "rpm
-qa|grep -i xcat" ?


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   Ahmed Essam 
To: "xCAT-user@lists.sourceforge.net"

Date:   07/09/2017 06:31 PM
Subject:[xcat-user] Error while executing makehosts command



Hi all,
I am using xCAT 2.13.4 version and the OS on my management node is Centos 6

When I try to execute "makehosts" command the foolowing error appears:


flock() on closed filehandle $lockh
at /opt/xcat/lib/perl/xCAT_plugin/hosts.pm line 451.
flock() on closed filehandle $lockh
at /opt/xcat/lib/perl/xCAT_plugin/hosts.pm line 565.

Other error appears when I try to use "switchdiscover"

switchdiscover --range 192.168.1.0/24 -w
Command failed: lsdef. Error message: Could not find an object named
'switch-192-168-1-50' of type 'node'..

Given that there isn't a problem in our switch configuration since we've
implemented an xCAT cluster before using the same switch with Centos7
management node.

Hope you can help.

Sincerely,
Ahmed Essam



--

Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Bcast address 0.0.0.0 Statelite deployment

2017-05-23 Thread Yuan Y Bai
.@lenovo.com  
  

  

  

  

  

  

  

  
 Lenovo.com 
  
 Twitter | Facebook | Instagram | Blogs | Forums
  

  

  

  




From: Yuan Y Bai [mailto:by...@cn.ibm.com]
Sent: Monday, May 22, 2017 9:21 AM
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Subject: Re: [xcat-user] Bcast address 0.0.0.0 Statelite deployment



Sorry for mistake of eth1, should eth0 in previous mail.
/sbin/ip addr add 10.10.101.51/24 dev eth0
/sbin/ip link set eth0 up

You should use:
/sbin/ip addr add 10.10.101.51/24 broadcast
10.10.101.255 dev eth0
Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,
Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193

Inactive hide details for "Yuan Y Bai" ---05/22/2017 02:18:54 PM---Hi
Gilad, I provisioned sles12.2 statelite using xCAT, I can"Yuan Y Bai"
---05/22/2017 02:18:54 PM---Hi Gilad, I provisioned sles12.2 statelite
using xCAT, I cannot reproduce this Bcase

From: "Yuan Y Bai" <by...@cn.ibm.com>
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Date: 05/22/2017 02:18 PM
Subject: Re: [xcat-user] Bcast address 0.0.0.0 Statelite deployment




Hi Gilad,

I provisioned sles12.2 statelite using xCAT, I cannot reproduce this Bcase
adress 0.0.0.0 for install nic through provisioning.

I have following questions:

1, Is eth0 your install nic? install nic is configured by dhcp in xCAT, if
eth0 is install nic, the bcast is not configured by xCAT directly, it
should be calculated.

2, Is eth0 configured by scripts from xCAT? for example, do you use
confignics or confignetwork scipts or etc to configure eth0?
If you using scripts in xCAT, could you tell me which scripts? I can look
into it.

3, If you using other scripts to configure eth0, the following command may
cause the bcast address as 0.0.0.0:
/sbin/ip link set eth0 down
/sbin/ip addr add 10.10.101.51/24 dev eth0
/sbin/ip link set eth0 up

You should use:
/sbin/ip addr add 10.10.101.51/24 broadcast
10.10.101.255 dev eth0

Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,
Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193

Inactive hide details for "Yuan Y Bai" ---05/22/2017 10:55:37 AM---Hi
Gilad, Thanks your information, let me try to reproduce i"Yuan Y Bai"
---05/22/2017 10:55:37 AM---Hi Gilad, Thanks your information, let me try
to reproduce it in my environment.

From: "Yuan Y Bai" <by...@cn.ibm.com>
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Date: 05/22/2017 10:55 AM
Subject: Re: [xcat-user] Bcast address 0.0.0.0 Statelite deployment




Hi Gilad,

Thanks your information, let me try to reproduce it in my environment.


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,
Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193

Inactive hide details for Gilad Berman ---05/20/2017 04:06:14 AM---All, I
am assuming many linux experts on this list, so sinceGilad Berman
---05/20/2017 04:06:14 AM---All, I am assuming many linux experts on this
list, so since I am not sure this is purely xCAT issue

From: Gilad Berman <gber...@lenovo.com>
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Date: 05/20/2017 04:06 AM
Subject: [xcat-user] Bcast address 0.0.0.0 Statelite deployment




All,

I am assuming many linux experts on this list, so since I am not sure this
is purely xCAT issue, any hints are more than welcome 

Re: [xcat-user] Bcast address 0.0.0.0 Statelite deployment

2017-05-22 Thread Yuan Y Bai
Sorry for mistake of eth1, should eth0 in previous mail.

  /sbin/ip addr add 10.10.101.51/24 dev eth0
  /sbin/ip link set eth0 up

  You should use:
/sbin/ip addr add 10.10.101.51/24 broadcast 10.10.101.255 dev
eth0

Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   "Yuan Y Bai" <by...@cn.ibm.com>
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Date:   05/22/2017 02:18 PM
Subject:Re: [xcat-user] Bcast address 0.0.0.0 Statelite deployment



Hi Gilad,

I provisioned sles12.2 statelite using xCAT, I cannot reproduce this Bcase
adress 0.0.0.0 for install nic through provisioning.

I have following questions:

1, Is eth0 your install nic? install nic is configured by dhcp in xCAT, if
eth0 is install nic, the bcast is not configured by xCAT directly, it
should be calculated.

2, Is eth0 configured by scripts from xCAT? for example, do you use
confignics or confignetwork scipts or etc to configure eth0?
If you using scripts in xCAT, could you tell me which scripts? I can look
into it.

3, If you using other scripts to configure eth0, the following command may
cause the bcast address as 0.0.0.0:
  /sbin/ip link set eth0 down
  /sbin/ip addr add 10.10.101.51/24 dev eth0
  /sbin/ip link set eth0 up

  You should use:
/sbin/ip addr add 10.10.101.51/24 broadcast 10.10.101.255 dev
eth0



Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,
Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193

Inactive hide details for "Yuan Y Bai" ---05/22/2017 10:55:37 AM---Hi
Gilad, Thanks your information, let me try to reproduce i"Yuan Y Bai"
---05/22/2017 10:55:37 AM---Hi Gilad, Thanks your information, let me try
to reproduce it in my environment.

From: "Yuan Y Bai" <by...@cn.ibm.com>
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Date: 05/22/2017 10:55 AM
Subject: Re: [xcat-user] Bcast address 0.0.0.0 Statelite deployment



Hi Gilad,

Thanks your information, let me try to reproduce it in my environment.


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,
Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193

Inactive hide details for Gilad Berman ---05/20/2017 04:06:14 AM---All, I
am assuming many linux experts on this list, so sinceGilad Berman
---05/20/2017 04:06:14 AM---All, I am assuming many linux experts on this
list, so since I am not sure this is purely xCAT issue

From: Gilad Berman <gber...@lenovo.com>
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Date: 05/20/2017 04:06 AM
Subject: [xcat-user] Bcast address 0.0.0.0 Statelite deployment



All,

I am assuming many linux experts on this list, so since I am not sure this
is purely xCAT issue, any hints are more than welcome 

In our statelite deployment the boot interface comes up with broadcast
address of 0.0.0.0 –
# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 6C:AE:8B:08:8E:03
inet addr:10.10.101.51 Bcast:0.0.0.0 Mask:255.255.255.0
….

The broadcast address can be setup easily with ifconfig eth0 broadcast.
Eth1 boots with the right bcast address.

I wonder what could cause this issue, when seems there is no limitation on
setting it.

OS is SLES12.2 and deployment method is statelite (statefull works as
expected).

THX in advance!



  
 
http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif
   

  

  

  
 

Re: [xcat-user] Bcast address 0.0.0.0 Statelite deployment

2017-05-22 Thread Yuan Y Bai
Hi Gilad,

I provisioned sles12.2 statelite using xCAT, I cannot reproduce this Bcase
adress 0.0.0.0 for install nic through provisioning.

I have following questions:

1, Is eth0 your install nic?  install nic is configured by dhcp in xCAT, if
eth0 is install nic, the bcast is not configured by xCAT directly, it
should be calculated.

2, Is eth0 configured by scripts from xCAT?  for example, do you use
confignics or confignetwork scipts or etc to configure eth0?
   If you using scripts in xCAT, could you tell me which scripts? I can
look into it.

3, If you using other scripts to configure eth0, the following command may
cause the bcast address as 0.0.0.0:

   /sbin/ip link set eth0 down
   /sbin/ip addr add 10.10.101.51/24 dev eth1
   /sbin/ip link set eth0 up

   You should use:
  /sbin/ip addr add 10.10.101.51/24 broadcast 10.10.101.255 dev eth1

Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   "Yuan Y Bai" <by...@cn.ibm.com>
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Date:   05/22/2017 10:55 AM
Subject:Re: [xcat-user] Bcast address 0.0.0.0 Statelite deployment



Hi Gilad,

Thanks your information, let me try to reproduce it in my environment.


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,
Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193

Inactive hide details for Gilad Berman ---05/20/2017 04:06:14 AM---All, I
am assuming many linux experts on this list, so sinceGilad Berman
---05/20/2017 04:06:14 AM---All, I am assuming many linux experts on this
list, so since I am not sure this is purely xCAT issue

From: Gilad Berman <gber...@lenovo.com>
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Date: 05/20/2017 04:06 AM
Subject: [xcat-user] Bcast address 0.0.0.0 Statelite deployment



All,

I am assuming many linux experts on this list, so since I am not sure this
is purely xCAT issue, any hints are more than welcome 

In our statelite deployment the boot interface comes up with broadcast
address of 0.0.0.0 –
# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 6C:AE:8B:08:8E:03
inet addr:10.10.101.51 Bcast:0.0.0.0 Mask:255.255.255.0
….

The broadcast address can be setup easily with ifconfig eth0 broadcast.
Eth1 boots with the right bcast address.

I wonder what could cause this issue, when seems there is no limitation on
setting it.

OS is SLES12.2 and deployment method is statelite (statefull works as
expected).

THX in advance!



  
 
http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif
   

  

  

  

DCG-Hardware  
 Gilad Berman   
  
 HPC Architect  
  
 Lenovo EMEA
  
   Phone+972-52-2554262 
  
   emailgber...@lenovo.com  
  

  

  


Re: [xcat-user] Bcast address 0.0.0.0 Statelite deployment

2017-05-21 Thread Yuan Y Bai

Hi Gilad,

Thanks your information, let me try to reproduce it in my environment.


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   Gilad Berman 
To: xCAT Users Mailing list 
Date:   05/20/2017 04:06 AM
Subject:[xcat-user] Bcast address 0.0.0.0 Statelite deployment



All,

I am assuming many linux experts on this list, so since I am not sure this
is purely xCAT issue, any hints are more than welcome 

In our statelite deployment the boot interface comes up with broadcast
address of 0.0.0.0 –
# ifconfig eth0
eth0  Link encap:Ethernet  HWaddr 6C:AE:8B:08:8E:03
  inet addr:10.10.101.51  Bcast:0.0.0.0  Mask:255.255.255.0
   ….

The broadcast address can be setup easily with ifconfig eth0 broadcast.
Eth1 boots with the right bcast address.

I wonder what could cause this issue, when seems there is no limitation on
setting it.

OS is SLES12.2 and deployment method is statelite (statefull works as
expected).

THX in advance!




  
 
http://lenovocentral.lenovo.com/marketing/branding/email_signature/images/gradient.gif
   

  

  

  

  
 Gilad Berman   
DCG-Hardware  
 HPC Architect  
  
 Lenovo EMEA
  
   Phone+972-52-2554262 
  
   emailgber...@lenovo.com  
  

  

  

  
 Lenovo.com 
  
 Twitter | Facebook | Instagram | Blogs | Forums
  

  

  




--

Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] configbond script

2017-05-03 Thread Yuan Y Bai
Christian,

You can also insert "MTU=9000" before line 212 in configbond to configure
slaves MTU.

Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   Yuan Y Bai/China/IBM
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Date:   05/04/2017 09:58 AM
Subject:Re: [xcat-user] configbond script


Hi Christian,

Currently, configbond does not support nicextraparams, it is on our
backlog, but may not be high priority.
 If you want to change MTU value, you can add "MTU=9000" before line 200 in
configbond to work around your work, like the following example:

   BONDING_OPTS="${array_bond_opts[*]}"
   MTU=9000
   EOF

Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193




From:   Christian Caruthers <ccaruth...@lenovo.com>
To: "xCAT Users Mailing list (xcat-user@lists.sourceforge.net)"
<xcat-user@lists.sourceforge.net>
Date:   05/03/2017 12:25 PM
Subject:[xcat-user] configbond script



I have the following setup:

nodels gss1 nics
gss1: nics.nicips: bond0!10.100.10.20
gss1: nics.nicextraparams: bond0!MTU=9000
gss1: nics.nicsadapter: bond0!MTU=9000
gss1: nics.nichostnamesuffixes: bond0!-10g
gss1: nics.nicdevices:
bond0!enp134s0,enp134s0d1,enp27s0,enp27s0d1,enp32s0,enp32s0d1
gss1: nics.node: gss1
gss1: nics.niccustomscripts: bond0!configbond bond0
enp134s0@enp134s0d1@enp27s0@enp27s0d1@enp32s0@
enp32s0d1 miimon=100@mode=4@xmit_hash_policy=1
gss1: nics.nicnetworks: bond0!10G
gss1: nics.nictypes: bond0!Ethernet

Despite this, the MTU setting isn't being placed in the ifcfg-bond script.

Looking at the nics manpage, it's not really clear which setting does
which:
 nicextraparams
nicextraparams
 Comma-separated list of extra parameters that will be used
for each NIC configuration.
nicsadapter
 Comma-separated list of extra parameters that will be used
for each NIC configuration.

Regards,
Christian Caruthers
Lenovo Professional Services
Mobile: 757-289-9872

--

Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] configbond script

2017-05-03 Thread Yuan Y Bai
Hi Christian,

Currently, configbond does not support nicextraparams, it is on our
backlog, but may not be high priority.
 If you want to change MTU value, you can add "MTU=9000" before line 200 in
configbond to work around your work, like the following example:

   BONDING_OPTS="${array_bond_opts[*]}"
   MTU=9000
   EOF

Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   Christian Caruthers 
To: "xCAT Users Mailing list (xcat-user@lists.sourceforge.net)"

Date:   05/03/2017 12:25 PM
Subject:[xcat-user] configbond script



I have the following setup:

nodels gss1 nics
gss1: nics.nicips: bond0!10.100.10.20
gss1: nics.nicextraparams: bond0!MTU=9000
gss1: nics.nicsadapter: bond0!MTU=9000
gss1: nics.nichostnamesuffixes: bond0!-10g
gss1: nics.nicdevices:
bond0!enp134s0,enp134s0d1,enp27s0,enp27s0d1,enp32s0,enp32s0d1
gss1: nics.node: gss1
gss1: nics.niccustomscripts: bond0!configbond bond0
enp134s0@enp134s0d1@enp27s0@enp27s0d1@enp32s0@
enp32s0d1 miimon=100@mode=4@xmit_hash_policy=1
gss1: nics.nicnetworks: bond0!10G
gss1: nics.nictypes: bond0!Ethernet

Despite this, the MTU setting isn't being placed in the ifcfg-bond script.

Looking at the nics manpage, it's not really clear which setting does
which:
 nicextraparams
nicextraparams
 Comma-separated list of extra parameters that will be used
for each NIC configuration.
nicsadapter
 Comma-separated list of extra parameters that will be used
for each NIC configuration.

Regards,
Christian Caruthers
Lenovo Professional Services
Mobile: 757-289-9872

--

Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Making nodes IPs static

2017-04-06 Thread Yuan Y Bai
The related doc is here:
http://xcat-docs.readthedocs.io/en/stable/guides/admin-guides/manage_clusters/ppc64le/diskful/customize_image/cfg_second_adapter.html


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   Christian Caruthers 
To: xCAT Users Mailing list 
Date:   04/04/2017 01:15 AM
Subject:Re: [xcat-user] Making nodes IPs static



If your hosts have a single interface, hardeths will do the trick w/o any
additional configuration changes. If your nodes have multiple interfaces,
you can use "confignics -s" (man confignics) and omit hardeths. To use
confignics, you'll need to configure the nics table (man nics)

Regards,
Christian Caruthers
Lenovo xESS IT Consultant
Mobile: 757-289-9872



-Original Message-
From: David Rajendra [mailto:drajen...@lenovo.com]
Sent: Monday, April 3, 2017 8:02 AM
To: xCAT Users Mailing list
Subject: Re: [xcat-user] Making nodes IPs static

Hello Hakan,

I can answer your first question:

You can use the xCAT  "hardeths" postscript to achieve this.
(It is located in /install/postscripts).

This will make the DHCP assigned IP address static Add this post script to
the xCAT postcripts table for your node/nodegroup definition.

Regards,

David



-Original Message-
From: Hakan Bayındır [mailto:hakan.bayin...@tubitak.gov.tr]
Sent: Monday, April 3, 2017 9:27 AM
To: xCAT User Mailing List
Subject: [xcat-user] Making nodes IPs static

Hello all,

I'm taking last steps to take our xCAT installation to production. I'm
working on making the IP addresses static on the hosts.

The DHCP server is assigning IPs as it should, however our management
policy requires that the nodes shall have static IPs after complete
installation.

Does xCAT has a feature for this, or should I write a small script for
this.

Also, as far as I can tell, xCAT allows nodes to talk with master's DB
about its details. Is there any documentation for this?

Best regards,

Hakan Bayindir

--
*Hakan BAYINDIR*
Uzman Araştırmacı
Ağ Teknolojileri Birimi
YÖK Binası B-5 Blok Kat:4
06539 Bilkent Ankara
T +90 312 298 9373
F +90 312 266 5181
www.ulakbim.gov.tr 
hakan.bayin...@tubitak.gov.tr 





Sorumluluk Reddi 

--

Check out the vibrant tech community on one of the world's most engaging
tech sites, Slashdot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user
--

Check out the vibrant tech community on one of the world's most engaging
tech sites, Slashdot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user
--

Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Setting MTU

2017-02-21 Thread Yuan Y Bai
Hi Christopher,

In my example, "confignics -s" can configure MTU=9000 correctly:
postscripts]# lsdef bybc0607 -i
ip,nicextraparams.eth0,nicnetworks.eth0,nictypes.eth0
Object name: bybc0607
ip=10.5.106.7
nicextraparams.eth0=MTU=9000
nicnetworks.eth0=10_0_0_0-255_0_0_0
nictypes.eth0=Ethernet

Currently, confignics use nicextraparams to configure MTU value.

The MTU in network table is used in ip from DHCP.


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   Yuan Y Bai/China/IBM@IBMCN
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Date:   02/22/2017 01:39 PM
Subject:Re: [xcat-user] Setting MTU



Hi Christopher,

"nicextraparams.ens1f0=MTU=9000" and set the MTU of 9000 in
the networks table, then DHCP now correctly sets the MTU to be 9000;  MTU
in the networks table can configure network from DHCP.

If you want to set MTU as static, you should use
nicextraparams.ens1f0=MTU=9000 in nics table;
Currently, confignics use nicextraparams to configure static attribute in
ifcfg-*;


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,
Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193

Inactive hide details for "Christopher J. Walker" ---02/21/2017 07:30:55
PM---I'm trying to configure the MTU to be 9000 for Ce"Christopher J.
Walker" ---02/21/2017 07:30:55 PM---I'm trying to configure the MTU to be
9000 for Centos 7 hosts on our internal network. I'm using xC

From: "Christopher J. Walker" <c.j.wal...@qmul.ac.uk>
To: "xcat-user@lists.sourceforge.net" <xcat-user@lists.sourceforge.net>
Date: 02/21/2017 07:30 PM
Subject: [xcat-user] Setting MTU



I'm trying to configure the MTU to be 9000 for Centos 7 hosts on our
internal network. I'm using xCAT 2.12.4


A way that works is to set the MTU explicitly on a network card basis -
so I do:

nicextraparams.ens1f0=MTU=9000
and add
  "confignics -s"
to the postscripts for the node.

However we have a number of different hardware types - and the internal
network card name therefore changes.

It seems cleaner to set the MTU in the MTU column of the networks table,
instead of explicitly listing each network card type. For testing, I've
removed the "nicextraparams.ens1f0=MTU=9000" and set the MTU of 9000 in
the networks table. DHCP now correctly sets the MTU to be 9000, but this
is then removed by "confignics -s" - resulting in the default of 1500
after install.

Is this an oversight on my part, or a bug in confignics, or configeth

Chris
--
Dr Christopher J. Walker
ITS Research
Queen Mary University of London, E1 4NS

--

Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user



--

Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Setting MTU

2017-02-21 Thread Yuan Y Bai
Hi Christopher,

"nicextraparams.ens1f0=MTU=9000" and set the MTU of 9000 in
the networks table, then DHCP now correctly sets the MTU to be 9000;  MTU
in the networks table can configure network from DHCP.

If you want to set MTU as static, you should use
nicextraparams.ens1f0=MTU=9000 in nics table;
Currently, confignics use nicextraparams to configure static attribute in
ifcfg-*;


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   "Christopher J. Walker" 
To: "xcat-user@lists.sourceforge.net"

Date:   02/21/2017 07:30 PM
Subject:[xcat-user] Setting MTU



I'm trying to configure the MTU to be 9000 for Centos 7 hosts on our
internal network. I'm using xCAT 2.12.4


A way that works is to set the MTU explicitly on a network card basis -
so I do:

 nicextraparams.ens1f0=MTU=9000
and add
   "confignics -s"
to the postscripts for the node.

However we have a number of different hardware types - and the internal
network card name therefore changes.

It seems cleaner to set the MTU in the MTU column of the networks table,
instead of explicitly listing each network card type. For testing, I've
removed the "nicextraparams.ens1f0=MTU=9000" and set the MTU of 9000 in
the networks table. DHCP now correctly sets the MTU to be 9000, but this
is then removed by "confignics -s" - resulting in the default of 1500
after install.

Is this an oversight on my part, or a bug in confignics, or configeth

Chris
--
Dr Christopher J. Walker
ITS Research
Queen Mary University of London, E1 4NS

--

Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] packimage users and groups

2017-02-07 Thread Yuan Y Bai
Hi Geert,

Could you resolve this problem?

packimage will remove some directories based on exclude list exlist. If you
add some directories you needed, this may cause problem.
Could you check your exlist in osimage definition if it contains some
directories you needed?

exlist is in osimage definition, for example, osimage name is
rhels7.2-x86_64-netboot-compute, you can get the exlist:

]# lsdef -t osimage rhels7.2-x86_64-netboot-compute -i exlist
Object name: rhels7.2-x86_64-netboot-compute
exlist=/opt/xcat/share/xcat/netboot/rh/compute.rhels7.x86_64.exlist
]# cat /opt/xcat/share/xcat/netboot/rh/compute.rhels7.x86_64.exlist



Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   Geert Geurts 
To: 
Date:   01/26/2017 07:09 PM
Subject:[xcat-user] packimage users and groups



Hello all,

I've build an centos 7.2 osimage using genimage but I noticed that a few
services didn't comeup correctly.

chronyd.service, sm-client.service and systemd-tmpfiles-setup.service
all failed because of failing users/groups.

I've added the needed users to the image, but these users get removed by
packimage.

Why does packimage care about the useres defined in an image?

What am I doing wrong here to get these users defined?


Best regards,

Geert



--

Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] makedhcp issues

2017-02-05 Thread Yuan Y Bai
Hi Will Robinson,

Weihua had fixed this issue in the following 2428, the code is merged in
2.13.2, you can find the fix in 2428 also.

https://github.com/xcat2/xcat-core/pull/2428



Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   Yuan Y Bai/China/IBM@IBMCN
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Date:   02/06/2017 11:22 AM
Subject:Re: [xcat-user] makedhcp issues



Hi Wei hua,

/sbin/xcatd line 2108 is used for trace, could you help look at it?


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,
Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193

Inactive hide details for Will Robinson ---02/05/2017 12:41:35 AM---Hi,
"makedhcp" is failing to properly configure DHCP on ourWill Robinson
---02/05/2017 12:41:35 AM---Hi, "makedhcp" is failing to properly configure
DHCP on our new management

From: Will Robinson <w...@clemson.edu>
To: xcat-user@lists.sourceforge.net
Date: 02/05/2017 12:41 AM
Subject: [xcat-user] makedhcp issues



Hi,

"makedhcp" is failing to properly configure DHCP on our new management
node (fresh install, running xCAT 2.13.1).  It may be that something did
not configure properly during install. Networks table appears to be
okay.  Capturing output reveals the following info below (running 'xcatd
-f').  Any suggestions would be appreciated.  Thanks.


MN:~# makedhcp -n
MN:~#

xCAT: Allowing makedhcp -n for root from localhost.localdomain
Use of uninitialized value $n in concatenation (.) or string at
/sbin/xcatd line 2108.
Do nothing

MN:~# makedhcp all
MN:~#

Use of uninitialized value $n in concatenation (.) or string at
/sbin/xcatd line 2108.
Do nothing





--


Will


--

Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user



--

Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] makedhcp issues

2017-02-05 Thread Yuan Y Bai
Hi Wei hua,

/sbin/xcatd line 2108 is used for trace, could you help look at it?


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   Will Robinson 
To: xcat-user@lists.sourceforge.net
Date:   02/05/2017 12:41 AM
Subject:[xcat-user] makedhcp issues



Hi,

"makedhcp" is failing to properly configure DHCP on our new management
node (fresh install, running xCAT 2.13.1).  It may be that something did
not configure properly during install. Networks table appears to be
okay.  Capturing output reveals the following info below (running 'xcatd
-f').  Any suggestions would be appreciated.  Thanks.


MN:~# makedhcp -n
MN:~#

xCAT: Allowing makedhcp -n for root from localhost.localdomain
Use of uninitialized value $n in concatenation (.) or string at
/sbin/xcatd line 2108.
Do nothing

MN:~# makedhcp all
MN:~#

Use of uninitialized value $n in concatenation (.) or string at
/sbin/xcatd line 2108.
Do nothing





--


Will


--

Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Software Kits

2016-12-20 Thread Yuan Y Bai
Yes, do you have any questions?


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   Russell Auld 
To: xCAT Users Mailing list 
Date:   12/21/2016 08:15 AM
Subject:[xcat-user] Software Kits



Do people still use software kits?


--

Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


--
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today.http://sdm.link/intel___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] /etc/named.conf

2016-11-28 Thread Yuan Y Bai
Hi Chris,

Thanks your advice.

I opened issue : https://github.com/xcat2/xcat-core/issues/2206

I will add comments in 2.13.1.


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   "Christopher J. Walker" 
To: "xcat-user@lists.sourceforge.net"

Date:   11/25/2016 08:29 PM
Subject:[xcat-user] /etc/named.conf



One of my colleagues recently changed the nameserver in /etc/named.conf.

This change was then overwritten when we ran makedns.

Could comments be added to this file indicating that it is generated by
xCAT please to help prevent this sort of thing in future.

Thanks,

Chris

--
Dr Christopher J. Walker
ITS Research
Queen Mary University of London, E1 4NS

--

___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


--
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Problem in reverse dns checking during opensslhandshakes(getpostscripts.awk)

2016-11-15 Thread Yuan Y Bai
Hi 범희대,

Could you paste your error messages from output?

Actually I do not clearly know your exact issue.


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   범희대 
To: xCAT Users Mailing list 
Date:   11/16/2016 01:27 PM
Subject:[xcat-user] Problem in reverse dns checking during openssl
handshakes(getpostscripts.awk)



Hello, guys.
Is there any one has met with timeout during running getpostscript.awk ?

xCAT OS provisioning system uses getpostscripts.awk to request
'getpostscripts' command and there openssl connects to xcat master ip
address.

Once I tested this function, saw an error resulted from timeout in openssl
connection. (you may understand what i mean if you have ever read the
script).

Then I debugged and solved it with appending an record for xcat master ip
into /etc/hosts file.
and googled about it and found there was an issue. That resulted from
reverse dns checking during openssl's handshaking.

So here is one question. There anyone else have experienced the same
problem? If you have and solved with another way(without
editing /etc/hosts), could you recommend me that?
Or is there any patch resolved it in the versions newer than 2.9.1?

Thx.
--

___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

--
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] makedhcp creates dhcpd.conf with syntax errors

2016-09-19 Thread Yuan Y Bai
Hi Roger,

Could you check if there is domain configured in site table in xCAT MN?

# tabdump site |grep domain

Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   "Roger Cline" <rcl...@us.ibm.com>
To: xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Date:   09/14/2016 11:58 PM
Subject:Re: [xcat-user] makedhcp creates dhcpd.conf with syntax errors



Hello Yuan,
  Here is the output when I run makedhcp

[root@c01xsn ~]# makedhcp -n
Warning: No 172.17.0.0 specific entry for domain, and no domain defined in
site table.
Warning: No dynamic range specified for 172.17.0.0. If hardware discovery
is being used, a dynamic range is required.
Warning: No 172.31.4.0 specific entry for domain, and no domain defined in
site table.
Warning: No 172.31.4.0 specific entry for nameservers, and no nameservers
defined in site table.
Warning: No dynamic range specified for 172.31.4.0. If hardware discovery
is being used, a dynamic range is required.
Warning: No 172.17.0.0 specific entry for domain, and no domain defined in
site table.
Warning: No dynamic range specified for 172.17.0.0. If hardware discovery
is being used, a dynamic range is required.
Warning: No 172.31.4.0 specific entry for domain, and no domain defined in
site table.
Warning: No 172.31.4.0 specific entry for nameservers, and no nameservers
defined in site table.
Warning: No dynamic range specified for 172.31.4.0. If hardware discovery
is being used, a dynamic range is required.
Warning: No 172.17.16.0 specific entry for domain, and no domain defined in
site table.
Warning: No dynamic range specified for 172.17.16.0. If hardware discovery
is being used, a dynamic range is required.
Warning: No 172.31.0.0 specific entry for domain, and no domain defined in
site table.
Warning: No 172.31.0.0 specific entry for nameservers, and no nameservers
defined in site table.
Warning: No dynamic range specified for 172.31.0.0. If hardware discovery
is being used, a dynamic range is required.

The result is a /etc/dhcp/dhcpd.conf which lacks zone names or labels for
172.17.0.0
172.31.4.0

The other two networks are not local to the service node, c01xsn.
172.17.16.0
172.31.0.0

 I get similar results on the other service node, c05xsn but with
172.17.16.0 and 172.31.0.0 included but missing zone names.

It seems like it's a problem with site definition. I have one site defined
and I've tried both of these attributes with the same result.
dhcpinterfaces=eth1
dhcpinterfaces=c02xmn|eth1;c01xsn|eth1;c05xsn|eth1

Thanks


Roger Cline
802.769.1409
LINUX/UNIX/AIX Systems Support
IBM Systems






From:"Yuan Y Bai" <by...@cn.ibm.com>
To:xCAT Users Mailing list <xcat-user@lists.sourceforge.net>
Date:09/07/2016 10:21 PM
Subject:Re: [xcat-user] makedhcp creates dhcpd.conf with syntax
errors



Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian District,
Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193
--

___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


--

___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

--
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Feature Request active-active xcat configuration - makedhcp failover configuration and ensure all tables have primary key

2016-09-13 Thread Yuan Y Bai
Hi Daniel,

Could you explain more details about this feature request?

Is this dhcp HA for xCAT management node active-active HA solution?


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   Daniel Letai 
To: xcat-user@lists.sourceforge.net
Date:   09/11/2016 11:33 PM
Subject:[xcat-user] Feature Request active-active xcat configuration -
makedhcp failover configuration and ensure all tables have
primary key



With the advent of galera based my/maria-db clustering and replication, the
last major obstacle to a true active-active xcat configuration is IMHO
removed.

The only other issue is dhcp configuration making use of dhcp failover
protocol:
https://tools.ietf.org/html/draft-ietf-dhc-failover-12
particularly section 5.3
http://manpages.ubuntu.com/manpages/precise/en/man5/dhcpd.conf.5.html#contenttoc6

and https://tools.ietf.org/html/rfc3074

I'd like to see xcat supporting an active-active management node via
additional flags to makedhcp to keep it consistent over upgrades, and
making sure all db tables have PK as per galera requirements.

I'd willing to help in modifying makedhcp, if required, however my
knowledge of perl is minimal, so any pointers would be helpful.
--

___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

--
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] makedhcp creates dhcpd.conf with syntax errors

2016-09-07 Thread Yuan Y Bai

Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193
--
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] makedhcp creates dhcpd.conf with syntax errors

2016-09-07 Thread Yuan Y Bai

Hi Roger,

coule you paste the reproduce steps and the syntax errors?


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193



From:   "Roger Cline" 
To: xcat-user@lists.sourceforge.net
Date:   09/07/2016 02:46 AM
Subject:[xcat-user] makedhcp creates dhcpd.conf with syntax errors



When I run makedhcp on my service nodes in a hierarchical cluster, it
generates a dhcpd.conf file with syntax errors.
It does not add a name to the first zone definition for each subnet. The
management node puts the domain name there.
I also have a subnet that no longer has any interfaces attached in the
management node's generated dhcpd.conf and can't figure out where makedhcp
is finding it. Cana nyone help me to get a clean makedhcp output ?
Thanks

Roger Cline
802.769.1409
LINUX/UNIX/AIX Systems Support
IBM Systems


--

___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

--
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Change breaking backword compatibility

2015-08-12 Thread Yuan Y Bai
Christopher,

Thanks for your comments.

I add old flag '-m' to avoid breaking backwards compatibility.

Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193


|
| From:  |
|
  
--|
  |Christopher Samuel sam...@unimelb.edu.au   
 |
  
--|
|
| To:|
|
  
--|
  |xCAT Users Mailing list xcat-user@lists.sourceforge.net
 |
  
--|
|
| Date:  |
|
  
--|
  |08/12/2015 01:10 PM  
 |
  
--|
|
| Subject:   |
|
  
--|
  |Re: [xcat-user] Change breaking backword compatibility   
 |
  
--|





Hiya,

On 12/08/15 14:51, Yuan Y Bai wrote:

 It is ugly if support both old flag '-m' and new flag '-s';

 Do you have important justification to keep old flag '-m'?

Not breaking existing installations  setups.

It's good practice to mark options scheduled for removal as deprecated
in documentation first, then provide warnings about impending removal
when they are used, and only after that's been in place for a release or
two commit the change to remove them.

All the best,
Chris
--
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci


--

___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


--
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] otherpkgs

2015-07-31 Thread Yuan Y Bai
Hi Indi,

You can give us contents of otherpkgdir and otherpkglist which are from
osimage definition;

Or you can refer to this doc to check your steps :
http://sourceforge.net/p/xcat/wiki/Setting_up_ESSL_and_PESSL_in_a_Stateful_Cluster/

Or: http://sourceforge.net/p/xcat/wiki/IBM_HPC_Stack_in_an_xCAT_Cluster/




Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193


|
| From:  |
|
  
--|
  |Xiao Peng Wang/China/IBM@IBMCN   
 |
  
--|
|
| To:|
|
  
--|
  |xCAT Users Mailing list xcat-user@lists.sourceforge.net
 |
  
--|
|
| Date:  |
|
  
--|
  |07/31/2015 10:27 AM  
 |
  
--|
|
| Subject:   |
|
  
--|
  |Re: [xcat-user] otherpkgs
 |
  
--|





Looks like you did not set correct path for repository. The repo path for
otherpkgs should be set in the otherpkgdir attribute for the osimage. If
you did set the otherpkgdir, check the package date for the repository has
been created successfully.

Thanks
Best Regards
--
Wang Xiaopeng (王晓朋)
IBM China System Technology Laboratory
Tel: 86-10-82453455
Email: w...@cn.ibm.com
Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road,
Haidian District Beijing P.R.China 100193

Inactive hide details for indi ---2015/07/31 04:35:19---Hi ALL, xcat 2.8.3
indi ---2015/07/31 04:35:19---Hi ALL, xcat 2.8.3

From: indi windsearc...@gmail.com
To: xcat-user@lists.sourceforge.net xcat-user@lists.sourceforge.net
Date: 2015/07/31 04:35
Subject: [xcat-user] otherpkgs



Hi ALL,

xcat 2.8.3
problem installing optional packages using profile.otherpkgs.pkgs file.
repository was created with createrepo, but looks like xcat don't see it.
during node installation or update(updatenode) getting following messages:


IBM_PPE_RTE_LICENSE_ACCEPT=yes IBM_ESSL_LICENSE_ACCEPT=yes
IBM_LOADL_LICENSE_ACCEPT=yes yum -y upgrade
hpcdev: Loaded plugins: product-id, security, subscription-manager
hpcdev: This system is not registered to Red Hat Subscription Management.
You can use subscription-manager to register.
hpcdev: Setting up Upgrade Process
hpcdev: No Packages marked for Update
hpcdev: Warning: the packages  xlmass** xlsmp** xlf** vac** xlc** libxl**
src* ppe_rte_license** pperteman** pperterh6p** ppertesamples** essl**
LoadL-full-license** LoadL-resmgr-full** could not be found in the
repository, falling back to rpm command, did you forget to run createrepo?
hpcdev: IBM_PPE_RTE_LICENSE_ACCEPT=yes IBM_ESSL_LICENSE_ACCEPT=yes
IBM_LOADL_LICENSE_ACCEPT=yes rpm -Uvh --replacepkgs  xlmass** xlsmp** xlf**
vac** xlc** libxl** src* ppe_rte_license** pperteman** pperterh6p**
ppertesamples** essl** LoadL-full-license** LoadL-resmgr-full**
hpcdev: error: File not found by glob: xlmass**
hpcdev: error: File not found by glob: pperterh6p**
hpcdev: Postscript: otherpkgs exited with code 2
hpcdev: Running of postscripts has completed.

on node hpcdev i can install all of those packets using yum without any
problems

Regards,
Igor

Re: [xcat-user] build kit for essl 5.3.1

2015-03-19 Thread Yuan Y Bai
Hi Bruno,

You can use essl5.2.0 partial kit to generate 5.3.1 complete kit on redhat
power; We assume partial kit is general for all the product version;
Commands like: buildkit [-V|--verbose] addpkgs partial_kit_tar_ball
[-p|--pkgdir product_package_directory_list] [-k|--kitversion version]
[-r|--kitrelease release]

If there is problem, you should untar the partial kit and find
buildkit.conf in its directory; Correct the buildkit.conf, and use it to
build new KIT;

ESSL team do not create essl5.3.X partial kit for redhat, since there is no
HPC customer to use essl5.3.1 on redhat;
They have essl5.3.1 partial kit for ubuntu14.04 P8LE;

Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193


|
| From:  |
|
  
--|
  |Bruno Brossoit bru...@ca.ibm.com   
 |
  
--|
|
| To:|
|
  
--|
  |xCAT Users Mailing list xcat-user@lists.sourceforge.net, 
 |
  
--|
|
| Date:  |
|
  
--|
  |2015/03/20 04:05 
 |
  
--|
|
| Subject:   |
|
  
--|
  |[xcat-user] build kit for essl 5.3.1 
 |
  
--|







Hi, got essl 5.3.1 and i see no buildkit available on fix central, only
5.2.0,

i am linux on power rhel 6 but i see this essl 5.3.1 is for RHEL 7, is this
normal ? can i use it on RHEL 6 ?

thanks

Bruno


--

Dive into the World of Parallel Programming The Go Parallel Website,
sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for
all
things parallel software development, from weekly thought leadership blogs
to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] looking for xcat-ibmhpc rpm

2015-03-15 Thread Yuan Y Bai
Hi Bruno,

IBM HPC team is using xCAT new buildkit way, we have some partial/complete
kit for PE,GPFS,compiler,ESSL and PESSL;
xCAT kit supports statefull and stateless, it does not support statelite;
We suggest user to use stateless instead of statelite;
Latest PE,GPFS,ESSL and PESSL partial/complete kit owned by per product
itself; Compiler partial kit is kept in sourceforge:
Doc is here: http://sourceforge.net/p/xcat/wiki/IBM_HPC_Software_Kits

If you want to use statelite, you need to use  xcat-ibmhpc 2.8 rpm; If you
can use stateless instead of statelite, you can use kit;
Kit is easier than xCAT-ibmhpc if you can get partial or complete kit from
per product team;

Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193


|
| From:  |
|
  
--|
  |Bruno Brossoit bru...@ca.ibm.com   
 |
  
--|
|
| To:|
|
  
--|
  | 
 |
  
--|
|
| Cc:|
|
  
--|
  |xCAT Users Mailing list xcat-user@lists.sourceforge.net
 |
  
--|
|
| Date:  |
|
  
--|
  |2015/03/13 20:38 
 |
  
--|
|
| Subject:   |
|
  
--|
  |Re: [xcat-user] looking for xcat-ibmhpc rpm  
 |
  
--|





thanks Yuan,

Ling Gao talked about the new buildkit way, which is better and easier to
use ? buildkit or xcat-ibmhpc ?

Bruno




 Yuan Y Bai
 by...@cn.ibm.com
   To
   xCAT Users Mailing list
 03/13/2015 12:03  xcat-user@lists.sourceforge.net
 AM cc
   xCAT Users Mailing list
   xcat-user@lists.sourceforge.net
 Please respond to Subject
xCAT Users Re: [xcat-user] looking for
   Mailing listxcat-ibmhpc rpm
 xcat-user@lists.
 sourceforge.net








Hi Bruno,

If you want to use xcat-ibmhpc rpm, you can find it from xcat 2.8_linux;
You can install xCAT-IBMhpc-2.8.5-snap201409010229.noarch.rpm with xcat
2.9;


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193

(Embedded image moved to file: pic55082.gif)Inactive hide details

Re: [xcat-user] install PE on rhel 6.x statelite on power

2015-03-12 Thread Yuan Y Bai
Hi Bruno,

Do you know the doc as following for statelite?
You can refer to them:
http://sourceforge.net/p/xcat/wiki/Setting_up_all_IBM_HPC_products_in_a_Statelite_or_Stateless_Cluster/
http://sourceforge.net/p/xcat/wiki/Setting_Up_IBM_HPC_Products_on_a_Statelite_or_Stateless_Login_Node/
http://sourceforge.net/p/xcat/wiki/Setting_up_ESSL_and_PESSL_in_a_Statelite_or_Stateless_Cluster/
http://sourceforge.net/p/xcat/wiki/Setting_up_PE_in_a_Statelite_or_Stateless_Cluster/
http://sourceforge.net/p/xcat/wiki/Setting_up_GPFS_in_a_Statelite_or_Stateless_Cluster/


We also have xCAT HPC kit in statefull and stateless cluster, but we do not
test them in Statelite cluster;
http://sourceforge.net/p/xcat/wiki/IBM_HPC_Software_Kits/

Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193


|
| From:  |
|
  
--|
  |Bruno Brossoit bru...@ca.ibm.com   
 |
  
--|
|
| To:|
|
  
--|
  |xCAT Users Mailing list xcat-user@lists.sourceforge.net, 
 |
  
--|
|
| Date:  |
|
  
--|
  |2015/03/12 22:52 
 |
  
--|
|
| Subject:   |
|
  
--|
  |[xcat-user] install PE on rhel 6.x statelite on power
 |
  
--|







Hi,

was wondering if you had a recipe or best practice to install PE (and also
gpfs,pessel,essl) on rhel 6.5 statelite for linux on power ?

thanks

Bruno


--

Dive into the World of Parallel Programming The Go Parallel Website,
sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for
all
things parallel software development, from weekly thought leadership blogs
to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] looking for xcat-ibmhpc rpm

2015-03-12 Thread Yuan Y Bai
Hi Bruno,

If you want to use xcat-ibmhpc rpm, you can find it from xcat 2.8_linux;
You can install xCAT-IBMhpc-2.8.5-snap201409010229.noarch.rpm with xcat
2.9;


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193


|
| From:  |
|
  
--|
  |Bruno Brossoit bru...@ca.ibm.com   
 |
  
--|
|
| To:|
|
  
--|
  |xCAT Users Mailing list xcat-user@lists.sourceforge.net, 
 |
  
--|
|
| Date:  |
|
  
--|
  |2015/03/12 22:11 
 |
  
--|
|
| Subject:   |
|
  
--|
  |[xcat-user] looking for xcat-ibmhpc rpm  
 |
  
--|







Hi , got xcat  2.9 installed and i do not see the xcat-ibmhpc rpm,

has it been replaced ?

thanks

Bruno


--

Dive into the World of Parallel Programming The Go Parallel Website,
sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for
all
things parallel software development, from weekly thought leadership blogs
to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] xCAT 2.8.4 and HP IPMI

2014-08-03 Thread Yuan Y Bai
Hi Lanae,

You rsetboot return Invalid role, that is to say its bmc user privilege is
not enough or is not administrator;
You should check your user privilege first and then change its privilege to
administrator:
To check:
#ipmitool user list 2




[root@master xCAT]# rsetboot node1903 net
node1903: Error: Invalid role
node1903: Error: Invalid role


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193


|
| From:  |
|
  
-|
  |Lanae Neild lne...@clemson.edu 
|
  
-|
|
| To:|
|
  
-|
  |xCAT Users Mailing list xcat-user@lists.sourceforge.net,   
|
  
-|
|
| Date:  |
|
  
-|
  |2014/08/02 02:58 
|
  
-|
|
| Subject:   |
|
  
-|
  |Re: [xcat-user] xCAT 2.8.4 and HP IPMI   
|
  
-|





Thanks Jarrod, I agree they shouldn't balk at that.



Lanae Neild
Systems Programmer I
HPC, CCIT, Clemson University
(864)505-4293
lne...@clemson.edu





On Fri, Aug 1, 2014 at 11:53 AM, Jarrod B Johnson jbjoh...@us.ibm.com
wrote:
  Regretably, I think you have to take up the issue with HP support at this
  point.  ipmitool is a pretty widely known tool so they shouldn't balk at
  that method of reproducing it like they might balk at over xCAT.

  Inactive hide details for Lanae Neild ---08/01/2014 11:31:53 AM---Weird.
  we have a couple different IPMI firmware versions in tLanae Neild
  ---08/01/2014 11:31:53 AM---Weird. we have a couple different IPMI
  firmware versions in the sl250's and both behave the same way




  From: Lanae Neild lne...@clemson.edu
  To: xCAT Users Mailing list xcat-user@lists.sourceforge.net
  Date: 08/01/2014 11:31 AM
  Subject: Re: [xcat-user] xCAT 2.8.4 and HP IPMI



  Weird. we have a couple different IPMI firmware versions in the sl250's
  and both behave the same way. What could be  different in xcat 2.7.8 that
  it would work when this command doesn't?


  sl250 from remote:


  [root@master xCAT]# ipmitool -I lanplus -U  -P  -H node1903-man0
  power status
  Error: Unable to establish IPMI v2 / RMCP+ session
  Unable to get Chassis Power Status


  dl165 from remote:


  [root@master xCAT]# ipmitool -I lanplus -U  -P  -H node1572-man0
  power status
  Chassis Power is on



  Lanae Neild
  Systems Programmer I
  HPC, CCIT, Clemson University
  (864)505-4293
  lne...@clemson.edu





  On Fri, Aug 1, 2014 at 11:05 AM, Jarrod B Johnson jbjoh...@us.ibm.com
  wrote:


Hmm, that looks bizarre.  If you try to do 'ipmitool -I lanplus -U
username -P password -H ilo power state' from a remote node,
does ipmitool also fail?  This might be some oddity with your DL160
systems.  Sorry I don't know off hand, I only know our system x
products directly.

Inactive hide details for Lanae Neild ---08/01/2014 10:50:56
AM---The channel 1 isn't set up, but either 'ipmitool lan print' o
Lanae Neild ---08/01/2014 10:50:56 AM---The channel 1 isn't set up,
but either 'ipmitool lan print' or 'ipmitool 

Re: [xcat-user] Error: the SoftLayer::API::SOAP perl module is not installed : xCat on Softlayer

2014-05-24 Thread Yuan Y Bai
Hi, Rachit,

You should install dependencies for Softlayer API.

1)download perl lib dependency for softlayer API, I use the following, if
you find these are not enough, download them from web;


wget
http://search.cpan.org/CPAN/authors/id/P/PH/PHRED/SOAP-Lite-1.11.tar.gz


     wget
http://search.cpan.org/CPAN/authors/id/A/AD/ADAMK/Class-Inspector-1.28.tar.gz


     wget
http://search.cpan.org/CPAN/authors/id/L/LE/LEONT/Test-Harness-3.30.tar.gz


     wget
http://search.cpan.org/CPAN/authors/id/B/BI/BINGOS/ExtUtils-MakeMaker-6.96.tar.gz


     wget
http://search.cpan.org/CPAN/authors/id/M/MO/MONS/XML-Hash-LX-0.0603.tar.gz


     wget http://search.cpan.org/CPAN/authors/id/A/AN/ANDK/CPAN-2.05.tar.gz


wget
http://search.cpan.org/CPAN/authors/id/D/DA/DAGOLDEN/CPAN-Meta-Requirements-2.125.tar.gz


2) install perl lib dependency, they may have the order:
ExtUtils-MakeMaker,Test-Harness,CPAN-Meta-Requirements,CPAN,Class-Inspector,SOAP-Lite,XML-Hash-LX


  For every lib you should do as following:


#gunzip CPAN-Meta-Requirements-2.125.tar.gz


#tar xvf CPAN-Meta-Requirements-2.125.tar
   #cd  CPAN-Meta-Requirements-2.125
  #perl Makefile.PL
  #make
  #make install


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193


|
| From:  |
|
  
--|
  |Rachit Arora2 rachi...@in.ibm.com  
 |
  
--|
|
| To:|
|
  
--|
  |xcat-user@lists.sourceforge.net, 
 |
  
--|
|
| Cc:|
|
  
--|
  |Dharmesh K Jain dharj...@in.ibm.com, Shrinivas S Kulkarni 
shrinivas.kulka...@in.ibm.com 
  |
  
--|
|
| Date:  |
|
  
--|
  |2014/05/23 20:28 
 |
  
--|
|
| Subject:   |
|
  
--|
  |[xcat-user] Error: the SoftLayer::API::SOAP perl module is not installed : 
xCat on Softlayer  |
  
--|





Hi All

We are trying xCat on Softlayer to provision nodes on Softlayer in order to
deploy our product on SoftLayer.

I tried the steps to install and configure xCat on SoftLayer

Here is the Summary of steps i did

1. Get a VM Server from Softlayer
2. Set up xCat-repo and xcat-dep-repo and use yum install to

install xcat managment node on this server
3. Install xCat-Softlayer rpm on this node
4. git clone for softlayer client api at folder location

/usr/local/lib
5. Create .slconfig in /root directory with details required

sl urser is , api key and apidir

but when i try to do getslnodes command i get following error

Error: the SoftLayer::API::SOAP perl module is not installed. Download it
using 'git clone https://github.com/softlayer/softlayer-api-perl-client';
and put the directory in ~/.slconfig .


But softlayer-api-perl-client is there in the expected folder

Please let me know what i missed and how i can resolve this?


Re: [xcat-user] Error: the SoftLayer::API::SOAP perl module is not installed : xCat on Softlayer

2014-05-24 Thread Yuan Y Bai
Hi Lissa,

The issue is caused by lack perl lib dependecies for
softlayer-api-perl-client;
Perl lib dependecies for Softlayer API  softlayer-api-perl-client is not
shipped by Softlayer API; And we did not described this in our doc, I think
we should mention Softlayer API perl lib dependencies in our doc; Although
they are not xCAT dependecies, they should be there to help getslnodes work
well with  softlayer-api-perl-client.
I will discuss this next week with China team.


Best Regards
--
Yuan Bai (白媛)

CSTL HPC System Management Development
Tel:86-10-82451401
E-mail: by...@cn.ibm.com
Address: IBM ZGC Campus. Ring Building 28,
 ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian
District,
 Beijing P.R.China 100193

IBM环宇大厦
北京市海淀区东北旺西路8号,中关村软件园28号楼
邮编:100193


|
| From:  |
|
  
--|
  |Lissa Valletta lis...@us.ibm.com   
 |
  
--|
|
| To:|
|
  
--|
  |xCAT Users Mailing list xcat-user@lists.sourceforge.net,   
 |
  
--|
|
| Cc:|
|
  
--|
  |Dharmesh K Jain dharj...@in.ibm.com, Shrinivas S Kulkarni 
shrinivas.kulka...@in.ibm.com, xcat-user@lists.sourceforge.net
  |
  
--|
|
| Date:  |
|
  
--|
  |2014/05/23 20:46 
 |
  
--|
|
| Subject:   |
|
  
--|
  |Re: [xcat-user] Error: the SoftLayer::API::SOAP perl module is not installed 
: xCat on Softlayer  |
  
--|





xCAT in Softlayer is still under development.   It really will not be
supported until our 2.9 release later this year.   I will email you and we
can discuss what you can do.

Lissa K. Valletta
8-3/B10
Poughkeepsie, NY 12601
(tie 293) 433-3102



Inactive hide details for Rachit Arora2 ---05/23/2014 08:36:20 AM---Hi All
We are trying xCat on Softlayer to provision nodes oRachit Arora2
---05/23/2014 08:36:20 AM---Hi All We are trying xCat on Softlayer to
provision nodes on Softlayer in order

From: Rachit Arora2 rachi...@in.ibm.com
To: xcat-user@lists.sourceforge.net,
Cc: Dharmesh K Jain dharj...@in.ibm.com, Shrinivas S Kulkarni
shrinivas.kulka...@in.ibm.com
Date: 05/23/2014 08:36 AM
Subject: [xcat-user] Error: the SoftLayer::API::SOAP perl module is not
installed : xCat on Softlayer



Hi All

We are trying xCat on Softlayer to provision nodes on Softlayer in order to
deploy our product on SoftLayer.

I tried the steps to install and configure xCat on SoftLayer

Here is the Summary of steps i did

1. Get a VM Server from Softlayer
2. Set up xCat-repo and xcat-dep-repo and use yum install to

install xcat managment node on this server
3. Install xCat-Softlayer rpm on this node
4. git clone for softlayer client api at folder location

/usr/local/lib
5. Create .slconfig in /root directory with details required

sl urser is , api key and apidir

but when i try to do getslnodes command i get following error

Error: the SoftLayer::API::SOAP perl module is not installed. Download it
using 'git clone https://github.com/softlayer/softlayer-api-perl-client';
and put the directory in ~/.slconfig .


But softlayer-api-perl-client is there in the