Re: [xcat-user] re-discovering node after motherboard replacement
Should be ready to be nodeset to do something else. 'standby' in this case is 'completed everything supposed to happen, awaiting instructions' If you put in hard drives with os still working: nodeset node boot if hard drive needs reinstall: nodeset node osimage If stateless: nodeset node netboot From: Damir Krstic damir.krs...@gmail.com To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 10/28/2013 10:42 AM Subject:[xcat-user] re-discovering node after motherboard replacement One of our GPU nodes had bad motherboard and we had it replaced few days ago. After motherboard was replaced we ran rmnodecfg script and node was re-discovered: mgt xCAT node discovery: qgpu0020 has been discovered I can see the new MAC address in the mac table. However, we are running into issues reprograming BMC. It never finishes. Console screen displays: Received request to retry in a bit, will call xCAT back in amount seconds. lsdef on this node displays that node is standby mode (not sure what that means): chain=runcmd=bmcsetup,standby currchain=standby currstate=standby Here is the content of the pxelinux file for this node: #standby DEFAULT xCAT LABEL xCAT KERNEL xcat/genesis.kernel.x86_64 APPEND initrd=xcat/genesis.fs.x86_64.gz quiet console=tty0 console=ttyS0,115200 xcatd=172.20.0.1:3001 destiny=standby nouveau.modeset=0 IPAPPEND 2 I hope you can help. Damir -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user inline: graycol.gif-- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] re-discovering node after motherboard replacement
OK - I'll try booting from the hard drive and see if that works, but...BMC never got programed. I can't reach this node with any of the rcons/rpower commands and if I try to telnet to its bmc port it fails. I'll keep poking to see if there are any other errors related to programing of BMC. Thanks, Damir On Mon, Oct 28, 2013 at 9:46 AM, Jarrod B Johnson jbjoh...@us.ibm.comwrote: Should be ready to be nodeset to do something else. 'standby' in this case is 'completed everything supposed to happen, awaiting instructions' If you put in hard drives with os still working: nodeset node boot if hard drive needs reinstall: nodeset node osimage If stateless: nodeset node netboot [image: Inactive hide details for Damir Krstic ---10/28/2013 10:42:01 AM---One of our GPU nodes had bad motherboard and we had it repla]Damir Krstic ---10/28/2013 10:42:01 AM---One of our GPU nodes had bad motherboard and we had it replaced few days ago. After motherboard was From: Damir Krstic damir.krs...@gmail.com To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 10/28/2013 10:42 AM Subject: [xcat-user] re-discovering node after motherboard replacement -- One of our GPU nodes had bad motherboard and we had it replaced few days ago. After motherboard was replaced we ran rmnodecfg script and node was re-discovered: mgt xCAT node discovery: qgpu0020 has been discovered I can see the new MAC address in the mac table. However, we are running into issues reprograming BMC. It never finishes. Console screen displays: Received request to retry in a bit, will call xCAT back in amount seconds. lsdef on this node displays that node is standby mode (not sure what that means): chain=runcmd=bmcsetup,standby currchain=standby currstate=standby Here is the content of the pxelinux file for this node: #standby DEFAULT xCAT LABEL xCAT KERNEL xcat/genesis.kernel.x86_64 APPEND initrd=xcat/genesis.fs.x86_64.gz quiet console=tty0 console=ttyS0,115200 xcatd=*172.20.0.1:3001* http://172.20.0.1:3001/ destiny=standby nouveau.modeset=0 IPAPPEND 2 I hope you can help. Damir -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user graycol.gif-- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
Re: [xcat-user] re-discovering node after motherboard replacement
Perhaps the cmos settings need to be adjusted. If you have a shared nic (eth0/imm) make sure the the appropriate settings in cmos indicate shared nic. What is the server machine type? Thomas Alandt WW Test Engineer Complex Solutions IBM-ISC From: Damir Krstic damir.krs...@gmail.com To: xCAT Users Mailing list xcat-user@lists.sourceforge.net, Date: 10/28/2013 11:01 AM Subject:Re: [xcat-user] re-discovering node after motherboard replacement OK - I'll try booting from the hard drive and see if that works, but...BMC never got programed. I can't reach this node with any of the rcons/rpower commands and if I try to telnet to its bmc port it fails. I'll keep poking to see if there are any other errors related to programing of BMC. Thanks, Damir On Mon, Oct 28, 2013 at 9:46 AM, Jarrod B Johnson jbjoh...@us.ibm.com wrote: Should be ready to be nodeset to do something else. 'standby' in this case is 'completed everything supposed to happen, awaiting instructions' If you put in hard drives with os still working: nodeset node boot if hard drive needs reinstall: nodeset node osimage If stateless: nodeset node netboot Inactive hide details for Damir Krstic ---10/28/2013 10:42:01 AM---One of our GPU nodes had bad motherboard and we had it replaDamir Krstic ---10/28/2013 10:42:01 AM---One of our GPU nodes had bad motherboard and we had it replaced few days ago. After motherboard was From: Damir Krstic damir.krs...@gmail.com To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 10/28/2013 10:42 AM Subject: [xcat-user] re-discovering node after motherboard replacement One of our GPU nodes had bad motherboard and we had it replaced few days ago. After motherboard was replaced we ran rmnodecfg script and node was re-discovered: mgt xCAT node discovery: qgpu0020 has been discovered I can see the new MAC address in the mac table. However, we are running into issues reprograming BMC. It never finishes. Console screen displays: Received request to retry in a bit, will call xCAT back in amount seconds. lsdef on this node displays that node is standby mode (not sure what that means): chain=runcmd=bmcsetup,standby currchain=standby currstate=standby Here is the content of the pxelinux file for this node: #standby DEFAULT xCAT LABEL xCAT KERNEL xcat/genesis.kernel.x86_64 APPEND initrd=xcat/genesis.fs.x86_64.gz quiet console=tty0 console=ttyS0,115200 xcatd=172.20.0.1:3001 destiny=standby nouveau.modeset=0 IPAPPEND 2 I hope you can help. Damir -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user inline: graycol.gif-- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists
Re: [xcat-user] re-discovering node after motherboard replacement
I'll bet that ipmi.bmcport is not set. If that is the case, set: nodech noderange ipmi.bmcport=0 And from now on, the IMM will move to the right port automatically during bmcsetup. From: Damir Krstic damir.krs...@gmail.com To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 10/28/2013 11:18 AM Subject:Re: [xcat-user] re-discovering node after motherboard replacement Machine type is 7912AC1. Shared nic...I can ssh to it while genesis is running but can't telnet to the shared bmc port. I'll go and look at the cmos settings to see if anything out of ordinary is set in the IMM section. Damir On Mon, Oct 28, 2013 at 10:05 AM, Jarrod B Johnson jbjoh...@us.ibm.com wrote: oh how is it wired? can you ssh to the node? is bmc dedicated or shared? nodels node ipmi.bmcport ssh node ipmitool lan print 1 Inactive hide details for Damir Krstic ---10/28/2013 11:01:39 AM---OK - I'll try booting from the hard drive and see if that woDamir Krstic ---10/28/2013 11:01:39 AM---OK - I'll try booting from the hard drive and see if that works, but...BMC never got programed. I c From: Damir Krstic damir.krs...@gmail.com To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 10/28/2013 11:01 AM Subject: Re: [xcat-user] re-discovering node after motherboard replacement OK - I'll try booting from the hard drive and see if that works, but...BMC never got programed. I can't reach this node with any of the rcons/rpower commands and if I try to telnet to its bmc port it fails. I'll keep poking to see if there are any other errors related to programing of BMC. Thanks, Damir On Mon, Oct 28, 2013 at 9:46 AM, Jarrod B Johnson jbjoh...@us.ibm.com wrote: Should be ready to be nodeset to do something else. 'standby' in this case is 'completed everything supposed to happen, awaiting instructions' If you put in hard drives with os still working: nodeset node boot if hard drive needs reinstall: nodeset node osimage If stateless: nodeset node netboot Inactive hide details for Damir Krstic ---10/28/2013 10:42:01 AM---One of our GPU nodes had bad motherboard and we had it repla Damir Krstic ---10/28/2013 10:42:01 AM---One of our GPU nodes had bad motherboard and we had it replaced few days ago. After motherboard was From: Damir Krstic damir.krs...@gmail.com To: xCAT Users Mailing list xcat-user@lists.sourceforge.net Date: 10/28/2013 10:42 AM Subject: [xcat-user] re-discovering node after motherboard replacement One of our GPU nodes had bad motherboard and we had it replaced few days ago. After motherboard was replaced we ran rmnodecfg script and node was re-discovered: mgt xCAT node discovery: qgpu0020 has been discovered I can see the new MAC address in the mac table. However, we are running into issues reprograming BMC. It never finishes. Console screen displays: Received request to retry in a bit, will call xCAT back in amount seconds. lsdef on this node displays that node is standby mode (not sure what that means): chain=runcmd=bmcsetup,standby currchain=standby currstate=standby Here is the content of the pxelinux file for this node: #standby DEFAULT xCAT LABEL xCAT KERNEL xcat/genesis.kernel.x86_64 APPEND initrd=xcat/genesis.fs.x86_64.gz quiet console=tty0 console=ttyS0,115200 xcatd=172.20.0.1:3001 destiny=standby nouveau.modeset=0 IPAPPEND 2 I hope you can help. Damir -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk