Re: [xcat-user] Getdestiny failing - can't open '/tmp/dhcpserver'

2012-08-07 Thread Josh Nielsen
Okay, it turns out that the xnba-undi package was outdated and for some
reason it was not updated from xcat-dep when we installed xCAT 2.7.3. I
guess it was not a required dependency hence was not grabbed
automatically for update with the new xCAT. I can boot Genesis now after
updating that package.

Also, it looks like the source rpm for that xnba-undi package just adds two
files:

/tftpboot/xcat/xnba.efi
/tftpboot/xcat/xnba.kpxe

The timestamp on them is now Feb 6 2012 and previously my xnba.kpxe
timestamp was Aug 24  2009 and I did not have xnba.efi before at all. Why
are these files needed with genesis? For my own edification does genesis
run *on top* of xnba or does it boot an entirely new image once xnba
fetches the genesis kernel images?

Thanks,
Josh

On Tue, Aug 7, 2012 at 5:24 PM, Josh Nielsen jniel...@hudsonalpha.comwrote:

 Hi Jarrod,

 Okay, I upgraded to xCAT 2.7.3 and installed the xCAT-genesis-x86_64 and
 elilo-xCAT RPMs and I reran mknb x86_64 to recreate the
 /tftpboot/xcat/xnba/nets files and it also reported Creating
 genesis.fs.x86_64.gz in /tftpboot/xcat. But I still get a missing NIC
 driver error when I PXE boot (this time explicit - confirming this is the
 problem). Upon PXE boot the clients get yaboot first, then load xnba.kpxe,
 which according to my /etc/dhcpd.conf file is set up to query the
 appropriate nets file:

 *if option user-class-identifier = xNBA and option client-architecture
 = 00:00 { #x86, xCAT Network Boot Agent*
 *   always-broadcast on;*
 *   filename = http://10.20.0.1/tftpboot/xcat/xnba/nets/10.20.0.0_16
 ;*

 Inside that file is:

 *[root@x3650-head01 etc]# cat /tftpboot/xcat/xnba/nets/10.20.0.0_16*
 *#!gpxe*
 *imgfetch -n kernel http://${next-server}/tftpboot/xcat/genesis.kernel.x86_64
 xcatd=10.20.0.1:3001  BOOTIF=01-${netX/machyp}*
 *imgfetch -n nbfs http://${next-server}/tftpboot/xcat/genesis.fs.x86_64.gz
 *
 *imgload kernel*
 *imgexec kernel*

 In my apache log I see:

 10.20.253.236 - - [07/Aug/2012:13:36:23 -0500] GET
 /tftpboot/xcat/xnba/nets/10.20.0.0_16 HTTP/1.0 200 235 - gPXE/0.9.7
 10.20.253.236 - - [07/Aug/2012:13:36:23 -0500] GET
 /tftpboot/xcat/genesis.kernel.x86_64 HTTP/1.0 200 3942032 - gPXE/0.9.7
 10.20.253.236 - - [07/Aug/2012:13:36:23 -0500] GET
 /tftpboot/xcat/genesis.fs.x86_64.gz HTTP/1.0 200 20210204 - gPXE/0.9.7

 Then genesis boots and (after I removed 'quiet' from the kernel arguments)
 does some initial boot checks and then it goes into a loop of dumping to
 the screen the help/syntax screen for grep (which indicates to me that what
 it is greping for is failing - possibly /tmp/dhcpserver like before).
 Eventually it gives up and prints this to the screen:

 ERROR Unable to find boot device (*maybe* the *nbroot is missing* the *
 driver* for your *nic*?)

 At that point it just sits there, and does not try anything else.

 What have I done wrong here? Is maybe the BOOTIF argument to genesis
 kernel wrong? Also why is it looking for nbroot? There was a previous
 bootloader in /opt/xcat/share/xcat/netboot/x86_64/nbroot/ but it should
 be looking in /opt/xcat/share/xcat/netboot/genesis/x86_64/ since it is
 using genesis, correct?

 Any ideas?

 -Josh


 On Wed, Jul 25, 2012 at 1:37 PM, Jarrod B Johnson jbjoh...@us.ibm.comwrote:

 Hmm, with xcat 2.7.3 you should be pulling in the 'xCAT-genesis' packages
 that replace the environment with something newer that has the appropriate
 nic drivers...
 -Josh Nielsen jniel...@hudsonalpha.com 
 jniel...@hudsonalpha.comwrote: -

 To: xcat-user@lists.sourceforge.net
 From: Josh Nielsen jniel...@hudsonalpha.com jniel...@hudsonalpha.com
 Date: 07/25/2012 12:59PM
 Subject: [xcat-user] Getdestiny failing - can't open '/tmp/dhcpserver'

 Hello,

 I have some new IBM System X DX360M4 nodes (all our previous ones were
 DX360M3s) that I am trying to autodiscover with xCAT and I am running into
 the same problem as in this mail thread:
 http://www.mail-archive.com/xcat-user@lists.sourceforge.net/msg01267.html.
 Essentially the node boots up, does a dhcpdiscover, and grabs a generic
 bootloader (in my case yaboot but it also works with pxelinux.0) which then
 reinitiates the dhcpdiscover and queries again for the appropriate
 bootloader from the file in the xcat/xnba/nets/ folder and is served 
 boots xnba.kpxe. At this point I believe the xnba image is supposed to load
 and execute the autodiscovery process which includes the getdestiny script,
 but all I am seeing are the messages:


 cat: can't open '/tmp/dhcpserver': No such file or directory
 grep: /tmp/destiny: No such file or directory
 grep: /tmp/destiny: No such file or directory


 One reply by Jarrod Johnson to that email thread above mentioned a
 possible network driver issue and suggested using Genesis from xCAT
 v2.7. According to 'xcatconfig -v' I am running Version 2.3.1 of xCAT and I
 am running on Centos 5 (2.6.18-128.el5). Is there a way to get this working
 with my current version of xCAT and OS (inject drivers somehow

[xcat-user] Getdestiny failing - can't open '/tmp/dhcpserver'

2012-07-25 Thread Josh Nielsen
Hello,

I have some new IBM System X DX360M4 nodes (all our previous ones were
DX360M3s) that I am trying to autodiscover with xCAT and I am running into
the same problem as in this mail thread:
http://www.mail-archive.com/xcat-user@lists.sourceforge.net/msg01267.html.
Essentially the node boots up, does a dhcpdiscover, and grabs a generic
bootloader (in my case yaboot but it also works with pxelinux.0) which then
reinitiates the dhcpdiscover and queries again for the appropriate
bootloader from the file in the xcat/xnba/nets/ folder and is served 
boots xnba.kpxe. At this point I believe the xnba image is supposed to load
and execute the autodiscovery process which includes the getdestiny script,
but all I am seeing are the messages:

cat: can't open '/tmp/dhcpserver': No such file or directory
grep: /tmp/destiny: No such file or directory
grep: /tmp/destiny: No such file or directory

One reply by Jarrod Johnson to that email thread above mentioned a possible
network driver issue and suggested using Genesis from xCAT v2.7. According
to 'xcatconfig -v' I am running Version 2.3.1 of xCAT and I am running
on Centos 5 (2.6.18-128.el5). Is there a way to get this working with my
current version of xCAT and OS (inject drivers somehow?), or do I need to
update to v2.7 of xCAT (does that require Centos 6)? In any case, is a
network driver the most likely explanation for what I am seeing? I have
tried everything that I can think of from the switch side to make sure SNMP
is enabled and the port definitions are correct in the switch table for
autodiscovery. Any ideas?

Thanks,
Josh Nielsen
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Getdestiny failing - can't open '/tmp/dhcpserver'

2012-07-25 Thread Josh Nielsen
Okay, we will be doing an upgrade of xCAT soon. I just wanted to double
check.

P.S. Can you take a crack at answering that gPXE/UNDI question?

On Wed, Jul 25, 2012 at 1:37 PM, Jarrod B Johnson jbjoh...@us.ibm.comwrote:

 Hmm, with xcat 2.7.3 you should be pulling in the 'xCAT-genesis' packages
 that replace the environment with something newer that has the appropriate
 nic drivers...
 -Josh Nielsen jniel...@hudsonalpha.com jniel...@hudsonalpha.comwrote: 
 -

 To: xcat-user@lists.sourceforge.net
 From: Josh Nielsen jniel...@hudsonalpha.com jniel...@hudsonalpha.com
 Date: 07/25/2012 12:59PM
 Subject: [xcat-user] Getdestiny failing - can't open '/tmp/dhcpserver'

 Hello,

 I have some new IBM System X DX360M4 nodes (all our previous ones were
 DX360M3s) that I am trying to autodiscover with xCAT and I am running into
 the same problem as in this mail thread:
 http://www.mail-archive.com/xcat-user@lists.sourceforge.net/msg01267.html.
 Essentially the node boots up, does a dhcpdiscover, and grabs a generic
 bootloader (in my case yaboot but it also works with pxelinux.0) which then
 reinitiates the dhcpdiscover and queries again for the appropriate
 bootloader from the file in the xcat/xnba/nets/ folder and is served 
 boots xnba.kpxe. At this point I believe the xnba image is supposed to load
 and execute the autodiscovery process which includes the getdestiny script,
 but all I am seeing are the messages:


 cat: can't open '/tmp/dhcpserver': No such file or directory
 grep: /tmp/destiny: No such file or directory
 grep: /tmp/destiny: No such file or directory


 One reply by Jarrod Johnson to that email thread above mentioned a
 possible network driver issue and suggested using Genesis from xCAT
 v2.7. According to 'xcatconfig -v' I am running Version 2.3.1 of xCAT and I
 am running on Centos 5 (2.6.18-128.el5). Is there a way to get this working
 with my current version of xCAT and OS (inject drivers somehow?), or do I
 need to update to v2.7 of xCAT (does that require Centos 6)? In any case,
 is a network driver the most likely explanation for what I am seeing? I
 have tried everything that I can think of from the switch side to make sure
 SNMP is enabled and the port definitions are correct in the switch table
 for autodiscovery. Any ideas?


 Thanks,
 Josh Nielsen


 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 xCAT-user mailing list
 xCAT-user@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/xcat-user




 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 xCAT-user mailing list
 xCAT-user@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/xcat-user


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Getdestiny failing - can't open '/tmp/dhcpserver'

2012-07-25 Thread Jarrod B Johnson
So the issue is that the xnba can work fine. The problem is the linux image that subsequently loads lacked the driver update. In an ideal world, we make an efi executable that does everything in UEFI that is moderately future proof. In a practical world we work with the much richer linux toolset for lack of time to develop EFI resources.-Josh Nielsen jniel...@hudsonalpha.com wrote: -To: xCAT Users Mailing list xcat-user@lists.sourceforge.netFrom: Josh Nielsen jniel...@hudsonalpha.comDate: 07/25/2012 02:58PMSubject: Re: [xcat-user] Getdestiny failing - can't open '/tmp/dhcpserver'Thanks for the information Lissa.I do have another more general question as well though. It is regarding the xnba boot image itself, which appears to be based off of gPXE. The idea of gPXE is that it is UNDI-capable and does not have to use TFTP to serve the images correct? And looking at the xnba.kpxe image/bootloader extension of '.kpxe' it looks like that is reserved specifically for images that load UNDI but offload PXE. According to this page (http://etherboot.org/wiki/gpxe_imagetypes) the extensions for images break down like this:- .pxe is an image designed to be chainloaded, unloading both the underlying PXE and UNDI code sections.- .kpxe is a PXE image that keeps UNDI loaded and unloads PXE- .kkpxe is a PXE image that keeps PXE+UNDI loaded and return to PXE (instead of int 18h).So does xnba.kpxe try to interact with the NIC card via UNDI once it loads? If so since UNDI is an abstracted API, and unless the API has been updated on the newer NIC cards, shouldn't it work with just about any network card regardless? From this PXE chainloading page (http://etherboot.org/wiki/pxechaining) it says:
"When chainloading gPXE from PXE, gPXE can use this API (instead of loading an hardware driver). This way, you're getting support for network controllers that are not natively supported by gPXE. Some network controllers have improved performance when using the UNDI driver over the vendor specific gPXE driver."
I'm just curious about some of the theory behind this because I'm only used to good ol' legacy PXE. All this gPXE and UNDI stuff is new to me.On Wed, Jul 25, 2012 at 12:57 PM, Lissa Valletta lis...@us.ibm.com wrote:
Version 2.3.x has not been supported for a long time. You need to be on the latest level of 2.6 for support , but you might as well go to the latest release which is 2.7.3.   You can upgrade xCAT and stay at your current Centos level. 


Lissa K. Valletta
2-3/T12
Poughkeepsie, NY 12601
(tie 293) 433-3102

Josh Nielsen ---07/25/2012 01:33:25 PM---Hello, I have some new IBM System X DX360M4 nodes (all our previous ones were

From:Josh Nielsen jniel...@hudsonalpha.com
To:xcat-user@lists.sourceforge.net
Date:07/25/2012 01:33 PM
Subject:    [xcat-user] Getdestiny failing - can't open '/tmp/dhcpserver'

Hello,I have some new IBM System X DX360M4 nodes (all our previous ones were DX360M3s) that I am trying to autodiscover with xCAT and I am running into the same problem as in this mail thread:http://www.mail-archive.com/xcat-user@lists.sourceforge.net/msg01267.html. Essentially the node boots up, does a dhcpdiscover, and grabs a generic bootloader (in my case yaboot but it also works with pxelinux.0) which then reinitiates the dhcpdiscover and queries again for the appropriate bootloader from the file in the xcat/xnba/nets/ folder and is served  bootsxnba.kpxe. At this point I believe the xnba image is supposed to load and execute the autodiscovery process which includes the getdestiny script, but all I am seeing are the messages:
cat: can't open '/tmp/dhcpserver': No such file or directory
grep: /tmp/destiny: No such file or directory
grep: /tmp/destiny: No such file or directory
One reply by Jarrod Johnson to that email thread above mentioned a possible network driver issue and suggested using Genesis from xCAT v2.7.According to 'xcatconfig -v' I am running Version 2.3.1 of xCAT and I am running onCentos 5 (2.6.18-128.el5). Is there a way to get this working with my current version of xCAT and OS (inject drivers somehow?), or do I need to update to v2.7 of xCAT (does that require Centos 6)? In any case, is a network driver the most likely explanation for what I am seeing? I have tried everything that I can think of from the switch side to make sure SNMP is enabled and the port definitions are correct in the switch table for autodiscovery. Any ideas?
Thanks,
Josh Nielsen--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___

xCAT-user mailing list
xCAT-user@lists.sourceforge

Re: [xcat-user] Getdestiny failing - can't open '/tmp/dhcpserver'

2012-07-25 Thread Josh Nielsen
Okay, thanks for the explanation.

Cheers,
Josh

On Wed, Jul 25, 2012 at 2:28 PM, Jarrod B Johnson jbjoh...@us.ibm.comwrote:

 So the issue is that the xnba can work fine.  The problem is the linux
 image that subsequently loads lacked the driver update.  In an ideal world,
 we make an efi executable that does everything in UEFI that is moderately
 future proof.  In a practical world we work with the much richer linux
 toolset for lack of time to develop EFI resources.

 -Josh Nielsen jniel...@hudsonalpha.com jniel...@hudsonalpha.comwrote: 
 -

 To: xCAT Users Mailing list 
 xcat-user@lists.sourceforge.netxcat-user@lists.sourceforge.net
 From: Josh Nielsen jniel...@hudsonalpha.com jniel...@hudsonalpha.com
 Date: 07/25/2012 02:58PM
 Subject: Re: [xcat-user] Getdestiny failing - can't open '/tmp/dhcpserver'

 Thanks for the information Lissa.

 I do have another more general question as well though. It is regarding
 the xnba boot image itself, which appears to be based off of gPXE. The idea
 of gPXE is that it is UNDI-capable and does not have to use TFTP to serve
 the images correct? And looking at the xnba.kpxe image/bootloader extension
 of '.kpxe' it looks like that is reserved specifically for images that load
 UNDI but offload PXE. According to this page (
 http://etherboot.org/wiki/gpxe_imagetypes) the extensions for images
 break down like this:

 - .pxe is an image designed to be chainloaded, unloading both the
 underlying PXE and UNDI code sections.
 - .kpxe is a PXE image that keeps UNDI loaded and unloads PXE
 - .kkpxe is a PXE image that keeps PXE+UNDI loaded and return to PXE
 (instead of int 18h).

 So does xnba.kpxe try to interact with the NIC card via UNDI once it
 loads? If so since UNDI is an abstracted API, and unless the API has been
 updated on the newer NIC cards, shouldn't it work with just about any
 network card regardless? From this PXE chainloading page (
 http://etherboot.org/wiki/pxechaining) it says:

 When chainloading gPXE from PXE, gPXE can use this API (instead of
 loading an hardware driver). This way, you're getting support for network
 controllers that are not natively supported by gPXE. Some network
 controllers have improved performance when using the UNDI driver over the
 vendor specific gPXE driver.
 I'm just curious about some of the theory behind this because I'm only
 used to good ol' legacy PXE. All this gPXE and UNDI stuff is new to me.


 On Wed, Jul 25, 2012 at 12:57 PM, Lissa Valletta lis...@us.ibm.comwrote:


 Version 2.3.x has not been supported for a long time.  You need to be on
 the latest level of 2.6 for support , but you might as well  go to the
 latest release which is  2.7.3. You can upgrade xCAT and stay at your
 current Centos level.

 Lissa K. Valletta
 2-3/T12
 Poughkeepsie, NY 12601
 (tie 293) 433-3102



 Josh Nielsen ---07/25/2012 01:33:25 PM---Hello, I have some new IBM
 System X DX360M4 nodes (all our previous ones were

 From: Josh Nielsen jniel...@hudsonalpha.com
 To: xcat-user@lists.sourceforge.net
 Date: 07/25/2012 01:33 PM
 Subject: [xcat-user] Getdestiny failing - can't open '/tmp/dhcpserver'

 --



 Hello,
 I have some new IBM System X DX360M4 nodes (all our previous ones were
 DX360M3s) that I am trying to autodiscover with xCAT and I am running into
 the same problem as in this mail thread: *
 http://www.mail-archive.com/xcat-user@lists.sourceforge.net/msg01267.html
 *http://www.mail-archive.com/xcat-user@lists.sourceforge.net/msg01267.html.
 Essentially the node boots up, does a dhcpdiscover, and grabs a generic
 bootloader (in my case yaboot but it also works with pxelinux.0) which then
 reinitiates the dhcpdiscover and queries again for the appropriate
 bootloader from the file in the xcat/xnba/nets/ folder and is served 
 boots xnba.kpxe. At this point I believe the xnba image is supposed to load
 and execute the autodiscovery process which includes the getdestiny script,
 but all I am seeing are the messages:

 cat: can't open '/tmp/dhcpserver': No such file or directory
 grep: /tmp/destiny: No such file or directory
 grep: /tmp/destiny: No such file or directory

 One reply by Jarrod Johnson to that email thread above mentioned a
 possible network driver issue and suggested using Genesis from xCAT
 v2.7. According to 'xcatconfig -v' I am running Version 2.3.1 of xCAT and I
 am running on Centos 5 (2.6.18-128.el5). Is there a way to get this working
 with my current version of xCAT and OS (inject drivers somehow?), or do I
 need to update to v2.7 of xCAT (does that require Centos 6)? In any case,
 is a network driver the most likely explanation for what I am seeing? I
 have tried everything that I can think of from the switch side to make sure
 SNMP is enabled and the port definitions are correct in the switch table
 for autodiscovery. Any ideas?

 Thanks,
 Josh Nielsen
 --
 Live Security Virtual Conference