Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-22 Thread Lissa Valletta
xCAT-nbroot-core*   was replaced by  xCAT-genesis-* in xCAT 2.8.

Lissa K. Valletta
8-3/B10
Poughkeepsie, NY 12601
(tie 293) 433-3102





From:   Xiao Peng Wang w...@cn.ibm.com
To: xCAT Users Mailing list xcat-user@lists.sourceforge.net,
Cc: xCAT Users Mailing list xcat-user@lists.sourceforge.net
Date:   01/22/2014 02:58 AM
Subject:Re: [xcat-user] Frustrating time with sequential node discovery



Why do you say that you need nbk.x86_64? Is this file listed in the
/tftpboot/xcat/xnba/nets/?

With the latest xCAT build, it needs /tftpboot/xcat/genesis.kernel.x86_64
instead of nbk.*

Thanks
Best Regards
--
Wang Xiaopeng (王晓朋)
IBM China System Technology Laboratory
Tel: 86-10-82453455
Email: w...@cn.ibm.com
Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road,
Haidian District Beijing P.R.China 100193

Inactive hide details for Jonathan Mills ---2014/01/22 14:10:32---It would
seem to me that what I am missing is the whole of thJonathan Mills
---2014/01/22 14:10:32---It would seem to me that what I am missing is the
whole of the  xCAT-nbroot infrastructure...because

From: Jonathan Mills jonmi...@renci.org
To: xCAT Users Mailing list xcat-user@lists.sourceforge.net,
Date: 2014/01/22 14:10
Subject: Re: [xcat-user] Frustrating time with sequential node discovery



It would seem to me that what I am missing is the whole of the
xCAT-nbroot infrastructure...because it isn't part of xcat-core, nor
xcat-dep.  So I didn't grab it.  But it just so happens...you need it.

The file

/tftpboot/xcat/nbk.x86_64

is provided by the RPM xCAT-nbkernel-x86_64.  Which is missing from my
yum repo mirrior, and from my hosts.


Anything else I'm missing?  Hopefully if I grab correct copies of
xCAT-nbkernel and xCAT-nbroot (or xCAT-nbroot2?) then node discovery
will actually work.

On 1/22/14, 12:08 AM, Xiao Peng Wang wrote:
 Both Josh and Russell are correct.

 xNBA is a customized pxe and genesis is a xCAT customized diskless linux
 system to run discovery and other tasks like 'bmcsetup'. It does not
 need the /tftpboot/pxelinux.cfg/.* to load the genesis.

 For discovery, if a node is not defined in xCAT, the dhcp configuration
 in the /etc/dhcp/dhcpd.conf or /etc/dhcpd.conf is used to reply the dhcp
 request from not-discovered node.

 In your dhcpd.conf, it should have the following part for your
 deployment network. If not, run 'makedhcp -n' to recreate your
dhcpd.conf.
  if option user-class-identifier = xNBA and option
 client-architecture = 00:00 { #x86, xCAT Network Boot Agent
 always-broadcast on;
 filename = http://10.1.0.207/tftpboot/xcat/xnba/nets/10.1.0.0_16
;
  } else if option user-class-identifier = xNBA and option
 client-architecture = 00:09 { #x86, xCAT Network Boot Agent
 filename =
 http://10.1.0.207/tftpboot/xcat/xnba/nets/10.1.0.0_16.uefi;;
  } else if option client-architecture = 00:00  { #x86
filename xcat/xnba.kpxe;
  } else if option vendor-class-identifier = Etherboot-5.4  { #x86
filename xcat/xnba.kpxe;
  } else if option client-architecture = 00:07 { #x86_64 uefi
 filename xcat/xnba.efi;
  } else if option client-architecture = 00:09 { #x86_64 uefi
 alternative id
 filename xcat/xnba.efi;
  }

 During the boot process of a not-discovered node, dhcpd will tell the
 node to load xcat/xnba.kpxe first and then the configuration file
 http://xcat mn/tftpboot/xcat/xnba/nets/10.1.0.0_16. Then the xnba will
 load the genesis.

 Take a look of the syslog to see whether the xnba was downloaded
 successfully from tftp server. And look into the httpd log to see
 whether the genesis has been downloaded successfully.


 Thanks
 Best Regards
 --
 Wang Xiaopeng (王晓朋)
 IBM China System Technology Laboratory
 Tel: 86-10-82453455
 Email: w...@cn.ibm.com
 Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road,
 Haidian District Beijing P.R.China 100193

 Inactive hide details for Josh Nielsen ---2014/01/22 05:56:00---Ah, I
 see what you are saying now. Well, I hope the thread I stJosh Nielsen
 ---2014/01/22 05:56:00---Ah, I see what you are saying now. Well, I hope
 the thread I stumbled on that Jarrod replied to help

 From: Josh Nielsen jniel...@hudsonalpha.org
 To: xCAT Users Mailing list xcat-user@lists.sourceforge.net,
 Date: 2014/01/22 05:56
 Subject: Re: [xcat-user] Frustrating time with sequential node discovery

 



 Ah, I see what you are saying now. Well, I hope the thread I stumbled
 on that Jarrod replied to helps figure out why his configuration is
 looking to the outdated (according to what Jarrod said) configuration
 files in /tftpboot/pxelinux.cfg/. Looks like it is either
 /etc/dhcpd.conf or /var/lib/dhcpd/dhcpd.leases related in that case.

 On Tue, Jan 21, 2014 at 3

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-22 Thread David D Johnson
I've been lurking on this discussion, and just checked to see what we've got -- 
nbroot or genesis -- and we have both of them.

I had given up on node discovery years ago, we originally used the switch port 
numbers and forwarding tables to assign node names.  Now I use ASU to collect 
the macs, and populate the mac table with a bit of grep and awk.  But last week 
we powered on a rack of 20 non-ibm nodes, and I was wishing we had something 
easier since ASU didn't work for them.

So I have two questions -- 
1) Can I safely delete the xCAT-nbroot-core* RPMS ?
2) What is the current best practice method?  What about for non-ibm hardware?

No, three questions
3) How do you get IBM manufacturing to use a specific different 172.29.X for 
each rack they build for you?
We've had three racks arrive in different months all with 172.29.101 addresses 
for the IMM, and I have to spend 5-10 minutes reprogramming each one.  I can't 
put them on the same network until the conflicts are gone.

What's hard for me is that we get new nodes only a couple times a year, and I 
forget everything in between.

Thanks,
 -- ddj

On Jan 22, 2014, at 7:12 AM, Lissa Valletta lis...@us.ibm.com wrote:

 xCAT-nbroot-core*   was replaced by  xCAT-genesis-* in xCAT 2.8.
 
 Lissa K. Valletta
 8-3/B10
 Poughkeepsie, NY 12601
 (tie 293) 433-3102
 
 
 
 graycol.gifXiao Peng Wang ---01/22/2014 02:58:13 AM---Why do you say that 
 you need nbk.x86_64? Is this file listed in the /tftpboot/xcat/xnba/nets/?
 
 From: Xiao Peng Wang w...@cn.ibm.com
 To:   xCAT Users Mailing list xcat-user@lists.sourceforge.net, 
 Cc:   xCAT Users Mailing list xcat-user@lists.sourceforge.net
 Date: 01/22/2014 02:58 AM
 Subject:  Re: [xcat-user] Frustrating time with sequential node discovery
 
 
 
 Why do you say that you need nbk.x86_64? Is this file listed in the 
 /tftpboot/xcat/xnba/nets/?
 
 With the latest xCAT build, it needs /tftpboot/xcat/genesis.kernel.x86_64 
 instead of nbk.*
 
 Thanks
 Best Regards
 --
 Wang Xiaopeng (王晓朋)
 IBM China System Technology Laboratory
 Tel: 86-10-82453455
 Email: w...@cn.ibm.com
 Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road, Haidian 
 District Beijing P.R.China 100193
 
 graycol.gifJonathan Mills ---2014/01/22 14:10:32---It would seem to me that 
 what I am missing is the whole of the  xCAT-nbroot infrastructure...because
 
 From: Jonathan Mills jonmi...@renci.org
 To: xCAT Users Mailing list xcat-user@lists.sourceforge.net, 
 Date: 2014/01/22 14:10
 Subject: Re: [xcat-user] Frustrating time with sequential node discovery
 
 
 
 It would seem to me that what I am missing is the whole of the 
 xCAT-nbroot infrastructure...because it isn't part of xcat-core, nor 
 xcat-dep.  So I didn't grab it.  But it just so happens...you need it.
 
 The file
 
 /tftpboot/xcat/nbk.x86_64
 
 is provided by the RPM xCAT-nbkernel-x86_64.  Which is missing from my 
 yum repo mirrior, and from my hosts.
 
 
 Anything else I'm missing?  Hopefully if I grab correct copies of 
 xCAT-nbkernel and xCAT-nbroot (or xCAT-nbroot2?) then node discovery 
 will actually work.
 
 On 1/22/14, 12:08 AM, Xiao Peng Wang wrote:
  Both Josh and Russell are correct.
 
  xNBA is a customized pxe and genesis is a xCAT customized diskless linux
  system to run discovery and other tasks like 'bmcsetup'. It does not
  need the /tftpboot/pxelinux.cfg/.* to load the genesis.
 
  For discovery, if a node is not defined in xCAT, the dhcp configuration
  in the /etc/dhcp/dhcpd.conf or /etc/dhcpd.conf is used to reply the dhcp
  request from not-discovered node.
 
  In your dhcpd.conf, it should have the following part for your
  deployment network. If not, run 'makedhcp -n' to recreate your dhcpd.conf.
   if option user-class-identifier = xNBA and option
  client-architecture = 00:00 { #x86, xCAT Network Boot Agent
  always-broadcast on;
  filename = http://10.1.0.207/tftpboot/xcat/xnba/nets/10.1.0.0_16;;
   } else if option user-class-identifier = xNBA and option
  client-architecture = 00:09 { #x86, xCAT Network Boot Agent
  filename =
  http://10.1.0.207/tftpboot/xcat/xnba/nets/10.1.0.0_16.uefi;;
   } else if option client-architecture = 00:00  { #x86
 filename xcat/xnba.kpxe;
   } else if option vendor-class-identifier = Etherboot-5.4  { #x86
 filename xcat/xnba.kpxe;
   } else if option client-architecture = 00:07 { #x86_64 uefi
  filename xcat/xnba.efi;
   } else if option client-architecture = 00:09 { #x86_64 uefi
  alternative id
  filename xcat/xnba.efi;
   }
 
  During the boot process of a not-discovered node, dhcpd will tell the
  node to load xcat/xnba.kpxe first and then the configuration file
  http://xcat mn/tftpboot/xcat/xnba/nets/10.1.0.0_16. Then the xnba will
  load the genesis.
 
  Take a look of the syslog to see whether the xnba was downloaded
  successfully from tftp server

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-22 Thread David D Johnson

On Jan 22, 2014, at 8:30 AM, Jonathan Mills jonmi...@renci.org wrote:

 Comments inline...
 
 On 1/22/14, 8:08 AM, David D Johnson wrote:
 I've been lurking on this discussion, and just checked to see what we've
 got -- nbroot or genesis -- and we have both of them.
 
 I had given up on node discovery years ago, we originally used the
 switch port numbers and forwarding tables to assign node names.  Now I
 use ASU to collect the macs, and populate the mac table with a bit of
 grep and awk.  But last week we powered on a rack of 20 non-ibm nodes,
 and I was wishing we had something easier since ASU didn't work for them.
 
 Precisely!  That's what I've been doing for IBM and Dell gear for a long 
 time.  In fact, here's what I do:
 
 for i in `seq 1 100`
 do
   MAC=$(rinv node${i} mac | grep 'MAC Address 1' | cut -d   -f 5)
   chtab node=node${i} mac.mac=$MAC mac.interface=eth0
 done;
 
 or similar.
 

Nice, rinv mac seems to take much less time that asu show --group PXE,and 
it can run in parallel on a node range. Unfortunately neither works for these 
(SupermicrO) nodes. 

 
 So I have two questions --
 1) Can I safely delete the xCAT-nbroot-core* RPMS ?
 
 I still don't know!  Because if using the chain-loading, I don't see how 
 the first stage is installed by the xCAT-genesis-* RPMS.
 
 2) What is the current best practice method?  What about for non-ibm
 hardware?
 
 This is also what I'm trying to establish.  I had been attempting to use 
 sequential node discovery for Cisco UCS-B series equipment, since the 
 switch method would be hard to use (since UCS's Fabric Interconnect is 
 kinda like a switch but not totally).
 
 I am also familiar with ROCKS Clusters method of sequential node 
 discovery (insert-ethers), which literally scrapes MAC addresses out of 
 the dhcpd lines which appear in /var/log/messages -- IMHO that would 
 have worked far better for xCAT than by the method of PXE booting (or 
 chain-loading) the Genesis kernel -- which may or may not have kernel 
 modules for your NIC hardware -- to send messages back to xcatmaster:3001.
 

Our clusters from 2006-2009 were all based on Rocks, and that was the one 
feature I miss the most. 
This way would work for any hardware type.  
The Rocks web-GUI database was OK for the small clusters, but they moved to the 
rocks cli for making changes just about the time we went to xcat-2.  We in 
fact used xcat-1 for all the hardware management, rcons rpower, etc.
I miss the ability to hack on the python scripts.

 
 No, three questions
 3) How do you get IBM manufacturing to use a specific different 172.29.X
 for each rack they build for you?
 We've had three racks arrive in different months all with 172.29.101
 addresses for the IMM, and I have to spend 5-10 minutes reprogramming
 each one.  I can't put them on the same network until the conflicts are
 gone.
 
 For a price, IBM has an integration center with technicians can make 
 such things happen.  For another price, they even offer a kind of DMZ 
 they call the yellowzone where you can SSH into their lab and 
 pre-configure your gear before it ships.  But it isn't worth the effort 
 unless you're going to be buying a lot of things with some frequency.
 

We get the racks prebuilt and shipped from Hong Kong, and they do program the 
IMM addresses, but they never ask us which rack number to use, it's always A1.  
If they're going to do it at all, they should do it right.

 What's hard for me is that we get new nodes only a couple times a year,
 and I forget everything in between.
 
 Thanks,
  -- ddj
 
 On Jan 22, 2014, at 7:12 AM, Lissa Valletta lis...@us.ibm.com
 mailto:lis...@us.ibm.com wrote:
 
 xCAT-nbroot-core*   was replaced by  xCAT-genesis-* in xCAT 2.8.
 
 Lissa K. Valletta
 8-3/B10
 Poughkeepsie, NY 12601
 (tie 293) 433-3102
 
 
 
 graycol.gifXiao Peng Wang ---01/22/2014 02:58:13 AM---Why do you say
 that you need nbk.x86_64? Is this file listed in the
 /tftpboot/xcat/xnba/nets/?
 
 From: Xiao Peng Wang w...@cn.ibm.com mailto:w...@cn.ibm.com
 To: xCAT Users Mailing list xcat-user@lists.sourceforge.net
 mailto:xcat-user@lists.sourceforge.net,
 Cc: xCAT Users Mailing list xcat-user@lists.sourceforge.net
 mailto:xcat-user@lists.sourceforge.net
 Date: 01/22/2014 02:58 AM
 Subject: Re: [xcat-user] Frustrating time with sequential node discovery
 
 
 
 
 
 Why do you say that you need nbk.x86_64? Is this file listed in the
 /tftpboot/xcat/xnba/nets/?
 
 With the latest xCAT build, it needs
 /tftpboot/xcat/genesis.kernel.x86_64 instead of nbk.*
 
 Thanks
 Best Regards
 --
 Wang Xiaopeng (王晓朋)
 IBM China System Technology Laboratory
 Tel: 86-10-82453455
 Email: w...@cn.ibm.com mailto:w...@cn.ibm.com
 Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road,
 Haidian District Beijing P.R.China 100193
 
 graycol.gifJonathan Mills ---2014/01/22

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-22 Thread Thomas Alandt
Answering your Item 3:
For IBM nodes we use the following to determine the address for the IMM
172.29.1xx.y
xx = rack number (rack A1, node 1 = 172.29.101.1 which follows the node ip
we would assign 172.20.101.1
y=node number in the rack, with the first node being in the lower U
location, if rack is two wide then it would be lower left side going up
then lower right side going up

Since your racks came at different timesthey were seen as different orders
and we will always start at the first rack and go contiguous from there.

We assign the bmc to a group, for example 84bmcperrack (see hosts tab)
#node,ip,hostnames,otherinterfaces,comments,disable
84bmcperrack,|\D+(\d+).*$|172.29.(101+(($1-1)/84)).(($1-1)%84+1)|
idataplex-bmc,|\D+(\d+).*$|172.29.(101+(($1-1)/84)).(($1-1)%84+1)|
40bmcperrack,|\D+(\d+).*$|172.29.(101+(($1-1)/40)).(($1-1)%40+1)|
41bmcperrack,|\D+(\d+).*$|172.29.(101+(($1-1)/41)).(($1-1)%41+1)|
42bmcperrack,|\D+(\d+).*$|172.29.(101+(($1-1)/42)).(($1-1)%42+1)|
20bmcperrack,|\D+(\d+).*$|172.29.(101+(($1-1)/20)).(($1-1)%20+1)|
21bmcperrack,|\D+(\d+).*$|172.29.(101+(($1-1)/21)).(($1-1)%21+1)|

You can set up your scheme when you bring up the cluster and this will all
get done during the discovery/bmcsetup.

Regards,

Tom

Thomas Alandt
WW  Test Engineer Complex Solutions
IBM-ISC
Phone:919-543-7581 (t/l 441-7581



From:   David D Johnson david_john...@brown.edu
To: xCAT Users Mailing list xcat-user@lists.sourceforge.net,
Date:   01/22/2014 08:11 AM
Subject:Re: [xcat-user] Frustrating time with sequential node discovery



I've been lurking on this discussion, and just checked to see what we've
got -- nbroot or genesis -- and we have both of them.

I had given up on node discovery years ago, we originally used the switch
port numbers and forwarding tables to assign node names.  Now I use ASU to
collect the macs, and populate the mac table with a bit of grep and awk.
But last week we powered on a rack of 20 non-ibm nodes, and I was wishing
we had something easier since ASU didn't work for them.

So I have two questions --
1) Can I safely delete the xCAT-nbroot-core* RPMS ?
2) What is the current best practice method?  What about for non-ibm
hardware?

No, three questions
3) How do you get IBM manufacturing to use a specific different 172.29.X
for each rack they build for you?
We've had three racks arrive in different months all with 172.29.101
addresses for the IMM, and I have to spend 5-10 minutes reprogramming each
one.  I can't put them on the same network until the conflicts are gone.

What's hard for me is that we get new nodes only a couple times a year, and
I forget everything in between.

Thanks,
 -- ddj

On Jan 22, 2014, at 7:12 AM, Lissa Valletta lis...@us.ibm.com wrote:



  xCAT-nbroot-core*   was replaced by  xCAT-genesis-* in xCAT 2.8.

  Lissa K. Valletta
  8-3/B10
  Poughkeepsie, NY 12601
  (tie 293) 433-3102



  graycol.gifXiao Peng Wang ---01/22/2014 02:58:13 AM---Why do you
  say that you need nbk.x86_64? Is this file listed in the
  /tftpboot/xcat/xnba/nets/?

  From: Xiao Peng Wang w...@cn.ibm.com
  To: xCAT Users Mailing list xcat-user@lists.sourceforge.net,
  Cc: xCAT Users Mailing list xcat-user@lists.sourceforge.net
  Date: 01/22/2014 02:58 AM
  Subject: Re: [xcat-user] Frustrating time with sequential node
  discovery





  Why do you say that you need nbk.x86_64? Is this file listed in the
  /tftpboot/xcat/xnba/nets/?

  With the latest xCAT build, it
  needs /tftpboot/xcat/genesis.kernel.x86_64 instead of nbk.*

  Thanks
  Best Regards
  --

  Wang Xiaopeng (王晓朋)
  IBM China System Technology Laboratory
  Tel: 86-10-82453455
  Email: w...@cn.ibm.com
  Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road,
  Haidian District Beijing P.R.China 100193

  graycol.gifJonathan Mills ---2014/01/22 14:10:32---It would seem to
  me that what I am missing is the whole of the  xCAT-nbroot
  infrastructure...because

  From: Jonathan Mills jonmi...@renci.org
  To: xCAT Users Mailing list xcat-user@lists.sourceforge.net,
  Date: 2014/01/22 14:10
  Subject: Re: [xcat-user] Frustrating time with sequential node
  discovery



  It would seem to me that what I am missing is the whole of the
  xCAT-nbroot infrastructure...because it isn't part of xcat-core, nor
  xcat-dep.  So I didn't grab it.  But it just so happens...you need
  it.

  The file

  /tftpboot/xcat/nbk.x86_64

  is provided by the RPM xCAT-nbkernel-x86_64.  Which is missing from
  my
  yum repo mirrior, and from my hosts.


  Anything else I'm missing?  Hopefully if I grab correct copies of
  xCAT-nbkernel and xCAT-nbroot (or xCAT-nbroot2?) then node discovery
  will actually work.

  On 1/22/14, 12:08 AM, Xiao

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-22 Thread Russell Jones
I can answer that point from a personal viewpoint - it's just a pain. A 
real bad pain, especially when you do not have homogenous switch 
models/vendors in the environment. By the time you've finally gotten it 
to work you could have just went node to node and hand-written down the 
MAC's and populated it yourself :-)


Typically when doing deployments now where xCAT is utilized I require 
the vendor to provide me a list or spreadsheet of node-to-MAC mappings, 
and just manually populate the tables myself with a for loop.


On 1/22/2014 10:03 AM, Jarrod B Johnson wrote:


I am interested in issues with switch based discovery that would cause 
it to be given up on.  Sequential or semi-automatic discovery is ok 
for smallish setups, but scaling it up causes a lot of ambiguity to 
trudge through.


--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-22 Thread Jonathan Mills
xCAT-cisco works to an extent.  It is fabulous for fetching MAC 
addresses via UCS Manager.  However, for me at least, the rsetboot 
command fails flat out.  But worst of all is that rpower commands do not 
shoot the node in the head like an IPMI command.  Instead, UCS Manager 
tries to be cute and gracefully shut down the OS.  To get the expected 
result of an rpower command via xCAT-cisco often means waiting 60 
seconds, or it may never work at all occasionally.

That's why I found it attractive, the idea of using a traditional xCAT 
setup using IPMI to control UCS nodes, and node discovery to pull in 
their MAC addresses.

Using IPMI with UCS hardware means that commands like 'rinv' don't work, 
like with SuperMicro gear.  Some aspects of the hardware aren't exposed 
through IPMI registers.

On 01/22/2014 11:03 AM, Jarrod B Johnson wrote:
 Sorry I haven't been following the thread and will hit a few points to
 the list in general.

 For rinv macs, sadly that's not part of standards, so we can only pull
 it off one vendor at a time, hence why rinv mac works for some, but not
 others.

 For the questions about UCS, I assume
 _https://github.com/vallard/xCAT-cisco_was looked at.  I'm not
 personally familiar with their scheme, but for other blade-oriented
 solutions, we have used the chassis managers as a topology cue
 alternative to switch.

 For scraping dhcpd.leases, that should be a doable script to include.
   There are cases that require a more thorough investigation than can be
 acheived in that manner is warranted, but it's better than a non-starter
 for cases where it doesn't work.  We strive to include modern network
 drivers and perhaps we should be more aggressive about that.

 One thing I've been hoping to do is implement a proxydhcp server.  That
 could glean much of the pertinent details for common configuration cases
 and provide a nonambiguous set of candidates for automatic (sequential,
 switch, chassis based) or semi-automatic (scriptable set of candidates
 to do whatever with) discovery (one challenge we've had with dhcp lease
 scraping is ambiguity of whether something is a node or piece of other
 equipment).

 I need to see about extending lsslp --flexdiscover to cover rackmount
 case for service processor based reconfiguration.  The good thing about
 that scheme is that duplicate IPs are fine and get fixed automatically
 so long as the IMMs are on the same subnet as a management node.

 I am interested in issues with switch based discovery that would cause
 it to be given up on.  Sequential or semi-automatic discovery is ok for
 smallish setups, but scaling it up causes a lot of ambiguity to trudge
 through.

 Inactive hide details for David D Johnson ---01/22/2014 08:53:19 AM---On
 Jan 22, 2014, at 8:30 AM, Jonathan Mills jonmills@renDavid D Johnson
 ---01/22/2014 08:53:19 AM---On Jan 22, 2014, at 8:30 AM, Jonathan Mills
 jonmi...@renci.org wrote:  Comments inline...

 From: David D Johnson david_john...@brown.edu
 To: xCAT Users Mailing list xcat-user@lists.sourceforge.net
 Date: 01/22/2014 08:53 AM
 Subject: Re: [xcat-user] Frustrating time with sequential node discovery

 




 On Jan 22, 2014, at 8:30 AM, Jonathan Mills jonmi...@renci.org wrote:

   Comments inline...
  
   On 1/22/14, 8:08 AM, David D Johnson wrote:
   I've been lurking on this discussion, and just checked to see what we've
   got -- nbroot or genesis -- and we have both of them.
  
   I had given up on node discovery years ago, we originally used the
   switch port numbers and forwarding tables to assign node names.  Now I
   use ASU to collect the macs, and populate the mac table with a bit of
   grep and awk.  But last week we powered on a rack of 20 non-ibm nodes,
   and I was wishing we had something easier since ASU didn't work for
 them.
  
   Precisely!  That's what I've been doing for IBM and Dell gear for a long
   time.  In fact, here's what I do:
  
   for i in `seq 1 100`
   do
 MAC=$(rinv node${i} mac | grep 'MAC Address 1' | cut -d   -f 5)
 chtab node=node${i} mac.mac=$MAC mac.interface=eth0
   done;
  
   or similar.
  

 Nice, rinv mac seems to take much less time that asu show --group
 PXE,and it can run in parallel on a node range. Unfortunately neither
 works for these (SupermicrO) nodes.

  
   So I have two questions --
   1) Can I safely delete the xCAT-nbroot-core* RPMS ?
  
   I still don't know!  Because if using the chain-loading, I don't see how
   the first stage is installed by the xCAT-genesis-* RPMS.
  
   2) What is the current best practice method?  What about for non-ibm
   hardware?
  
   This is also what I'm trying to establish.  I had been attempting to use
   sequential node discovery for Cisco UCS-B series equipment, since the
   switch method would be hard to use (since UCS's Fabric Interconnect is
   kinda like a switch but not totally).
  
   I am also familiar with ROCKS Clusters

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-22 Thread David D Johnson
I'm not clear anymore remembering what gave is the most grief back 2-3 years 
ago when the first add-on racks were added to the cluster.  There were some 
xcat update growing pains, some bugs that weren't fixed soon enough for us to 
move on, and the original design/rollout was done in 2009 by an IBM contractor, 
so we needed to learn what he had done.  Some of this was moving target problem 
-- I learned it one way, and by the time I need to do it again the ground rules 
have changed, and I've forgotten a lot of stuff in the mean time.  We did not 
originally buy xcat support, but we have it now.  We also did a 
centos5-centos6 upgrade by moving all the xcat stuff to a new mgt server, 
turned off the old dhcp server and had the diskless nodes reboot on the new 
one.  I never did get around to copying the tabdump switch information.  Also, 
because of partial rack orders, the actual node arrangements got screwy. Like 
node201-240 on the bottom left and right of one idataplex rack, and 241-274 in 
the upper half.
Finally it's a matter of how long the learning curve to do it the right way 
vs. knowing exactly how long it takes to do it with asu/rinv and tabedit mac.

On Jan 22, 2014, at 12:10 PM, Jonathan Mills jonmi...@renci.org wrote:

 xCAT-cisco works to an extent.  It is fabulous for fetching MAC 
 addresses via UCS Manager.  However, for me at least, the rsetboot 
 command fails flat out.  But worst of all is that rpower commands do not 
 shoot the node in the head like an IPMI command.  Instead, UCS Manager 
 tries to be cute and gracefully shut down the OS.  To get the expected 
 result of an rpower command via xCAT-cisco often means waiting 60 
 seconds, or it may never work at all occasionally.
 
 That's why I found it attractive, the idea of using a traditional xCAT 
 setup using IPMI to control UCS nodes, and node discovery to pull in 
 their MAC addresses.
 
 Using IPMI with UCS hardware means that commands like 'rinv' don't work, 
 like with SuperMicro gear.  Some aspects of the hardware aren't exposed 
 through IPMI registers.
 
 On 01/22/2014 11:03 AM, Jarrod B Johnson wrote:
 Sorry I haven't been following the thread and will hit a few points to
 the list in general.
 
 For rinv macs, sadly that's not part of standards, so we can only pull
 it off one vendor at a time, hence why rinv mac works for some, but not
 others.
 
 For the questions about UCS, I assume
 _https://github.com/vallard/xCAT-cisco_was looked at.  I'm not
 personally familiar with their scheme, but for other blade-oriented
 solutions, we have used the chassis managers as a topology cue
 alternative to switch.
 
 For scraping dhcpd.leases, that should be a doable script to include.
  There are cases that require a more thorough investigation than can be
 acheived in that manner is warranted, but it's better than a non-starter
 for cases where it doesn't work.  We strive to include modern network
 drivers and perhaps we should be more aggressive about that.
 
 One thing I've been hoping to do is implement a proxydhcp server.  That
 could glean much of the pertinent details for common configuration cases
 and provide a nonambiguous set of candidates for automatic (sequential,
 switch, chassis based) or semi-automatic (scriptable set of candidates
 to do whatever with) discovery (one challenge we've had with dhcp lease
 scraping is ambiguity of whether something is a node or piece of other
 equipment).
 
 I need to see about extending lsslp --flexdiscover to cover rackmount
 case for service processor based reconfiguration.  The good thing about
 that scheme is that duplicate IPs are fine and get fixed automatically
 so long as the IMMs are on the same subnet as a management node.
 
 I am interested in issues with switch based discovery that would cause
 it to be given up on.  Sequential or semi-automatic discovery is ok for
 smallish setups, but scaling it up causes a lot of ambiguity to trudge
 through.
 
 Inactive hide details for David D Johnson ---01/22/2014 08:53:19 AM---On
 Jan 22, 2014, at 8:30 AM, Jonathan Mills jonmills@renDavid D Johnson
 ---01/22/2014 08:53:19 AM---On Jan 22, 2014, at 8:30 AM, Jonathan Mills
 jonmi...@renci.org wrote:  Comments inline...
 
 From: David D Johnson david_john...@brown.edu
 To: xCAT Users Mailing list xcat-user@lists.sourceforge.net
 Date: 01/22/2014 08:53 AM
 Subject: Re: [xcat-user] Frustrating time with sequential node discovery
 
 
 
 
 
 
 On Jan 22, 2014, at 8:30 AM, Jonathan Mills jonmi...@renci.org wrote:
 
 Comments inline...
 
 On 1/22/14, 8:08 AM, David D Johnson wrote:
 I've been lurking on this discussion, and just checked to see what we've
 got -- nbroot or genesis -- and we have both of them.
 
 I had given up on node discovery years ago, we originally used the
 switch port numbers and forwarding tables to assign node names.  Now I
 use ASU to collect the macs, and populate the mac table with a bit

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-22 Thread Josh Nielsen
Whoops, I meant to write Jarrod not Jarros. I went a little Koine Greek
on your name there. Sorry about that. :-)


On Wed, Jan 22, 2014 at 10:22 AM, Josh Nielsen jniel...@hudsonalpha.orgwrote:

 Jarros, I know you haven't been following the whole thread but Jonathan's
 problem (which this discussion originated from) is that somehow his
 installation is still using the /tftpboot/pxelinux.cfg/ files despite
 being up to date with genesis. I saw a response on the list from you to
 someone else about a similar problem in November 2013:

 http://sourceforge.net/mailarchive/message.php?msg_id=31683484
 http://sourceforge.net/mailarchive/message.php?msg_id=31686689

 Although I am not experiencing the same problem I too would be interested
 in the solution. It has always been challenging for me to understand the
 xCAT PXE deployment process, which is why I have 5-10 pages of self-written
 documentation and debugging info in a Google Doc about it. He posted his 
 dhcpd.conf
 in his latest email. Any thoughts?

 -Josh


 On Wed, Jan 22, 2014 at 10:11 AM, Jarrod B Johnson jbjoh...@us.ibm.comwrote:

 1) If running 2.8, go ahead and delete nbroot-core.  genesis is far more
 maintainable and easier to muck with on the end point (e.g. having full
 fledged glibc)
 2) The greatest common denominator remains switch based.  It's the only
 frequently instrumented topology cue that is pretty universal.  For bladed
 solutions there is potential for the chassis manager to be a good topology
 cue.  I am curious what goes awry with switch based discovery.  Without a
 topology cue, then the choices are sequential discovery (which I frankly
 haven't used myself) or using/scripting nodediscoverls/nodediscoverdef.  I
 personally think the latter is actually better and can be trivially made
 into a 'sequential' discovery using straightforward scripting..

 3) Tom Alandt is the best person to discuss what can/can't be done by IBM
 mfg.  On the other hand, we *could* take some measures to make you
 impervious to the conflict.  The 'lsslp --flexdiscover' for its respective
 bits is impervious to IP conflict issues and will automatically fix it.
  It's not a huge stretch to make that pan out for rackmount systems (though
 currently it's hard to pull off without *some* topology cue).
 [image: Inactive hide details for David D Johnson ---01/22/2014 08:11:53
 AM---I've been lurking on this discussion, and just checked to]David D
 Johnson ---01/22/2014 08:11:53 AM---I've been lurking on this discussion,
 and just checked to see what we've got -- nbroot or genesis --

 From: David D Johnson david_john...@brown.edu
 To: xCAT Users Mailing list xcat-user@lists.sourceforge.net
 Date: 01/22/2014 08:11 AM
 Subject: Re: [xcat-user] Frustrating time with sequential node discovery
 --



 I've been lurking on this discussion, and just checked to see what we've
 got -- nbroot or genesis -- and we have both of them.

 I had given up on node discovery years ago, we originally used the switch
 port numbers and forwarding tables to assign node names.  Now I use ASU to
 collect the macs, and populate the mac table with a bit of grep and awk.
  But last week we powered on a rack of 20 non-ibm nodes, and I was wishing
 we had something easier since ASU didn't work for them.

 So I have two questions --
 1) Can I safely delete the xCAT-nbroot-core* RPMS ?
 2) What is the current best practice method?  What about for non-ibm
 hardware?

 No, three questions
 3) How do you get IBM manufacturing to use a specific different 172.29.X
 for each rack they build for you?
 We've had three racks arrive in different months all with 172.29.101
 addresses for the IMM, and I have to spend 5-10 minutes reprogramming each
 one.  I can't put them on the same network until the conflicts are gone.

 What's hard for me is that we get new nodes only a couple times a year,
 and I forget everything in between.

 Thanks,
  -- ddj

 On Jan 22, 2014, at 7:12 AM, Lissa Valletta 
 *lis...@us.ibm.com*lis...@us.ibm.com
 wrote:


xCAT-nbroot-core*   was replaced by  xCAT-genesis-* in xCAT 2.8.

Lissa K. Valletta
8-3/B10
Poughkeepsie, NY 12601
(tie 293) 433-3102



graycol.gifXiao Peng Wang ---01/22/2014 02:58:13 AM---Why do you
say that you need nbk.x86_64? Is this file listed in the
/tftpboot/xcat/xnba/nets/?

From: Xiao Peng Wang *w...@cn.ibm.com* w...@cn.ibm.com
To: xCAT Users Mailing list 
 *xcat-user@lists.sourceforge.net*xcat-user@lists.sourceforge.net,

Cc: xCAT Users Mailing list 
 *xcat-user@lists.sourceforge.net*xcat-user@lists.sourceforge.net

Date: 01/22/2014 02:58 AM
Subject: Re: [xcat-user] Frustrating time with sequential node
discovery

--



Why do you say that you need nbk.x86_64? Is this file listed in the
/tftpboot/xcat/xnba/nets/?

With the latest xCAT build, it needs
/tftpboot/xcat/genesis.kernel.x86_64 instead of nbk.*

Thanks

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-21 Thread Russell Jones
xNBA is a customized gpxe image that xCAT uses.

NBFS is the older maintenance image that was used for if you set your 
node to boot to shell, or booted a runimage script. NBFS is deprecated, 
and Genesis replaced NBFS as the maintenance image for these tasks.

In a standard 2.8 install, there should no longer be any nbk/nbfs RPMs 
installed - Genesis replaced them.

perl-xCAT-2.8.3-snap201311122316.noarch
xCAT-2.8.3-snap201311122318.x86_64
xCAT-client-2.8.3-snap201311122316.noarch
xCAT-genesis-base-x86_64-2.8-snap201308090229.noarch
elilo-xcat-3.14-4.noarch
xCAT-server-2.8.3-snap201311122316.noarch
xCAT-genesis-scripts-x86_64-2.8.3-snap201311122318.noarch
ipmitool-xcat-1.8.11-3.x86_64
conserver-xcat-8.1.16-10.x86_64
xCAT-buildkit-2.8.3-snap201311122318.noarch
syslinux-xcat-3.86-2.noarch



On 1/21/2014 2:38 PM, Josh Nielsen wrote:
 Hi Jonathan,

 Yes, I definitely think that would cause a problem. This is jogging my
 memory because I think that when the new Genesis boot loader was
 rolled out in the first version of xCAT that supported it that I faced
 a similar problem. I had assumed that only Genesis was needed but xNBA
 is still used an an intermediate image even if it is no longer the
 final image. I will check my yum repos as soon as I can - but by some
 unfortunate coincidence I just discovered that YUM is not working
 since our RHEL license expired three days ago (unbeknownst to me until
 10 minutes ago). Do you have xCAT-genesis-x86_64 and elilo-xCAT? You
 may even have to pull xNBA images from an older install(?) and then
 run mknb to build the images.

 I remember downloading the tarred files with the RPM manually and
 creating a local repo for xCAT. Whenever I get YUM back I'll give you
 more specifics if I can.

 -Josh

 On Tue, Jan 21, 2014 at 1:54 PM, Jonathan Mills jonmi...@renci.org wrote:
 Josh,

 I don't doubt that you're on to something.  But if this is the case, it
 means my systems are missing some files, namely:

 /tftpboot/xcat/nbk.x86_64
 /tftpboot/xcat/nbfs.x86_64.gz

 Can you tell me what RPM installed those files on your system?  They
 don't exist on mine, and even a 'yum provides' doesn't find them.


 On 01/21/2014 11:51 AM, Josh Nielsen wrote:
 Hi Jonathan,

 It is my understanding, from extensive debugging and notes that I have
 taken about the xCAT netbooting process in the past, that xCAT uses a
 two-stage image deployment method. It will first come up with a more
 generic boot image (normally xnba or sometimes yaboot) which - when it
 contacts the xCAT headnode (or the node handling DHCP requests) - the
 headnode will then recognize the current image on the client that is
 sending requests to DHCP for further boot instructions, and will tell
 the client to then load another image based on the subnet and image type
 it is currently using. For example my headnode's /etc/dhcpd.conf file
 has an entry that looks like this:

 hared-network eth0 {
 subnet 10.20.0.0 netmask 255.255.0.0 {
   max-lease-time 43200;
   min-lease-time 43200;
   default-lease-time 43200;
   next-server  10.20.0.1;
   option log-servers 10.20.0.1;
   option ntp-servers 10.20.0.1;
   option domain-name x;
   option domain-name-servers  10.20.0.1;
   if option user-class-identifier = xNBA and option
 client-architecture = 00:00 { #x86, xCAT Network Boot Agent
  always-broadcast on;
  filename = http://10.20.0.1/tftpboot/xcat/xnba/nets/10.20.0.0_16;;
   } else if option user-class-identifier = xNBA and option
 client-architecture = 00:09 { #x86, xCAT Network Boot Agent
  filename =
 http://10.20.0.1/tftpboot/xcat/xnba/nets/10.20.0.0_16.uefi;;
   } else if option client-architecture = 00:00  { #x86
 filename xcat/xnba.kpxe;
   } else if option vendor-class-identifier = Etherboot-5.4  { #x86
 filename xcat/xnba.kpxe;
   } else if option client-architecture = 00:07 { #x86_64 uefi
  filename xcat/xnba.efi;
   } else if option client-architecture = 00:09 { #x86_64 uefi
 alternative id
  filename xcat/xnba.efi;
   } else if option client-architecture = 00:02 { #ia64
  filename elilo.efi;
   } else if substring(filename,0,1) = null { #otherwise, provide
 yaboot if the client isn't specific
  filename /yaboot;
   }
   range dynamic-bootp 10.20.200.254 10.20.254.254;
 } # 10.20.0.0/255.255.0.0 http://10.20.0.0/255.255.0.0 subnet_end

 So if it boots with the xNBA image it then directs it to the
 http://10.20.0.1/tftpboot/xcat/xnba/nets/10.20.0.0_16 which has the
 genesis boot instructions in it:

 #!gpxe
 imgfetch -n kernel
 http://${next-server}/tftpboot/xcat/genesis.kernel.x86_64 quiet
 xcatd=10.20.0.1:3001 http://10.20.0.1:3001  BOOTIF=01-${netX/machyp}
 imgfetch -n nbfs http://${next-server}/tftpboot/xcat/genesis.fs.x86_64.gz
 imgload kernel
 imgexec kernel

 So first it boots with xnba (first stage of boot), it contacts the DHCP
 server which gives it 

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-21 Thread Jonathan Mills
Russell,

That's what I had been thinking.

# rpm -qa | grep -i xcat | sort
conserver-xcat-8.1.16-10.x86_64
elilo-xcat-3.14-4.noarch
ipmitool-xcat-1.8.11-3.x86_64
perl-xCAT-2.8.3-snap201311122316.noarch
syslinux-xcat-3.86-2.noarch
xCAT-2.8.3-snap201311122318.x86_64
xCAT-buildkit-2.8.3-snap201311122318.noarch
xCAT-client-2.8.3-snap201311122316.noarch
xCAT-genesis-base-x86_64-2.8-snap201308090229.noarch
xCAT-genesis-scripts-x86_64-2.8.3-snap201311122318.noarch
xCAT-server-2.8.3-snap201311122316.noarch


If that is the case, I am troubled by the incorrect pxelinux.cfg 
configuration generated by 'mknb x86_64'.  And this is to say nothing of 
having successful node discover, which still eludes me.

On 01/21/2014 03:58 PM, Russell Jones wrote:
 xNBA is a customized gpxe image that xCAT uses.

 NBFS is the older maintenance image that was used for if you set your
 node to boot to shell, or booted a runimage script. NBFS is deprecated,
 and Genesis replaced NBFS as the maintenance image for these tasks.

 In a standard 2.8 install, there should no longer be any nbk/nbfs RPMs
 installed - Genesis replaced them.

 perl-xCAT-2.8.3-snap201311122316.noarch
 xCAT-2.8.3-snap201311122318.x86_64
 xCAT-client-2.8.3-snap201311122316.noarch
 xCAT-genesis-base-x86_64-2.8-snap201308090229.noarch
 elilo-xcat-3.14-4.noarch
 xCAT-server-2.8.3-snap201311122316.noarch
 xCAT-genesis-scripts-x86_64-2.8.3-snap201311122318.noarch
 ipmitool-xcat-1.8.11-3.x86_64
 conserver-xcat-8.1.16-10.x86_64
 xCAT-buildkit-2.8.3-snap201311122318.noarch
 syslinux-xcat-3.86-2.noarch



 On 1/21/2014 2:38 PM, Josh Nielsen wrote:
 Hi Jonathan,

 Yes, I definitely think that would cause a problem. This is jogging my
 memory because I think that when the new Genesis boot loader was
 rolled out in the first version of xCAT that supported it that I faced
 a similar problem. I had assumed that only Genesis was needed but xNBA
 is still used an an intermediate image even if it is no longer the
 final image. I will check my yum repos as soon as I can - but by some
 unfortunate coincidence I just discovered that YUM is not working
 since our RHEL license expired three days ago (unbeknownst to me until
 10 minutes ago). Do you have xCAT-genesis-x86_64 and elilo-xCAT? You
 may even have to pull xNBA images from an older install(?) and then
 run mknb to build the images.

 I remember downloading the tarred files with the RPM manually and
 creating a local repo for xCAT. Whenever I get YUM back I'll give you
 more specifics if I can.

 -Josh

 On Tue, Jan 21, 2014 at 1:54 PM, Jonathan Mills jonmi...@renci.org wrote:
 Josh,

 I don't doubt that you're on to something.  But if this is the case, it
 means my systems are missing some files, namely:

 /tftpboot/xcat/nbk.x86_64
 /tftpboot/xcat/nbfs.x86_64.gz

 Can you tell me what RPM installed those files on your system?  They
 don't exist on mine, and even a 'yum provides' doesn't find them.


 On 01/21/2014 11:51 AM, Josh Nielsen wrote:
 Hi Jonathan,

 It is my understanding, from extensive debugging and notes that I have
 taken about the xCAT netbooting process in the past, that xCAT uses a
 two-stage image deployment method. It will first come up with a more
 generic boot image (normally xnba or sometimes yaboot) which - when it
 contacts the xCAT headnode (or the node handling DHCP requests) - the
 headnode will then recognize the current image on the client that is
 sending requests to DHCP for further boot instructions, and will tell
 the client to then load another image based on the subnet and image type
 it is currently using. For example my headnode's /etc/dhcpd.conf file
 has an entry that looks like this:

 hared-network eth0 {
  subnet 10.20.0.0 netmask 255.255.0.0 {
max-lease-time 43200;
min-lease-time 43200;
default-lease-time 43200;
next-server  10.20.0.1;
option log-servers 10.20.0.1;
option ntp-servers 10.20.0.1;
option domain-name x;
option domain-name-servers  10.20.0.1;
if option user-class-identifier = xNBA and option
 client-architecture = 00:00 { #x86, xCAT Network Boot Agent
   always-broadcast on;
   filename = 
 http://10.20.0.1/tftpboot/xcat/xnba/nets/10.20.0.0_16;;
} else if option user-class-identifier = xNBA and option
 client-architecture = 00:09 { #x86, xCAT Network Boot Agent
   filename =
 http://10.20.0.1/tftpboot/xcat/xnba/nets/10.20.0.0_16.uefi;;
} else if option client-architecture = 00:00  { #x86
  filename xcat/xnba.kpxe;
} else if option vendor-class-identifier = Etherboot-5.4  { #x86
  filename xcat/xnba.kpxe;
} else if option client-architecture = 00:07 { #x86_64 uefi
   filename xcat/xnba.efi;
} else if option client-architecture = 00:09 { #x86_64 uefi
 alternative id
   filename xcat/xnba.efi;
} else if option client-architecture = 00:02 { #ia64
   filename elilo.efi;
} else 

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-21 Thread Josh Nielsen
Evidently though something in his xCAT setup it creating the files in
/tftpboot/pxelinux.cfg/ with reference to xnba just like my
installation. Where does xCAT grab the configuration for that? Maybe
it was because I didn't do a completely clean install and did an
in-place upgrade, but my cluster actually works perfectly with both
xnba  genesis installed because it uses xnba first to bootstrap and
then requests the Genesis image. xCAT must support that scenario else
I haven't the slightest idea by what miracle my installation is
running with such a configuration. :-)

-Josh

On Tue, Jan 21, 2014 at 2:58 PM, Russell Jones
russell-l...@jonesmail.me wrote:
 xNBA is a customized gpxe image that xCAT uses.

 NBFS is the older maintenance image that was used for if you set your
 node to boot to shell, or booted a runimage script. NBFS is deprecated,
 and Genesis replaced NBFS as the maintenance image for these tasks.

 In a standard 2.8 install, there should no longer be any nbk/nbfs RPMs
 installed - Genesis replaced them.

 perl-xCAT-2.8.3-snap201311122316.noarch
 xCAT-2.8.3-snap201311122318.x86_64
 xCAT-client-2.8.3-snap201311122316.noarch
 xCAT-genesis-base-x86_64-2.8-snap201308090229.noarch
 elilo-xcat-3.14-4.noarch
 xCAT-server-2.8.3-snap201311122316.noarch
 xCAT-genesis-scripts-x86_64-2.8.3-snap201311122318.noarch
 ipmitool-xcat-1.8.11-3.x86_64
 conserver-xcat-8.1.16-10.x86_64
 xCAT-buildkit-2.8.3-snap201311122318.noarch
 syslinux-xcat-3.86-2.noarch



 On 1/21/2014 2:38 PM, Josh Nielsen wrote:
 Hi Jonathan,

 Yes, I definitely think that would cause a problem. This is jogging my
 memory because I think that when the new Genesis boot loader was
 rolled out in the first version of xCAT that supported it that I faced
 a similar problem. I had assumed that only Genesis was needed but xNBA
 is still used an an intermediate image even if it is no longer the
 final image. I will check my yum repos as soon as I can - but by some
 unfortunate coincidence I just discovered that YUM is not working
 since our RHEL license expired three days ago (unbeknownst to me until
 10 minutes ago). Do you have xCAT-genesis-x86_64 and elilo-xCAT? You
 may even have to pull xNBA images from an older install(?) and then
 run mknb to build the images.

 I remember downloading the tarred files with the RPM manually and
 creating a local repo for xCAT. Whenever I get YUM back I'll give you
 more specifics if I can.

 -Josh

 On Tue, Jan 21, 2014 at 1:54 PM, Jonathan Mills jonmi...@renci.org wrote:
 Josh,

 I don't doubt that you're on to something.  But if this is the case, it
 means my systems are missing some files, namely:

 /tftpboot/xcat/nbk.x86_64
 /tftpboot/xcat/nbfs.x86_64.gz

 Can you tell me what RPM installed those files on your system?  They
 don't exist on mine, and even a 'yum provides' doesn't find them.


 On 01/21/2014 11:51 AM, Josh Nielsen wrote:
 Hi Jonathan,

 It is my understanding, from extensive debugging and notes that I have
 taken about the xCAT netbooting process in the past, that xCAT uses a
 two-stage image deployment method. It will first come up with a more
 generic boot image (normally xnba or sometimes yaboot) which - when it
 contacts the xCAT headnode (or the node handling DHCP requests) - the
 headnode will then recognize the current image on the client that is
 sending requests to DHCP for further boot instructions, and will tell
 the client to then load another image based on the subnet and image type
 it is currently using. For example my headnode's /etc/dhcpd.conf file
 has an entry that looks like this:

 hared-network eth0 {
 subnet 10.20.0.0 netmask 255.255.0.0 {
   max-lease-time 43200;
   min-lease-time 43200;
   default-lease-time 43200;
   next-server  10.20.0.1;
   option log-servers 10.20.0.1;
   option ntp-servers 10.20.0.1;
   option domain-name x;
   option domain-name-servers  10.20.0.1;
   if option user-class-identifier = xNBA and option
 client-architecture = 00:00 { #x86, xCAT Network Boot Agent
  always-broadcast on;
  filename = 
 http://10.20.0.1/tftpboot/xcat/xnba/nets/10.20.0.0_16;;
   } else if option user-class-identifier = xNBA and option
 client-architecture = 00:09 { #x86, xCAT Network Boot Agent
  filename =
 http://10.20.0.1/tftpboot/xcat/xnba/nets/10.20.0.0_16.uefi;;
   } else if option client-architecture = 00:00  { #x86
 filename xcat/xnba.kpxe;
   } else if option vendor-class-identifier = Etherboot-5.4  { #x86
 filename xcat/xnba.kpxe;
   } else if option client-architecture = 00:07 { #x86_64 uefi
  filename xcat/xnba.efi;
   } else if option client-architecture = 00:09 { #x86_64 uefi
 alternative id
  filename xcat/xnba.efi;
   } else if option client-architecture = 00:02 { #ia64
  filename elilo.efi;
   } else if substring(filename,0,1) = null { #otherwise, provide
 yaboot if the client isn't specific
  filename 

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-21 Thread Russell Jones
It *should* work with xNBA and Genesis - xNBA is the PXE image that 
loads Genesis. :-)

Genesis is the utility image that handles shell commands, runimages, etc.

Don't confuse NBFS with xNBA - NBFS is deprecated via Genesis. xNBA is 
the gpxe image that loads Genesis or your normal OS image depending on 
what you sent via nodeset. Genesis would not be able to load without 
xNBA (or standard PXE), and neither would any netboot images.

On 1/21/2014 3:33 PM, Josh Nielsen wrote:
   my case it still works with
 both xnba and genesis because of the nature of PXE chainloading. It
 probably adds deployment time, but it actually works in such a mixed
 configuration.

 -Josh


--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user


Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-21 Thread Jonathan Mills
It would seem to me that what I am missing is the whole of the 
xCAT-nbroot infrastructure...because it isn't part of xcat-core, nor 
xcat-dep.  So I didn't grab it.  But it just so happens...you need it.

The file

/tftpboot/xcat/nbk.x86_64

is provided by the RPM xCAT-nbkernel-x86_64.  Which is missing from my 
yum repo mirrior, and from my hosts.


Anything else I'm missing?  Hopefully if I grab correct copies of 
xCAT-nbkernel and xCAT-nbroot (or xCAT-nbroot2?) then node discovery 
will actually work.

On 1/22/14, 12:08 AM, Xiao Peng Wang wrote:
 Both Josh and Russell are correct.

 xNBA is a customized pxe and genesis is a xCAT customized diskless linux
 system to run discovery and other tasks like 'bmcsetup'. It does not
 need the /tftpboot/pxelinux.cfg/.* to load the genesis.

 For discovery, if a node is not defined in xCAT, the dhcp configuration
 in the /etc/dhcp/dhcpd.conf or /etc/dhcpd.conf is used to reply the dhcp
 request from not-discovered node.

 In your dhcpd.conf, it should have the following part for your
 deployment network. If not, run 'makedhcp -n' to recreate your dhcpd.conf.
  if option user-class-identifier = xNBA and option
 client-architecture = 00:00 { #x86, xCAT Network Boot Agent
 always-broadcast on;
 filename = http://10.1.0.207/tftpboot/xcat/xnba/nets/10.1.0.0_16;;
  } else if option user-class-identifier = xNBA and option
 client-architecture = 00:09 { #x86, xCAT Network Boot Agent
 filename =
 http://10.1.0.207/tftpboot/xcat/xnba/nets/10.1.0.0_16.uefi;;
  } else if option client-architecture = 00:00  { #x86
filename xcat/xnba.kpxe;
  } else if option vendor-class-identifier = Etherboot-5.4  { #x86
filename xcat/xnba.kpxe;
  } else if option client-architecture = 00:07 { #x86_64 uefi
 filename xcat/xnba.efi;
  } else if option client-architecture = 00:09 { #x86_64 uefi
 alternative id
 filename xcat/xnba.efi;
  }

 During the boot process of a not-discovered node, dhcpd will tell the
 node to load xcat/xnba.kpxe first and then the configuration file
 http://xcat mn/tftpboot/xcat/xnba/nets/10.1.0.0_16. Then the xnba will
 load the genesis.

 Take a look of the syslog to see whether the xnba was downloaded
 successfully from tftp server. And look into the httpd log to see
 whether the genesis has been downloaded successfully.


 Thanks
 Best Regards
 --
 Wang Xiaopeng (王晓朋)
 IBM China System Technology Laboratory
 Tel: 86-10-82453455
 Email: w...@cn.ibm.com
 Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road,
 Haidian District Beijing P.R.China 100193

 Inactive hide details for Josh Nielsen ---2014/01/22 05:56:00---Ah, I
 see what you are saying now. Well, I hope the thread I stJosh Nielsen
 ---2014/01/22 05:56:00---Ah, I see what you are saying now. Well, I hope
 the thread I stumbled on that Jarrod replied to help

 From: Josh Nielsen jniel...@hudsonalpha.org
 To: xCAT Users Mailing list xcat-user@lists.sourceforge.net,
 Date: 2014/01/22 05:56
 Subject: Re: [xcat-user] Frustrating time with sequential node discovery

 



 Ah, I see what you are saying now. Well, I hope the thread I stumbled
 on that Jarrod replied to helps figure out why his configuration is
 looking to the outdated (according to what Jarrod said) configuration
 files in /tftpboot/pxelinux.cfg/. Looks like it is either
 /etc/dhcpd.conf or /var/lib/dhcpd/dhcpd.leases related in that case.

 On Tue, Jan 21, 2014 at 3:51 PM, Russell Jones
 russell-l...@jonesmail.me wrote:
   It *should* work with xNBA and Genesis - xNBA is the PXE image that
   loads Genesis. :-)
  
   Genesis is the utility image that handles shell commands, runimages, etc.
  
   Don't confuse NBFS with xNBA - NBFS is deprecated via Genesis. xNBA is
   the gpxe image that loads Genesis or your normal OS image depending on
   what you sent via nodeset. Genesis would not be able to load without
   xNBA (or standard PXE), and neither would any netboot images.
  
   On 1/21/2014 3:33 PM, Josh Nielsen wrote:
 my case it still works with
   both xnba and genesis because of the nature of PXE chainloading. It
   probably adds deployment time, but it actually works in such a mixed
   configuration.
  
   -Josh
  
  
  
 --
   CenturyLink Cloud: The Leader in Enterprise Cloud Services.
   Learn Why More Businesses Are Choosing CenturyLink Cloud For
   Critical Workloads, Development Environments  Everything In Between.
   Get a Quote or Start a Free Trial Today.
  
 http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
   ___
   xCAT-user mailing list
   xCAT-user@lists.sourceforge.net
   https://lists.sourceforge.net/lists/listinfo/xcat-user

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-21 Thread Xiao Peng Wang
Why do you say that you need nbk.x86_64? Is this file listed in the
/tftpboot/xcat/xnba/nets/?

With the latest xCAT build, it needs /tftpboot/xcat/genesis.kernel.x86_64
instead of nbk.*

Thanks
Best Regards
--
 Wang Xiaopeng (王晓朋)
 IBM China System Technology Laboratory
 Tel: 86-10-82453455
 Email: w...@cn.ibm.com
 Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road,
Haidian District Beijing P.R.China 100193



From:   Jonathan Mills jonmi...@renci.org
To: xCAT Users Mailing list xcat-user@lists.sourceforge.net,
Date:   2014/01/22 14:10
Subject:Re: [xcat-user] Frustrating time with sequential node discovery



It would seem to me that what I am missing is the whole of the
xCAT-nbroot infrastructure...because it isn't part of xcat-core, nor
xcat-dep.  So I didn't grab it.  But it just so happens...you need it.

The file

/tftpboot/xcat/nbk.x86_64

is provided by the RPM xCAT-nbkernel-x86_64.  Which is missing from my
yum repo mirrior, and from my hosts.


Anything else I'm missing?  Hopefully if I grab correct copies of
xCAT-nbkernel and xCAT-nbroot (or xCAT-nbroot2?) then node discovery
will actually work.

On 1/22/14, 12:08 AM, Xiao Peng Wang wrote:
 Both Josh and Russell are correct.

 xNBA is a customized pxe and genesis is a xCAT customized diskless linux
 system to run discovery and other tasks like 'bmcsetup'. It does not
 need the /tftpboot/pxelinux.cfg/.* to load the genesis.

 For discovery, if a node is not defined in xCAT, the dhcp configuration
 in the /etc/dhcp/dhcpd.conf or /etc/dhcpd.conf is used to reply the dhcp
 request from not-discovered node.

 In your dhcpd.conf, it should have the following part for your
 deployment network. If not, run 'makedhcp -n' to recreate your
dhcpd.conf.
  if option user-class-identifier = xNBA and option
 client-architecture = 00:00 { #x86, xCAT Network Boot Agent
 always-broadcast on;
 filename = http://10.1.0.207/tftpboot/xcat/xnba/nets/10.1.0.0_16
;
  } else if option user-class-identifier = xNBA and option
 client-architecture = 00:09 { #x86, xCAT Network Boot Agent
 filename =
 http://10.1.0.207/tftpboot/xcat/xnba/nets/10.1.0.0_16.uefi;;
  } else if option client-architecture = 00:00  { #x86
filename xcat/xnba.kpxe;
  } else if option vendor-class-identifier = Etherboot-5.4  { #x86
filename xcat/xnba.kpxe;
  } else if option client-architecture = 00:07 { #x86_64 uefi
 filename xcat/xnba.efi;
  } else if option client-architecture = 00:09 { #x86_64 uefi
 alternative id
 filename xcat/xnba.efi;
  }

 During the boot process of a not-discovered node, dhcpd will tell the
 node to load xcat/xnba.kpxe first and then the configuration file
 http://xcat mn/tftpboot/xcat/xnba/nets/10.1.0.0_16. Then the xnba will
 load the genesis.

 Take a look of the syslog to see whether the xnba was downloaded
 successfully from tftp server. And look into the httpd log to see
 whether the genesis has been downloaded successfully.


 Thanks
 Best Regards
 --
 Wang Xiaopeng (王晓朋)
 IBM China System Technology Laboratory
 Tel: 86-10-82453455
 Email: w...@cn.ibm.com
 Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road,
 Haidian District Beijing P.R.China 100193

 Inactive hide details for Josh Nielsen ---2014/01/22 05:56:00---Ah, I
 see what you are saying now. Well, I hope the thread I stJosh Nielsen
 ---2014/01/22 05:56:00---Ah, I see what you are saying now. Well, I hope
 the thread I stumbled on that Jarrod replied to help

 From: Josh Nielsen jniel...@hudsonalpha.org
 To: xCAT Users Mailing list xcat-user@lists.sourceforge.net,
 Date: 2014/01/22 05:56
 Subject: Re: [xcat-user] Frustrating time with sequential node discovery

 



 Ah, I see what you are saying now. Well, I hope the thread I stumbled
 on that Jarrod replied to helps figure out why his configuration is
 looking to the outdated (according to what Jarrod said) configuration
 files in /tftpboot/pxelinux.cfg/. Looks like it is either
 /etc/dhcpd.conf or /var/lib/dhcpd/dhcpd.leases related in that case.

 On Tue, Jan 21, 2014 at 3:51 PM, Russell Jones
 russell-l...@jonesmail.me wrote:
   It *should* work with xNBA and Genesis - xNBA is the PXE image that
   loads Genesis. :-)
  
   Genesis is the utility image that handles shell commands, runimages,
etc.
  
   Don't confuse NBFS with xNBA - NBFS is deprecated via Genesis. xNBA is
   the gpxe image that loads Genesis or your normal OS image depending on
   what you sent via nodeset. Genesis would not be able to load without
   xNBA (or standard PXE), and neither would any netboot images.
  
   On 1/21/2014 3:33 PM, Josh Nielsen wrote:
 my case it still works with
   both xnba and genesis because of the nature of PXE chainloading

Re: [xcat-user] Frustrating time with sequential node discovery

2014-01-20 Thread Xiao Peng Wang
xCAT is using genesis (an xCAT customized pxe tool) to function the
discovery process. The configuration for genesis is put
in /tftpboot/xcat/xnba/nets/ for a specific network. Could you check your
specific xnba configuration file for your deployment network has been put
in /tftpboot/xcat/xnba/nets/?

The prerequisite for booting of genesis is to make the node has a dynamic
IP address. Did you configure the dynamic IP range for your deployment
network? Could you take a look of your syslog to see whether the node has
sent out dhcp request and what did your dhcp server replied to them?

Thanks
Best Regards
--
 Wang Xiaopeng (王晓朋)
 IBM China System Technology Laboratory
 Tel: 86-10-82453455
 Email: w...@cn.ibm.com
 Address: 28,ZhongGuanCun Software Park,No.8 Dong Bei Wang West Road,
Haidian District Beijing P.R.China 100193



From:   Jonathan Mills jonmi...@renci.org
To: xCAT Users Mailing list xcat-user@lists.sourceforge.net,
Date:   2014/01/19 06:24
Subject:[xcat-user] Frustrating time with sequential node discovery



I'm running xCAT 2.8.3 and CentOS 6.4 atop of Cisco UCS-C hardware.  I'm
attempting to do a sequential nodediscovery.  I've pre-populated the
nodelist table with the nodenames, so I shouldn't need to do anything
more than

nodediscoverystart noderange=node[1-15]

However, none of the nodes ever gets discovered.

Digging deeper, it seems that none of them ever successfully PXE boot at
all.  They should be PXE booting off of the genesis netboot image and
speaking back to the xcatmaster, correct?

When I run 'mknb x86_64', it populates /tftpboot/pxelinux.cfg with
entries to non-existent netboot images.  Watch:

[root@ncsu-hn ~]# rpm -qf /opt/xcat/sbin/mknb
xCAT-client-2.8.3-snap201311122316.noarch
[root@ncsu-hn ~]# mknb x86_64
Creating genesis.fs.x86_64.lzma in /tftpboot/xcat
[root@ncsu-hn ~]# cd /tftpboot/pxelinux.cfg/
[root@ncsu-hn pxelinux.cfg]# ls
0A6400  0A6500  0A6600  7F  98300D  98300DE6  98300DE7  C0A86B
[root@ncsu-hn pxelinux.cfg]# cat *
DEFAULT xCAT
   LABEL xCAT
   KERNEL xcat/nbk.x86_64
   APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=10.100.0.1:3001
DEFAULT xCAT
   LABEL xCAT
   KERNEL xcat/nbk.x86_64
   APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=10.101.0.1:3001
DEFAULT xCAT
   LABEL xCAT
   KERNEL xcat/nbk.x86_64
   APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=10.102.0.1:3001
DEFAULT xCAT
   LABEL xCAT
   KERNEL xcat/nbk.x86_64
   APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=127.0.0.1:3001
DEFAULT xCAT
   LABEL xCAT
   KERNEL xcat/nbk.x86_64
   APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=152.48.13.3:3001
DEFAULT xCAT
   LABEL xCAT
   KERNEL xcat/nbk.x86_64
   APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=152.48.13.230:3001
DEFAULT xCAT
   LABEL xCAT
   KERNEL xcat/nbk.x86_64
   APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=152.48.13.231:3001
DEFAULT xCAT
   LABEL xCAT
   KERNEL xcat/nbk.x86_64
   APPEND initrd=xcat/nbfs.x86_64.gz quiet xcatd=192.168.107.10:3001
[root@ncsu-hn pxelinux.cfg]# cd ../xcat/
[root@ncsu-hn xcat]# ls -la
total 21528
drwxr-xr-x  4 root root 4096 Jan 17 13:06 .
drwxr-xr-x. 7 root root 4096 Jan 18 22:02 ..
-rwxr-xr-x  1 root root   242929 Jan 15  2012 elilo-x64.efi
-rw-r--r--  1 root root 17573621 Jan 18 22:03 genesis.fs.x86_64.lzma
-rwxr-xr-x  1 root root  3986608 Aug  9 06:29 genesis.kernel.x86_64
drwxr-xr-x  3 root root 4096 Jan 17 13:06 osimage
drwxr-xr-x  3 root root 4096 Dec 23 07:42 xnba
-rw-r--r--  1 root root   139200 Oct 28 16:16 xnba.efi
-rw-r--r--  1 root root74792 Oct 28 16:16 xnba.kpxe



As you can seeit ought to be netbooting the genesis kernel, but
instead all my pxelinux.cfg/* files are instructing clients to boot the
non-existent nbk.x86_64 image.

Your advice is appreciated.

--
Jonathan Mills
Systems Administrator
Renaissance Computing Institute
UNC-Chapel Hill

--

CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk

___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

inline: graycol.gif--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___
xCAT-user mailing list
xCAT-user@lists.sourceforge.net