Re: PXE boot is an infinite reinstall

2011-10-18 Thread Sergio Ballestrero
 What we (CERN ATLAS Online) do is to have a PXE default config that points 
back to localboot, and is changed for the specific PC when you want to 
reinstall. The %postboot of the kickstart then has to contain a call to a CGI 
(e.g. using wget) that resets the PXE to the default. The same kind of 
mechanism is used by Cobbler and by Quattor's PXE install system. This is 
faster than letting PXE timeout.

The Rocks trick is interesting and possibly the safest way to do it. I see it 
may have the disadvantage that you can't reinstall from remote a PC that is 
unable to boot - unless your PCs have a lights-out remote management that 
allows you to change the boot order.

Cheers,
  Sergio

On 18 Oct 2011, at 01:26, Steven Timm wrote:

 The trick that Rocks uses is to have a boot order of (hard disk, pxe)
 and then when you want to reinstall, change two bytes in the
 boot sector to make the hard disk unbootable and it will fall through
 to a PXE boot only at that time.
 
 What worker node installs at Fermilab do is to have a DHCP server that
 only answers the PXE request when you want to reinstall, and no other
 time, so the PXE request just times out and then you boot off the hard drive.
 
 Steve
 
 On Mon, 17 Oct 2011, ~Stack~ wrote:
 
 Hello All,
 
 I ran into another issue with my PXE build out. I searched the net and
 found many people with the same issue, but there was either no response
 or their solution would not work for my needs (requiring access to
 software I don't have). What I am after with this is a completely
 unmanaged automated install of a client on boot.
 
 I am using dnsmasq as my DNS, DHCP, and TFTP server.
 
 I have a server and a client. The client boots off the network card with
 PXE. It asks for and receives a IP from the DHCP server and proceeds
 with pulling the TFTP information. The TFTP server passes it a
 pxelinux.0 file along with the default configuration. The
 configuration has a kickstart file and the client continues with a
 flawless install of SL6.1. After the install, the client reboots...and
 the whole process starts over and over and over again. I know why it
 does this (the default boot option is to install), but I can't figure
 out how to control it.
 
 What I would like is a process where I boot the clients from an off
 state, have them do a fresh install, and then reboot into the new
 install. Nothing is stored on these nodes and a fresh install goes
 rather quickly so I don't mind this option.
 
 At first I tried scripting an option that just toggled the tftp default
 menu but it wasn't working very smoothly as not all my hosts boot at
 equal speeds.
 
 I attempted chainloading in the tftp but just made a mess and I didn't
 get any different results. Most likely due to me not understanding it
 properly. I am open to pointers.
 
 I thought I could do it inside of DNSMasq, but I couldn't find a good
 example and my attempts didn't work.
 
 I looked online and found projects like systemimager.org but I am
 already doing most of what they provide. I attempted to reverse their
 perl scripts but that is a bigger project then I initially thought. What
 I did like about this project was the ability to tell it to allow a
 single host or a group of hosts to reinstall or to boot off the hard disk.
 
 I have gotten some great pointers from this list so far and I am really
 hoping someone might have another for me. Any ideas?
 
 Thanks!
 
 ~Stack~
 
 
 -- 
 --
 Steven C. Timm, Ph.D  (630) 840-8525
 t...@fnal.gov  http://home.fnal.gov/~timm/
 Fermilab Computing Division, Scientific Computing Facilities,
 Grid Facilities Department, FermiGrid Services Group, Group Leader.
 Lead of FermiCloud project.

-- 
 Sergio Ballestrero  - http://physics.uj.ac.za/psiwiki/Ballestrero
 University of Johannesburg, Physics Department
 ATLAS TDAQ sysadmin group - Office:75282 OnCall:164851


Re: PXE boot is an infinite reinstall

2011-10-18 Thread Yannick Perret

Steven Timm a écrit :

The trick that Rocks uses is to have a boot order of (hard disk, pxe)
and then when you want to reinstall, change two bytes in the
boot sector to make the hard disk unbootable and it will fall through
to a PXE boot only at that time.

What worker node installs at Fermilab do is to have a DHCP server that
only answers the PXE request when you want to reinstall, and no other
time, so the PXE request just times out and then you boot off the hard 
drive.



Here (at CC-IN2P3) we do mostly the same: boot sequence HDD;PXE.

Destroying partition table works. We also use IPMI. Using IPMI commands 
(if your nodes have a IPMI-compatible card) you can use chassis bootdev 
pxe, whitch tells the node to boot on PXE only the next time.
So reinstalling a node (with a configured IMPI) consists in chassis 
bootdev pxe + chassis power [cycle|on].


Regards,
--
Y.

Steve

On Mon, 17 Oct 2011, ~Stack~ wrote:


Hello All,

I ran into another issue with my PXE build out. I searched the net and
found many people with the same issue, but there was either no response
or their solution would not work for my needs (requiring access to
software I don't have). What I am after with this is a completely
unmanaged automated install of a client on boot.

I am using dnsmasq as my DNS, DHCP, and TFTP server.

I have a server and a client. The client boots off the network card with
PXE. It asks for and receives a IP from the DHCP server and proceeds
with pulling the TFTP information. The TFTP server passes it a
pxelinux.0 file along with the default configuration. The
configuration has a kickstart file and the client continues with a
flawless install of SL6.1. After the install, the client reboots...and
the whole process starts over and over and over again. I know why it
does this (the default boot option is to install), but I can't figure
out how to control it.

What I would like is a process where I boot the clients from an off
state, have them do a fresh install, and then reboot into the new
install. Nothing is stored on these nodes and a fresh install goes
rather quickly so I don't mind this option.

At first I tried scripting an option that just toggled the tftp default
menu but it wasn't working very smoothly as not all my hosts boot at
equal speeds.

I attempted chainloading in the tftp but just made a mess and I didn't
get any different results. Most likely due to me not understanding it
properly. I am open to pointers.

I thought I could do it inside of DNSMasq, but I couldn't find a good
example and my attempts didn't work.

I looked online and found projects like systemimager.org but I am
already doing most of what they provide. I attempted to reverse their
perl scripts but that is a bigger project then I initially thought. What
I did like about this project was the ability to tell it to allow a
single host or a group of hosts to reinstall or to boot off the hard 
disk.


I have gotten some great pointers from this list so far and I am really
hoping someone might have another for me. Any ideas?

Thanks!

~Stack~





Re: PXE boot is an infinite reinstall

2011-10-18 Thread Felip Moll
If you have Dell Servers there is an option that you can change through
iDrac interface that is Boot once. You check Boot once with PXE and then
reboot the machine. It will boot from PXE only one time so when rebooting
will go throught the HD.

Regards
2011/10/18 Yannick Perret yper...@in2p3.fr

 Steven Timm a écrit :

  The trick that Rocks uses is to have a boot order of (hard disk, pxe)
 and then when you want to reinstall, change two bytes in the
 boot sector to make the hard disk unbootable and it will fall through
 to a PXE boot only at that time.

 What worker node installs at Fermilab do is to have a DHCP server that
 only answers the PXE request when you want to reinstall, and no other
 time, so the PXE request just times out and then you boot off the hard
 drive.

  Here (at CC-IN2P3) we do mostly the same: boot sequence HDD;PXE.

 Destroying partition table works. We also use IPMI. Using IPMI commands (if
 your nodes have a IPMI-compatible card) you can use chassis bootdev pxe,
 whitch tells the node to boot on PXE only the next time.
 So reinstalling a node (with a configured IMPI) consists in chassis
 bootdev pxe + chassis power [cycle|on].

 Regards,
 --

 Y.

 Steve

 On Mon, 17 Oct 2011, ~Stack~ wrote:

  Hello All,

 I ran into another issue with my PXE build out. I searched the net and
 found many people with the same issue, but there was either no response
 or their solution would not work for my needs (requiring access to
 software I don't have). What I am after with this is a completely
 unmanaged automated install of a client on boot.

 I am using dnsmasq as my DNS, DHCP, and TFTP server.

 I have a server and a client. The client boots off the network card with
 PXE. It asks for and receives a IP from the DHCP server and proceeds
 with pulling the TFTP information. The TFTP server passes it a
 pxelinux.0 file along with the default configuration. The
 configuration has a kickstart file and the client continues with a
 flawless install of SL6.1. After the install, the client reboots...and
 the whole process starts over and over and over again. I know why it
 does this (the default boot option is to install), but I can't figure
 out how to control it.

 What I would like is a process where I boot the clients from an off
 state, have them do a fresh install, and then reboot into the new
 install. Nothing is stored on these nodes and a fresh install goes
 rather quickly so I don't mind this option.

 At first I tried scripting an option that just toggled the tftp default
 menu but it wasn't working very smoothly as not all my hosts boot at
 equal speeds.

 I attempted chainloading in the tftp but just made a mess and I didn't
 get any different results. Most likely due to me not understanding it
 properly. I am open to pointers.

 I thought I could do it inside of DNSMasq, but I couldn't find a good
 example and my attempts didn't work.

 I looked online and found projects like systemimager.org but I am
 already doing most of what they provide. I attempted to reverse their
 perl scripts but that is a bigger project then I initially thought. What
 I did like about this project was the ability to tell it to allow a
 single host or a group of hosts to reinstall or to boot off the hard
 disk.

 I have gotten some great pointers from this list so far and I am really
 hoping someone might have another for me. Any ideas?

 Thanks!

 ~Stack~





PXE boot is an infinite reinstall

2011-10-17 Thread ~Stack~
Hello All,

I ran into another issue with my PXE build out. I searched the net and
found many people with the same issue, but there was either no response
or their solution would not work for my needs (requiring access to
software I don't have). What I am after with this is a completely
unmanaged automated install of a client on boot.

I am using dnsmasq as my DNS, DHCP, and TFTP server.

I have a server and a client. The client boots off the network card with
PXE. It asks for and receives a IP from the DHCP server and proceeds
with pulling the TFTP information. The TFTP server passes it a
pxelinux.0 file along with the default configuration. The
configuration has a kickstart file and the client continues with a
flawless install of SL6.1. After the install, the client reboots...and
the whole process starts over and over and over again. I know why it
does this (the default boot option is to install), but I can't figure
out how to control it.

What I would like is a process where I boot the clients from an off
state, have them do a fresh install, and then reboot into the new
install. Nothing is stored on these nodes and a fresh install goes
rather quickly so I don't mind this option.

At first I tried scripting an option that just toggled the tftp default
menu but it wasn't working very smoothly as not all my hosts boot at
equal speeds.

I attempted chainloading in the tftp but just made a mess and I didn't
get any different results. Most likely due to me not understanding it
properly. I am open to pointers.

I thought I could do it inside of DNSMasq, but I couldn't find a good
example and my attempts didn't work.

I looked online and found projects like systemimager.org but I am
already doing most of what they provide. I attempted to reverse their
perl scripts but that is a bigger project then I initially thought. What
I did like about this project was the ability to tell it to allow a
single host or a group of hosts to reinstall or to boot off the hard disk.

I have gotten some great pointers from this list so far and I am really
hoping someone might have another for me. Any ideas?

Thanks!

~Stack~


Re: PXE boot is an infinite reinstall

2011-10-17 Thread Steven Timm

The trick that Rocks uses is to have a boot order of (hard disk, pxe)
and then when you want to reinstall, change two bytes in the
boot sector to make the hard disk unbootable and it will fall through
to a PXE boot only at that time.

What worker node installs at Fermilab do is to have a DHCP server that
only answers the PXE request when you want to reinstall, and no other
time, so the PXE request just times out and then you boot off the hard 
drive.


Steve

On Mon, 17 Oct 2011, ~Stack~ wrote:


Hello All,

I ran into another issue with my PXE build out. I searched the net and
found many people with the same issue, but there was either no response
or their solution would not work for my needs (requiring access to
software I don't have). What I am after with this is a completely
unmanaged automated install of a client on boot.

I am using dnsmasq as my DNS, DHCP, and TFTP server.

I have a server and a client. The client boots off the network card with
PXE. It asks for and receives a IP from the DHCP server and proceeds
with pulling the TFTP information. The TFTP server passes it a
pxelinux.0 file along with the default configuration. The
configuration has a kickstart file and the client continues with a
flawless install of SL6.1. After the install, the client reboots...and
the whole process starts over and over and over again. I know why it
does this (the default boot option is to install), but I can't figure
out how to control it.

What I would like is a process where I boot the clients from an off
state, have them do a fresh install, and then reboot into the new
install. Nothing is stored on these nodes and a fresh install goes
rather quickly so I don't mind this option.

At first I tried scripting an option that just toggled the tftp default
menu but it wasn't working very smoothly as not all my hosts boot at
equal speeds.

I attempted chainloading in the tftp but just made a mess and I didn't
get any different results. Most likely due to me not understanding it
properly. I am open to pointers.

I thought I could do it inside of DNSMasq, but I couldn't find a good
example and my attempts didn't work.

I looked online and found projects like systemimager.org but I am
already doing most of what they provide. I attempted to reverse their
perl scripts but that is a bigger project then I initially thought. What
I did like about this project was the ability to tell it to allow a
single host or a group of hosts to reinstall or to boot off the hard disk.

I have gotten some great pointers from this list so far and I am really
hoping someone might have another for me. Any ideas?

Thanks!

~Stack~



--
--
Steven C. Timm, Ph.D  (630) 840-8525
t...@fnal.gov  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Group Leader.
Lead of FermiCloud project.


Re: PXE boot is an infinite reinstall

2011-10-17 Thread ~Stack~
On 10/17/2011 06:26 PM, Steven Timm wrote:
 The trick that Rocks uses is to have a boot order of (hard disk, pxe)
 and then when you want to reinstall, change two bytes in the
 boot sector to make the hard disk unbootable and it will fall through
 to a PXE boot only at that time.

That is actually a really good idea now that I think about it. It has
been a few years since I used Rocks. I am really glad they are still
around. I just looked and saw they are still RHEL 5 based and not 6
(which I need). Oh well.

I have been thinking about this for a short while now, and I already
think I know how to script this to work the way I want it to.

Thanks for the suggestion!

~Stack~