Re: [ovirt-users] oVirt / ROM images / PXE

2015-03-06 Thread Paul Heinlein

On Fri, 6 Mar 2015, Chris Adams wrote:


Once upon a time, Paul Heinlein  said:

So it might be helpful to look at the DHCP options, but the server
is making OFFERs, so I'm not really sure what bits might be suspect.


Do you see a difference between the DHCP options with the "bad" and 
"good" ROMs?


It might be a while before I can set things up to get a good capture, 
but I do appreciate your pointers. I'll reply to the list once I can 
free up the time for the tests.


--
Paul Heinlein
heinl...@madboa.com
45°38' N, 122°6' W___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt / ROM images / PXE

2015-03-06 Thread Chris Adams
Once upon a time, Paul Heinlein  said:
> So it might be helpful to look at the DHCP options, but the server
> is making OFFERs, so I'm not really sure what bits might be suspect.

Do you see a difference between the DHCP options with the "bad" and
"good" ROMs?
-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt / ROM images / PXE

2015-03-06 Thread Paul Heinlein

On Fri, 6 Mar 2015, Chris Adams wrote:

If the DHCP requests are making it to the server, the next thing to 
see is if there is any difference in the DHCP options requested 
between the different ROM images (maybe your DHCP config isn't 
matching up correctly in some case that works on mine?).


I'd done a bunch of tcpdump-ing (which went unmentioned in my post to 
the list). The DHCP requests are indeed making it to the server, and 
the server makes an OFFER, which never makes it back to VM ROM.


To reiterate, however, DHCP clients from working VMs are able to get 
leases, so the network itself is functioning. The issue is definitely 
tied to the ROM that's used.


So it might be helpful to look at the DHCP options, but the server is 
making OFFERs, so I'm not really sure what bits might be suspect.


--
Paul Heinlein
heinl...@madboa.com
45°38' N, 122°6' W___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt / ROM images / PXE

2015-03-06 Thread Chris Adams
Once upon a time, Paul Heinlein  said:
> Good data point! Can you tell me the compatibility version of your
> data center and its cluster(s)? How about the cluster CPU type?

One DC, one cluster, version 3.5.  Intel Nehalem CPU.  I've PXE booted
CentOS 5, 6, and 7 VMs (64 bit for all and 32 bit for 5/6).

I'd suspect something in the network setup.  I have VLANs on an 802.1q
trunk on an LACP bond (with oVirt bridging the VLANs to VMs).  My DHCP
server (separate physical CentOS 6 box) is also running VLANs on 802.1q
on LACP bond, with dnsmasq listening on one VLAN.

I'd look at traffic coming out of the VM on the node, and coming into
the DHCP server, and see who sees what (are the requests coming out of
the VM, is the DHCP server seeing them, is it replying, does the VM get
the reply).

If the DHCP requests are making it to the server, the next thing to see
is if there is any difference in the DHCP options requested between the
different ROM images (maybe your DHCP config isn't matching up correctly
in some case that works on mine?).

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt / ROM images / PXE

2015-03-06 Thread Paul Heinlein

On Fri, 6 Mar 2015, Chris Adams wrote:


Once upon a time, Paul Heinlein  said:
Summary: To get oVirt-managed VMs to boot using PXE, I had to 
replace the rhel6-*.rom files with their ipxe equivalents.


I'm PXE booting oVirt VMs with no trouble.  I have CentOS 7 nodes, 
running oVirt 3.5.1 (hosted engine on CentOS 6).  Each node has a 
pair of NICs in a LACP bond to a switch stack, running 802.1q on top 
of that, with several VLANs (only one VLAN has a DHCP server and a 
local CentOS repo, so I put VMs on that VLAN for install).


Good data point! Can you tell me the compatibility version of your 
data center and its cluster(s)? How about the cluster CPU type?


I'm just trying to figure out what moving parts are at issue.

--
Paul Heinlein
heinl...@madboa.com
45°38' N, 122°6' W___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt / ROM images / PXE

2015-03-06 Thread Chris Adams
Once upon a time, Paul Heinlein  said:
> Summary: To get oVirt-managed VMs to boot using PXE, I had to
> replace the rhel6-*.rom files with their ipxe equivalents.

I'm PXE booting oVirt VMs with no trouble.  I have CentOS 7 nodes,
running oVirt 3.5.1 (hosted engine on CentOS 6).  Each node has a pair
of NICs in a LACP bond to a switch stack, running 802.1q on top of that,
with several VLANs (only one VLAN has a DHCP server and a local CentOS
repo, so I put VMs on that VLAN for install).

-- 
Chris Adams 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] oVirt / ROM images / PXE

2015-03-06 Thread Paul Heinlein
Summary: To get oVirt-managed VMs to boot using PXE, I had to replace 
the rhel6-*.rom files with their ipxe equivalents.


Setup: Nodes running CentOS 7, fully current. Engine 3.5.0.1 running 
on Fedora 19. Storage via NFS. 1GB ethernet network. DHCP server on 
separate subnet, relying on dhcp-relay switch settings.


PXE is our preferred method for installing VMs. We rarely use 
templates.


I recently upgraded our oVirt hypervisor nodes from Fedora 19 to 
CentOS 7 (where "upgrade" obviously implies a full OS 
re-installation). I failed to test creating VMs using PXE in this 
setup because -- and this is crucial -- it worked fine on our 
non-oVirt CentOS 7 hypervisors, which are managed using virsh and 
virt-install.


So PXE booting a new VM failed. The symptom was the dreaded "dhcp 
connection timeout" at the seabios prompt. The timout would persist 
even when I pressed Ctrl-B and re-tried "dhcp net0" several times over 
the course of the next few minutes.


Significantly, existing VMs experienced no DHCP troubles at all, in 
the sense that dhclient or its equivalent within the VM could get 
and renew leases as expected.


I poked around the forward-delay settings on the oVirt-managed 
bridges, but they had the same charactistics of the bridges on our 
non-oVirt hypervisors. Lots of other network-oriented troubleshooting 
likewise failed.


It got to the point that I took one of my oVirt nodes, removed it from 
oVirt, and reinstalled it with plain CentOS 7. Its physical 
connections and IP addresses stayed exactly the same. Without oVirt, 
virt-install used PXE without a hitch. I re-added that machine to the 
oVirt cluster, and PXE failed once again.


I noticed that the BIOS PXE prompt was slightly different in the ovirt 
and non-ovirt environment, which led me to poke around the ROM images 
in /usr/share/qemu-kvm.


I think the rhel6-*.rom images are those used by oVirt, while the 
pxe-*.rom images (actually symlinks into ../ipxe/) are used by, e.g., 
the local virt-install utility.


My workaround is to replace the rhel6-*.rom files with the same 
symlinks used by the pxe-*.rom files:


- %< -
# short version of script, minus error checking
cd /usr/share/qemu-kvm
for NIC in e1000 ne2k_pci pcnet rtl8139 virtio; do
  mv rhel6-${NIC}.rom rhel6-${NIC}.rom.dist
  ln -s $(readlink pxe-${NIC}.rom) rhel6-${NIC}.rom
done
- %< -

Now oVirt VMs can boot from PXE without any issue.

I'm wildly curious about what's going on here.

--
Paul Heinlein
heinl...@madboa.com
45°38' N, 122°6' W___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users