Re: [CentOS] my notes on bond, bridge, network, kvm, host and virtual so far

2012-02-13 Thread Gordon Messmer
On 02/07/2012 07:26 PM, Devin Reade wrote:
 I had a lot of problems with the network stack on VMs, both under
 VMWare ESXi and Xen where the network would just go numb.  After a
 lot of splunking I determined that it seemed to be related to
 faulty TCP segment offload.

Yeah, wow.  You just jogged my memory.  Intel 82573(V/L/E) ethernet 
adapters had a serious bug that would cause TX hangs:

http://downloadmirror.intel.com/9180/eng/README.txt
  82573(V/L/E) TX Unit Hang Messages

Bob, what model cards did you have in your server?
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] my notes on bond, bridge, network, kvm, host and virtual so far

2012-02-13 Thread Bob Hoffman
===
*Gordon Messmer* wrote

On 02/07/2012 07:26 PM, Devin Reade wrote:
/  I had a lot of problems with the network stack on VMs, both under
//  VMWare ESXi and Xen where the network would just go numb.  After a
//  lot of splunking I determined that it seemed to be related to
//  faulty TCP segment offload.
/
Yeah, wow.  You just jogged my memory.  Intel 82573(V/L/E) ethernet
adapters had a serious bug that would cause TX hangs:

http://downloadmirror.intel.com/9180/eng/README.txt
   82573(V/L/E) TX Unit Hang Messages

Bob, what model cards did you have in your server?

=
http://www.supermicro.com/products/system/2U/6026/SYS-6026T-NTR_.cfm

IntelĀ® 82576 Dual-Port Gigabit Ethernet Controller (though I think this 
is basically e1000)

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] my notes on bond, bridge, network, kvm, host and virtual so far

2012-02-07 Thread Bob Hoffman
well, had to add something to it.

I found out I was having an issue with the addon ethernet card (e1000) 
'link detected no'
and it not working. Took it out? Yep? Work? No.

However, I did add a second vm and something interesting is happening

one vm stays up, one will crash...the one that stays does not die.

I am thinking that the vnet0 that comes up is messed up and I need to 
reset it somehow.
Or...something elsebut one staying up while other goes down is 
rather odd.

very strange.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] my notes on bond, bridge, network, kvm, host and virtual so far

2012-02-07 Thread Devin Reade
I have no idea if this is the source of your problem (I wasn't using
bonded interfaces), but it's sufficiently similar that you might
want to try it.

I had a lot of problems with the network stack on VMs, both under
VMWare ESXi and Xen where the network would just go numb.  After a
lot of splunking I determined that it seemed to be related to 
faulty TCP segment offload.  Generally speaking, between the VM,
the virtual NICs, the hypervisor/host, and the physical network card, 
some levels figured that they'd offload segmentation handling to 
a lower layer, the lower layer wasn't doing it, and the upper layer
thought that it was.

Under low network load everything seemed fine but as the network
got pushed things would blow up and go numb.

Turning off TSO in the VM seemed to do the trick, although I think
in the Xen case I turned it off in the host as well.

The basic command is:  /sbin/ethtool -K ethX tso off

While I had the above command in rc.local, I would also run the
attached script in /etc/cron.hourly as there were some circumstances 
where tso would get reenabled.

Good luck

Devin
-- 
Some people are like Slinkies: Not really good for anything, but you can't
help but smile when you see one tumble the stairs.
- Anonymous
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] my notes on bond, bridge, network, kvm, host and virtual so far

2012-02-07 Thread Devin Reade
Devin Reade g...@gno.org wrote:

[...]
 While I had the above command in rc.local, I would also run the
 attached script in /etc/cron.hourly as there were some circumstances 
 where tso would get reenabled.

And in case attachments get stripped on the mailing list, you
can also get the script here:

ftp://ftp.gno.org/pub/tools/force-tso

Devin
-- 
Some people are like Slinkies: Not really good for anything, but you can't
help but smile when you see one tumble the stairs.
- Anonymous

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] my notes on bond, bridge, network, kvm, host and virtual so far

2012-02-07 Thread Ian Pilcher
On 02/06/2012 09:28 PM, Bob Hoffman wrote:
 I put this page together just so I won't spam the board anymore begging 
 for help..lol
 http://bobhoffman.com/vmissue.html

You're using bonding mode 0, which may not work when attached to a
bridge.  Try changing to mode 1 and playing with the cables.  If every-
thing works with mode 1, you've got an idea on where to focus.

As far as active/active bonding modes go, I know that mode 4 (LACP) is
supposed to work, but that requires support on the switch(es).

-- 

Ian Pilcher arequip...@gmail.com
If you're going to shift my paradigm ... at least buy me dinner first.


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] my notes on bond, bridge, network, kvm, host and virtual so far

2012-02-07 Thread Devin Reade
Although it was written in the context of Xen, you might also want to have a
look at the netloop nloopbacks parameter as described in
http://www.novell.com/communities/node/4094/xen-network-bridges-explained-with-troubleshooting-notes.
On a Xen cluster with 3 physical interfaces per node I had to increase
that parameter to keep interfaces from going numb.

I don't know how this translates to the libvirt/kvm world.

Devin
-- 
Some people are like Slinkies: Not really good for anything, but you can't
help but smile when you see one tumble the stairs.
- Anonymous

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] my notes on bond, bridge, network, kvm, host and virtual so far

2012-02-06 Thread Bob Hoffman
I put this page together just so I won't spam the board anymore begging 
for help..lol
http://bobhoffman.com/vmissue.html

This shows a working effort of bonded eths, bridged into a vm, and a few 
other things.
The only missing thing is something on the host that ends up putting the 
VM internet
connection into some kind of limbo.

Whether it is hardware related, bug related, libvirt nat related, I 
don't know.
I will only post here on this issue again if it ever gets solved.
At this point the server is a no go and getting shelved until I can find 
a tech
that knows this stuff and can fix it.

right now: unsolvable.

I may just put some websites on the host computer until I can find a 
reliable way of
keeping the virtual guest connection 100% up.

Hope this helps someone wanting to bridge or bond.

bob
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] my notes on bond, bridge, network, kvm, host and virtual so far

2012-02-06 Thread Patrick Lists
On 07-02-12 04:28, Bob Hoffman wrote:
 I put this page together just so I won't spam the board anymore begging
 for help..lol
 http://bobhoffman.com/vmissue.html

According to http://wiki.centos.org/TipsAndTricks/BondingInterfaces 
there should not be a HWADDR=mac_address in ifcfg-eth0.

Regards,
Patrick

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] my notes on bond, bridge, network, kvm, host and virtual so far

2012-02-06 Thread Gordon Messmer
On 02/06/2012 07:28 PM, Bob Hoffman wrote:
 I put this page together just so I won't spam the board anymore begging
 for help..lol
 http://bobhoffman.com/vmissue.html

http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Virtualization/sect-Virtualization-Network_Configuration-Bridged_networking_with_libvirt.html

Your page doesn't include your sysctl.conf, so the information available 
makes it look like your guests are subject to firewall rules on the VM 
host, none of which allow access to them.

Have you tried disabling netfilter on the bridge device, as documented 
above?

If that doesn't help, I'm curious about the problem.  Contact me off list.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] my notes on bond, bridge, network, kvm, host and virtual so far

2012-02-06 Thread DV
On Mon, Feb 6, 2012 at 23:22, Patrick Lists
centos-l...@puzzled.xs4all.nl wrote:
 On 07-02-12 04:28, Bob Hoffman wrote:
 I put this page together just so I won't spam the board anymore begging
 for help..lol
 http://bobhoffman.com/vmissue.html

 According to http://wiki.centos.org/TipsAndTricks/BondingInterfaces
 there should not be a HWADDR=mac_address in ifcfg-eth0.

 Regards,
 Patrick

I second that.

What may be happening is that the VM host, which is explicitly set to
use br0 as it's main host interface, works fine when bond0
communicates using the eth0 interface, and maintains the connection
while eth0's MAC is active for br0 ip address, but at some point
outbound traffic from the VM host ceases and the MAC address for eth0
times out of br0.  At that point inbound traffic can go to whichever
interface answers first in bond0, and if it is eth1, traffic will time
out since the main host is probably only using eth0 and not the br0 as
its own interface.  I would be curious to know what the main hosts
routing table has in it, if it is using eth0 or br0 as it's
communication interface.

try  these:

# route -n
# ip route show table all
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] my notes on bond, bridge, network, kvm, host and virtual so far

2012-02-06 Thread Bob Hoffman
Hi all, thanks for taking an interest.
been populating the page with all the data I could.

as someone who sent me a mail noticed that there was still some 
networkmanager stuff in the
log messages. I have disabled it before..and did so again, fresh reboot 
and there it was.

I am using startx to enter into a mini desktop on the unit (via impi) 
and that seemed to
restart networkmanager for some odd reason.

I cleaned it up some more and am waiting to see if it fails. (it can 
take up to one hour to fail, so
I try to only do 3 or 4 things at once and wait til it goes...sigh.)

http://bobhoffman.com/vmissue.html

I tried with and without hwaddress in the files and found no difference. 
I took them out again for this try
and using mode 6 which is set to rewrite. A number of the modes required 
switch support according to
docs and I tried testing them out but using hwaddress to get around it 
(well, worth a shot).

the worst part is the depth of time it takes. Sometimes almost an hour 
goes by and I say 'eureka!!'
and continue working on the server, proud of myself...then 'bam' no 
connectlol

screwing around on the vm got me to screw up the network on there 
now..seems eth0 does not exist and it wants eth2...lol
sigh.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos