Re: [CentOS-virt] NIC Stability Problems Under Xen 4.4 / CentOS 6 / Linux 3.18

2017-01-31 Thread Jinesh Choksi
On 30 January 2017 at 22:17, Adi Pircalabu  wrote:

> May I chip in here? In our environment we're randomly seeing:
>
> Jan 17 23:40:14 xen01 kernel: ixgbe :04:00.1 eth6: Detected Tx Unit
> Hang
>

Someone in this thread: https://sourceforge.net/p/e1000/bugs/530/#2855
 reported that *"With these kernels I was only able to work around the
issue by disabling tx-checksumming offload with ethtool."*

However, that was reported for Kernels 4.2.6 / 4.2.8 / 4.4.8 and 4.4.10. I
just thought it could be something you could rule out and hence mentioned
it:

ethtool --offload eth6 rx off tx off


Another thing to rule out in case its a regression with Intel NICs and TSO:

# tso => tcp-segmentation-offload
# gso => generic-segmentation-offload
# gro => generic-receive-offload
# sg => scatter-gather
# ufo => udp-fragmentation-offload (Cannot change)
# lro => large-receive-offload (Cannot change)

ethtool -K eth6 tso off gso off gro off sg off
___
CentOS-virt mailing list
CentOS-virt@centos.org
https://lists.centos.org/mailman/listinfo/centos-virt


Re: [CentOS-virt] NIC Stability Problems Under Xen 4.4 / CentOS 6 / Linux 3.18

2017-01-30 Thread Jinesh Choksi
>Are there other kernel options that might be useful to try?

pci=nomsi

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1521173/comments/13



On 27 January 2017 at 18:21, Kevin Stange  wrote:

> On 01/27/2017 06:08 AM, Karel Hendrych wrote:
> > Have you tried to eliminate all power management features all over?
>
> I've been trying to find and disable all power management features but
> having relatively little luck with that solving the problems.  Stabbing
> the the dark I've tried different ACPI settings, including completely
> disabling it, disabling CPU frequency scaling, and setting pcie_aspm=off
> on the kernel command line.  Are there other kernel options that might
> be useful to try?
>
> > Are the devices connected to the same network infrastructure?
>
> There are two onboard NICs and two NICs on a dual-port card in each
> server.  All devices connect to a cisco switch pair in VSS and the links
> are paired in LACP.
>
> > There has to be something common.
>
> The NICs having issues are running a native VLAN, a tagged VLAN, iSCSI
> and NFS traffic, as well as some basic management stuff over SSH, and
> they are configured with an MTU of 9000 on the native VLAN.  It's a lot
> of features, but I can't really turn them off and then actually have
> enough load on the NICs to reproduce the issue.  Several of these
> servers were installed and being burned in for 3 months without ever
> having an issue, but suddenly collapsed when I tried to bring 20 or so
> real-world VMs up on them.
>
> The other NICs in the system that are connected don't exhibit issues and
> run only VM network interfaces.  They are also in LACP and running VLAN
> tags, but normal 1500 MTU.
>
> So far it seems to correlate with NICs on the expansion cards, but it's
> a coincidence that these cards are the ones with the storage and
> management traffic.  I'm trying to swap some of this load to the onboard
> NICs to see if the issues migrate over with it, or if they stay with the
> expansion cards.
>
> If the issue exists on both NIC types, then it rules out the specific
> NIC chipset as the culprit.  It could point to the driver, but upgrading
> it to a newer version did not help and actually appeared to make
> everything worse.  This issue might actually be more to do with the PCIe
> bridge than the NICs, but these are still different motherboards with
> different PCIe bridges (5520 vs C600) experiencing the same issues.
>
> > I've been using Intel NICs with Xen/CentOS for ages with no issues.
>
> I figured that must be so.  Everyone uses Intel NICs.  If this was a
> common issue, it would probably be causing a lot of people a lot of
> trouble.
>
> --
> Kevin Stange
> Chief Technology Officer
> Steadfast | Managed Infrastructure, Datacenter and Cloud Services
> 800 S Wells, Suite 190 | Chicago, IL 60607
> 312.602.2689 X203 | Fax: 312.602.2688
> ke...@steadfast.net | www.steadfast.net
> ___
> CentOS-virt mailing list
> CentOS-virt@centos.org
> https://lists.centos.org/mailman/listinfo/centos-virt
>
___
CentOS-virt mailing list
CentOS-virt@centos.org
https://lists.centos.org/mailman/listinfo/centos-virt


Re: [CentOS-virt] CentOS-virt Digest, Vol 111, Issue 2

2016-11-07 Thread Jinesh Choksi


On 05/11/2016 12:00, centos-virt-requ...@centos.org wrote:

would making these changes not break the existing automation that folks
might have in place ?


Hi

Yes, that was my concern as well. I was asking the question since I 
wanted to find out whether it was possible that a refresh of the 
CentOS-7 x86_64 AMI would enable the new functionality. I was not 
requesting for the functionality to be enabled.


Regards,
Jinesh
___
CentOS-virt mailing list
CentOS-virt@centos.org
https://lists.centos.org/mailman/listinfo/centos-virt


[CentOS-virt] CentOS-7 x86_64 AMIs and consistent network device naming

2016-11-04 Thread Jinesh Choksi
Hello,

Re:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7-Beta/html/7.3_Release_Notes/bug_fixes_general_updates.html

Are there any upcoming plans for turning off the use of legacy interface
names in the next official CentOS 7.x AMI?

Currently, both the official RHEL 7.3 GA AMI and the latest available
official CentOS-7 x86_64 AMI use legacy interface names via either:

- "net.ifnames=0" kernel boot parameter
or
- "ln -vs /dev/null /etc/udev/rules.d/80-net-setup-link.rules"

As I'm setting up some automation which will use the official CentOS 7 AMI,
I'm wondering whether to switch on the new naming scheme manually and
pre-empt the switch over or wait until the official AMIs switch over.

Also, will the next CentOS 7 AMI start using NetworkManager instead of the
traditional ifcfg-* scripts?

Regards,
Jinesh
___
CentOS-virt mailing list
CentOS-virt@centos.org
https://lists.centos.org/mailman/listinfo/centos-virt