[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
This bug was fixed in the package linux - 3.13.0-144.193 --- linux (3.13.0-144.193) trusty; urgency=medium * linux: 3.13.0-144.193 -proposed tracker (LP: #1755227) * CVE-2017-12762 - isdn/i4l: fix buffer overflow * CVE-2017-17807 - KEYS: add missing permission check for request_key() destination * bnx2x_attn_int_deasserted3:4323 MC assert! (LP: #1715519) // CVE-2018-126 - net: Add ndo_gso_check - net: create skb_gso_validate_mac_len() - bnx2x: disable GSO where gso_size is too big for hardware * CVE-2017-17448 - netfilter: nfnetlink_cthelper: Add missing permission checks * CVE-2017-11089 - cfg80211: Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE * CVE-2018-5332 - RDS: Heap OOB write in rds_message_alloc_sgs() * ppc64el: Do not call ibm,os-term on panic (LP: #1736954) - powerpc: Do not call ppc_md.panic in fadump panic notifier * CVE-2017-17805 - crypto: salsa20 - fix blkcipher_walk API usage * [Hyper-V] storvsc: do not assume SG list is continuous when doing bounce buffers (LP: #1742480) - SAUCE: storvsc: do not assume SG list is continuous when doing bounce buffers * Shutdown hang on 16.04 with iscsi targets (LP: #1569925) - scsi: libiscsi: Allow sd_shutdown on bad transport * CVE-2017-17741 - KVM: Fix stack-out-of-bounds read in write_mmio * CVE-2017-5715 (Spectre v2 Intel) - [Packaging] pull in retpoline files -- Stefan BaderThu, 15 Mar 2018 15:08:03 +0100 ** Changed in: linux (Ubuntu Trusty) Status: Fix Committed => Fix Released ** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2017-11089 ** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2017-12762 ** Changed in: linux (Ubuntu Xenial) Status: Fix Committed => Fix Released ** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2017-16995 ** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2017-17862 ** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2017-5753 ** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2018-8043 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
This bug was fixed in the package linux - 4.4.0-119.143 --- linux (4.4.0-119.143) xenial; urgency=medium * linux: 4.4.0-119.143 -proposed tracker (LP: #1760327) * Dell XPS 13 9360 bluetooth scan can not detect any device (LP: #1759821) - Revert "Bluetooth: btusb: fix QCA Rome suspend/resume" linux (4.4.0-118.142) xenial; urgency=medium * linux: 4.4.0-118.142 -proposed tracker (LP: #1759607) * Kernel panic with AWS 4.4.0-1053 / 4.4.0-1015 (Trusty) (LP: #1758869) - x86/microcode/AMD: Do not load when running on a hypervisor * CVE-2018-8043 - net: phy: mdio-bcm-unimac: fix potential NULL dereference in unimac_mdio_probe() linux (4.4.0-117.141) xenial; urgency=medium * linux: 4.4.0-117.141 -proposed tracker (LP: #1755208) * Xenial update to 4.4.114 stable release (LP: #1754592) - x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels - usbip: prevent vhci_hcd driver from leaking a socket pointer address - usbip: Fix implicit fallthrough warning - usbip: Fix potential format overflow in userspace tools - x86/microcode/intel: Fix BDW late-loading revision check - x86/retpoline: Fill RSB on context switch for affected CPUs - sched/deadline: Use the revised wakeup rule for suspending constrained dl tasks - can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once - can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once - PM / sleep: declare __tracedata symbols as char[] rather than char - time: Avoid undefined behaviour in ktime_add_safe() - timers: Plug locking race vs. timer migration - Prevent timer value 0 for MWAITX - drivers: base: cacheinfo: fix x86 with CONFIG_OF enabled - drivers: base: cacheinfo: fix boot error message when acpi is enabled - PCI: layerscape: Add "fsl,ls2085a-pcie" compatible ID - PCI: layerscape: Fix MSG TLP drop setting - mmc: sdhci-of-esdhc: add/remove some quirks according to vendor version - fs/select: add vmalloc fallback for select(2) - hwpoison, memcg: forcibly uncharge LRU pages - cma: fix calculation of aligned offset - mm, page_alloc: fix potential false positive in __zone_watermark_ok - ipc: msg, make msgrcv work with LONG_MIN - x86/ioapic: Fix incorrect pointers in ioapic_setup_resources() - ACPI / processor: Avoid reserving IO regions too early - ACPI / scan: Prefer devices without _HID/_CID for _ADR matching - ACPICA: Namespace: fix operand cache leak - netfilter: x_tables: speed up jump target validation - netfilter: arp_tables: fix invoking 32bit "iptable -P INPUT ACCEPT" failed in 64bit kernel - netfilter: nf_dup_ipv6: set again FLOWI_FLAG_KNOWN_NH at flowi6_flags - netfilter: nf_ct_expect: remove the redundant slash when policy name is empty - netfilter: nfnetlink_queue: reject verdict request from different portid - netfilter: restart search if moved to other chain - netfilter: nf_conntrack_sip: extend request line validation - netfilter: use fwmark_reflect in nf_send_reset - ext2: Don't clear SGID when inheriting ACLs - reiserfs: fix race in prealloc discard - reiserfs: don't preallocate blocks for extended attributes - reiserfs: Don't clear SGID when inheriting ACLs - fs/fcntl: f_setown, avoid undefined behaviour - scsi: libiscsi: fix shifting of DID_REQUEUE host byte - Input: trackpoint - force 3 buttons if 0 button is reported - usb: usbip: Fix possible deadlocks reported by lockdep - usbip: fix stub_rx: get_pipe() to validate endpoint number - usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input - usbip: prevent leaking socket pointer address in messages - um: link vmlinux with -no-pie - vsyscall: Fix permissions for emulate mode with KAISER/PTI - eventpoll.h: add missing epoll event masks - x86/microcode/intel: Extend BDW late-loading further with LLC size check - hrtimer: Reset hrtimer cpu base proper on CPU hotplug - dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state - ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL - ipv6: fix udpv6 sendmsg crash caused by too small MTU - ipv6: ip6_make_skb() needs to clear cork.base.dst - lan78xx: Fix failure in USB Full Speed - net: igmp: fix source address check for IGMPv3 reports - tcp: __tcp_hdrlen() helper - net: qdisc_pkt_len_init() should be more robust - pppoe: take ->needed_headroom of lower device into account on xmit - r8169: fix memory corruption on retrieval of hardware statistics. - sctp: do not allow the v4 socket to bind a v4mapped v6 address - sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf - vmxnet3: repair memory leak - net: Allow neigh contructor functions ability to modify the primary_key - ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
Hi, We do ship an iso for ppc64le for Trusty - I'm not sure whether it does bare metal/PowerNV or just as an LPAR under PowerVM, but it's probably a bit moot at this point. The good news is that as you can see, the artful kernel was released with the fix. The Xenial kernel also contains the fix; I'm not sure why that wasn't auto-added to this bug, but the release notes contain this fix: https://launchpad.net/ubuntu/+source/linux/4.4.0-119.143 I am not sure what the final status of the patch in Trusty is, I will let you know when I find out. Regards, Daniel -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
This bug was fixed in the package linux - 4.13.0-38.43 --- linux (4.13.0-38.43) artful; urgency=medium * linux: 4.13.0-38.43 -proposed tracker (LP: #1755762) * Servers going OOM after updating kernel from 4.10 to 4.13 (LP: #1748408) - i40e: Fix memory leak related filter programming status - i40e: Add programming descriptors to cleaned_count * [SRU] Lenovo E41 Mic mute hotkey is not responding (LP: #1753347) - platform/x86: ideapad-laptop: Increase timeout to wait for EC answer * fails to dump with latest kpti fixes (LP: #1750021) - kdump: write correct address of mem_section into vmcoreinfo * headset mic can't be detected on two Dell machines (LP: #1748807) - ALSA: hda/realtek - Support headset mode for ALC215/ALC285/ALC289 - ALSA: hda - Fix headset mic detection problem for two Dell machines - ALSA: hda - Fix a wrong FIXUP for alc289 on Dell machines * CIFS SMB2/SMB3 does not work for domain based DFS (LP: #1747572) - CIFS: make IPC a regular tcon - CIFS: use tcon_ipc instead of use_ipc parameter of SMB2_ioctl - CIFS: dump IPC tcon in debug proc file * i2c-thunderx: erroneous error message "unhandled state: 0" (LP: #1754076) - i2c: octeon: Prevent error message on bus error * hisi_sas: Add disk LED support (LP: #1752695) - scsi: hisi_sas: directly attached disk LED feature for v2 hw * EDAC, sb_edac: Backport 1 patch to Ubuntu 17.10 (Fix missing DIMM sysfs entries with KNL SNC2/SNC4 mode) (LP: #1743856) - EDAC, sb_edac: Fix missing DIMM sysfs entries with KNL SNC2/SNC4 mode * [regression] Colour banding and artefacts appear system-wide on an Asus Zenbook UX303LA with Intel HD 4400 graphics (LP: #1749420) - drm/edid: Add 6 bpc quirk for CPT panel in Asus UX303LA * DVB Card with SAA7146 chipset not working (LP: #1742316) - vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems * [Asus UX360UA] battery status in unity-panel is not changing when battery is being charged (LP: #1661876) // AC adapter status not detected on Asus ZenBook UX410UAK (LP: #1745032) - ACPI / battery: Add quirk for Asus UX360UA and UX410UAK * ASUS UX305LA - Battery state not detected correctly (LP: #1482390) - ACPI / battery: Add quirk for Asus GL502VSK and UX305LA * support thunderx2 vendor pmu events (LP: #1747523) - perf pmu: Extract function to get JSON alias map - perf pmu: Pass pmu as a parameter to get_cpuid_str() - perf tools arm64: Add support for get_cpuid_str function. - perf pmu: Add helper function is_pmu_core to detect PMU CORE devices - perf vendor events arm64: Add ThunderX2 implementation defined pmu core events - perf pmu: Add check for valid cpuid in perf_pmu__find_map() * lpfc.ko module doesn't work (LP: #1746970) - scsi: lpfc: Fix loop mode target discovery * Ubuntu 17.10 crashes on vmalloc.c (LP: #1739498) - powerpc/mm/book3s64: Make KERN_IO_START a variable - powerpc/mm/slb: Move comment next to the code it's referring to - powerpc/mm/hash64: Make vmalloc 56T on hash * ethtool -p fails to light NIC LED on HiSilicon D05 systems (LP: #1748567) - net: hns: add ACPI mode support for ethtool -p * CVE-2017-17807 - KEYS: add missing permission check for request_key() destination * [Artful SRU] Fix capsule update regression (LP: #1746019) - efi/capsule-loader: Reinstate virtual capsule mapping * [Artful/Bionic] [Config] enable EDAC_GHES for ARM64 (LP: #1747746) - Ubuntu: [Config] enable EDAC_GHES for ARM64 * linux-tools: perf incorrectly linking libbfd (LP: #1748922) - SAUCE: tools -- add ability to disable libbfd - [Packaging] correct disablement of libbfd * Cherry pick c96f5471ce7d for delayacct fix (LP: #1747769) - delayacct: Account blkio completion on the correct task * Error in CPU frequency reporting when nominal and min pstates are same (cpufreq) (LP: #1746174) - cpufreq: powernv: Dont assume distinct pstate values for nominal and pmin * retpoline abi files are empty on i386 (LP: #1751021) - [Packaging] retpoline-extract -- instantiate retpoline files for i386 - [Packaging] final-checks -- sanity checking ABI contents - [Packaging] final-checks -- check for empty retpoline files * [P9,Power NV][WSP][Ubuntu 1804] : "Kernel access of bad area " when grouping different pmu events using perf fuzzer . (perf:) (LP: #1746225) - powerpc/perf: Fix oops when grouping different pmu events * bnx2x_attn_int_deasserted3:4323 MC assert! (LP: #1715519) // CVE-2018-126 - net: create skb_gso_validate_mac_len() - bnx2x: disable GSO where gso_size is too big for hardware * Ubuntu16.04.03: ISAv3 initialize MMU registers before setting partition table (LP: #1736145) - powerpc/64s: Initialize ISAv3 MMU registers before setting partition table * powerpc/powernv: Flush console before platform error reboot (LP: #1735159) -
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
I didn't think Trusty would even run baremetal on ppc64el? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
Fantastic! If I understand correctly, that is sufficient for verification-done- artful, so I am changing that over for you. The one remaining kernel is Trusty 3.13. I am guessing your module doesn't compile for that? If it doesn't, there probably isn't much point on booting with just a virtual ethernet adaptor as the change is specifically to the bnx2x code. Thanks again for your prompt testing efforts! Regards, Daniel ** Tags removed: verification-needed-artful ** Tags added: verification-done-artful -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
Full test passes. No crash with proposed hwe-16.04 kernel and tso/gso enabled. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
After conferring with the IBM Novalink dev team, we found a way to exploit some undocumented preliminary artful support in the DKMS, to successfully install the hwe-16.04 kernel (4.13.0-38) from xenial- proposed. It boots and passes traffic. I hope to report back later today or possibly tomorrow with full test results. Please let me know if this is sufficient testing for applying the verification-done-artful tag. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
These Linux LPARs are running Novalink, in a mode that allows them to serve in place of the traditional PowerVM AIX VIO server. So they're providing virtual I/O for other LPARs (this is how we ended up sending 64k packets into the bnx2x nics via open vswitch). The DKMS modules enable that function. It also serves to restrict what we can do with the OS in terms of trying other versions; the Novalink software is supported only on Xenial. Trying to rebuild a machine on artful without Novalink means none of the other LPARs on that machine will run, which will be very disruptive for us, even on a test machine. We could probably find a way to test artful on a VM, but then that would be using a virtual ethernet adapter and we wouldn't be testing the bnx2x code. I don't know if that would be valuable to you? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
Hi, Thanks for the Xenial test! The kernel team process is that patches will always be committed from the most recent kernel first and then back to older kernels, so that no- one ends up with a regression if they upgrade to a more recent kernel. So if it is applied to Xenial it will be applied to Artful :) (FYI, it's already in the kernel that will be in Bionic next month.) I don't know what you're compiling with DKMS; are you able to do any test at all without it? Just testing that the machine boots and that you can ping someone would make me more comfortable. Regards, Daniel -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
We're not set up to test the artful kernel, but I can envision a near- to-mid-term future where we will need this patch to be present in the artful kernel. What's the best way for me to help make sure that future obtains? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
** Tags removed: verification-needed-xenial ** Tags added: verification-done-xenial -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
Our validation tests pass with tso/gso enabled and 4.4.0-117 (no crash). We'll do a couple more tests but it's looking good so far. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
I was able to install the ppc64el 4.4.0-117 kernel from xenial-proposed. It boots and passes at least enough traffic to make an interactive ssh session function. I'm having some DKMS issues, however, which are preventing me from validating the patch prevents crash as expected (one of the modules we need has BUILD_EXCLUSIVE for 4.8 and 4.10 only). I am working with the IBM development team who publishes these modules to get that resolved ASAP (they were surprised by this, so we are waiting for answers from individual engineers, probably Monday). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
Ok, that works. I should be able to provide basic validation much more easily. We are having trouble reproducing the crash with the unpatched kernel, as we've put many remediating workarounds in place and it's been an adventure to remember where they all are and how exactly they all interacted (and also there are some other bugs currently being worked on in our test environment that affect our ability to reproduce as well). We know it will crash though, and as soon as we get the test setup in place to reproduce the crash I'm confident we can repeat it every time and get a good validation that the patch is solving our problem as well. If we still can't reproduce the crash by COB Monday, I'll switch gears to provide just a basic boot/function validation and will make sure I get that to you by Tuesday at the latest. We only have the environment to test xenial ppc64el. We have xenial and trusty x86_64, but not with bnx2x. We don't have artful anywhere that I can test with. Based on that, I was only planning to provide a report for xenial ppc64el. Does the smoke testing make sense to perform without bnx2x hardware present? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
Hi, As well as Po-Hsu's comment above, I also have this internal update from the kernel team: As this is also a security fix, don't stress too much. If things could be verified for at least for one of the kernels until next week that is better than nothing. We are rather unlikely rip out fixes if they have a CVE (and do not appear to regress other things) (FYI, this issue is covered by CVE-2018-126.) So, just to confirm, in order of priority: 1) First confirm there are no new regressions (just a quick 'smoke test') on the 3 kernels. 2) Second, do a full test of 1 of the kernels. 3) Test the remaining 2 kernels. Please keep this bug updated as you go through each step. Hopefully this helps reduce the pressure for you! Regards, Daniel -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
In addition to what Po-Hsu Lin and Daniel Axtens said, to have each of the kernels test-booted and checked for obvious regressions would be great and if in addition one of those can be tested for the reported problem being fixed, then this is sufficient. But mainly this procedure is to ensure we do not break things while trying to fix problems. And if there is some regression it takes time to isolate the change and revert it. Hence the urge to get reporter test as soon as possible. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
Hello r.stricklin, The planned SRU schedule this cycle * Current cycle: 09-Mar through 31-Mar 09-Mar Last day for kernel commits for this cycle. 12-Mar - 17-Mar Kernel prep week. 18-Mar - 30-Mar Bug verification & Regression testing. 02-Apr Release to -updates. So yeah don't worry about the deadline mentioned in #6 However it would be great to get this verified before the last day, in this case, if there is any issue with this fix we still have some time to pull it out. Thanks -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
Hi, I am the support engineer on the Canonical side who has been working on this with IBM Support on your behalf. Apologies for the confusion. I will contact our kernel team now and get this clarified for you as soon as I can. Now, I can't speak for the kernel team or make any commitments on their behalf. However, I know the full test process is time-consuming, so in the mean time, if you are able to boot with each of the kernels and just quickly verify that there are no obvious regressions - just that boot succeeds and that the network card can still send and receive data - I think that would be a very good first step. Regards, Daniel -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
I am the end user who originally reported this problem. We have been working with Canonical through IBM Support to get this problem resolved -- When I last spoke with IBM about this a few weeks ago, I got the impression we'd have a little extra time to perform verification (e.g. the test period would be between 18 and 31 March) and planned for our test resources accordingly. I now see that we'd need to perform verification by the 23rd, or possibly 26th? I will do what I can do increase urgency but even having a couple extra days would go a long way to help us get the verification done. Please let me know ASAP if this is or is not possible. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- trusty' to 'verification-done-trusty'. If the problem still exists, change the tag 'verification-needed-trusty' to 'verification-failed- trusty'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-trusty ** Tags added: verification-needed-xenial -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- artful' to 'verification-done-artful'. If the problem still exists, change the tag 'verification-needed-artful' to 'verification-failed- artful'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed- xenial'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-artful -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
** Also affects: linux (Ubuntu Xenial) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Xenial) Status: New => Fix Committed ** Also affects: linux (Ubuntu Trusty) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Trusty) Status: New => Fix Committed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
** Also affects: linux (Ubuntu Artful) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Artful) Status: New => Fix Committed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
This bug was fixed in the package linux - 4.15.0-10.11 --- linux (4.15.0-10.11) bionic; urgency=medium * linux: 4.15.0-10.11 -proposed tracker (LP: #1749250) * "swiotlb: coherent allocation failed" dmesg spam with linux 4.15.0-9.10 (LP: #1749202) - swiotlb: suppress warning when __GFP_NOWARN is set - drm/ttm: specify DMA_ATTR_NO_WARN for huge page pools * linux-tools: perf incorrectly linking libbfd (LP: #1748922) - SAUCE: tools -- add ability to disable libbfd - [Packaging] correct disablement of libbfd * [Artful] Realtek ALC225: 2 secs noise when a headset plugged in (LP: #1744058) - ALSA: hda/realtek - update ALC225 depop optimize * [Artful] Support headset mode for DELL WYSE (LP: #1723913) - SAUCE: ALSA: hda/realtek - Add support headset mode for DELL WYSE * headset mic can't be detected on two Dell machines (LP: #1748807) - ALSA: hda/realtek - Support headset mode for ALC215/ALC285/ALC289 - ALSA: hda - Fix headset mic detection problem for two Dell machines * Bionic update to v4.15.3 stable release (LP: #1749191) - ip6mr: fix stale iterator - net: igmp: add a missing rcu locking section - qlcnic: fix deadlock bug - qmi_wwan: Add support for Quectel EP06 - r8169: fix RTL8168EP take too long to complete driver initialization. - tcp: release sk_frag.page in tcp_disconnect - vhost_net: stop device during reset owner - ipv6: addrconf: break critical section in addrconf_verify_rtnl() - ipv6: change route cache aging logic - Revert "defer call to mem_cgroup_sk_alloc()" - net: ipv6: send unsolicited NA after DAD - rocker: fix possible null pointer dereference in rocker_router_fib_event_work - tcp_bbr: fix pacing_gain to always be unity when using lt_bw - cls_u32: add missing RCU annotation. - ipv6: Fix SO_REUSEPORT UDP socket with implicit sk_ipv6only - soreuseport: fix mem leak in reuseport_add_sock() - net_sched: get rid of rcu_barrier() in tcf_block_put_ext() - net: sched: fix use-after-free in tcf_block_put_ext - media: mtk-vcodec: add missing MODULE_LICENSE/DESCRIPTION - media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE - media: tegra-cec: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE - gpio: uniphier: fix mismatch between license text and MODULE_LICENSE - crypto: tcrypt - fix S/G table for test_aead_speed() - Linux 4.15.3 * bnx2x_attn_int_deasserted3:4323 MC assert! (LP: #1715519) // CVE-2018-126 - net: create skb_gso_validate_mac_len() - bnx2x: disable GSO where gso_size is too big for hardware * ethtool -p fails to light NIC LED on HiSilicon D05 systems (LP: #1748567) - net: hns: add ACPI mode support for ethtool -p * CVE-2017-5715 (Spectre v2 Intel) - [Packaging] retpoline files must be sorted - [Packaging] pull in retpoline files * [Feature] PXE boot with Intel Omni-Path (LP: #1712031) - d-i: Add hfi1 to nic-modules * CVE-2017-5715 (Spectre v2 retpoline) - [Packaging] retpoline -- add call site validation - [Config] disable retpoline checks for first upload * Do not duplicate changelog entries assigned to more than one bug or CVE (LP: #1743383) - [Packaging] git-ubuntu-log -- handle multiple bugs/cves better -- Seth ForsheeTue, 13 Feb 2018 11:33:58 -0600 ** Changed in: linux (Ubuntu) Status: Confirmed => Fix Released ** CVE added: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2017-5715 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
** Description changed: SRU Justification = A ppc64le system runs as a guest under PowerVM. This guest has a bnx2x card attached, and uses openvswitch to bridge an ibmveth interface for traffic from other LPARs. We see the following crash sometimes when running netperf: May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_attn_int_deasserted3:4323(enP24p1s0f2)]MC assert! May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:720(enP24p1s0f2)]XSTORM_ASSERT_LIST_INDEX 0x2 May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:736(enP24p1s0f2)]XSTORM_ASSERT_INDEX 0x0 = 0x 0x25e42a7e 0x00462a38 0x00010052 May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:750(enP24p1s0f2)]Chip Revision: everest3, FW Version: 7_13_1 May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_attn_int_deasserted3:4329(enP24p1s0f2)]driver assert May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_panic_dump:923(enP24p1s0f2)]begin crash dump - ... (dump of registers follows) ... Subsequent debugging reveals that the packets causing the issue come through the ibmveth interface - from the AIX LPAR. The veth protocol is 'special' - communication between LPARs on the same chassis can use very large (64k) frames to reduce overhead. Normal networks cannot handle such large packets, so traditionally, the VIOS partition would signal to the AIX partitions that it was 'special', and AIX would send regular, ethernet-sized packets to VIOS, which VIOS would then send out. This signalling between VIOS and AIX is done in a way that is not standards-compliant, and so was never made part of Linux. Instead, the Linux driver has always understood large frames and passed them up the network stack. In some cases (e.g. with TCP), multiple TCP segments are coalesced into one large packet. In Linux, this goes through the generic receive offload code, using a similar mechanism to GSO. These segments can be very large which presents as a very large MSS (maximum segment size) or gso_size. Normally, the large packet is simply passed to whatever network application on Linux is going to consume it, and everything is OK. However, in this case, the packets go through Open vSwitch, and are then passed to the bnx2x driver. The bnx2x driver/hardware supports TSO and GSO, but with a restriction: the maximum segment size is limited to around 9700 bytes. Normally this is more than adequate. However, if a large packet with very large (>9700 byte) TCP segments arrives through ibmveth, and is passed to bnx2x, the hardware will panic. [Impact] bnx2x card panics, requiring power cycle to restore functionality. The workaround is turning off TSO, which prevents the crash as the kernel resegments *all* packets in software, not just ones that are too big. This has a performance cost. [Fix] Test packet size in bnx2x feature check path and disable GSO if it is - too large. + too large. To do this we move a function from one file to another and + add another in the networking core. [Regression Potential] - Limited to bnx2x card driver. + A/B/X: The changes to the network core are easily reviewed. The changes to behaviour are limited to the bnx2x card driver. The most likely failure case is a false-positive on the size check, which would lead to a performance regression only. + + T: This also involves a different change to the networking core to add + the old-style GSO checking, which is more invasive. However the changes + are simple and easily reviewed. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
This has been assigned CVE-2018-126. ** CVE added: https://cve.mitre.org/cgi- bin/cvename.cgi?name=2018-126 ** Description changed: SRU Justification = A ppc64le system runs as a guest under PowerVM. This guest has a bnx2x card attached, and uses openvswitch to bridge an ibmveth interface for traffic from other LPARs. We see the following crash sometimes when running netperf: May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_attn_int_deasserted3:4323(enP24p1s0f2)]MC assert! May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:720(enP24p1s0f2)]XSTORM_ASSERT_LIST_INDEX 0x2 May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:736(enP24p1s0f2)]XSTORM_ASSERT_INDEX 0x0 = 0x 0x25e42a7e 0x00462a38 0x00010052 May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:750(enP24p1s0f2)]Chip Revision: everest3, FW Version: 7_13_1 May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_attn_int_deasserted3:4329(enP24p1s0f2)]driver assert May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_panic_dump:923(enP24p1s0f2)]begin crash dump - ... (dump of registers follows) ... Subsequent debugging reveals that the packets causing the issue come through the ibmveth interface - from the AIX LPAR. The veth protocol is 'special' - communication between LPARs on the same chassis can use very large (64k) frames to reduce overhead. Normal networks cannot handle such large packets, so traditionally, the VIOS partition would signal to the AIX partitions that it was 'special', and AIX would send regular, ethernet-sized packets to VIOS, which VIOS would then send out. This signalling between VIOS and AIX is done in a way that is not standards-compliant, and so was never made part of Linux. Instead, the Linux driver has always understood large frames and passed them up the network stack. In some cases (e.g. with TCP), multiple TCP segments are coalesced into one large packet. In Linux, this goes through the generic receive offload code, using a similar mechanism to GSO. These segments can be very large which presents as a very large MSS (maximum segment size) or gso_size. Normally, the large packet is simply passed to whatever network application on Linux is going to consume it, and everything is OK. However, in this case, the packets go through Open vSwitch, and are then passed to the bnx2x driver. The bnx2x driver/hardware supports TSO and GSO, but with a restriction: the maximum segment size is limited to around 9700 bytes. Normally this is more than adequate. However, if a large packet with very large (>9700 byte) TCP segments arrives through ibmveth, and is passed to bnx2x, the hardware will panic. - Impact - -- + [Impact] bnx2x card panics, requiring power cycle to restore functionality. The workaround is turning off TSO, which prevents the crash as the kernel resegments *all* packets in software, not just ones that are too big. This has a performance cost. - Fix - --- + [Fix] Test packet size in bnx2x feature check path and disable GSO if it is too large. - Regression Potential - + [Regression Potential] Limited to bnx2x card driver. The most likely failure case is a false-positive on the size check, which would lead to a performance regression only. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
** Description changed: SRU Justification = A ppc64le system runs as a guest under PowerVM. This guest has a bnx2x card attached, and uses openvswitch to bridge an ibmveth interface for traffic from other LPARs. We see the following crash sometimes when running netperf: May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_attn_int_deasserted3:4323(enP24p1s0f2)]MC assert! May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:720(enP24p1s0f2)]XSTORM_ASSERT_LIST_INDEX 0x2 May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:736(enP24p1s0f2)]XSTORM_ASSERT_INDEX 0x0 = 0x 0x25e42a7e 0x00462a38 0x00010052 May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:750(enP24p1s0f2)]Chip Revision: everest3, FW Version: 7_13_1 May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_attn_int_deasserted3:4329(enP24p1s0f2)]driver assert May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_panic_dump:923(enP24p1s0f2)]begin crash dump - ... (dump of registers follows) ... Subsequent debugging reveals that the packets causing the issue come through the ibmveth interface - from the AIX LPAR. The veth protocol is 'special' - communication between LPARs on the same chassis can use very large (64k) frames to reduce overhead. Normal networks cannot handle such large packets, so traditionally, the VIOS partition would signal to the AIX partitions that it was 'special', and AIX would send regular, ethernet-sized packets to VIOS, which VIOS would then send out. This signalling between VIOS and AIX is done in a way that is not standards-compliant, and so was never made part of Linux. Instead, the Linux driver has always understood large frames and passed them up the network stack. In some cases (e.g. with TCP), multiple TCP segments are coalesced into one large packet. In Linux, this goes through the generic receive offload code, using a similar mechanism to GSO. These segments can be very large which presents as a very large MSS (maximum segment size) or gso_size. Normally, the large packet is simply passed to whatever network application on Linux is going to consume it, and everything is OK. However, in this case, the packets go through Open vSwitch, and are then passed to the bnx2x driver. The bnx2x driver/hardware supports TSO and GSO, but with a restriction: the maximum segment size is limited to around 9700 bytes. Normally this is more than adequate. However, if a large packet with very large (>9700 byte) TCP segments arrives through ibmveth, and is passed to bnx2x, the hardware will panic. Impact -- bnx2x card panics, requiring power cycle to restore functionality. The workaround is turning off TSO, which prevents the crash as the kernel resegments *all* packets in software, not just ones that are too big. This has a performance cost. - Fix --- - Test packet size in bnx2x feature check path. + Test packet size in bnx2x feature check path and disable GSO if it is + too large. Regression Potential Limited to bnx2x card driver. The most likely failure case is a false-positive on the size check, which would lead to a performance regression only. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
** Description changed: - (This bug provides a place to track the progress of this issue upstream - and then in to Ubuntu.) + SRU Justification + = A ppc64le system runs as a guest under PowerVM. This guest has a bnx2x card attached, and uses openvswitch to bridge an ibmveth interface for traffic from other LPARs. We see the following crash sometimes when running netperf: - May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_attn_int_deasserted3:4323(enP24p1s0f2)]MC assert! - May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:720(enP24p1s0f2)]XSTORM_ASSERT_LIST_INDEX 0x2 - May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:736(enP24p1s0f2)]XSTORM_ASSERT_INDEX 0x0 = 0x 0x25e42a7e 0x00462a38 0x00010052 - May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:750(enP24p1s0f2)]Chip Revision: everest3, FW Version: 7_13_1 - May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_attn_int_deasserted3:4329(enP24p1s0f2)]driver assert - May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_panic_dump:923(enP24p1s0f2)]begin crash dump - + May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_attn_int_deasserted3:4323(enP24p1s0f2)]MC assert! + May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:720(enP24p1s0f2)]XSTORM_ASSERT_LIST_INDEX 0x2 + May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:736(enP24p1s0f2)]XSTORM_ASSERT_INDEX 0x0 = 0x 0x25e42a7e 0x00462a38 0x00010052 + May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_mc_assert:750(enP24p1s0f2)]Chip Revision: everest3, FW Version: 7_13_1 + May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_attn_int_deasserted3:4329(enP24p1s0f2)]driver assert + May 10 17:16:32 tuk6r1phn2 kernel: bnx2x: [bnx2x_panic_dump:923(enP24p1s0f2)]begin crash dump - ... (dump of registers follows) ... Subsequent debugging reveals that the packets causing the issue come through the ibmveth interface - from the AIX LPAR. The veth protocol is 'special' - communication between LPARs on the same chassis can use very large (64k) frames to reduce overhead. Normal networks cannot handle such large packets, so traditionally, the VIOS partition would signal to the AIX partitions that it was 'special', and AIX would send regular, ethernet-sized packets to VIOS, which VIOS would then send out. This signalling between VIOS and AIX is done in a way that is not standards-compliant, and so was never made part of Linux. Instead, the Linux driver has always understood large frames and passed them up the network stack. In some cases (e.g. with TCP), multiple TCP segments are coalesced into one large packet. In Linux, this goes through the generic receive offload code, using a similar mechanism to GSO. These segments can be very large which presents as a very large MSS (maximum segment size) or gso_size. Normally, the large packet is simply passed to whatever network application on Linux is going to consume it, and everything is OK. However, in this case, the packets go through Open vSwitch, and are then passed to the bnx2x driver. The bnx2x driver/hardware supports TSO and GSO, but with a restriction: the maximum segment size is limited to - around 9700 bytes. Normally this is more than adequate as jumbo frames - are limited to 9000 bytes. However, if a large packet with large (>9700 - byte) TCP segments arrives through ibmveth, and is passed to bnx2x, the - hardware will panic. + around 9700 bytes. Normally this is more than adequate. However, if a + large packet with very large (>9700 byte) TCP segments arrives through + ibmveth, and is passed to bnx2x, the hardware will panic. - Turning off TSO prevents the crash as the kernel resegments the data and - assembles the packets in software. This has a performance cost. + Impact + -- - Clearly at the very least, bnx2x should not crash in this case. + bnx2x card panics, requiring power cycle to restore functionality. - One patch to do this was sent upstream: - https://www.spinics.net/lists/netdev/msg452932.html + The workaround is turning off TSO, which prevents the crash as the + kernel resegments *all* packets in software, not just ones that are too + big. This has a performance cost. + + + Fix + --- + + Test packet size in bnx2x feature check path. + + Regression Potential + + + Limited to bnx2x card driver. + The most likely failure case is a false-positive on the size check, which would lead to a performance regression only. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com
[Bug 1715519] Re: bnx2x_attn_int_deasserted3:4323 MC assert!
A set of 2 patches to fix this was accepted upstream: https://github.com/torvalds/linux/commit/2b16f048729bf35e6c28a40cbfad07239f9dcd90 https://github.com/torvalds/linux/commit/8914a595110a6eca69a5e275b323f5d09e18f4f9 I will send an SRU shortly. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1715519 Title: bnx2x_attn_int_deasserted3:4323 MC assert! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715519/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs