[Kernel-packages] [Bug 1952185] Re: Embedded ARM64 crash trying to zero-fill an 8GB ramdisk
** Changed in: linux (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1952185 Title: Embedded ARM64 crash trying to zero-fill an 8GB ramdisk Status in linux package in Ubuntu: Invalid Bug description: ---Problem Description--- Kernel crash when setting up ramdisk on embedded ARM Contact Information = Chris Ward t...@uk.ibm.com Mohit Kapur moh...@us.ibm.com ---Additional Hardware Info--- Embedded ARM with FPGA ---uname output--- Linux cuttlefisharm1 5.4.0-xilinx-v2020.2 #1 SMP Thu Nov 18 18:44:45 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux Machine Type = IBM Research internal processor based on xilinx ---System Hang--- Unresponsive. Power cycle reclaims ---Debugger--- A debugger is not configured ---Steps to Reproduce--- Boot the system. Try dd if=/dev/zero of=/dev/ram0 bs=... count=... A small ramdisk works and gives the expected error when trying to write 8G of data into an 8M ramdisk. tjcw@cuttlefisharm1:~$ sudo dd if=/dev/zero of=/dev/ram0 bs=4096 count=1048576 [sudo] password for tjcw: dd: error writing '/dev/ram0': No space left on device 2049+0 records in 2048+0 records out 8388608 bytes (8.4 MB, 8.0 MiB) copied, 0.0531788 s, 158 MB/s tjcw@cuttlefisharm1:~$ A 2GB ramdisk works, a 4GB ramdisk causes a crash To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1952185/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1781038] Re: KVM guest hash page table failed to allocate contiguous memory (CMA)
Hello Canonical, I've subscribed Daniel to this bug for his help getting additional information on the situation he was working. I expect he will add more folks from his end if necessary. Lastly, if you feel the bug should be marked as private for this discussion, please feel free to do that. Thanks. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1781038 Title: KVM guest hash page table failed to allocate contiguous memory (CMA) Status in linux package in Ubuntu: New Bug description: Per an email forwarded within IBM, we wish to use this Launchpad bug to work on the technical discussion with the Canonical development folks and the IBM KVM and kernel team surrounding the analysis made by Daniel Axtens of Canonical for the customer issue raised in Case #00177825. The only statement at the moment by the KVM team was that there were various issues associated with CMA fragmentation causing issues with KVM guests. However, as mentioned, this bug is to allow the dialog amongst all the developers to see what can be done to help alleviate the situation or understand the root cause further. Please also note that we should not be attaching customer data to this bug. If that is necessary then we expect Canonical to help provide a controlled environment for reviewing that data so we avoid any privacy issues (e.g. for GDPR compliance). Here is the email from Daniel: I have looked at the sosreport you uploaded. Here is my analysis so far. Virtualisation on powerpc has some special requirements. To start a guest on a powerpc host, you need to allocate a contiguous area of memory to hold the guest's hash page table (HPT, or HTAB, depending on which document you look at). The HPT is required to track and manage guest memory. Your error reports show qemu asking the kernel to allocate an HTAB, and the kernel reporting that it had insufficient memory to do so. The required memory for the HPT scales with the guest memory size - it should be about 1/128th of guest memory, so for a 16GB guest, that's 128MB. However, the HPT has to be allocated as a single contiguous memory region. (This is in contrast to regular guest memory, which is not required to be contiguous from the host point of view.) The kernel keeps a special contiguous memory area (CMA) for these purposes, and keeps track of the total amounts in use and still available. These are shown in /proc/meminfo. From the system that ran the sosreport, we see: CmaTotal: 26853376 kB CmaFree: 4024448 kB So there is a total of about 25GB of CMA, of which about 3.8GB remain. This is obviously more than 128MB: - It's very possible that between the error and the sosreport, more contiguous memory became available. This would match the intermittent nature of the issue. - It also might be that the failure was due to fragmentation of memory in the CMA pool. That is, there might be more than 128MB, but it might all be in chunks that are smaller than 128MB, or which don't have the required alignment for a HPT. Given that the system's uptime was 112 days when the sosreport was generated, it would be unsurprising if fragmentation had occurred! (Relatedly - you're running 4.4.0-109, which does not have the Spectre and Meltdown fixes.) This issue has come up before - both in a public Canonical-IBM synchronised bug report[1], and with Red Hat[2]. It appears that there is some work within IBM to address this, but it seems to have stalled. I will get in touch with the IBM powerpc kernel team on their public mailing list and ask about the status. I will keep you updated. In the mean time, I have a potential solution/workaround. By default, 5% of memory is reserved for CMA (kernel source: arch/powerpc/kvm/book3s_hv_builtin.c, kvm_cma_resv_ratio). You can increase this with a boot parameter, so for example to reserve 10%, you could boot with kvm_cma_resv_ratio=10. This can be set in petitboot. This should significantly reduce the incidence of this issue - perhaps eliminating it entirely - at the cost of locking away more of the system's memory. You would need to experiment to determine the optimal value. Perhaps given that you are seeing the problem only intermittently, a ratio of 7% would be sufficient - that would give you ~35GB of CMA. Please let me know if testing this setting would be an option for you. Please also let me know if you require further information on setting boot parameters with Petitboot. Regards, Daniel [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632045 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1304300 Before we go any further, let's get the basic info here. Apparently there was a sosreport somewhere else, and a link would be good, but, here's what we need here -- at least -- to
[Kernel-packages] [Bug 1588529] Re: Test bug 2. please ignore but do not close.
** Changed in: linux (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1588529 Title: Test bug 2. please ignore but do not close. Status in linux package in Ubuntu: Invalid Bug description: == Comment: #0 - Gary M. Gaydos - 2016-06-02 16:17:56 == ---Problem Description--- This is not a real bug. This bug is opened to investigate a potential bugzilla issue. ---uname output--- blah Machine Type = blah ---Debugger--- A debugger is not configured Contact Information = gmgay...@us.ibm.com Stack trace output: no Oops output: no System Dump Info: The system is not configured to capture a system dump. *Additional Instructions for gmgay...@us.ibm.com: -Attach sysctl -a output output to the bug. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1588529/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1557751] Re: makedumpfile generates kernel version error
Hello, Since we see this in xenial as well, will there be a track opened for delivery of a fix there as well? Thanks. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to makedumpfile in Ubuntu. https://bugs.launchpad.net/bugs/1557751 Title: makedumpfile generates kernel version error Status in makedumpfile package in Ubuntu: Confirmed Status in makedumpfile source package in Trusty: Confirmed Bug description: == Comment: #7 - Vaishnavi Bhat - 2016-03-15 06:16:53 == The following warning is generated by makedumpfile * running makedumpfile --dump-dmesg /proc/vmcore /mnt/201602030129/dmesg.201602030129 The kernel version is not supported. The created dumpfile may be incomplete. The makedumpfile version on your machine is # makedumpfile -v makedumpfile: version 1.5.5 (released on 18 Dec 2013) I did a #sudo apt-get source makedumpfile and checked for the sources. In the makedumpfile.h , the LATEST_VERSION is set to #define OLDEST_VERSION KERNEL_VERSION(2, 6, 15)/* linux-2.6.15 */ #define LATEST_VERSION KERNEL_VERSION(4, 1, 0)/* linux-4.1.0 */ where as kernel version on the machine is # uname -r 4.2.0-27-generic Hence we see this mismatch and message as "The kernel version is not supported." We need canonical to change the value of LATEST_VERSION so that we do not see this message. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1557751/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1587295] Re: drmgr failed to remove i/o slot
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1587295 Title: drmgr failed to remove i/o slot Status in linux package in Ubuntu: New Bug description: == Comment: #0 - Minh Nguyen - 2015-12-04 10:01:38 == ---Problem Description--- While performing drmgr to remove an IO slot, we encounter a failure: >pvmctl IOSlot detach --drc-names U78C9.001.WZS005Z-P1-C3 -p id=1 [PVME0105FF05-0187] Command /usr/sbin/pvmdrmgr drmgr -c phb -s 'PHB 41' -r returned 255. Additional messages: /usr/sbin/pvmdrmgr drmgr -c phb -s 'PHB 41' -r Validating PHB DLPAR capability...yes. Isolation failed for 2029 with -9001 Valid outstanding translations exist. /var/log/syslog showed: Dec 3 15:07:22 yc00sp-neo kernel: [ 395.877784] rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1 Dec 3 15:07:22 yc00sp-neo kernel: [ 395.878122] rpaphp: Slot [U78C9.001.WZS005Z-P1-C3] registered Dec 3 15:07:23 yc00sp-neo kernel: [ 396.625406] iommu: Removing device 0001:01:00.0 from group 1 Dec 3 15:07:24 yc00sp-neo kernel: [ 397.293386] iommu: Removing device 0001:01:00.1 from group 1 Dec 3 15:07:34 yc00sp-neo kernel: [ 407.298765] pci_bus 0001:01: busn_res: [bus 01-ff] is released Dec 3 15:07:34 yc00sp-neo kernel: [ 407.298844] rpadlpar_io: slot PHB 41 removed /var/log/drmgr showed: retrieving hotplug nodes Could not find DRC property group in path: /proc/device-tree/pci@800201b. hp adapter status for U78C9.001.WZS005Z-P1-C3 is 1 setting hp adapter status to UNCONFIG adapter for U78C9.001.WZS005Z-P1-C3 hp adapter status for U78C9.001.WZS005Z-P1-C3 is 2 Removing device-tree node /proc/device-tree/pci@8002029/ethernet@0,1 Removing device-tree node /proc/device-tree/pci@8002029/ethernet@0 HPDEV: /sys/bus/pci/devices/:50:00.0 /pci@800201b/usb@0 performing kernel op for PHB 41, file is /sys/bus/pci/slots/control/remove_slot Removing device-tree node /proc/device-tree/pci@8002029 Removing device-tree node /proc/device-tree/interrupt-controller@8002529 Releasing drc index 0x2029 get-sensor for 2029: 0, 1 Setting isolation state to 'isolate' Isolation failed for 2029 with -9001 Valid outstanding translations exist. The slot has a 10 Gigabit Etherenet-SFP+ SR PCI-E adapter Contact Information = Minh Nguyen (mi...@us.ibm.com) Jeremy Arnold (arnol...@us.ibm.com) ---uname output--- Linux yc00sp-neo 4.2.0-16-generic #19-Ubuntu SMP Thu Oct 8 14:49:47 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-42A ---Debugger--- A debugger is not configured ---Steps to Reproduce--- Run the command: pvmctl IOSlot detach --drc-names U78C9.001.WZS005Z-P1-C3 -p id=1 Userspace tool common name: gdb The userspace tool has the following bit modes: 64bit Userspace rpm: powerpc-ibm-utils Userspace tool obtained from project website: na *Additional Instructions for Minh Nguyen (mi...@us.ibm.com) Jeremy Arnold (arnol...@us.ibm.com) : -Post a private note with access information to the machine that the bug is occuring on. -Attach ltrace and strace of userspace application. == Comment: #7 - Carol L. Soto - 2016-02-08 16:15:57 == I sniff in the /var/log/kern.log.4 I put in /tmp/kern.log.4 I see this Dec 3 15:00:51 yc00sp-neo kernel: [4.762738] ibmvmc: sethmcid: Set HMC ID: "neo 1" Dec 3 15:00:51 yc00sp-neo kernel: [4.817873] DCCP: Activated CCID 2 (TCP-like) Dec 3 15:07:22 yc00sp-neo kernel: [ 395.877784] rpaphp: RPA HOT Plug PCI Controller Driver version: 0.1 Dec 3 15:07:22 yc00sp-neo kernel: [ 395.878122] rpaphp: Slot [U78C9.001.WZS005Z-P1-C3] registered Dec 3 15:07:23 yc00sp-neo kernel: [ 396.625406] iommu: Removing device 0001:01:00.0 from group 1 Dec 3 15:07:24 yc00sp-neo kernel: [ 397.293386] iommu: Removing device 0001:01:00.1 from group 1 Dec 3 15:07:34 yc00sp-neo kernel: [ 407.298765] pci_bus 0001:01: busn_res: [bus 01-ff] is released Dec 3 15:07:34 yc00sp-neo kernel: [ 407.298844] rpadlpar_io: slot PHB 41 removed ~ but I do not see Mellanox traces I only see be2net traces. That is another device. == Comment: #15 - Douglas Miller - 2016-02-18 13:49:26 == Looking around the system, I notice that 'lspci' shows no (ethernet) device. I looked at the kernel and the module 'be2net' was still loaded, but had zero dependents. I ran "rmmod be2net" and the module was removed without error. I then ran the pvmctl remove command and it appeared to succeed: root@cs-tul6-neo:~# pvmctl IOSlot detach --drc-names U78CB.001.WZS00D0-P1-C6 -p id=1 [PVME0105FF05-0187] Command /usr/sbin/pvmdrmgr drmgr -c phb -s 'PHB 24' -r returned 3. Additional messages: /usr/sbin/pvmdrmgr drmgr -c phb -s 'PHB 24' -r
[Kernel-packages] [Bug 1578445] Re: Bridge test for package application, please ignore
** Changed in: linux (Ubuntu) Assignee: Taco Screen team (taco-screen-team) => Luciano Chavez (lnx1138) ** Changed in: linux (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1578445 Title: Bridge test for package application, please ignore Status in linux package in Ubuntu: Invalid Bug description: == Comment: #0 - bugproxy bugproxy - 2016-05-04 20:01:30 == ---Problem Description--- Bridge test Contact Information = test ---uname output--- test Machine Type = test ---Debugger--- A debugger is not configured To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1578445/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1574697] Re: WARNING: at /build/linux-aWXT0l/linux-4.4.0/drivers/pci/pci.c:1595 [travis3EN]
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1574697 Title: WARNING: at /build/linux-aWXT0l/linux-4.4.0/drivers/pci/pci.c:1595 [travis3EN] Status in linux package in Ubuntu: New Bug description: ---Problem Description--- WARNING: at /build/linux-aWXT0l/linux-4.4.0/drivers/pci/pci.c:1595 [travis3EN] ---uname output--- Linux ltciofvtr-s822l2-lp3 4.4.0-4-generic #19-Ubuntu SMP Fri Feb 5 17:36:21 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux Machine Type = s822l ---Steps to Reproduce--- triggering EEH causes the warning messages in syslog Note: its just the warning messages, card recovers after EEH 1. from peer: run some load linux-xqxs:~ # ping -f 22.22.22.22 2. from pKVM host run the EEH for the travis3EN card [root@ltciofvtr-s822l2-lp1 ~]# echo 0x8000 > /sys/kernel/debug/powerpc/PCI0003/err_injct_inboundA; sleep 1; echo 0x0 > /sys/kernel/debug/powerpc/PCI0003/err_injct_inboundA 3. on client's sysfs you can see the warning messages "WARNING: at /build/linux-aWXT0l/linux-4.4.0/drivers/pci/pci.c:1595" [ 940.382507] EEH: Frozen PHB#0-PE#1 detected [ 940.382594] EEH: PE location: N/A, PHB location: N/A [ 940.382828] mlx4_core :00:04.0: mlx4_pci_err_detected was called [ 940.382891] mlx4_core :00:04.0: device is going to be reset [ 940.382953] mlx4_core :00:04.0: device was reset successfully [ 940.383014] mlx4_en :00:04.0: Internal error detected, restarting device Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382507] EEH: Frozen PHB#0-PE#1 detected Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382594] EEH: PE location: N/A, PHB location: N/A Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382647] CPU: 1 PID: 176 Comm: kworker/u16:2 Not tainted 4.4.0-4-generic #19-Ubuntu Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382671] Workqueue: mlx4_en mlx4_en_do_get_stats [mlx4_en] Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382673] Call Trace: Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382714] [c487b7c0] [c0ad8aa0] dump_stack+0x90/0xbc (unreliable) Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382725] [c487b7f0] [c00378f4] eeh_dev_check_failure+0x534/0x580 Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382728] [c487b890] [c00379c4] eeh_check_failure+0x84/0xd0 Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382743] [c487b8d0] [d2112fc0] cmd_pending+0xb0/0xe0 [mlx4_core] Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382749] [c487b900] [d21130b0] mlx4_cmd_post+0xc0/0x250 [mlx4_core] Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382756] [c487b9b0] [d211592c] __mlx4_cmd+0x1dc/0x9b0 [mlx4_core] Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382766] [c487ba70] [d24eb030] mlx4_en_DUMP_ETH_STATS+0xc0/0x830 [mlx4_en] Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382770] [c487bb70] [d24ef150] mlx4_en_do_get_stats+0x160/0x340 [mlx4_en] Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382780] [c487bc50] [c00dc920] process_one_work+0x1e0/0x560 Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382783] [c487bce0] [c00dce34] worker_thread+0x194/0x680 Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382785] [c487bd80] [c00e58d0] kthread+0x110/0x130 Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382788] [c487be30] [c0009538] ret_from_kernel_thread+0x5c/0xa4 Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382814] mlx4_core :00:04.0: Could not post command 0x49: ret=-5, in_param=0x0, in_mod=0x1, op_mod=0x0 Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382821] EEH: Detected PCI bus error on PHB#0-PE#1 Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382823] EEH: This PCI device has failed 1 times in the last hour Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382824] EEH: Notify device drivers to shutdown Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382828] mlx4_core :00:04.0: mlx4_pci_err_detected was called Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382891] mlx4_core :00:04.0: device is going to be reset Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.382953] mlx4_core :00:04.0: device was reset successfully Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.383014] mlx4_en :00:04.0: Internal error detected, restarting device Feb 19 02:18:23 ltciofvtr-s822l2-lp3 kernel: [ 940.383320] mlx4_en: enp0s4: Close port called Feb 19 02:18:23 ltciofvtr-s822l2-lp3 systemd[1]: Starting Cleanup of Temporary Directories... Feb 19 02:18:23 ltciofvtr-s822l2-lp3 systemd-tmpfiles[2473
[Kernel-packages] [Bug 1572291] Re: s390/pci: add extra padding to function measurement block
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1572291 Title: s390/pci: add extra padding to function measurement block Status in linux package in Ubuntu: New Bug description: Please backport upstream commit: commit 9d89d9e61d361f3adb75e1aebe4bb367faf16cfa Author: Sebastian Ott Date: Thu Mar 31 11:48:31 2016 +0200 s390/pci: add extra padding to function measurement block Newer machines might use a different (larger) format for function measurement blocks. To ensure that we comply with the alignment requirement on these machines and prevent memory corruption (when firmware writes more data than we expect) add 16 padding bytes at the end of the fmb. Cc: sta...@vger.kernel.org # v4.1+ Signed-off-by: Sebastian Ott Signed-off-by: Martin Schwidefsky arch/s390/include/asm/pci.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1572291/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1544438] Re: ISST:LTE: LPAR roselp1 kexec_core from reboot command
*** This bug is a duplicate of bug 1546260 *** https://bugs.launchpad.net/bugs/1546260 Marking as a dup of bug 1546260 as the kexec fix also took care of this issue. ** Changed in: linux (Ubuntu) Status: New => Fix Released ** Package changed: linux (Ubuntu) => kexec-tools (Ubuntu) ** This bug has been marked a duplicate of bug 1546260 kexec/kdump not working in ubuntu 16.04 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1544438 Title: ISST:LTE: LPAR roselp1 kexec_core from reboot command Status in kexec-tools package in Ubuntu: Fix Released Bug description: == Comment: #0 2016-02-08 18:18:35 == ---Problem Description--- reboot from command line. kernel cores on booting. ---uname output--- 4.4.0-2-generic Machine Type = 8286-42A ---System Hang--- system hung. need to reboot via hmc ---Debugger--- A debugger is not configured ---Steps to Reproduce--- install machine with ubuntu setup generl post install scripts reboot command line fails on boot up Stack trace output: no Oops output: no System Dump Info: The system was configured to capture a dump, however a dump was not produced. == Comment: #1 -2016-02-08 18:19:27 == root@roselp1:/kte/tools# reboot [ 878.305097] kdump-tools[63102]: Stopping kdump-tools: * unloaded kdump kernel Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .Ubuntu 16.04. . . .[ 939.516426] kexec_core: Starting new kernel == Comment: #3 - 2016-02-08 18:28:40 == > I am going to try to dump via hmc to capture kernel logs. hmc dump not working. I am going to change the kernel command line from crash kernel to xmon and reproduce I can't even get it to dump to xmon. == Comment: #6 - 2016-02-09 05:25:54 == (In reply to comment #5) > Please share kernel cmdline params. root@roselp1:~# cat /proc/cmdline BOOT_IMAGE=/boot/vmlinux-4.4.0-2-generic root=UUID=b0dee2d0-a2c9-43e2-a43b-70fec2cf6180 ro splash quiet xmon=on To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/kexec-tools/+bug/1544438/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1483170] Re: NVidia: Ubuntu: OS crashed into xmon Prompt; scsi_report_bus_reset
** Changed in: linux (Ubuntu) Status: New => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1483170 Title: NVidia: Ubuntu: OS crashed into xmon Prompt; scsi_report_bus_reset Status in linux package in Ubuntu: Fix Released Bug description: Problem Description: This system is running non-virtualized ubuntu with one nvidia k80 GPU. During a hardbootme run the OS crashed. Here are the details from xmon: 0:mon> e cpu 0x0: Vector: 300 (Data Access) at [c038f3b0] pc: c069ba80: scsi_report_bus_reset+0x60/0xb0 lr: d0001cae524c: ipr_erp_start+0x3bc/0x644 [ipr] sp: c038f630 msr: 90009033 dar: 100178 dsisr: 4000 current = 0xc1359b10 paca= 0xcfb8 softe: 0irq_happened: 0x01 pid = 0, comm = swapper/0 0:mon> r R00 = d0001cae524c R16 = 0020 R01 = c038f630 R17 = R02 = c13d8028 R18 = fffefa58 R03 = c00fdcb0 R19 = c0e4a000 R04 = R20 = c1412180 R05 = 0002 R21 = 0001 R06 = 0067 R22 = 0002 R07 = 0629 R23 = 01f0 R08 = 0001 R24 = c0001010ea00 R09 = 001000f0 R25 = c00fdcb00730 R10 = 00ff R26 = 0001 R11 = d0001cae6518 R27 = 0629 R12 = c069ba20 R28 = c00fdce40cf0 R13 = cfb8 R29 = c00fa4c50300 R14 = c135a120 R30 = R15 = R31 = c00fdcb0 pc = c069ba80 scsi_report_bus_reset+0x60/0xb0 cfar= c0009368 slb_miss_realmode+0x50/0x78 lr = d0001cae524c ipr_erp_start+0x3bc/0x644 [ipr] msr = 90009033 cr = 2804 ctr = c069ba20 xer = trap = 300 dar = 00100178 dsisr = 4000 0:mon> t [c038f660] d0001cae524c ipr_erp_start+0x3bc/0x644 [ipr] [c038f6c0] d0001caddb20 ipr_scsi_done+0x100/0x120 [ipr] [c038f700] d0001cadc5bc ipr_isr_mhrrq+0x10c/0x250 [ipr] [c038f760] c012ff90 handle_irq_event_percpu+0x90/0x2b0 [c038f820] c0130218 handle_irq_event+0x68/0xd0 [c038f850] c0135380 handle_fasteoi_irq+0xe0/0x250 [c038f880] c012f188 generic_handle_irq+0x58/0x90 [c038f8b0] c00119d0 __do_irq+0x80/0x190 [c038f8e0] c0011bec do_IRQ+0x10c/0x120 [c038f940] c0002794 hardware_interrupt_common+0x114/0x180 --- Exception: 501 (Hardware Interrupt) at c06a45b4 scsi_io_completion+0x1e4/0x800 [c038fd00] c069662c scsi_finish_command+0x15c/0x1b0 [c038fd80] c06a41d8 scsi_softirq_done+0x198/0x200 [c038fe00] c04cbbd4 blk_done_softirq+0xb4/0xe0 [c038fe40] c00b5244 __do_softirq+0x174/0x3e0 [c038ff30] c00b5888 irq_exit+0xf8/0x140 [c038ff60] c00119dc __do_irq+0x8c/0x190 [c038ff90] c0025320 call_do_irq+0x14/0x24 [c13d7840] c0011b80 do_IRQ+0xa0/0x120 [c13d78a0] c0002794 hardware_interrupt_common+0x114/0x180 --- Exception: 501 (Hardware Interrupt) at c00110d4 arch_local_irq_restore+0x74/0x90 [c13d7b90] c00162f8 __switch_to+0x208/0x350 (unreliable) [c13d7bb0] c00ef70c finish_task_switch+0x7c/0x1e0 [c13d7bf0] c09d6c40 __schedule+0x370/0x910 [c13d7e10] c09d7880 schedule_preempt_disabled+0x20/0x30 [c13d7e30] c01121e4 cpu_startup_entry+0x1c4/0x500 [c13d7ee0] c000ccd4 rest_init+0xa4/0xc0 [c13d7f00] c0d53e4c start_kernel+0x520/0x53c [c13d7f90] c0009b6c start_here_common+0x20/0xa8 0:mon> == Comment: #1 - Brian J. King - 2015-05-28 17:08:13 == Make sure we have the host lock held when calling scsi_report_bus_reset. Fixes a crash seen as the __devices list in the scsi host was changing as we were iterating through it. == Comment: #8 - Wen Xiong - 2015-08-06 11:09:25 == Release of bug changed to Ubuntu14.04. He has tested the patch and " yes the patch worked". We have upstream the patch last month. Here is the commit link: https://git.kernel.org/cgit/linux/kernel/git/jejb/scsi.git/commit/drivers/scsi/ipr.c?h=misc&id=36b8e180e1e929e00b351c3b72aab3147fc14116 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1483170/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages Mo
[Kernel-packages] [Bug 1495863] Re: ISST-KVM: R3-0: Firestone: PowerNV : Call traces w.r.t filesystem while running stress test
*** This bug is a duplicate of bug 1469829 *** https://bugs.launchpad.net/bugs/1469829 ** Changed in: linux (Ubuntu) Status: New => Invalid ** This bug has been marked a duplicate of bug 1469829 ppc64el should use 'deadline' as default io scheduler -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1495863 Title: ISST-KVM: R3-0: Firestone: PowerNV : Call traces w.r.t filesystem while running stress test Status in linux package in Ubuntu: Invalid Bug description: == Comment: #0 - Krishnaja Balachandran - 2015-09-02 05:01:44 == ---Problem Description--- While running stress tests( IO BASE TCP NFS) on Firestone system "amp" I see the following call traces in "dmesg". Also, few commands are hanging in amp. Contact Information = kriba...@in.ibm.com ---uname output--- Linux amp 3.19.0-26-generic #28~14.04.1-Ubuntu SMP Wed Aug 12 14:10:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux Machine Type = PowerNV 8335-GTA ---Debugger--- "xmon" was configured, however the system did not enter into the debugger Stack trace output: [69633.095738] INFO: rcu_sched detected stalls on CPUs/tasks: { 11 59} (detected by 171, t=10356287 jiffies, g=2131264, c=2131263, q=51507717) [69633.096031] Task dump for CPU 11: [69633.096058] kswapd8 R running task0 989 2 0x0804 [69633.096110] Call Trace: [69633.096138] [c03c9f93f070] [c03c9f93f0b0] 0xc03c9f93f0b0 (unreliable) [69633.096347] [c03c9f93f240] [c03c9f93f370] 0xc03c9f93f370 [69633.096399] Task dump for CPU 59: [69633.096426] kworker/u386:14 D 0 77488 2 0x0804 [69633.096484] Workqueue: writeback bdi_writeback_workfn (flush-8:32) [69633.096536] Call Trace: [69633.096556] [c03c9f936f00] [c01ff4ee76f0] 0xc01ff4ee76f0 (unreliable) [69633.096618] [c03c9f9370d0] [c03c9f937130] 0xc03c9f937130 [69633.096671] [c03c9f937130] [c0a11370] __schedule+0x370/0x8d0 [69633.096726] [c03c9f937350] [c0a153e8] rwsem_down_write_failed+0x288/0x400 [69633.096788] [c03c9f9373e0] [c0a147f8] down_write+0x88/0x90 [69633.096864] [c03c9f937410] [d00029bc5254] xfs_ilock+0xf4/0x160 [xfs] [69633.096933] [c03c9f937450] [d00029bc1bc8] xfs_iomap_write_allocate+0x238/0x3f0 [xfs] [69633.097011] [c03c9f937580] [d00029ba60bc] xfs_map_blocks+0x1cc/0x2f0 [xfs] [69633.097087] [c03c9f9375f0] [d00029ba7b24] xfs_vm_writepage+0x194/0x630 [xfs] [69633.097148] [c03c9f9376d0] [c021c66c] __writepage+0x4c/0xb0 [69633.097373] [c03c9f937710] [c021cda4] write_cache_pages+0x1e4/0x4c0 [69633.097437] [c03c9f937850] [c021d0e4] generic_writepages+0x64/0x90 [69633.097513] [c03c9f9378b0] [d00029ba5e70] xfs_vm_writepages+0x70/0xa0 [xfs] [69633.097575] [c03c9f9378f0] [c021e720] do_writepages+0x60/0xc0 [69633.097627] [c03c9f937920] [c02f1db8] __writeback_single_inode+0x68/0x370 [69633.097687] [c03c9f937970] [c02f2478] writeback_sb_inodes+0x2c8/0x4f0 [69633.097747] [c03c9f937a40] [c02f2784] __writeback_inodes_wb+0xe4/0x150 [69633.097807] [c03c9f937aa0] [c02f354c] wb_writeback+0x30c/0x3e0 [69633.097859] [c03c9f937b40] [c02f405c] bdi_writeback_workfn+0x14c/0x550 [69633.097919] [c03c9f937c60] [c00d28bc] process_one_work+0x19c/0x480 [69633.097980] [c03c9f937cf0] [c00d3160] worker_thread+0x190/0x5b0 [69633.098032] [c03c9f937d80] [c00da494] kthread+0x114/0x140 [69633.098085] [c03c9f937e30] [c000956c] ret_from_kernel_thread+0x5c/0x70 NOTE: System is on a private network. Access the private network via SSH to "banner.isst.aus.stglabs.ibm.com" using your GSA ID and password. (Banner itself is behind a BSO, so must authenticate through that first.) Login details : ssh banner.isst.aus.stglabs.ibm.com [debug/don2rry ] Host login:- amp.isst.aus.stglabs.ibm.com [10.33.31.106 ] [root/don2rry] login via GUI: bmc-amp.isst.aus.stglabs.ibm.com [10.33.31.106 ] IPMI Login to host amp console :- - From banner machine run the following : ssh banner.isst.aus.stglabs.ibm.com [debug/don2rry ] ipmitool -I lanplus -H bmc-amp -U ADMIN -P admin sol deactivate ipmitool -I lanplus -H bmc-amp -U ADMIN -P admin sol activate - TESTING INFORMATION - SYSTEM INFORMATION - HOST NAME or NETWORK ADDRESS: amp.isst.aus.stglabs.ibm.com
[Kernel-packages] [Bug 1506327] Re: ISST-KVM: R3-0: Tuleta: PowerKVM : flyg3 : Boot during installation hangs at "Booting Linux via __start()"
Closed as unreproducible on the IBM side for some time. Closing here as well. ** Changed in: linux (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1506327 Title: ISST-KVM: R3-0: Tuleta: PowerKVM : flyg3 : Boot during installation hangs at "Booting Linux via __start()" Status in linux package in Ubuntu: Invalid Bug description: == Comment: #0 == ---Problem Description--- The installation of ubuntu1510 is hanging at the below given point: PowerKVM host info: [root@flykvm ~]# uname -a Linux flykvm 3.18.21-352.el7_1.pkvm3_1_0.3400.1.ppc64le #1 SMP Tue Sep 29 13:30:10 CDT 2015 ppc64le ppc64le ppc64le GNU/Linux [root@flykvm ~]# [root@flykvm ~]# [root@flykvm ~]# cat /etc/os-release NAME="IBM_PowerKVM" VERSION="3.1.0" ID=ibm_powerkvm VERSION_ID="3.1.0" BUILD_ID="35-beta" PRETTY_NAME="IBM_PowerKVM 3.1.0" ANSI_COLOR="0;34" CPE_NAME="cpe:/o:ibm:beta:pkvm3_1" [root@flykvm ~]# ---console logs--- [root@flykvm ~]# virsh start --console flyg3 Domain flyg3 started Connected to domain flyg3 Escape character is ^] SLOF ** QEMU Starting Build Date = Sep 21 2015 16:58:03 FW Version = mockbuild@ release 20150921 Press "s" to enter Open Firmware. Populating /vdevice methods Populating /vdevice/v-scsi@2000 SCSI: Looking for devices 8002 CD-ROM : "QEMU QEMU CD-ROM 2.3." 8001 DISK : "QEMU QEMU HARDDISK2.3." 8000 DISK : "QEMU QEMU HARDDISK2.3." Populating /vdevice/vty@30001000 Populating /vdevice/nvram@7100 00 2000 (D) : 106b 003fserial bus [ usb-ohci ] 00 1000 (D) : 1af4 1000virtio [ net ] 00 0800 (D) : 15b3 1011network [ network ] No NVRAM common partition, re-initializing... Scanning USB OHCI: initializing Using default console: /vdevice/vty@30001000 Welcome to Open Firmware Copyright (c) 2004, 2011 IBM Corporation All rights reserved. This program and the accompanying materials are made available under the terms of the BSD License available at http://www.opensource.org/licenses/bsd-license.php Trying to load: from: /vdevice/v-scsi@2000/disk@8000 ... E3404: Not a bootable device! Trying to load: from: /vdevice/v-scsi@2000/disk@8002 ... Successfully loaded GNU GRUB version 2.02~beta2-28 ++ |*Install | | Rescue mode | | | | | | | | | | | | | | | | | | | | | ++ Use the ^ and v keys to select which entry is highlighted. Press enter to boot the selected OS, `e' to edit the commands before booting or `c' for a command-line. OF stdout device is: /vdevice/vty@30001000 Preparing to boot Linux version 4.2.0-14-generic (buildd@denneed03) (gcc version 5.2.1 20150930 (Ubuntu 5.2.1-19ubuntu1) ) #16-Ubuntu SMP Fri Oct 2 05:18:10 UTC 2015 (Ubuntu 4.2.0-14.16-generic 4.2.2) Detected machine type: 0101 Max number of cores passed to firmware: 256 (NR_CPUS = 2048) Calling ibm,client-architecture-support... done command line: BOOT_IMAGE=/install/vmlinux tasks=standard pkgsel/language-pack-patterns= pkgsel/install-language-support=false --- quiet memory layout at init: memory_limit : (16 MB aligned) alloc_bottom : 043b alloc_top: 3000 alloc_top_hi : 0001 rmo_top : 3000 ram_top : 000
[Kernel-packages] [Bug 1483189] Re: Machine crashes when we unload the Nvidia dirver module with Ubuntu 15.10
** Changed in: linux (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1483189 Title: Machine crashes when we unload the Nvidia dirver module with Ubuntu 15.10 Status in linux package in Ubuntu: Invalid Bug description: Problem Description == Machine crashes when we unload the Nvidia dirver module ---Additional Hardware Info--- root@fr111p1:~# lspci | grep -i NVIDIA :03:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1) :04:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1) 0002:03:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1) 0002:04:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1) Machine Type = P8 Steps to Reproduce === Install a P8 Power NV 8335-GTA Hardware with Ubuntu 15.10 Netboot images. Then followed the below steps to install the latest kernel. 1) cat <> /etc/apt/sources.list deb http://ppa.launchpad.net/canonical-kernel-team/ppa/ubuntu wily main deb-src http://ppa.launchpad.net/canonical-kernel-team/ppa/ubuntu wily main EOF 2) apt-get update 3) apt-cache search linux-image-4 linux-image-4.0.0-3-generic - Linux kernel image for version 4.0.0 on PowerPC 64el SMP linux-image-4.0.0-4-generic - Linux kernel image for version 4.0.0 on PowerPC 64el SMP 4) Choose the latest kernel version for installation apt-get install linux-image-4.0.0-4-generic Then rebooted the machine to the latest kernel and installed the CUDA packages. root@fr111p1:~# dpkg -i cuda-repo-ubuntu1410_7.0-28_ppc64el.deb root@fr111p1:~# apt-get update root@fr111p1:~# apt-get install cuda Then tried to unload the kernel module manually. root@fr111p1:~# lsmod | grep nvidia nvidia_uvm 88636 0 nvidia 11342553 1 nvidia_uvm drm 431025 5 ast,ttm,drm_kms_helper,nvidia root@fr111p1:~# rmmod nvidia_uvm root@fr111p1:~# lsmod | grep nvidia nvidia 11342553 0 drm 431025 5 ast,ttm,drm_kms_helper,nvidia root@fr111p1:~# rmmod nvidia root@fr111p1:~# nvidia-smi ---uname output--- Linux fr111p1 4.0.0-4-generic #6-Ubuntu SMP Tue Jun 30 20:50:37 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux Stack trace output: [14134.139423] Call Trace: [14134.139448] [c01e3e1c7890] [c0287ec0] kmem_cache_alloc_trace+0x300/0x330 (unreliable) [14134.139526] [c01e3e1c7900] [c011206c] down+0x7c/0xa0 [14134.139687] [c01e3e1c7940] [d00012a94970] nvidia_open+0x370/0x730 [nvidia] [14134.139837] [c01e3e1c79e0] [d00012aa33ac] nvidia_frontend_open+0x8c/0x100 [nvidia] [14134.139901] [c01e3e1c7a70] [c02bc3f4] chrdev_open+0x114/0x260 [14134.139954] [c01e3e1c7ad0] [c02b18b0] do_dentry_open+0x2d0/0x480 [14134.140007] [c01e3e1c7b30] [c02c76a0] do_last+0x190/0x1010 [14134.140073] [c01e3e1c7c00] [c02caffc] path_openat+0xdc/0x810 [14134.140126] [c01e3e1c7cd0] [c02ccb98] do_filp_open+0x58/0xf0 [14134.140183] [c01e3e1c7db0] [c02b3698] do_sys_open+0x1c8/0x390 [14134.140271] [c01e3e1c7e30] [c0009258] system_call+0x38/0xd0 Oops output: [14134.137140] Unable to handle kernel paging request for data at address 0x [14134.137162] NVRM: loading NVIDIA UNIX ppc64le Kernel Module 346.46 Tue Feb 17 17:18:33 PST 2015 [14134.137361] Faulting instruction address: 0xc0a42154 [14134.137411] Oops: Kernel access of bad area, sig: 11 [#1] [14134.137640] SMP NR_CPUS=2048 NUMA PowerNV [14134.137684] Modules linked in: nvidia(POE) dm_round_robin dm_multipath scsi_dh cxgb3 cxgb4 ib_ipoib ib_ucm ib_uverbs ib_cm ib_umad mlx4_ib ib_sa ib_mad ib_core ib_addr joydev mac_hid hid_generic ipmi_powernv ipmi_msghandler ast powernv_rng ttm at24 drm_kms_helper usbhid uio_pdrv_genirq syscopyarea uio sysfillrect hid sysimgblt i2c_algo_bit drm autofs4 mlx4_en vxlan ip6_udp_tunnel udp_tunnel uas usb_storage bnx2x ahci mlx4_core libahci mdio libcrc32c [last unloaded: nvidia] [14134.138214] CPU: 62 PID: 54865 Comm: nvidia-persiste Tainted: P OE 4.0.0-4-generic #6-Ubuntu [14134.138279] task: c01e3e122200 ti: c01e3e1c4000 task.ti: c01e3e1c4000 [14134.138335] NIP: c0a42154 LR: c011206c CTR: c0111ff0 [14134.138389] REGS: c01e3e1c7610 TRAP: 0300 Tainted: P OE (4.0.0-4-generic) [14134.138453] MSR: 90009033 CR: 24002482 XER: 2000 [14134.138594] CFAR: c0008468 DAR: DSISR: 4200 SOFTE: 0 GPR00: c011206c c01e3e1c7890 c1489300 d00012d55ac8 GPR04: 0001 d00012d55b10 c01e0
[Kernel-packages] [Bug 1468605] Re: Kdump boot fails due to Kernel OOPS @tpm_ibmvtpm_probe (PowerVM)
** Changed in: kexec-tools (Ubuntu) Status: New => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to kexec-tools in Ubuntu. https://bugs.launchpad.net/bugs/1468605 Title: Kdump boot fails due to Kernel OOPS @tpm_ibmvtpm_probe (PowerVM) Status in kexec-tools package in Ubuntu: Fix Released Bug description: == Comment: #0 - SACHIN P. SANT - 2015-06-04 06:03:17 == ---Steps to Reproduce--- 1) Using latest daily ISO install 14.04.02 as a Power VM guest. The LPAR has vTPM functionality enabled. 2) Upgrade the kernel to 3.19 level (3.19.0-18-generic) 3) Configure kdump. A fix is required to configure kdump. Refer to defect (https://bugzilla.linux.ibm.com/show_bug.cgi?id=125712) # kdump-config load Modified cmdline:BOOT_IMAGE=/boot/vmlinux-3.19.0-18-generic root=UUID=3ea23bcf-7269-432f-bacc-f82c6cdd774e ro splash quiet vt.handoff=7 irqpoll maxcpus=1 nousb elfcorehdr=155200K segment[0].mem:0x800 memsz:24641536 segment[1].mem:0x978 memsz:65536 segment[2].mem:0x979 memsz:65536 segment[3].mem:0x97a memsz:65536 segment[4].mem:0x97b memsz:20971520 segment[5].mem:0xec7 memsz:262144 * loaded kdump kernel # cat /proc/cmdline BOOT_IMAGE=/boot/vmlinux-3.19.0-18-generic root=UUID=3ea23bcf-7269-432f-bacc-f82c6cdd774e ro splash quiet crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M vt.handoff=7 # 4) Trigger a crash. Kdump boot fails with a kernel panic I'm in purgatory -> smp_release_cpus() spinning_secondaries = 0 <- smp_release_cpus() <- setup_system() [0.877197] Unable to handle kernel paging request for data at address 0x0010 [0.877232] Faulting instruction address: 0xc863a044 [0.877241] Oops: Kernel access of bad area, sig: 11 [#1] [0.877246] SMP NR_CPUS=2048 NUMA pSeries [0.877255] Modules linked in: [0.877264] CPU: 8 PID: 1 Comm: swapper/8 Not tainted 3.19.0-18-generic #18~14.04.1-Ubuntu [0.877273] task: c00086bc ti: c00086c0 task.ti: c00086c0 [0.877280] NIP: c863a044 LR: c8639f54 CTR: 003fd734 [0.877288] REGS: c00086c03580 TRAP: 0300 Not tainted (3.19.0-18-generic) [0.877294] MSR: 80019033 CR: 24002022 XER: 2010 [0.877314] CFAR: c8639f60 DAR: 0010 DSISR: 4000 SOFTE: 1 GPR00: 24002022 c00086c03800 c944c760 0100 GPR04: fff00500 0010 GPR08: 0008 0001 GPR12: fff0 ce7f4800 c800bdd8 GPR16: GPR20: GPR24: c00086dece00 c8a86cb0 fff0 GPR28: c000823c9848 c0008608ad00 c000823c9800 [0.877409] NIP [c863a044] tpm_ibmvtpm_probe+0x2e4/0x580 [0.877417] LR [c8639f54] tpm_ibmvtpm_probe+0x1f4/0x580 [0.877423] Call Trace: [0.877430] [c00086c03800] [c8639e6c] tpm_ibmvtpm_probe+0x10c/0x580 (unreliable) [0.877442] [c00086c038a0] [c8033a1c] vio_bus_probe+0x1bc/0x480 [0.877451] [c00086c03940] [c864b15c] driver_probe_device+0xec/0x470 [0.877461] [c00086c039d0] [c864b69c] __driver_attach+0x11c/0x120 [0.877469] [c00086c03a10] [c864817c] bus_for_each_dev+0x9c/0x110 [0.877478] [c00086c03a60] [c864a8fc] driver_attach+0x3c/0x60 [0.877486] [c00086c03a90] [c864a2d8] bus_add_driver+0x208/0x320 [0.877495] [c00086c03b20] [c864c31c] driver_register+0x9c/0x180 [0.877504] [c00086c03b90] [c80322f8] __vio_register_driver+0x78/0xc0 [0.877513] [c00086c03c10] [c8df6de8] ibmvtpm_module_init+0x2c/0x40 [0.877523] [c00086c03c30] [c800b4bc] do_one_initcall+0x11c/0x270 [0.877532] [c00086c03d00] [c8da4100] kernel_init_freeable+0x264/0x34c [0.877543] [c00086c03dc0] [c800bdfc] kernel_init+0x2c/0x130 [0.877553] [c00086c03e30] [c800956c] ret_from_kernel_thread+0x5c/0x70 [0.877561] Instruction dump: [0.877567] 38e0 f8410018 7d2c4b78 4e800421 e8410018 4bfffea0 6042 38600064 [0.877583] 4bb04c35 6000 e93e0008 38600100 <80890010> 4ba46121 6000 7c6907b4 [0.877607] ---[ end trace c0064a96755f8f13 ]--- [0.881558] [2.881696] Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b [2.881696] [2.885423] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b [2.885423] == Comment: #6 - HON CHI
[Kernel-packages] [Bug 1555765] Re: Backport upstream bugfixes to ubuntu-16.04
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1555765 Title: Backport upstream bugfixes to ubuntu-16.04 Status in linux package in Ubuntu: New Bug description: Backport the below upstream "powernv-cpufreq" bug fixes to ubuntu 16.04: The below listed patchset is accepted upstream and is in linux-next. 1) 86622cb8c57abb05fe95bea3a068949c0ca79fc3 cpufreq: powernv: Free 'chips' on module exit 2) 6d167a44e6c8da3316e037b788585fcf96112bea cpufreq: powernv: Hot-plug safe the kworker thread 3) 96c4726f01cdbf53acf74cf2394e287d74bf40a3 cpufreq: powernv: Remove cpu_to_chip_id() from hot-path 4) 0306e481d479a58eff17c27adf213fbb5822946b cpufreq: powernv/tracing: Add powernv_throttle tracepoint 5)c89f2682a39192433c296bf97b834fd2815a758b cpufreq: powernv: Replace pr_info with trace print for throttle event Two more patches that are posted upstream: 6) http://marc.info/?l=linuxppc-embedded&m=145648325218686&w=2 7) http://marc.info/?l=linux-pm&m=145648316718658&w=2 1) Fixes a memory leak 2) Makes the worker thread hot-plug safe which is scheduled on an OCC event. 3) Fixes the below bug which cause 4% cpu overhead by powernv_cpufreq_throttle_check() function. 4.44% [k] _raw_spin_lock_irqsave | |--48.56%-- _raw_spin_lock_irqsave | | | |--72.55%-- of_find_property | | | | | |--62.14%-- of_n_addr_cells | | | __of_find_n_match_cpu_property | | | | | | | |--75.72%-- of_get_cpu_node | | | | cpu_to_chip_id | | | | powernv_cpufreq_throttle_check | | | | powernv_cpufreq_target_index | | | | __cpufreq_driver_target 4) , 5) and 7) are permanent fixes for BZ 131119, BZ 130125 which complained on the throttle message reporting 6) Fixes bug in powernv_cpufreq_{init/exit} To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1555765/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1370421] Re: BUG: soft lockup - CPU#15 stuck for 59737s! [genload:22734]
** Changed in: linux (Ubuntu) Status: Confirmed => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1370421 Title: BUG: soft lockup - CPU#15 stuck for 59737s! [genload:22734] Status in linux package in Ubuntu: Invalid Bug description: == Comment: #0 - ABDUL HALEEM - 2014-09-01 05:24:37 == ---Problem Description--- CPU stalls and soft lockup on cpu while running ltpstresstest.sh test of LTP suite, detailed syslog and the test logs are attached Contact Information = abdha...@in.ibm.com ---uname output--- Linux ubuntu 3.16.0-10-generic #15-Ubuntu SMP Thu Aug 21 16:32:31 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = POWER8 ---Debugger--- A debugger is not configured ---Steps to Reproduce--- - Ubuntu 14.10 LE guest running on Power 8 machine with Power KVM build 2_1_1.8 - Download and build LTP suite on the guest. run /opt/ltp/testscripts/ltpstress.sh -d /tmp/sardata -l /tmp/ltplog.12028 -m 128 -t 24 -S - After 2hrs of test run, dmesg start throwing below trace messages. syslog: - Aug 31 09:31:59 ubuntu kernel: [83796.274731] Adding 576k swap on swapfile29. Priority:-29 extents:1 across:576k FS Aug 31 09:32:00 ubuntu in.rshd[8457]: connect from 127.0.0.1 (127.0.0.1) Aug 31 09:32:01 ubuntu in.rshd[8459]: connect from 127.0.0.1 (127.0.0.1) Aug 31 09:32:02 ubuntu in.rshd[8461]: connect from 127.0.0.1 (127.0.0.1) Sep 1 04:42:36 ubuntu kernel: [147953.248523] INFO: rcu_sched detected stalls on CPUs/tasks: { 15} (detected by 2, t=92214 jiffies, g=440674, c=440673, q=304) Sep 1 04:42:36 ubuntu kernel: [147953.248720] Task dump for CPU 15: Sep 1 04:42:36 ubuntu kernel: [147953.248725] genload R running task0 22734 22733 0x0004 Sep 1 04:42:36 ubuntu kernel: [147953.248730] Call Trace: Sep 1 04:42:36 ubuntu kernel: [147953.248740] [c33239b0] [c0056fe4] ht64_call_hpte_insert1+0x4/0x3c (unreliable) Sep 1 04:42:36 ubuntu kernel: [147953.248745] [c3323ab0] [c00532c8] hash_preload+0x2f8/0x300 Sep 1 04:42:36 ubuntu kernel: [147953.248748] [c3323b30] [c004eaf0] update_mmu_cache+0xf0/0x110 Sep 1 04:42:36 ubuntu kernel: [147953.248753] [c3323b70] [c023559c] handle_mm_fault+0xa0c/0x11b0 Sep 1 04:42:36 ubuntu kernel: [147953.248758] [c3323c10] [c09e58dc] do_page_fault+0x71c/0x990 Sep 1 04:42:36 ubuntu kernel: [147953.248762] [c3323e30] [c0009568] handle_page_fault+0x10/0x30 Sep 1 04:42:36 ubuntu kernel: [147953.250365] INFO: rcu_sched detected stalls on CPUs/tasks: { 15} (detected by 2, t=16035133 jiffies, g=440674, c=440673, q=304) Sep 1 04:42:36 ubuntu kernel: [147953.250519] Task dump for CPU 15: Sep 1 04:42:36 ubuntu kernel: [147953.250522] genload R running task0 22734 22733 0x0004 Sep 1 04:42:36 ubuntu kernel: [147953.250525] Call Trace: Sep 1 04:42:36 ubuntu kernel: [147953.250528] [c33239b0] [c0056fe4] ht64_call_hpte_insert1+0x4/0x3c (unreliable) Sep 1 04:42:36 ubuntu kernel: [147953.250532] [c3323ab0] [c00532c8] hash_preload+0x2f8/0x300 Sep 1 04:42:36 ubuntu kernel: [147953.250535] [c3323b30] [c004eaf0] update_mmu_cache+0xf0/0x110 Sep 1 04:42:36 ubuntu kernel: [147953.250538] [c3323b70] [c023559c] handle_mm_fault+0xa0c/0x11b0 Sep 1 04:42:36 ubuntu kernel: [147953.250541] [c3323c10] [c09e58dc] do_page_fault+0x71c/0x990 Sep 1 04:42:36 ubuntu kernel: [147953.250544] [c3323e30] [c0009568] handle_page_fault+0x10/0x30 Sep 1 04:42:36 ubuntu kernel: [147953.257562] BUG: soft lockup - CPU#15 stuck for 59737s! [genload:22734] Sep 1 04:42:36 ubuntu kernel: [147953.257647] Modules linked in: nfsv2 nfsv3 nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache pseries_rng rtc_generic e1000 ohci_pci Other details : -- @ubuntu:/tmp$ lscpu Architecture: ppc64le Byte Order:Little Endian CPU(s):16 On-line CPU(s) list: 0-15 Thread(s) per core:1 Core(s) per socket:1 Socket(s): 16 NUMA node(s): 1 Model: IBM pSeries (emulated by qemu) L1d cache: 64K L1i cache: 32K NUMA node0 CPU(s): 0-15 @ubuntu:/tmp$ free total used free sharedbuffers cached Mem: 2072704 8924801180224448 274240 132480 -/+ buffers/cache: 4857601586944 Swap: 3460160 353923424768 @ubuntu:/tmp$ uptime 05:22:02 up 1 day, 19:06, 2 users, load average: 10.67, 9.10, 9.32 Thanks == Comment: #1 - ABDUL HALEEM - 2014-09-01 05:31:58 == == Comment: #2 - ABDUL HALEEM - 2014-09-01 05:36:48 =
[Kernel-packages] [Bug 1541534] Re: s390/cio: update measurement characteristics
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1541534 Title: s390/cio: update measurement characteristics Status in linux package in Ubuntu: New Bug description: Description: s390/cio: update measurement characteristics Symptom: lschp shows stale information in the "Cmg" and "Shared" column. Problem: Measurement characteristics are read only during IPL and are not updated when capabilities of a chpid change. Solution: Keep measurement characteristics up to date. Reproduction: chchp -c 1 ; lschp Upstream-ID: 0d9bfe9123cfde59bf5c2e375b59d2a7d5061c4c 61f0bfcf8020f02eb09adaef96745d1c1d1b3623 9f3d6d7a40a178b8a5b5274f4e55fec8c30147c9 Please integrate the following upstream commit IDs into the Ubuntu kernel: commit 0d9bfe9123cfde59bf5c2e375b59d2a7d5061c4c Author: Sebastian Ott Date: Mon Jan 25 10:30:27 2016 +0100 s390/cio: fix measurement characteristics memleak Measurement characteristics are allocated during channel path registration but not freed during deregistration. Fix this by embedding these characteristics inside struct channel_path. Signed-off-by: Sebastian Ott Reviewed-by: Peter Oberparleiter Signed-off-by: Martin Schwidefsky commit 61f0bfcf8020f02eb09adaef96745d1c1d1b3623 Author: Sebastian Ott Date: Mon Jan 25 10:31:33 2016 +0100 s390/cio: ensure consistent measurement state Make sure that in all cases where we could not obtain measurement characteristics the associated fields are set to invalid values. Note: without this change the "shared" capability of a channel path for which we could not obtain the measurement characteristics was incorrectly displayed as 0 (not shared). We will now correctly report "unknown" in this case. Signed-off-by: Sebastian Ott Reviewed-by: Peter Oberparleiter Signed-off-by: Martin Schwidefsky commit 9f3d6d7a40a178b8a5b5274f4e55fec8c30147c9 Author: Sebastian Ott Date: Mon Jan 25 10:32:51 2016 +0100 s390/cio: update measurement characteristics Per channel path measurement characteristics are obtained during channel path registration. However if some properties of a channel path change we don't update the measurement characteristics. Make sure to update the characteristics when we change the properties of a channel path or receive a notification from FW about such a change. Signed-off-by: Sebastian Ott Reviewed-by: Peter Oberparleiter Signed-off-by: Martin Schwidefsky To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1541534/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1541907] Re: qeth: layer2 reports unknown state to network tools.
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1541907 Title: qeth: layer2 reports unknown state to network tools. Status in linux package in Ubuntu: New Bug description: Please backport and integration the upstream linux commit below: commit e5ebe63214d44d4dcf43df02edf3613e04d671b9 Author: Ursula Braun Date: Fri Dec 11 12:27:55 2015 +0100 qeth: initialize net_device with carrier off /sys/class/net//operstate for an active qeth network interface offen shows "unknown", which translates to "state UNKNOWN in output of "ip link show". It is caused by a missing initialization of the __LINK_STATE_NOCARRIER bit in the net_device state field. This patch adds a netif_carrier_off() invocation when creating the net_device for a qeth device. Signed-off-by: Ursula Braun Acked-by: Hendrik Brueckner Reference-ID: Bugzilla 133209 Signed-off-by: David S. Miller To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1541907/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1463654] Re: Kernel WARN @drivers/base/memory.c:200 during DLPAR memory operation
** Changed in: ubuntu Assignee: (unassigned) => Taco Screen team (taco-screen-team) ** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1463654 Title: Kernel WARN @drivers/base/memory.c:200 during DLPAR memory operation Status in linux package in Ubuntu: New Bug description: ---Problem Description--- Kernel WARN @drivers/base/memory.c:200 during DLPAR memory operation Contact Information = Sachin Sant / ss...@in.ibm.com ---uname output--- 3.19.0-18-generic ---Patches Installed--- A patched powerpc-ibm-utils package is required Machine Type = POWER8 ---Debugger--- A debugger is not configured ---Steps to Reproduce--- 1) Using latest daily ISO install 14.04.02 as a Power VM guest 2) Upgrade the kernel to 3.19 level (3.19.0-18-generic) 3) Ensure ksh and powerpc-ibm-utils packages are installed. 4) Download following DLPAR packages from http://ausgsa.ibm.com/projects/r/rsctdev/builds/muthu/rmuts006a/ppc64le/ devices.chrp.base.servicerm_2.5.0.1-15111_ppc64el.deb dynamicrm_2.0.1-3_ppc64el.deb rsct.core_3.2.0.6-15111_ppc64el.deb rsct.core.utils_3.2.0.6-15111_ppc64el.deb src_3.2.0.6-15111_ppc64el.deb 5) Install the packages. 6) Perform a add memory operation via HMC Stack trace output: alp9 kernel: [44170.234662] [ cut here ] alp9 kernel: [44170.234667] WARNING: at /build/buildd/linux-lts-vivid-3.19.0/drivers/base/memory.c:200 alp9 kernel: [44170.234668] Modules linked in: rpadlpar_io rpaphp pseries_rng rtc_generic alp9 kernel: [44170.234675] CPU: 2 PID: 1391 Comm: systemd-udevd Not tainted 3.19.0-18-generic #18~14.04.1-Ubuntu alp9 kernel: [44170.234677] task: c003dbea7c80 ti: c003dbf1 task.ti: c003dbf1 Jun 4 19:44:47 alp9 kernel: [44170.234678] NIP: c0668f34 LR: c06699b0 CTR: Jun 4 19:44:47 alp9 kernel: [44170.234680] REGS: c003dbf13910 TRAP: 0700 Not tainted (3.19.0-18-generic) Jun 4 19:44:47 alp9 kernel: [44170.234680] MSR: 80029033 CR: 28042888 XER: 2000 Jun 4 19:44:47 alp9 kernel: [44170.234686] CFAR: c0668ed8 SOFTE: 1 Jun 4 19:44:47 alp9 kernel: [44170.234686] GPR00: c06699b0 c003dbf13b90 c144c760 0001 Jun 4 19:44:47 alp9 kernel: [44170.234686] GPR04: 0779 0100 00078000 f1de4000 Jun 4 19:44:47 alp9 kernel: [44170.234686] GPR08: c13ac760 0001 7790 003f Jun 4 19:44:47 alp9 kernel: [44170.234686] GPR12: c172cb00 ce831200 0100074a0010 Jun 4 19:44:47 alp9 kernel: [44170.234686] GPR16: 10032230 100311d0 3fffcb59bf20 0003 Jun 4 19:44:47 alp9 kernel: [44170.234686] GPR20: 100527f8 100322b0 01312d00 0100074af7f0 Jun 4 19:44:47 alp9 kernel: [44170.234686] GPR24: 100322d0 3fffcb59bf20 c003dbf13e00 Jun 4 19:44:47 alp9 kernel: [44170.234686] GPR28: 1000 00077000 00077900 Jun 4 19:44:47 alp9 kernel: [44170.234707] NIP [c0668f34] pages_correctly_reserved+0x134/0x1c0 Jun 4 19:44:47 alp9 kernel: [44170.234709] LR [c06699b0] memory_subsys_online+0x70/0x140 Jun 4 19:44:47 alp9 kernel: [44170.234710] Call Trace: Jun 4 19:44:47 alp9 kernel: [44170.234711] [c003dbf13b90] [0006] 0x6 (unreliable) Jun 4 19:44:47 alp9 kernel: [44170.234714] [c003dbf13c00] [c06699b0] memory_subsys_online+0x70/0x140 Jun 4 19:44:47 alp9 kernel: [44170.234716] [c003dbf13c40] [c06476f4] device_online+0xb4/0x120 Jun 4 19:44:47 alp9 kernel: [44170.234718] [c003dbf13c80] [c066987c] store_mem_state+0x8c/0x150 Jun 4 19:44:47 alp9 kernel: [44170.234721] [c003dbf13cc0] [c0643618] dev_attr_store+0x68/0xa0 Jun 4 19:44:47 alp9 kernel: [44170.234724] [c003dbf13d00] [c035afd0] sysfs_kf_write+0x80/0xb0 Jun 4 19:44:47 alp9 kernel: [44170.234726] [c003dbf13d40] [c0359f0c] kernfs_fop_write+0x18c/0x1f0 Jun 4 19:44:47 alp9 kernel: [44170.234730] [c003dbf13d90] [c02b450c] vfs_write+0xdc/0x260 Jun 4 19:44:47 alp9 kernel: [44170.234732] [c003dbf13de0] [c02b53bc] SyS_write+0x6c/0x110 Jun 4 19:44:47 alp9 kernel: [44170.234735] [c003dbf13e30] [c0009258] system_call+0x38/0xd0 Jun 4 19:44:47 alp9 kernel: [44170.234736] Instruction dump: Jun 4 19:44:47 alp9 kernel: [44170.234737] 419e0024 788a2428 7d095214 2fa8 41de0014 7d29502a 38e74000 7928ffe3 Jun 4 19:44:47 alp9 kernel: [44170.234740] 4082ff7c 3d02fff6 892808e3 69290001 <0b09> 2fa9 40de0068 3860 Jun 4 19:44:47
[Kernel-packages] [Bug 1426216] Re: pci driver messages - BAR 13: no space for
Doing cleanup and marking bug as Invalid as it was. ** Changed in: linux (Ubuntu) Status: Confirmed => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1426216 Title: pci driver messages - BAR 13: no space for Status in linux package in Ubuntu: Invalid Bug description: ---Problem Description--- pci driver messages - BAR 13: no space for ---uname output--- 3.18.0-12-generic Machine Type = POWER8 ---Steps to Reproduce--- Install Ubuntu 15.04 daily build running bare-metal After the installation following messages are seen in dmesg. Not sure if these indicate a problem. Can some one look at these and confirm if this is problem or can be ignored ? Messages from dmesg log [0.714476] pci_bus 0001:00: max bus depth: 3 pci_try_num: 4 [0.714519] pci 0001:04:00.0: reg 0x16c: [mem 0x-0x 64bit] [0.714541] pci 0001:04:00.0: reg 0x174: [mem 0x-0x 64bit] [0.714577] pci 0001:00:00.0: BAR 15: assigned [mem 0x3b101000-0x3b103fff 64bit pref] [0.714579] pci 0001:00:00.0: BAR 14: assigned [mem 0x3fe08000-0x3fe081ff] [0.714581] pci 0001:00:00.0: BAR 13: no space for [io size 0x3000] [0.714583] pci 0001:00:00.0: BAR 13: failed to assign [io size 0x3000] [0.714586] pci 0001:01:00.0: BAR 15: assigned [mem 0x3b101000-0x3b103fff 64bit pref] [0.714588] pci 0001:01:00.0: BAR 14: assigned [mem 0x3fe08000-0x3fe0817f] [0.714590] pci 0001:01:00.0: BAR 0: assigned [mem 0x3fe08180-0x3fe08183] [0.714595] pci 0001:01:00.0: BAR 13: no space for [io size 0x3000] [0.714597] pci 0001:01:00.0: BAR 13: failed to assign [io size 0x3000] [0.714600] pci 0001:02:01.0: BAR 15: assigned [mem 0x3b101000-0x3b101fff 64bit pref] [0.714602] pci 0001:02:08.0: BAR 15: assigned [mem 0x3b102000-0x3b102fff 64bit pref] [0.714604] pci 0001:02:09.0: BAR 15: assigned [mem 0x3b103000-0x3b103fff 64bit pref] [0.714606] pci 0001:02:01.0: BAR 14: assigned [mem 0x3fe08000-0x3fe0807f] [0.714608] pci 0001:02:08.0: BAR 14: assigned [mem 0x3fe08080-0x3fe080ff] [0.714609] pci 0001:02:09.0: BAR 14: assigned [mem 0x3fe08100-0x3fe0817f] [0.714611] pci 0001:02:01.0: BAR 13: no space for [io size 0x1000] [0.714613] pci 0001:02:01.0: BAR 13: failed to assign [io size 0x1000] [0.714614] pci 0001:02:08.0: BAR 13: no space for [io size 0x1000] [0.714616] pci 0001:02:08.0: BAR 13: failed to assign [io size 0x1000] [0.714618] pci 0001:02:09.0: BAR 13: no space for [io size 0x1000] [0.714619] pci 0001:02:09.0: BAR 13: failed to assign [io size 0x1000] [0.714622] pci 0001:03:00.0: BAR 6: assigned [mem 0x3fe08000-0x3fe08003 pref] [0.714624] pci 0001:03:00.1: BAR 6: assigned [mem 0x3fe08004-0x3fe08007 pref] [0.714627] pci 0001:03:00.0: BAR 2: assigned [mem 0x3fe08008-0x3fe080083fff 64bit] [0.714641] pci 0001:03:00.1: BAR 2: assigned [mem 0x3fe080084000-0x3fe080087fff 64bit] [0.714656] pci 0001:03:00.0: BAR 0: assigned [mem 0x3fe080088000-0x3fe080088fff 64bit] [0.714670] pci 0001:03:00.1: BAR 0: assigned [mem 0x3fe080089000-0x3fe080089fff 64bit] [0.714685] pci 0001:03:00.0: BAR 4: no space for [io size 0x0100] [0.714686] pci 0001:03:00.0: BAR 4: failed to assign [io size 0x0100] [0.714688] pci 0001:03:00.1: BAR 4: no space for [io size 0x0100] [0.714690] pci 0001:03:00.1: BAR 4: failed to assign [io size 0x0100] [0.714691] pci 0001:02:01.0: PCI bridge to [bus 03] # lspci :00:00.0 PCI bridge: IBM Device 03dc 0001:00:00.0 PCI bridge: IBM Device 03dc 0001:01:00.0 PCI bridge: PLX Technology, Inc. PEX 8732 32-lane, 8-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ca) 0001:02:01.0 PCI bridge: PLX Technology, Inc. PEX 8732 32-lane, 8-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ca) 0001:02:08.0 PCI bridge: PLX Technology, Inc. PEX 8732 32-lane, 8-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ca) 0001:02:09.0 PCI bridge: PLX Technology, Inc. PEX 8732 32-lane, 8-Port PCI Express Gen 3 (8.0 GT/s) Switch (rev ca) 0001:03:00.0 Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host Adapter (rev 03) 0001:03:00.1 Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host Adapter (rev 03) 0001:04:00.0 RAID bus controller: IBM PCI-E IPR SAS Adapter (ASIC) (rev 01) 0002:00:00.0 PCI bridge: IBM Device 03dc 0003:00:00.0 PCI bridge: IBM Device 03dc 0003:01:00.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca) 0003:02:01.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca) 0003:02:08.0 PCI bridge: PLX Technology, Inc. Device 8748 (rev ca) 0003:02:09.0 PCI bridge: PLX Technology, Inc. Device
[Kernel-packages] [Bug 1533351] Re: DLPAR operation fails on Bell adapter with Ubuntu 14.04.3 OS
** Changed in: linux-lts-utopic (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-lts-utopic in Ubuntu. https://bugs.launchpad.net/bugs/1533351 Title: DLPAR operation fails on Bell adapter with Ubuntu 14.04.3 OS Status in linux-lts-utopic package in Ubuntu: Invalid Bug description: == Comment: #0 - HARSHA THYAGARAJA - 2015-11-06 04:10:32 == ---Problem Description--- DLPAR operation fails on Bell adapter Contact Information = hathy...@in.ibm.com, iranna.an...@in.ibm.com ---uname output--- Linux tuletapio1-lp5 3.13.0-67-generic #110-Ubuntu SMP Fri Oct 23 13:24:51 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-41A ---Steps to Reproduce--- Necessary packages installed are: devices.chrp.base.servicerm_2.5.0.1-15111_ppc64el.deb dynamicrm_2.0.1-3_ppc64el.deb rsct.core_3.2.0.6-15111_ppc64el.deb rsct.core.utils_3.2.0.6-15111_ppc64el.deb src_3.2.0.6-15111_ppc64el.deb On the OS: root@tuletapio1-lp5:~# startsrc -g rsct 0513-059 The ctcas Subsystem has been started. Subsystem PID is 1382. 0513-029 The ctrmc Subsystem is already active. Multiple instances are not supported. root@tuletapio1-lp5:~# startsrc -g rsct_rm 0513-029 The IBM.MgmtDomainRM Subsystem is already active. Multiple instances are not supported. 0513-059 The IBM.ERRM Subsystem has been started. Subsystem PID is 1389. 0513-029 The IBM.HostRM Subsystem is already active. Multiple instances are not supported. 0513-059 The IBM.AuditRM Subsystem has been started. Subsystem PID is 1390. 0513-059 The IBM.SensorRM Subsystem has been started. Subsystem PID is 1393. 0513-029 The IBM.DRM Subsystem is already active. Multiple instances are not supported. 0513-029 The IBM.ServiceRM Subsystem is already active. Multiple instances are not supported. root@tuletapio1-lp5:~# lssrc -a Subsystem GroupPID Status ctrmcrsct 921 active IBM.DRM rsct_rm 1025active IBM.MgmtDomainRM rsct_rm 1130active IBM.HostRM rsct_rm 1143active IBM.ServiceRMrsct_rm 1183active ctcasrsct 1382active IBM.ERRM rsct_rm 1389active IBM.AuditRM rsct_rm 1390active IBM.SensorRM rsct_rm 1393active In the HMC: Run the command: hscroot@pwrio-hmc:~> lshwres -r io -m tuletapio1-fsp --rsubtype slot --filter "lpar_names=tuletapio1-lp5-iranna" unit_phys_loc=U78C9.001.WZS00CH,bus_id=24,phys_loc=C6,drc_index=21010018,lpar_name=tuletapio1-lp5-iranna,lpar_id=5,slot_io_pool_id=none,description=Quad Async EIA-232 PCI-Express Adapter,feature_codes=none,pci_vendor_id=114F,pci_device_id=00B6,pci_subs_vendor_id=114F,pci_subs_device_id=00B6,pci_class=,pci_revision_id=AA,bus_grouping=0,iop=0,parent_slot_drc_index=none,drc_name=U78C9.001.WZS00CH-P1-C6,interposer_present=0,interposer_pcie=0,lpar_assignment_capable=1,dynamic_lpar_assignment_capable=1 hscroot@pwrio-hmc:~> chhwres -r io -m tuletapio1-fsp -o r --id 5 -l 21010018 HSCL2929 The dynamic removal of I/O resources failed: The I/O slot dynamic partitioning operation failed. Here are the I/O slot IDs that failed and the reasons for failure: Validating PHB DLPAR capability...yes. failed to open /sys/bus/pci/slots/U78C9.001.WZS00CH-P1-C6/power: No such file or directory failed to disable hotplug children kernel remove failed for PHB 24, rc = -1 Observed in the terminal: Nov 4 05:26:43 tuletapio1-lp5 kernel: [ 553.125671] rpadlpar_io: slot PHB 24 removed Nov 4 05:26:44 tuletapio1-lp5 kernel: [ 554.125766] rpadlpar_io: slot PHB 24 removed Nov 4 05:26:45 tuletapio1-lp5 kernel: [ 555.125862] rpadlpar_io: slot PHB 24 removed Nov 4 05:26:46 tuletapio1-lp5 kernel: [ 556.125957] rpadlpar_io: slot PHB 24 removed Nov 4 05:26:47 tuletapio1-lp5 kernel: [ 557.126052] rpadlpar_io: slot PHB 24 removed Nov 4 05:26:48 tuletapio1-lp5 kernel: [ 558.126148] rpadlpar_io: slot PHB 24 removed Nov 4 05:26:49 tuletapio1-lp5 kernel: [ 559.126243] rpadlpar_io: slot PHB 24 removed Nov 4 05:26:50 tuletapio1-lp5 kernel: [ 560.126338] rpadlpar_io: slot PHB 24 removed Nov 4 05:26:51 tuletapio1-lp5 kernel: [ 561.126432] rpadlpar_io: slot PHB 24 removed Nov 4 05:26:52 tuletapio1-lp5 kernel: [ 562.126527] rpadlpar_io: slot PHB 24 removed Nov 4 05:26:53 tuletapio1-lp5 kernel: [ 563.126622] rpadlpar_io: slot PHB 24 removed Nov 4 05:26:54 tuletapio1-lp5 kernel: [ 564.126717] rpadlpar_io: slot PHB 24 removed Nov 4 05:26:55 tuletapio1-lp5 kernel: [ 565.126813] rpadlpar_io: slot PHB 24 removed Nov 4 05:26:56 tuletapio1-lp5 kernel: [ 566.126908] rpadlpar_io: slot PHB 24 removed Nov 4 05:26:57 tuletapio1-lp5 kernel: [ 567.127004] rpadlpar
[Kernel-packages] [Bug 1532942] Re: [EEH] Recursive fenced PHB during EEH recovery on Broadcom Shiner adapter
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1532942 Title: [EEH] Recursive fenced PHB during EEH recovery on Broadcom Shiner adapter Status in linux package in Ubuntu: New Bug description: ---Problem Description--- Recursive PHB error from EEH recovery on Shiner adapter ---uname output--- Ubuntu 14.04.3 LTS (3.19.0-30) Machine Type = Firestone ---Steps to Reproduce--- When injecting EEH error (frozen PE) to BCM57800 on ubuntu 14.04.3, the EEH error is recovered automatically. However I saw recursive fenced PHB error during the recovery, which isn't the expected behaviour. After checking the ubuntu's kernel source repository, I found one patch, which as merged to powerpc/next branch recently, is missed as below. This bug is opened to tracking the backporting. Please help to mirror to Canonical accordingly, thanks! This bug was opened to track the backporting (as below) to ubuntu's distro. Please help mirrored to Cacinonical accordingly. https://git.kernel.org/cgit/linux/kernel/git/next/linux- next.git/commit/?id=353169acf1858bb2dc3f91475dafabce547de14c which reads "powerpc/eeh: Fix recursive fenced PHB on Broadcom shiner adapter" To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1532942/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1532914] Re: Surelock GA2 SP1: capiredp01: cxl_init_adapter fails for CAPI devices 0000:01:00.0 and 0005:01:00.0 after upgrading to 840.10 Platform firmware build fips840/b1208b
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1532914 Title: Surelock GA2 SP1: capiredp01: cxl_init_adapter fails for CAPI devices :01:00.0 and 0005:01:00.0 after upgrading to 840.10 Platform firmware build fips840/b1208b_1604.840 Status in linux package in Ubuntu: New Bug description: Problem Description I upgraded the Platform firmware to the 840.10 Platform firmware build (b1208b_1604.840) to prepare for Surelock GA2 SP1 testing. After the upgrade, I used the ipmitool to power on capiredfsp.aus.stglabs.ibm.com and boot the Ubuntu 15.10 partition (capiredp01.aus.stglabs.ibm.com) in OPAL firmware mode. In petitboot, I saw messages for "cxl-pci :01:00.0: cxl_init_adapter failed: -5" and "cxl-pci 0005:01:00.0: cxl_init_adapter failed: -5." After the partition started running, I didn't see any AFU devices in /dev/cxl/ or /sys/class/cxl/ although I was able to see PCI devices for the hardware accelerators (:01:00.0 and 0005:01:00.0) with the lspci command. ubuntu@capiredp01:~$ ls -l /dev/cxl/ ls: cannot access /dev/cxl/: No such file or directory ubuntu@capiredp01:~$ ls -l /sys/class/cxl/ total 0 ubuntu@capiredp01:~$ sudo lscfg | grep -i afu ubuntu@capiredp01:~$ sudo lspci|egrep -i "04cf|0477" :01:00.0 Processing accelerators: IBM Device 04cf (rev 01) 0005:01:00.0 Processing accelerators: IBM Device 04cf (rev 01) ubuntu@capiredp01:~$ lsscsi -g [0:0:0:0]enclosu IBM VSBPD12M1 6GSAS03 - /dev/sg1 [0:0:1:0]cd/dvd IBM. RMBO0140512 RA65 /dev/sr0 /dev/sg2 [0:3:0:0]no dev IBM 57D7001SISIOA0150 - /dev/sg0 [1:0:0:0]enclosu IBM VSBPD12M1 6GSAS03 - /dev/sg4 [1:0:1:0]diskIBM HUC109030CSS600 E5C6 /dev/sda /dev/sg5 [1:0:2:0]diskIBM HUC101212CSS600 A5AA /dev/sdb /dev/sg6 [1:0:3:0]diskIBM HUC101212CSS600 A5AA /dev/sdc /dev/sg7 [1:0:4:0]diskIBM HUC101212CSS600 A5AA /dev/sdd /dev/sg8 [1:0:5:0]diskIBM ST1200MM0007 BF04 /dev/sde /dev/sg9 [1:0:6:0]diskIBM ST1200MM0007 BF04 /dev/sdf /dev/sg10 [1:3:0:0]no dev IBM 57D7001SISIOA0150 - /dev/sg3 This is a regression: the Linux kernel has failed to synchronize the PSL timebase. The corresponding error message is in the dmesg log attached in comment #4: [1.687586] PSL: Timebase sync: giving up! CAPI devices are not enabled, because of this failure. PSL Timebase sync should not be a requirement for CAPI initialization, nor should it make an initialized card become unavailable. Currently, timebase is an unused function of CAPI with hopes of adoption in the future. Support of this feature should be considered optional at this time. I'm not sure what the fastest way to fix this is, but it needs to be fixed as quickly as possible. CAPI is broken in Ubuntu 15.10. I can reproduce the bug, regardless of the skiboot level, with recent kernels. Older kernels behave as expected, regardless of the skiboot level. Firmware is not the cause of the regression, and kernel probably is. I sent this out to the capi-linux distro too, but I'll comment here as well. I'm not sure what is being looked at to determine the PSL timebase sync failed. As far as I know all PSL versions should support timebase. The only timebase error the PSL logs is if CAPP returns a status that says timebase has an error. I'd think if that is the issue that timebase has not been correctly enabled or sequenced correctly in the host CAPP. The PSL can't be enabled for timebase until the CAPP unit in the host has been enabled. I have installed a recent mainline Linux kernel (4.4.0-rc8) on capiredp01. I have rebooted this kernel and verified that the PSL timebase syncs without problem. I will now compare the source code of Ubuntu kernel 4.2.0-19 (that hits the bug) with the source of mainline kernel 4.4.0-rc8 (that operates as expected). I have updated the Ubuntu kernel and modules with: $ sudo apt-get install linux-image-4.2.0-23-generic $ sudo apt-get install linux-image-extra-4.2.0-23-generic I have rebooted Ubuntu kernel linux-image-4.2.0-23-generic, and found that the cxl driver hits the bug. I have also downloaded the source for this Ubuntu kernel (and modules) with: $ sudo apt-get source linux-image-4.2.0-23-generic I have recompiled and installed, and noticed that the resulting kernel bears the version 4.2.6 (??). I have rebooted this Ubuntu kernel 4.2.6 built from the Ubuntu source for 4.2.0-23-generic, and found that the timebase sync occurs normally. In short, the kernels linux-4.2.6 and linux-4.4.0-rc8 (that I have built from the source, respect
[Kernel-packages] [Bug 1486180] Re: Kernel OOPS during DLPAR operation with Fibre Channel adapter
** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1486180 Title: Kernel OOPS during DLPAR operation with Fibre Channel adapter Status in linux package in Ubuntu: New Bug description: -- Problem Description -- Kernel OOPS during DLPAR operation with Fibre Channel adapter ---uname output--- 4.1.0-1-generic ---Additional Hardware Info--- Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host Adapter (rev 03) Machine Type = POWER8 ---Steps to Reproduce--- 1) Install Ubuntu 15.10 on a Power VM LPAR. 2) Configure and start rtas_errd daemon 3) Via HMC try to add a Fibre channel adapter via dynamic partitioning During the operation following OOPS message is observed Oops output: !!! 00E0806 Fcode, Copyright (c) 2000-2012 Emulex !!! Version 3.10x2 !!! 00E0806 Fcode, Copyright (c) 2000-2012 Emulex !!! Version 3.10x2 [ 8696.808703] PCI host bridge /pci@8002020 ranges: [ 8696.808708] MEM 0x0003ff84..0x0003ff847eff -> 0x8000 [ 8696.808716] PCI: I/O resource not set for host bridge /pci@8002020 (domain 1) [ 8696.808761] PCI host bridge to bus 0001:01 [ 8696.808765] pci_bus 0001:01: root bus resource [mem 0x3ff84-0x3ff847eff] (bus address [0x8000-0xfeff]) [ 8696.808768] pci_bus 0001:01: root bus resource [bus 01-ff] [ 8696.897390] rpaphp: Slot [U78C7.001.RCH0042-P1-C8] registered [ 8696.897395] rpadlpar_io: slot PHB 32 added [ 8696.972155] Emulex LightPulse Fibre Channel SCSI driver 10.5.0.0. [ 8696.972157] Copyright(c) 2004-2015 Emulex. All rights reserved. [ 8696.972438] lpfc 0001:01:00.1: enabling device (0140 -> 0142) [ 8696.976145] Unable to handle kernel paging request for data at address 0x000c [ 8696.976174] Faulting instruction address: 0xc0084cc4 [ 8696.976182] Oops: Kernel access of bad area, sig: 11 [#1] [ 8696.976188] SMP NR_CPUS=2048 NUMA pSeries [ 8696.976196] Modules linked in: lpfc(+) scsi_transport_fc rpadlpar_io rpaphp rtc_generic pseries_rng autofs4 [ 8696.976220] CPU: 3 PID: 1426 Comm: systemd-udevd Not tainted 4.1.0-1-generic #1~dogfoodv1-Ubuntu [ 8696.976230] task: c003857737e0 ti: c000fd08c000 task.ti: c000fd08c000 [ 8696.976239] NIP: c0084cc4 LR: c0084ca8 CTR: [ 8696.976247] REGS: c000fd08f0f0 TRAP: 0300 Not tainted (4.1.0-1-generic) [ 8696.976255] MSR: 80019033 CR: 8222 XER: 2000 [ 8696.976278] CFAR: c0008468 DAR: 000c DSISR: 4000 SOFTE: 1 GPR00: c0084ca8 c000fd08f370 c14bda00 GPR04: 0001 c000fd08f408 0003 d2c31e60 GPR08: c13bda00 c003873e6b80 d2ca7c98 GPR12: 8800 ce831b00 d29421f8 38ca4522 GPR16: c000fd08fdc0 c000fd08fe04 d2941878 c000fc8054c0 GPR20: d238 d238 d2ccff90 GPR24: c165074c c0038e17e000 c13b5e00 c0038e17e000 GPR28: c13b5e28 ca590600 c13b5df0 c13b5e20 [ 8696.976396] NIP [c0084cc4] enable_ddw+0x254/0x7b0 [ 8696.976405] LR [c0084ca8] enable_ddw+0x238/0x7b0 [ 8696.976411] Call Trace: [ 8696.976419] [c000fd08f370] [c0084ca8] enable_ddw+0x238/0x7b0 (unreliable) [ 8696.976431] [c000fd08f4b0] [c00866d8] dma_set_mask_pSeriesLP+0x218/0x2a0 [ 8696.976444] [c000fd08f540] [c0023528] dma_set_mask+0x58/0xa0 [ 8696.976474] [c000fd08f570] [d2c71280] lpfc_pci_probe_one+0xb0/0xc50 [lpfc] [ 8696.976486] [c000fd08f610] [c05987fc] local_pci_probe+0x6c/0x140 [ 8696.976497] [c000fd08f6a0] [c0598a28] pci_device_probe+0x158/0x1e0 [ 8696.976510] [c000fd08f700] [c067b744] driver_probe_device+0x1c4/0x5a0 [ 8696.976522] [c000fd08f790] [c067bcdc] __driver_attach+0x11c/0x120 [ 8696.976533] [c000fd08f7d0] [c067854c] bus_for_each_dev+0x9c/0x110 [ 8696.976544] [c000fd08f820] [c067adbc] driver_attach+0x3c/0x60 [ 8696.976555] [c000fd08f850] [c067a768] bus_add_driver+0x208/0x320 [ 8696.976565] [c000fd08f8e0] [c067c99c] driver_register+0x9c/0x180 [ 8696.976576] [c000fd08f950] [c05978ec] __pci_register_driver+0x6c/0x90 [ 8696.976604] [c000fd08f990] [d2ca7848] lpfc_init+0x17c/0x1d8 [lpfc] [ 8696.976617] [c000fd08fa20] [c000b42c] do_one_initcall+0x12c/0x280 [ 8
[Kernel-packages] [Bug 1526946] Re: Surelock GA2: Kernel panic with GA candidate driver, warning at kernel/rcu/tree.c:2694
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1526946 Title: Surelock GA2: Kernel panic with GA candidate driver, warning at kernel/rcu/tree.c:2694 Status in linux package in Ubuntu: New Bug description: -- Problem Description -- System was loaded (Ubuntu 15.10 base (Linux z1391 4.2.0-16-generic #19-Ubuntu SMP Thu Oct 8 14:49:47 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux) Kernel panic running Hardware Test Exerciser test suite [ 8841.280827] Unable to handle kernel paging request for data at address 0x00100108 [ 8841.280873] Faulting instruction address: 0xc0981994 [ 8841.280902] Oops: Kernel access of bad area, sig: 11 [#1] [ 8841.280932] SMP NR_CPUS=2048 NUMA PowerNV [ 8841.281034] Modules linked in: iptable_filter ip_tables x_tables uio_pdrv_genirq uio powernv_rng sunrpc autofs4 ses enclosure cxlflash bnx2x ipr cxl mdio libcrc32c [ 8841.281055] CPU: 71 PID: 63157 Comm: hxecpu Not tainted 4.2.0-16-generic #19-Ubuntu [ 8841.281065] task: c01e252fa440 ti: c0305658c000 task.ti: c0305658c000 [ 8841.281077] NIP: c0981994 LR: c0981984 CTR: c0981940 [ 8841.281086] REGS: c0305658f920 TRAP: 0300 Not tainted (4.2.0-16-generic) [ 8841.281194] MSR: 90009033 CR: 39139953 XER: a000 [ 8841.281473] CFAR: c0008468 DAR: 00100108 DSISR: 4200 SOFTE: 1 [ 8841.281473] GPR00: c0981984 c0305658fba0 c151ae00 c01ff69de300 [ 8841.281473] GPR04: 0101 fec0 c093f168 000a [ 8841.281473] GPR08: 0100 00200200 00100100 0005 [ 8841.281473] GPR12: c0981940 cfb6a280 0001 [ 8841.281473] GPR16: c1431280 c0ad3988 7fff [ 8841.281473] GPR20: c01fcd2bb100 c0305658c000 c1429b80 [ 8841.281473] GPR24: 000a c01ff59ddb30 0001 [ 8841.281473] GPR28: c035fb079f00 c0305658c000 c01ff69de300 c035fb070f00 [ 8841.281514] NIP [c0981994] ipv4_dst_destroy+0x54/0xa0 [ 8841.281530] LR [c0981984] ipv4_dst_destroy+0x44/0xa0 [ 8841.281537] Call Trace: [ 8841.281561] [c0305658fba0] [c0981984] ipv4_dst_destroy+0x44/0xa0 (unreliable) [ 8841.281585] [c0305658fbd0] [c093f120] dst_destroy+0xf0/0x1a0 [ 8841.281631] [c0305658fc10] [c093f4a8] dst_destroy_rcu+0x28/0x50 [ 8841.281668] [c0305658fc40] [c013a020] rcu_process_callbacks+0x340/0x6f0 [ 8841.281692] [c0305658fcf0] [c00baef8] __do_softirq+0x188/0x3a0 [ 8841.281709] [c0305658fde0] [c00bb388] irq_exit+0xc8/0x100 [ 8841.281727] [c0305658fe00] [c001f734] timer_interrupt+0xa4/0xe0 [ 8841.281751] [c0305658fe30] [c0002714] decrementer_common+0x114/0x180 [ 8841.281762] Instruction dump: [ 8841.281812] 6000 e93f00b0 395f00b0 7fa95000 419e0048 ebdf00c0 7fc3f378 48116d99 [ 8841.281840] 6000 e93f00b8 e95f00b0 7fc3f378 f949 3d200010 61290100
[Kernel-packages] [Bug 1517142] Re: ubuntu guest with 10G n/w and Texan iSCSI crashes during FIO
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1517142 Title: ubuntu guest with 10G n/w and Texan iSCSI crashes during FIO Status in linux package in Ubuntu: New Bug description: Issues were found in iSCSI tests with hardware remote targets. Specifically, kernel crash happens due to dereferencing a null pointer (sc->device->lun at libiscsi.c:369, with sc==NULL). During the crash, lots of messages regarding lists invalid accesses are showed in kernel log. The commit 659743b02c41 ("[SCSI] libiscsi: Reduce locking contention in fast path") appears to be the cause. Reverting the commit solve the issue, at least until we can discuss and find the exact problem (and its solution) in the commit 659743b02c41 ("[SCSI] libiscsi: Reduce locking contention in fast path"). A test kernel was patched to revert the offend commit - Prashantha is running tests to check if the problem is solved. With the patched kernel, I am unable to recreate the crash. The patch appears to be working. A discussion is ongoing in linux-scsi mailing list, to revert the patch upstream (look the following link). http://marc.info/?l=linux-scsi&m=144730474819919 Another quick discussion, started by me, it's on open-iscsi mailing list, on Google Groups: https://groups.google.com/forum/#!topic/open-iscsi/0S5fEM_Aafk The iscsi maintainer wants to revert, but patch co-author wants more study before reverting. Prashantha is performing some performance analysis to check the impact of the patch on iscsi performance. Mirroring to Launchpad for Canonical's awareness. Once the discussion settles on the final solution, a patch or link to the upstream commit will be provided for Canonical to review for acceptance in the 14.04 LTS kernel and SRU. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1517142/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1509120] Re: Process accounting deadlock with idmapd callout when writing to NFSv4 mount
** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1509120 Title: Process accounting deadlock with idmapd callout when writing to NFSv4 mount Status in linux package in Ubuntu: New Bug description: ---Problem Description--- System hang when process accounting to a NFSv4 mount. ---uname output--- Linux ppc001 3.19.0-30-generic #34~14.04.1-Ubuntu SMP Fri Oct 2 22:21:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux ---Problem Details--- We have a customer that is experiencing intermittent system hangs on their system. After a bit of debug, it was discovered that the trigger was turning on process accounting and writing to a file hosted via an NFSv4 mount. During the testing, several vmcores were captured, and the fingerprint indicates a mutex deadlock situation with process accounting. In the most recent vmcore, it appears that the scenario is something like the following: 1. PID: 4898 COMMAND: "ls" triggers a write to the process accounting file. 2. The resulting NFS write needs idmapd information and calls out to idmapd 3. The idmapd usermodehelper process triggers another process accounting update that blocks on the mutex being held by PID 4898. PID: 4898 TASK: c01fd26d7580 CPU: 7 COMMAND: "ls" #0 [c01fd274a950] __switch_to at c0015934 #1 [c01fd274ab20] __switch_to at c0015934 #2 [c01fd274ab80] __schedule at c0a11de8 #3 [c01fd274ada0] schedule_timeout at c0a16284 #4 [c01fd274ae90] wait_for_common at c0a1360c #5 [c01fd274af10] call_usermodehelper_exec at c00ccd38 #6 [c01fd274af70] call_sbin_request_key at c0429258 #7 [c01fd274b100] request_key_and_link at c042983c #8 [c01fd274b200] request_key at c0429978 #9 [c01fd274b240] nfs_idmap_get_key at d0002ca8b0bc [nfsv4] #10 [c01fd274b2b0] nfs_map_name_to_uid at d0002ca8bbd0 [nfsv4] #11 [c01fd274b320] decode_getfattr_attrs at d0002ca7f59c [nfsv4] #12 [c01fd274b420] decode_getfattr_generic.constprop.96 at d0002ca7fd78 [nfsv4] #13 [c01fd274b4d0] nfs4_xdr_dec_getattr at d0002ca80738 [nfsv4] #14 [c01fd274b530] rpcauth_unwrap_resp at d0001fe67180 [sunrpc] #15 [c01fd274b600] call_decode at d0001fe527c8 [sunrpc] #16 [c01fd274b6b0] __rpc_execute at d0001fe64260 [sunrpc] #17 [c01fd274b790] rpc_run_task at d0001fe54a78 [sunrpc] #18 [c01fd274b7c0] nfs4_call_sync_sequence at d0002ca60960 [nfsv4] #19 [c01fd274b860] _nfs4_proc_getattr at d0002ca6217c [nfsv4] #20 [c01fd274b930] nfs4_proc_getattr at d0002ca6f494 [nfsv4] #21 [c01fd274b9a0] __nfs_revalidate_inode at d000202cf614 [nfs] #22 [c01fd274ba30] nfs_revalidate_file_size at d000202c9618 [nfs] #23 [c01fd274ba70] nfs_file_write at d000202cabdc [nfs] #24 [c01fd274bb00] new_sync_write at c02b3d9c #25 [c01fd274bbd0] __kernel_write at c02b3fec #26 [c01fd274bc20] do_acct_process at c0166b78 #27 [c01fd274bcc0] acct_process at c016748c #28 [c01fd274bcf0] do_exit at c00b3660 #29 [c01fd274bdc0] do_group_exit at c00b3b14 #30 [c01fd274be00] sys_exit_group at c00b3bdc #31 [c01fd274be30] system_call at c0009258 PID: 4900 TASK: c03c9946c180 CPU: 16 COMMAND: "kworker/u320:2" #0 [c03c994fb790] __switch_to at c0015934 #1 [c03c994fb960] __switch_to at c0015934 #2 [c03c994fb9c0] __schedule at c0a11de8 #3 [c03c994fbbe0] schedule_preempt_disabled at c0a12980 #4 [c03c994fbc00] __mutex_lock_slowpath at c0a14aec #5 [c03c994fbc80] mutex_lock at c0a14c4c #6 [c03c994fbcb0] acct_get at c01663ec #7 [c03c994fbcf0] acct_process at c0167480 #8 [c03c994fbd20] do_exit at c00b3660 #9 [c03c994fbdf0] call_usermodehelper at c00ccaf4 #10 [c03c994fbe30] ret_from_kernel_thread at c000956c Historical bug data: from customer: am uploading a crash dump file from a lock up event that I just had. I was reminded on a status update call this morning that I never tried running process accounting since opening the ticket and updating the kernel. So, I tried that this morning. The first time I turned it on with the default output location and didn?t have any problems. Then I tried turning it on with the output going to our shared disk space, which is where I was originally sending it. I ran a couple commands that returned just fine, then when I ran a CUDA test program it didn?t return and I verified that I was no longer able
[Kernel-packages] [Bug 1505178] Re: MFG: Habanero: hxestorage exerciser logs task blocked messages in dmesg when running disks under PMC Sierra
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1505178 Title: MFG: Habanero: hxestorage exerciser logs task blocked messages in dmesg when running disks under PMC Sierra Status in linux package in Ubuntu: New Bug description: == Comment: #0 == When running STX on Habanero systems with PMC Sierra, the following linux error messages are found when running "dmesg -T --level=alert,crit,err" after the run. [Fri Oct 2 12:36:52 2015] INFO: task hxestorage:18049 blocked for more than 120 seconds. [Fri Oct 2 12:36:52 2015] Tainted: G OE 3.19.0-25-generic #26~14.04.1-Ubuntu [Fri Oct 2 12:36:52 2015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Oct 2 12:36:52 2015] INFO: task hxestorage:18177 blocked for more than 120 seconds. [Fri Oct 2 12:36:52 2015] Tainted: G OE 3.19.0-25-generic #26~14.04.1-Ubuntu [Fri Oct 2 12:36:52 2015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Oct 2 12:36:52 2015] INFO: task hxestorage:18181 blocked for more than 120 seconds. [Fri Oct 2 12:36:52 2015] Tainted: G OE 3.19.0-25-generic #26~14.04.1-Ubuntu [Fri Oct 2 12:36:52 2015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Oct 2 12:36:52 2015] INFO: task hxestorage:18185 blocked for more than 120 seconds. [Fri Oct 2 12:36:52 2015] Tainted: G OE 3.19.0-25-generic #26~14.04.1-Ubuntu [Fri Oct 2 12:36:52 2015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Oct 2 12:36:52 2015] INFO: task hxestorage:18189 blocked for more than 120 seconds. [Fri Oct 2 12:36:52 2015] Tainted: G OE 3.19.0-25-generic #26~14.04.1-Ubuntu [Fri Oct 2 12:36:52 2015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Oct 2 12:36:52 2015] INFO: task hxestorage:18194 blocked for more than 120 seconds. [Fri Oct 2 12:36:52 2015] Tainted: G OE 3.19.0-25-generic #26~14.04.1-Ubuntu [Fri Oct 2 12:36:52 2015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Oct 2 12:36:52 2015] INFO: task hxestorage:18200 blocked for more than 120 seconds. [Fri Oct 2 12:36:52 2015] Tainted: G OE 3.19.0-25-generic #26~14.04.1-Ubuntu [Fri Oct 2 12:36:52 2015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Oct 2 12:36:52 2015] INFO: task hxestorage:18205 blocked for more than 120 seconds. [Fri Oct 2 12:36:52 2015] Tainted: G OE 3.19.0-25-generic #26~14.04.1-Ubuntu [Fri Oct 2 12:36:52 2015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Oct 2 12:36:52 2015] INFO: task hxestorage:18213 blocked for more than 120 seconds. [Fri Oct 2 12:36:52 2015] Tainted: G OE 3.19.0-25-generic #26~14.04.1-Ubuntu [Fri Oct 2 12:36:52 2015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Oct 2 12:36:52 2015] INFO: task hxestorage:18221 blocked for more than 120 seconds. [Fri Oct 2 12:36:52 2015] Tainted: G OE 3.19.0-25-generic #26~14.04.1-Ubuntu [Fri Oct 2 12:36:52 2015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. We are running the following code levels. ver 1.5.4.3 - OS, HTX, Firmware and Machine details OS: GNU/Linux OS Version: Ubuntu 14.04.3 LTS \n \l Kernel Version: 3.19.0-25-generic HTX Version: htxubuntu-357 Host Name: rcx2c357 Machine Serial No: 1035C5A Machine Type/Model: 8348-21C We have a very limited number of PMC Sierra configs. I've seen this error on both EC3S and ECSY PMC adapter types. We've only run systems with 6TB drives or a mix of 6TB and 8TB disk drives so far. == Comment: #5 == Call Trace: dmesg -T --- [Fri Oct 2 12:36:52 2015] INFO: task hxestorage:18049 blocked for more than 120 seconds. [Fri Oct 2 12:36:52 2015] Tainted: G OE 3.19.0-25-generic #26~14.04.1-Ubuntu [Fri Oct 2 12:36:52 2015] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Fri Oct 2 12:36:52 2015] hxestorage D 3fff78c69a20 0 18049 451 0x0004 [Fri Oct 2 12:36:52 2015] Call Trace: [Fri Oct 2 12:36:52 2015] [c0791de17490] [c079111f8980] 0xc079111f8980 (unreliable) [Fri Oct 2 12:36:52 2015] [c0791de17660] [c0015934] __switch_to+0x204/0x350 [Fri Oct 2 12:36:52 2015] [c0791de176c0] [c0a11948] __schedule+0x368/0x8d0 [Fri Oct 2 12:36:52 2015]
[Kernel-packages] [Bug 1502982] Re: STCOP810:Firestone: frsfp6 EEH on Bluefin does not recover with Ubuntu
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1502982 Title: STCOP810:Firestone: frsfp6 EEH on Bluefin does not recover with Ubuntu Status in linux package in Ubuntu: New Bug description: Problem: == Test Case Execution Record: 95613: EEH_Firestone_Ubuntu 14.04.03_Bluefin_Standalone on frsfp6 Error Injection Method: err_injct_inboundA Step 1. Start HTX (I used mdt.hdbuster & only ran htx on bluefin disks) Step 2. Inject EEH error bluefin is in slot P1-C4 (PCI0004) echo 0x8000 > /sys/kernel/debug/powerpc/PCI0004/err_injct_inboundA; sleep 1; echo 0x0 > /sys/kernel/debug/powerpc/PCI0004/err_injct_inboundA Expected Result: Adapter/SAN disks to recover and htx still run Actual Result: Adapter did not recover... continuous EEH errors until limit of 6 is reached in 1 hour There're two patches: one for skiboot firmware and another patch, which has been in upstream, was missed in ubuntu distro (at least 15.04). The skiboot patch has been merged to upstream. c7192a4 PHB3: Fix wrong PE number in error injection (skiboot) 2aa5cf9 powerpc/eeh: Fix missed PE#0 on P7IOC (linux) If I'm correct, I think this bug needs to be mirrored so that the Linux patch (commit 2aa5cf9) can be backported to ubuntu distro. With the patch backported to ubuntu 15.04, EEH works fine on Broadcom adapter (not exactly the one where the bug was reported initially): root@fstn2-p1:/# dmesg | grep EEH [0.216919] EEH: PowerNV platform initialized [0.570606] EEH: devices created [1.302482] EEH: PCI Enhanced I/O Error Handling Enabled [ 90.566761] EEH: PHB location: Slot1 [ 90.567503] EEH: Frozen PHB#4-PE#0 detected [ 90.567673] EEH: PE location: Slot1, PHB location: Slot1 [ 90.567930] EEH: Detected PCI bus error on PHB#4-PE#0 [ 90.567935] EEH: This PCI device has failed 1 times in the last hour [ 90.567937] EEH: Notify device drivers to shutdown [ 90.567985] EEH: Collect temporary log [ 90.568971] EEH: Reset without hotplug activity [ 94.585540] EEH: Notify device drivers the completion of reset [ 94.585934] EEH: Notify device driver to resume The story about this bug is: Without commit 2aa5cf9 ("powerpc/eeh: Fix missed PE#0 on P7IOC"). PE#0 is regarded as invalid one. When kernel sees the frozen PE#0, the frozen state is cleared and dump the PHB diag-data, then try to recover it. When resetting the PE, the driver, which wasn't stopped by error_detected() completely, access the MMIO space and just causes another (recursive) EEH error. Eventually, the EEH recovery failed. During the PE reset, the I/O path for the PE should be frozen and MMIO access during the period should be dropped to avoid recursive EEH error. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1502982/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1500672] Re: cpufreq: powernv : Decrease the severity of console messages
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1500672 Title: cpufreq: powernv : Decrease the severity of console messages Status in linux package in Ubuntu: New Bug description: "CPU Frequency could be throttled" dmesg is logged with critical log level when the max frequency is clipped. This message should be logged to 'info' if the frequency is throttled between nominal and turbo and not as critical. Fix for this involves the below two patches: 1)053819e0bf8407746cc5febf7a4947bee50377b4 cpufreq: powernv: Handle throttling due to Pmax capping at chip level 2) https://patchwork.ozlabs.org/patch/517268/ To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1500672/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1499357] Re: 830 TI on Tuleta during IPL of Linux - bad xisr passed to PHYP
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1499357 Title: 830 TI on Tuleta during IPL of Linux - bad xisr passed to PHYP Status in linux package in Ubuntu: New Bug description: I looked at the dump and the assert is due to a bad xisr. From the VIO trace the xisr was 01000A00. > ~d4a3/phypmacro/vio -globals -fr +-+ | HvVioGlobals (address=800807621100) | +-+ BitBucket:0x07F01A2D2000 AssertFr: 0x07F80F394C80 AssertEnabled: True VlanMap: 0x07F807621000 ++ | HvVioFr (address=07F01A282800) | ++ [ 0] HvVioInterruptAssertBadXisr [TB] 002C2067AB1A 01000A00 Here is the trace along with some Linux output that followed: Token = 34, timebase = 0x24222848 h_hypervisor_esw_call(0x504c) rc = 0xfffc (-4) 175: b= 0100 0A00 0001 [] 105: get_parms_ptr= 0100 0A00 0001 [] GET XIVE ERROR hcall rc=fffc buff_rc=1 [0.000517] irq: (null) didn't like hwirq-0x1000a00 to VIRQ16 mapping (rc=-22) [0.000578] hvsi_console_init: couldn't create irq mapping for 0x1000a00 - I then dumped the device tree for interrupts that PFW communicates to Linux via the device as follows: 1) Here are all the 'interrupt-ranges' properties found: 0 > showprops -i interrupt-ranges /ibm,platform-facilities00090400 0400 /event-sources 0009 0008 /interrupt-controller@8002510 37f8 0004 /interrupt-controller@8002513 3ff8 0004 /interrupt-controller@8002514 /interrupt-controller@8002515 /interrupt-controller@8002518 17f8 0004 /interrupt-controller@800251b /interrupt-controller@800251d 1ff8 0004 /interrupt-controller@800251e /interrupt-controller@800251f /interrupt-controller@8002521 0ff8 0004 /interrupt-controller@8002528 2ff8 0004 /interrupt-controller@8002529 27f8 0004 /vdevice000a 00c7 000b 007f 2) Here are all the 'ibm,msi-ranges' properties found: 0 > showprops -i ibm,msi-ranges /pci@8002014/ethernet@0 3be0 0001 /pci@8002014/ethernet@0,1 3be1 0001 /pci@8002014/ethernet@0,2 3be2 0001 /pci@8002014/ethernet@0,3 3be3 0001 /pci@8002015/pci1014,034A@0 3820 0001 /pci@8002018/pci@0/pci@2/fibre-channel@01000 0001 /pci@8002018/pci@0/pci@2/fibre-channel@0,1 1001 0001 /pci@8002018/pci@0/pci@3/fibre-channel@01002 0001 /pci@8002018/pci@0/pci@3/fibre-channel@0,1 1003 0001 /pci@800201b/usb@0 1fa0 0001 /pci@800201e/ethernet@0 1ce0 0001 /pci@800201e/ethernet@0,1 1ce1 0001 /pci@800201e/ethernet@0,2 1ce2 0001 /pci@800201e/ethernet@0,3 1ce3 0001 /pci@8002029/pci@0/pci@2/fibre-channel@02000 0001 /pci@8002029/pci@0/pci@2/fibre-channel@0,1 2001 0001 /pci@8002029/pci@0/pci@3/fibre-channel@02002 0001 /pci@8002029/pci@0/pci@3/fibre-channel@0,1 2003 0001 3) Here are all the 'interrupts' properties found: 0 > showprops -i interrupts /event-sources/epow-events 00090001 /vdevice/vty@3000 000a /vdevice/vty@3001 000a0001 /vdevice/ibm,vmc@3002 000a0002 ---
[Kernel-packages] [Bug 1496989] Re: ISST-LTE: system crashes at lpfc_sli4_scmd_to_wqidx_distr
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1496989 Title: ISST-LTE: system crashes at lpfc_sli4_scmd_to_wqidx_distr Status in linux package in Ubuntu: New Bug description: -- Problem Description -- We have Ubuntu 15.10 installed on our system and run stress test for around 24 hrs then it crashes at lpfc_sli4_scmd_to_wqidx_distr+0x30/0x100 0:mon> e cpu 0x0: Vector: 300 (Data Access) at [ca0575a0] pc: d3115b30: lpfc_sli4_scmd_to_wqidx_distr+0x30/0x100 [lpfc] lr: d30b749c: lpfc_sli_calc_ring.part.20+0xdc/0x100 [lpfc] sp: ca057820 msr: 80019033 dar: 0 dsisr: 4000 current = 0xc00272dbbcf0 paca= 0xce7f softe: 0irq_happened: 0x01 pid = 246, comm = scsi_eh_0 0:mon> t [ca057850] d30b749c lpfc_sli_calc_ring.part.20+0xdc/0x100 [lpfc] [ca057890] d30bf680 lpfc_sli_issue_iocb+0xf0/0x320 [lpfc] [ca0578f0] d30c3804 lpfc_sli_issue_iocb_wait+0x264/0x680 [lpfc] [ca0579d0] d3110fd4 lpfc_send_taskmgmt+0x2d4/0x7d0 [lpfc] [ca057aa0] d3111bf4 lpfc_device_reset_handler+0x114/0x210 [lpfc] [ca057b60] c071254c scsi_eh_ready_devs+0x68c/0xee0 [ca057c50] c071474c scsi_error_handler+0x6ac/0x9e0 [ca057d80] c00e1e20 kthread+0x110/0x130 [ca057e30] c0009530 ret_from_kernel_thread+0x5c/0xac 0:mon> di d3115b30 d3115b30 e92ald r9,0(r10) d3115b34 e929ld r9,0(r9) d3115b38 e92901a8ld r9,424(r9) d3115b3c 7928b7e3rldicl. r8,r9,54,63 d3115b40 40820090bne d3115bd0# lpfc_sli4_scmd_to_wqidx_distr+0xd0/0x100 [lpfc] d3115b44 813f0ae0lwz r9,2784(r31) d3115b48 2f890001cmpwi cr7,r9,1 d3115b4c 419e0054beq cr7,d3115ba0# lpfc_sli4_scmd_to_wqidx_distr+0xa0/0x100 [lpfc] d3115b50 395f0d58addir10,r31,3416 d3115b54 3921li r9,1 d3115b58 7c2004aclwsync d3115b5c 7c605028lwarx r3,0,r10 d3115b60 7c691a14add r3,r9,r3 d3115b64 7c60512dstwcx. r3,0,r10 d3115b68 40c2fff4bne-d3115b5c# lpfc_sli4_scmd_to_wqidx_distr+0x5c/0x100 [lpfc] d3115b6c 7c0004acsync 0:mon> d c0ab00e0 c0ab00e0 4c696e7578207665 7273696f6e20342e |Linux version 4.| c0ab00f0 322e302d372d6765 6e65726963202862 |2.0-7-generic (b| c0ab0100 75696c6464406465 6e6e656564303429 |uildd@denneed04)| c0ab0110 2028676363207665 7273696f6e20352e | (gcc version 5.| lpfc_sli4_scmd_to_wqidx_distr() got moved around and changed a bit to lpfc_scsi.c with commit 8b0dff14164d3f43eba8365950b506d898e0e1e6 and the crash appears to be due to an invalid address of 0x0 for struct scsi_cmnd *cmnd 3860 int lpfc_sli4_scmd_to_wqidx_distr(struct lpfc_hba *phba, 3861 struct lpfc_scsi_buf *lpfc_cmd) 3862 { 3863 struct scsi_cmnd *cmnd = lpfc_cmd->pCmd; 3864 struct lpfc_vector_map_info *cpup; 3865 int chann, cpu; 3866 uint32_t tag; 3867 uint16_t hwq; 3868 3869 if (shost_use_blk_mq(cmnd->device->host)) { 3870 tag = blk_mq_unique_tag(cmnd->request); 3871 hwq = blk_mq_unique_tag_to_hwq(tag); 3872 3873 return hwq; 3874 } 0:mon> r R00 = d30b749c R16 = ca057cd0 R01 = ca057820 R17 = ca057cb8 R02 = d3163d28 R18 = ca52a088 R03 = c0027e9fe000 R19 = ca057cb0 R04 = c0027139a400 R20 = 001e R05 = c0027139a470 R21 = 0001 R06 = 0001 R22 = c180c268 R07 = d3163d28 R23 = c0027139a470 R08 = d310de90 R24 = c0027139a400 R09 = 0004 R25 = ca057978 R10 = R26 = 0001 R11 = d3137e20 R27 = R12 = 28641824 R28 = ca528000 R13 = ce7f R29 = c0027e9fe000 R14 = ca057cb8 R30 = c0027139a400 R15 = R31 = c0027e9fe000 pc = d3115b30 lpfc_sli4_scmd_to_wqidx_distr+0x30/0x100 [lpfc] cfar= c0008468 slb_miss_realmode+0x50/0x78 lr = d30b749c lpfc_sli_calc_ring.part.20+0xdc/0x100 [lpfc] msr = 80019033 cr = 28648828 ctr = c0a95a70 xer = 2000 trap = 3
[Kernel-packages] [Bug 1491494] Re: Ubuntu 14.04.03 LPAR hits kernel oops after serial adapter is removed from profile
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1491494 Title: Ubuntu 14.04.03 LPAR hits kernel oops after serial adapter is removed from profile Status in linux package in Ubuntu: New Bug description: -- Problem Description -- The failure related to the BELL-3 (2 port-Async EIA-232 adapter). Ubuntu always hit exception when the adapter is not present. See my test scenarios below. Test #1: Boot Ubuntu with BELL-3 adapter === - The Ubuntu LPAR was running with the BELL-3 (2 port-Async EIA-232 adapter) before. So I assigned the BELL-3 adapter to Ubuntu LPAR profile and powered on the LPAR. => Ubuntu boot fine this time. Test #2: Boot Ubuntu with BELL-3 adapter removed from LPAR profile === - I powered down the Ubuntu partition and removed the BELL-3 adapter from LPAR profile then powered on the LPAR. => Ubuntu hit the exception. Elapsed time since release of system processors: 0 mins 9 secs error: no suitable video mode found. OF stdout device is: /vdevice/vty@3000 Preparing to boot Linux version 3.19.0-23-generic (buildd@denneed03) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #24~14.04.1-Ubuntu SMP Wed Jul 8 11:17:19 UTC 2015 (Ubuntu 3.19.0-23.24~14.04.1-generic 3.19.8-ckt2) Detected machine type: 0101 Max number of cores passed to firmware: 256 (NR_CPUS = 2048) Calling ibm,client-architecture-support... done command line: BOOT_IMAGE=/boot/vmlinux-3.19.0-23-generic root=UUID=768190e7-f633-4c63-a1e3-588d12dea265 ro quiet splash vt.handoff=7 memory layout at init: memory_limit : (16 MB aligned) alloc_bottom : 0b42 alloc_top: 1000 alloc_top_hi : 1000 rmo_top : 1000 ram_top : 1000 instantiating rtas at 0x0ecb... done prom_hold_cpus: skipped copying OF device tree... Building dt strings... Building dt structure... Device tree strings 0x0b43 -> 0x0b4316b1 Device tree struct 0x0b44 -> 0x0b47 Calling quiesce... returning from prom_init -> smp_release_cpus() spinning_secondaries = 15 <- smp_release_cpus() <- setup_system() [0.661510] /build/linux-lts-vivid-uV14Ja/linux-lts-vivid-3.19.0/drivers/rtc/hctosys.c: unable to open rtc device (rtc0) [0.672826] sd 0:0:1:0: [sda] Assuming drive cache: write through [4.658302] device-mapper: table: 252:0: multipath: error getting device [4.691990] device-mapper: table: 252:0: multipath: error getting device [4.934034] device-mapper: table: 252:0: multipath: error getting device [4.951977] device-mapper: table: 252:0: multipath: error getting device * Discovering and coalescing multipaths... [ OK ] Skipping profile in /etc/apparmor.d/disable: usr.sbin.rsyslogd * Starting AppArmor profiles[ OK ] Loading the saved-state of the serial devices... [5.109665] Unable to handle kernel paging request for data at address 0xd803 [5.109677] Faulting instruction address: 0xc060fec4 [5.109685] Oops: Kernel access of bad area, sig: 11 [#1] [5.109691] SMP NR_CPUS=2048 NUMA pSeries [5.109699] Modules linked in: dm_round_robin dm_multipath scsi_dh pseries_rng rtc_generic knem(OE) nfsd auth_rpcgss nfs_acl nfs lockd grace sunrpc fscache rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) configfs ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx5_ib(OE) mlx5_core(OE) mlx4_ib(OE) ib_sa(OE) ib_mad(OE) ib_core(OE) ib_addr(OE) mlx4_en(OE) mlx4_core(OE) mlx_compat(OE) [5.109759] CPU: 1 PID: 1816 Comm: setserial Tainted: G OE 3.19.0-23-generic #24~14.04.1-Ubuntu [5.109769] task: c000f389c880 ti: c000f0528000 task.ti: c000f0528000 [5.109777] NIP: c060fec4 LR: c0617498 CTR: c060fe20 [5.109785] REGS: c000f052b6b0 TRAP: 0300 Tainted: G OE (3.19.0-23-generic) [5.109793] MSR: 80009033 CR: 84002022 XER: [5.109814] CFAR: c0008468 DAR: d803 DSISR: 4200 SOFTE: 1 GPR00: c0617498 c000f052b930 c144c700 00bf GPR04: d803 00bf c000f399 0141 GPR08: c0611d20 c13539e0 d800 c1351ba8 GPR12: c060fe20 ce830900 GPR16: GPR20: 007d 0040 GPR24: c0
[Kernel-packages] [Bug 1488495] Re: NX842 fixes for Ubuntu 15.10
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1488495 Title: NX842 fixes for Ubuntu 15.10 Status in linux package in Ubuntu: New Bug description: ---Problem Description--- There are actually some more NX 842 patches that should be in the 4.3 kernel. They're currently all in Herbert's tree: git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git In the usual descending order: 03952d9 crypto: nx - make platform drivers directly register with crypto 174d66d crypto: nx - rename nx-842-crypto.c to nx-842.c d31581a crypto: nx - merge nx-compress and nx-compress-crypto 20fc311 crypto: nx - use common code for both NX decompress success cases ee781b7 crypto: nx - don't register pSeries driver if ENODEV 7f6e3aa crypto: nx - move kzalloc() out of spinlock 90fd73f crypto: nx - remove pSeries NX 'status' field 039af96 crypto: nx - remove __init/__exit from VIO functions 23ad69a crypto: nx/842 - Fix context corruption 2b93f7e crypto: nx - reduce chattiness of platform drivers 7abd75b crypto: nx - do not emit extra output if status is disabled ec13bcb crypto: nx - rename nx842_{init, exit} to nx842_pseries_{init, exit} fa9a9a0 crypto: nx - nx842_OF_upd_status should return ENODEV if device is not 'okay' ---uname output--- Ubuntu 15.10 Machine Type = power 8 I am working on back-port these patches and will be providing them early next week. These patches fix several bugs in nx842 code. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1488495/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1487085] Re: Ubuntu 14.04.3 LTS Crash in notifier_call_chain after boot
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1487085 Title: Ubuntu 14.04.3 LTS Crash in notifier_call_chain after boot Status in linux package in Ubuntu: New Bug description: ---Problem Description--- Installed Ubuntu 14.04.3 LTS on Palmetto and its crashing after booting to login. This happens every time I boot Ubuntu 14.04.3 LTS. I've reinstalled Ubuntu and replaced the hard disk as well and re-installed. Still crashing. ---uname output--- Linux paul40 3.19.0-26-generic #28~14.04.1-Ubuntu SMP Wed Aug 12 14:10:52 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux Machine Type = Palmetto ---System Hang--- Ubuntu OS crashes and cannot access host. Must reboot system ---Steps to Reproduce--- Boot system Oops output: [ 33.132376] Unable to handle kernel paging request for data at address 0x200 [ 33.132565] Faulting instruction address: 0xc00dbc60 [ 33.133422] Oops: Kernel access of bad area, sig: 11 [#1] [ 33.134410] SMP NR_CPUS=2048 NUMA PowerNV [ 33.134478] Modules linked in: ast ttm drm_kms_helper joydev mac_hid drm hid_generic usbhid hid syscopyarea sysfillrect sysimgblt i2c_algo_bit ofpart cmdlinepart at24 uio_pdrv_genirq powernv_flash mtd ipmi_powernv powernv_rng opal_prd ipmi_msghandler uio uas usb_storage ahci libahci [ 33.139112] CPU: 24 PID: 0 Comm: swapper/24 Not tainted 3.19.0-26-generic #28~14.04.1-Ubuntu [ 33.139943] task: c13cccb0 ti: c00fff70 task.ti: c1448000 [ 33.141642] NIP: c00dbc60 LR: c00dbd94 CTR: [ 33.142605] REGS: c00fff703980 TRAP: 0300 Not tainted (3.19.0-26-generic) [ 33.143417] MSR: 90009033 CR: 28002888 XER: [ 33.144244] CFAR: c0008468 DAR: 0200 DSISR: 4000 SOFTE: 0 GPR00: c00dbd94 c00fff703c00 c144cc00 c15f03c0 GPR04: 0007 c15f03b8 GPR08: 0200 c006c394 90001003 GPR12: 2200 cfb8d800 0058 GPR16: c1448000 c1448000 c1448080 c0e9a880 GPR20: c1448080 0001 0002 0012 GPR24: c00f1e432200 c15f03b8 GPR28: 0007 c15f03c0 [ 33.157013] NIP [c00dbc60] notifier_call_chain+0x70/0x100 [ 33.157818] LR [c00dbd94] atomic_notifier_call_chain+0x44/0x60 [ 33.162090] Call Trace: [ 33.162845] [c00fff703c00] [0008] 0x8 (unreliable) [ 33.163644] [c00fff703c50] [c00dbd94] atomic_notifier_call_chain+0x44/0x60 [ 33.164647] [c00fff703c90] [c006f2a8] opal_message_notify+0xa8/0x100 [ 33.165476] [c00fff703d00] [c00dbc88] notifier_call_chain+0x98/0x100 [ 33.167007] [c00fff703d50] [c00dbd94] atomic_notifier_call_chain+0x44/0x60 [ 33.167816] [c00fff703d90] [c006f654] opal_do_notifier.part.5+0x74/0xa0 [ 33.172166] [c00fff703dd0] [c006f6d8] opal_interrupt+0x58/0x70 [ 33.172997] [c00fff703e10] [c01273d0] handle_irq_event_percpu+0x90/0x2b0 [ 33.174507] [c00fff703ed0] [c0127658] handle_irq_event+0x68/0xd0 [ 33.175312] [c00fff703f00] [c012baf4] handle_fasteoi_irq+0xe4/0x240 [ 33.176124] [c00fff703f30] [c01265c8] generic_handle_irq+0x58/0x90 [ 33.176936] [c00fff703f60] [c0010f10] __do_irq+0x80/0x190 [ 33.182406] [c00fff703f90] [c002476c] call_do_irq+0x14/0x24 [ 33.183258] [c144ba30] [c00110c0] do_IRQ+0xa0/0x120 [ 33.184072] [c144ba90] [c00025d8] hardware_interrupt_common+0x158/0x180 [ 33.184907] --- interrupt: 501 at arch_local_irq_restore+0x5c/0x90 [ 33.184907] LR = arch_local_irq_restore+0x40/0x90 [ 33.186473] [c144bd80] [c00f2ae19808] 0xc00f2ae19808 (unreliable) [ 33.188024] [c144bda0] [c085d5d8] cpuidle_enter_state+0xa8/0x260 [ 33.192695] [c144be00] [c0108be8] cpu_startup_entry+0x488/0x4e0 [ 33.193543] [c144bee0] [c000bdb4] rest_init+0xa4/0xc0 [ 33.194327] [c144bf00] [c0da3e80] start_kernel+0x53c/0x558 [ 33.195084] [c144bf90] [c0008c6c] start_here_common+0x20/0xa8
[Kernel-packages] [Bug 1480894] Re: Frequency in Psafe after an OCC reset cycle while using performance governor
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1480894 Title: Frequency in Psafe after an OCC reset cycle while using performance governor Status in linux package in Ubuntu: New Bug description: == Comment: #0 - Shilpasri G. Bhat - 2015-05-08 14:27:37 == ---Problem Description--- Frequency is not restored on a reset-cycle of OCC which can lead to performance degradation when using performance-governor. When OCC is reset it forces the cpu to safe frequency, after OCC becomes active again the frequency is not restored to max frequency if the governor is performance-governor or to the last requested frequency for other static governors. Contact Information = Shilpasri G Bhat / shilpa.b...@linux.vnet.ibm.com ---Steps to Reproduce--- 1) select performance governor 2) OCC reset cycle Fix for this is posted to lkml: https://lkml.org/lkml/2015/5/4/136 == Comment: #1 - Shilpasri G. Bhat - 2015-08-03 06:56:15 == Fyi ... v4 of the patches which were posted at https://lkml.org/lkml/2015/7/13/375 ("[PATCH v4 0/6] powernv: cpufreq: Report frequency throttle by OCC") with the following 6 patches: cpufreq: powernv: Handle throttling due to Pmax capping at chip level powerpc/powernv: Add definition of OPAL_MSG_OCC message type cpufreq: powernv: Register for OCC related opal_message notification cpufreq: powernv: Call throttle_check() on receiving OCC_THROTTLE cpufreq: powernv: Report Psafe only if PMSR.psafe_mode_active bit is set cpufreq: powernv: Restore cpu frequency to policy->cur on unthrottling . ... is now accepted upstream as follows: 1) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=053819e0bf8407746cc5febf7a4947bee50377b4 ("cpufreq: powernv: Handle throttling due to Pmax capping at chip level") . 2) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=196ba2d514a13f6af1b3d78de71ce74ed2fc8bdc ("powerpc/powernv: Add definition of OPAL_MSG_OCC message type") . 3) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=cb166fa937a2fbc14badcafca86202354c34a213 ("cpufreq: powernv: Register for OCC related opal_message notification") . 4) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=735366fc407755626058218fc8d0430735a669ac ("cpufreq: powernv: Call throttle_check() on receiving OCC_THROTTLE") . 5) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=3dd3ebe5bb3837aeac28a23f8f22b97cb84abab6 ("cpufreq: powernv: Report Psafe only if PMSR.psafe_mode_active bit is set") . 6) https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=227942809b52f23cda414858b635c0285f11de00 ("cpufreq: powernv: Restore cpu frequency to policy->cur on unthrottling") To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1480894/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1486180] Re: Kernel OOPS during DLPAR operation with Fibre Channel adapter
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1486180 Title: Kernel OOPS during DLPAR operation with Fibre Channel adapter Status in linux package in Ubuntu: New Bug description: -- Problem Description -- Kernel OOPS during DLPAR operation with Fibre Channel adapter ---uname output--- 4.1.0-1-generic ---Additional Hardware Info--- Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host Adapter (rev 03) Machine Type = POWER8 ---Steps to Reproduce--- 1) Install Ubuntu 15.10 on a Power VM LPAR. 2) Configure and start rtas_errd daemon 3) Via HMC try to add a Fibre channel adapter via dynamic partitioning During the operation following OOPS message is observed Oops output: !!! 00E0806 Fcode, Copyright (c) 2000-2012 Emulex !!! Version 3.10x2 !!! 00E0806 Fcode, Copyright (c) 2000-2012 Emulex !!! Version 3.10x2 [ 8696.808703] PCI host bridge /pci@8002020 ranges: [ 8696.808708] MEM 0x0003ff84..0x0003ff847eff -> 0x8000 [ 8696.808716] PCI: I/O resource not set for host bridge /pci@8002020 (domain 1) [ 8696.808761] PCI host bridge to bus 0001:01 [ 8696.808765] pci_bus 0001:01: root bus resource [mem 0x3ff84-0x3ff847eff] (bus address [0x8000-0xfeff]) [ 8696.808768] pci_bus 0001:01: root bus resource [bus 01-ff] [ 8696.897390] rpaphp: Slot [U78C7.001.RCH0042-P1-C8] registered [ 8696.897395] rpadlpar_io: slot PHB 32 added [ 8696.972155] Emulex LightPulse Fibre Channel SCSI driver 10.5.0.0. [ 8696.972157] Copyright(c) 2004-2015 Emulex. All rights reserved. [ 8696.972438] lpfc 0001:01:00.1: enabling device (0140 -> 0142) [ 8696.976145] Unable to handle kernel paging request for data at address 0x000c [ 8696.976174] Faulting instruction address: 0xc0084cc4 [ 8696.976182] Oops: Kernel access of bad area, sig: 11 [#1] [ 8696.976188] SMP NR_CPUS=2048 NUMA pSeries [ 8696.976196] Modules linked in: lpfc(+) scsi_transport_fc rpadlpar_io rpaphp rtc_generic pseries_rng autofs4 [ 8696.976220] CPU: 3 PID: 1426 Comm: systemd-udevd Not tainted 4.1.0-1-generic #1~dogfoodv1-Ubuntu [ 8696.976230] task: c003857737e0 ti: c000fd08c000 task.ti: c000fd08c000 [ 8696.976239] NIP: c0084cc4 LR: c0084ca8 CTR: [ 8696.976247] REGS: c000fd08f0f0 TRAP: 0300 Not tainted (4.1.0-1-generic) [ 8696.976255] MSR: 80019033 CR: 8222 XER: 2000 [ 8696.976278] CFAR: c0008468 DAR: 000c DSISR: 4000 SOFTE: 1 GPR00: c0084ca8 c000fd08f370 c14bda00 GPR04: 0001 c000fd08f408 0003 d2c31e60 GPR08: c13bda00 c003873e6b80 d2ca7c98 GPR12: 8800 ce831b00 d29421f8 38ca4522 GPR16: c000fd08fdc0 c000fd08fe04 d2941878 c000fc8054c0 GPR20: d238 d238 d2ccff90 GPR24: c165074c c0038e17e000 c13b5e00 c0038e17e000 GPR28: c13b5e28 ca590600 c13b5df0 c13b5e20 [ 8696.976396] NIP [c0084cc4] enable_ddw+0x254/0x7b0 [ 8696.976405] LR [c0084ca8] enable_ddw+0x238/0x7b0 [ 8696.976411] Call Trace: [ 8696.976419] [c000fd08f370] [c0084ca8] enable_ddw+0x238/0x7b0 (unreliable) [ 8696.976431] [c000fd08f4b0] [c00866d8] dma_set_mask_pSeriesLP+0x218/0x2a0 [ 8696.976444] [c000fd08f540] [c0023528] dma_set_mask+0x58/0xa0 [ 8696.976474] [c000fd08f570] [d2c71280] lpfc_pci_probe_one+0xb0/0xc50 [lpfc] [ 8696.976486] [c000fd08f610] [c05987fc] local_pci_probe+0x6c/0x140 [ 8696.976497] [c000fd08f6a0] [c0598a28] pci_device_probe+0x158/0x1e0 [ 8696.976510] [c000fd08f700] [c067b744] driver_probe_device+0x1c4/0x5a0 [ 8696.976522] [c000fd08f790] [c067bcdc] __driver_attach+0x11c/0x120 [ 8696.976533] [c000fd08f7d0] [c067854c] bus_for_each_dev+0x9c/0x110 [ 8696.976544] [c000fd08f820] [c067adbc] driver_attach+0x3c/0x60 [ 8696.976555] [c000fd08f850] [c067a768] bus_add_driver+0x208/0x320 [ 8696.976565] [c000fd08f8e0] [c067c99c] driver_register+0x9c/0x180 [ 8696.976576] [c000fd08f950] [c05978ec] __pci_register_driver+0x6c/0x90 [ 8696.976604] [c000fd08f990] [d2ca7848] lpfc_init+0x17c/0x1d8 [lpfc] [ 8696.976617] [c000fd08fa20] [c000b42c] do_one_initcall+0x12c/0x280 [ 8696.976628] [c000fd08faf0] [c0a6c7c8] do
[Kernel-packages] [Bug 1483343] Re: NMI watchdog: BUG: soft lockup errors when we execute lock_torture_wr tests
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1483343 Title: NMI watchdog: BUG: soft lockup errors when we execute lock_torture_wr tests Status in linux package in Ubuntu: New Bug description: ---Problem Description--- NMI watchdog: BUG: soft lockup errors when we execute lock_torture_wr tests ---uname output--- Linux alp15 3.19.0-18-generic #18~14.04.1-Ubuntu SMP Wed May 20 09:40:36 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux Machine Type = P8 ---Steps to Reproduce--- Install a P8 Power VM LPAR with Ubuntu 14.04.2 ISO. Then install the Ubuntu 14.04.3 kernel on the same and reboot. Then compile and build the LTP latest test suites on the same. root@alp15:~# tar -xvf ltp-full-20150420.tar.bz2 root@alp15:~# cd ltp-full-20150420/ root@alp15:~/ltp-full-20150420# ls aclocal.m4 configure execltp.in install-sh Makefile READMErunltplite.shtestcasesutils autom4te.cache configure.ac IDcheck.sh lib Makefile.release README.kernel_config runtest testscripts ver_linux config.guessCOPYING include ltpmenu missing runalltests.shscenario_groups TODO VERSION config.sub doc INSTALL m4 pan runltpscripts tools root@alp15:~/ltp-full-20150420# ./configure root@alp15:~/ltp-full-20150420# make root@alp15:~/ltp-full-20150420# make install root@alp15:/opt/ltp/testcases/bin# ./lock_torture.sh lock_torture 1 TINFO : estimate time 6.00 min lock_torture 1 TINFO : spin_lock: running 60 sec... Message from syslogd@alp15 at Thu Jun 18 01:23:32 2015 ... alp15 vmunix: [ 308.034386] NMI watchdog: BUG: soft lockup - CPU#10 stuck for 21s! [lock_torture_wr:2337] Message from syslogd@alp15 at Thu Jun 18 01:23:32 2015 ... alp15 vmunix: [ 308.034389] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [lock_torture_wr:2331] Message from syslogd@alp15 at Thu Jun 18 01:23:32 2015 ... alp15 vmunix: [ 308.034394] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [lock_torture_wr:2339] Message from syslogd@alp15 at Thu Jun 18 01:23:32 2015 ... alp15 vmunix: [ 308.034396] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [lock_torture_wr:2346] Message from syslogd@alp15 at Thu Jun 18 01:23:32 2015 ... alp15 vmunix: [ 308.034398] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 21s! [lock_torture_wr:2334] Message from syslogd@alp15 at Thu Jun 18 01:23:32 2015 ... alp15 vmunix: [ 308.034410] NMI watchdog: BUG: soft lockup - CPU#11 stuck for 22s! [lock_torture_wr:2321] Message from syslogd@alp15 at Thu Jun 18 01:23:32 2015 ... alp15 vmunix: [ 308.034412] NMI watchdog: BUG: soft lockup - CPU#9 stuck for 22s! [lock_torture_wr:2333] Message from syslogd@alp15 at Thu Jun 18 01:23:32 2015 ... alp15 vmunix: [ 308.038386] NMI watchdog: BUG: soft lockup - CPU#14 stuck for 22s! [lock_torture_wr:2327] Stack trace output: root@alp15:~# dmesg | more [ 1717.146881] lock_torture_wr R running task [ 1717.146881] [ 1717.146885] 0 2555 2 0x0804 [ 1717.146887] Call Trace: [ 1717.146894] [c00c7551b820] [c00c7551b860] 0xc00c7551b860 (unreliable) [ 1717.146899] [c00c7551b860] [c00b4fb0] __do_softirq+0x220/0x3b0 [ 1717.146904] [c00c7551b960] [c00b5478] irq_exit+0x98/0x100 [ 1717.146909] [c00c7551b980] [c001fa54] timer_interrupt+0xa4/0xe0 [ 1717.146913] [c00c7551b9b0] [c0002758] decrementer_common+0x158/0x180 [ 1717.146922] --- interrupt: 901 at _raw_write_lock+0x68/0xc0 [ 1717.146922] LR = torture_rwlock_write_lock+0x28/0x40 [locktorture] [ 1717.146927] [c00c7551bca0] [c00c7551bcd0] 0xc00c7551bcd0 (unreliable) [ 1717.146934] [c00c7551bcd0] [dd4810b8] torture_rwlock_write_lock+0x28/0x40 [locktorture] [ 1717.146939] [c00c7551bcf0] [dd480578] lock_torture_writer+0x98/0x210 [locktorture] [ 1717.146944] [c00c7551bd80] [c00da4d4] kthread+0x114/0x140 [ 1717.146948] [c00c7551be30] [c000956c] ret_from_kernel_thread+0x5c/0x70 [ 1717.146951] Task dump for CPU 10: [ 1717.146953] lock_torture_wr R running task0 2537 2 0x0804 [ 1717.146957] Call Trace: [ 1717.146961] [c00c7557b820] [c00c7557b860] 0xc00c7557b860 (unreliable) [ 1717.146966] [c00c7557b860] [c00b4fb0] __do_softirq+0x220/0x3b0 [ 1717.146970] [c00c7557b960] [c00b5478] irq_exit+0x98/0x100 [ 1717.146975] [c00c7557b980] [c001fa54] timer_interrupt+0xa4/0xe0 [ 1717.146979] [c00c7557b9b0] [c0002758
[Kernel-packages] [Bug 1472798] Re: sensors command is not getting executed in Ubuntu 15.10 on PowerNV 8335-GTA Hardware
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1472798 Title: sensors command is not getting executed in Ubuntu 15.10 on PowerNV 8335-GTA Hardware Status in linux package in Ubuntu: New Bug description: ---Problem Description--- sensors command is not getting executed in Ubuntu 15.10 on PowerNV 8335-GTA Hardware ---uname output--- Linux tul8fp 3.19.0-22-generic #22-Ubuntu SMP Tue Jun 16 17:15:17 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux Machine Type = P8 ---Steps to Reproduce--- Install a PowerNV 8335-GTA machine with Ubuntu 15.10 ISO. Then load the kernel module ibmpowernv on the same. root@tul8fp:~# sensors No sensors found! Make sure you loaded all the kernel drivers you need. Try sensors-detect to find out which these are. root@tul8fp:~# sensors-detect # sensors-detect revision 6209 (2014-01-14 22:51:58 +0100) # DMI data unavailable, please consider installing dmidecode 2.7 # or later for better results. This program will help you determine which kernel modules you need to load to use lm_sensors most effectively. It is generally safe and recommended to accept the default answers to all questions, unless you know what you're doing. Some south bridges, CPUs or memory controllers contain embedded sensors. Do you want to scan for them? This is totally safe. (YES/no): YES modprobe: FATAL: Module cpuid not found. Failed to load module cpuid. Silicon Integrated Systems SIS5595... No VIA VT82C686 Integrated Sensors... No VIA VT8231 Integrated Sensors...No AMD K8 thermal sensors... No AMD Family 10h thermal sensors... No AMD Family 11h thermal sensors... No AMD Family 12h and 14h thermal sensors... No AMD Family 15h thermal sensors... No AMD Family 15h power sensors... No AMD Family 16h power sensors... No Intel digital thermal sensor... No Intel AMB FB-DIMM thermal sensor... No VIA C7 thermal sensor...No VIA Nano thermal sensor... No Lastly, we can probe the I2C/SMBus adapters for connected hardware monitoring devices. This is the most risky part, and while it works reasonably well on most systems, it has been reported to cause trouble on some systems. Do you want to probe the I2C/SMBus adapters now? (YES/no): YES Sorry, no supported PCI bus adapters found. Next adapter: p8_0008_e0p0 (i2c-0) Do you want to scan it? (YES/no/selectively): YES Client found at address 0x50 Handled by driver `at24' (already loaded), chip type `24c128' (note: this is probably NOT a sensor chip!) Client found at address 0x51 Handled by driver `at24' (already loaded), chip type `24c128' (note: this is probably NOT a sensor chip!) Next adapter: p8_0008_e0p1 (i2c-1) Do you want to scan it? (YES/no/selectively): YES Client found at address 0x50 Handled by driver `at24' (already loaded), chip type `24c128' (note: this is probably NOT a sensor chip!) Client found at address 0x51 Handled by driver `at24' (already loaded), chip type `24c128' (note: this is probably NOT a sensor chip!) Next adapter: p8__e0p0 (i2c-2) Do you want to scan it? (YES/no/selectively): YES Client found at address 0x50 Handled by driver `at24' (already loaded), chip type `24c128' (note: this is probably NOT a sensor chip!) Client found at address 0x51 Handled by driver `at24' (already loaded), chip type `24c128' (note: this is probably NOT a sensor chip!) Next adapter: p8__e0p1 (i2c-3) Do you want to scan it? (YES/no/selectively): YES Client found at address 0x50 Handled by driver `at24' (already loaded), chip type `24c128' (note: this is probably NOT a sensor chip!) Client found at address 0x51 Handled by driver `at24' (already loaded), chip type `24c128' (note: this is probably NOT a sensor chip!) Next adapter: p8__e1p2 (i2c-4) Do you want to scan it? (YES/no/selectively): YES Client found at address 0x50 Handled by driver `at24' (already loaded), chip type `24c128' (note: this is probably NOT a sensor chip!) Next adapter: AST i2c bit bus (i2c-5) Do you want to scan it? (yes/NO/selectively): YES Sorry, no sensors were detected. Either your system has no sensors, or they are not supported, or they are connected to an I2C or SMBus adapter that is not supported. If you find out what chips are on your board, check http://www.lm-sensors.org/wiki/Devices for driver
[Kernel-packages] [Bug 1469829] Re: Firestone system I/O hang
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1469829 Title: Firestone system I/O hang Status in linux package in Ubuntu: New Bug description: -- Problem Description -- Firestone system given to DASD group failed HTX overnight test with miscompare error. HTX mdt.hdbuster was running on secondary drive and failed about 12 hours into test HTX miscompare analysis: -== Device under test: /dev/sdb Stanza running: rule_3 miscompare offset: 0x40 Transfer size: Random Size LBA number: 0x70fc miscompare length: all the blocks in the transfer size *- STANZA 3: Creates number of threads twice the queue depth. Each thread -* *- doing 2 num_oper with RC operation with xfer size between 1 block -* *- to 256K.-* This miscompare shows read operation is unable to get the expected data from the disk. The re-read buffer also shows the same data as the first read operation. Since the first read and next re-read shows same data, there could be a write operation (of previous rule stanza to initialize disk with pattern 007 ) failure on the disk. The same miscompare behavior shows for all the blocks in the transfer size. /dev/sdb Jun 2 02:29:43 2015 err=03b6 sev=2 hxestorage <<=== device name (/dev/sdb) rule_3_13 numopers= 2 loop= 767 blk=0x70fc len=89088 min_blkno=0 max_blkno=0x74706daf, RANDOM access Seed Values= 37303, 290, 23235 Data Pattern Seed Values = 37303, 291, 23235 BWRC LBA fencepost Detail: th_nummin_lba max_lba status 0 01c9be3ffR 1 1d1c1b6c3a3836d7F 2 3a3836d857545243F 3 5754524474706dafF Miscompare at buffer offset 64 (0x40) <<=== miscompare offset (0x40) (Flags: badsig=0; cksum=0x6) Maximum LBA = 0x74706daf wbuf (baseaddr 0x3ffe1c0e6600) b0ff rbuf (baseaddr 0x3ffe1c0fc400) 850100fc700200fd700300fe700400ff7005 Write buffer saved in /tmp/htxsdb.wbuf1 Read buffer saved in /tmp/htxsdb.rbuf1 Re-read fails compare at offset64; buffer saved in /tmp/htxsdb.rerd1 errno: 950(Unknown error 950) Asghar reproduced that HTX hang he is seeing. Looking in the kernel logs I see some messages from the kernel that there are user threads blocked on getting reads serviced. So likely HTX is seeing the same thing. I've asked Asghar to try using the deadline I/O scheduler rather than CFQ to see if that makes any difference. If that does not make any difference, the next thing to try is reducing the queue depth of the device. Right now its 31, which I think is pretty high. Step 1: echo deadline > /sys/block/sda/queue/scheduler echo deadline > /sys/block/sdb/queue/scheduler If that reproduces the issue, go to step 2: echo cfq > /sys/block/sda/queue/scheduler echo cfq > /sys/block/sdb/queue/scheduler echo 8 > /sys/block/sda/device/queue_depth echo 8 > /sys/block/sdb/device/queue_depth Breno - it looks like the default I/O scheduler + default queue depth for the SATA disks in Firestone is not optimal, in that when running a heavy I/O workload, we see read starvation occurring, which is making the system nearly unusable. Once we changed the I/O scheduler from cfq to deadline, all the issues went away and the system is able to run the same workload yet still be responsive. Suggest we either encourage Canonical to change the default I/O scheduler to deadline or at the very least provide documentation to encourage our customers to make this change themselves. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1469829/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1442878] Re: Backport upstream bugfix in cpuidle to fix memory corruption
** Changed in: ubuntu Status: New => Confirmed ** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1442878 Title: Backport upstream bugfix in cpuidle to fix memory corruption Status in linux package in Ubuntu: Confirmed Bug description: == Comment: #0 - Shilpasri G. Bhat - 2015-04-10 14:35:43 == This is a request to backport upstream cpuidle bugfix for a memory corruption d52356e7f48e powerpc: fix memory corruption by pnv_alloc_idle_core_states Space allocated for paca is based off nr_cpu_ids, but pnv_alloc_idle_core_states() iterates paca with cpu_nr_cores()*threads_per_core, which is using NR_CPUS. This causes pnv_alloc_idle_core_states() to write over memory, which is outside of paca array and may later lead to various panics. Fixes: 7cba160ad789 (powernv/cpuidle: Redesign idle states management) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1442878/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1442186] Re: NVMe device driver failing w/nvme-user and htx tools
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Status: New => Confirmed ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1442186 Title: NVMe device driver failing w/nvme-user and htx tools Status in linux package in Ubuntu: Confirmed Bug description: ---Problem Description--- NVMe device driver running with a Samsung SM1715 card fails (hits EEH) w/nvme-user and htx tools Machine Type = Tuleta P8 ---Steps to Reproduce--- 1. Install the nvme-user tools package and run the nvme_rw tool or 2. Install the htx I/O exerciser tool and run htx ---uname output--- Linux ubuntu1504 3.18.0-12-generic #13-Ubuntu SMP Thu Jan 29 13:44:26 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux EEH error in kernel log after running HTX: [503195.365335] EEH: Frozen PE#2 on PHB#4 detected [503195.365346] EEH: PE location: U78CB.001.D123456-P1-C5 , PHB location: U 78CB.001.D123456-P1-C33 [503195.366226] EEH: This PCI device has failed 1 times in the last hour [503195.366227] EEH: Notify device drivers to shutdown [503195.366233] EEH: Collect temporary log [503195.366315] EEH: of node=/pciex@3fffe4200/pci@0/mass-storage@0 [503195.366321] EEH: PCI device/vendor: a820144d [503195.366327] EEH: PCI cmd/status register: 00100546 [503195.366329] EEH: PCI-E capabilities and status follow: [503195.366356] EEH: PCI-E 00: 00024010 10008fe1 202e 00416c43 [503195.366372] EEH: PCI-E 10: 10430040 [503195.366374] EEH: PCI-E 20: [503195.366375] EEH: PCI-E AER capability register set follows: [503195.366396] EEH: PCI-E AER 00: 18020001 0040 00440010 [503195.366413] EEH: PCI-E AER 10: e000 01e0 [503195.366429] EEH: PCI-E AER 20: [503195.366435] EEH: PCI-E AER 30: [503195.366438] PHB3 PHB#4 Diag-data (Version: 1) [503195.366440] brdgCtl: 0002 [503195.366442] UtlSts: 0020 [503195.366444] RootSts: 0020 0040 f0430048 00100147 [503195.366446] PhbSts: 001c 001c [503195.366448] Lem: 0400 42498e327f502eae [503195.366450] InAErr: 4000 4000 [503195.366452] PE[ 2] A/B: 80003025 8000 [503195.366454] EEH: Reset with hotplug activity [503195.367741] nvme 0004:01:00.0: Cancelling I/O 0 QID 57 [503199.411764] EEH: Sleep 5s ahead of complete hotplug [503204.415815] pci 0004:01:00.0: [144d:a820] type 00 class 0x010802 [503204.415874] pci 0004:01:00.0: reg 0x10: [mem 0x-0x3fff 64bit] [503204.423731] pci 0004:01:00.0: BAR 0: assigned [mem 0x3ff0-0x3ff03fff 64bit] [503204.423970] nvme 0004:01:00.0: enabling device (0140 -> 0142) [503204.423998] nvme 0004:01:00.0: Using 64-bit DMA iommu bypass [503207.914466] nvme1n1: unknown partition table [503207.914643] EEH: Notify device driver to resume The PHB3 diag-data is telling: DMA write to PCI bus address 0x0, which doesn't have corresponding valid TCE entries. As I know, there are 2 possibilities and (A) might be the case. (A) As MSI/MSIx interrupt is essentially DMA write transaction, the PCI bus address 0x0 would be MSI/MSIx message address. The address 0x0 is obviously invalid. It indicates there is spurious MSI/MSIx interrupt whose MSI/MSIx message wasn't populated and updated to hardware yet. (B) We really have 0x0 DMA write transaction and the TCE entries wasn't populated and built yet. From the kernel log, we're using direct DMA window. So the DMA address shouldn't less than (0x1ul << 59). Recap the PHB3 diag-data [503195.366438] PHB3 PHB#4 Diag-data (Version: 1) [503195.366440] brdgCtl: 0002 [503195.366442] UtlSts: 0020 [503195.366444] RootSts: 0020 0040 f0430048 00100147 [503195.366446] PhbSts: 001c 001c [503195.366448] Lem: 0400 42498e327f502eae [503195.366450] InAErr: 4000 4000 [503195.366452] PE[ 2] A/B: 80003025 8000 William helped run following command, which triggers EEH error, which has same root cause as we saw before. After that, re-running the command won't trigger EEH again: # sudo dd if=/dev/nvme1n1 of=/dev/null bs=1k count=512 # dmesg | grep EEH.*has\ failed [ 2377.124104] EEH: This PCI device has failed 1 times in the last hour I'm not sure if there is possibility of content in MSIx table is
[Kernel-packages] [Bug 1442180] Re: powerpc/perf: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Status: New => Confirmed ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1442180 Title: powerpc/perf: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH Status in linux package in Ubuntu: Confirmed Bug description: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH We cap 32bit userspace backtraces to PERF_MAX_STACK_DEPTH (currently 127), but we forgot to do the same for 64bit backtraces. If userspace creates a stack frame that points to itself we will loop forever in the backtrace code with interrupts off. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1442180/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1441856] Re: ISST-LTE:Ubuntu15.04: After migration of Ubuntu15.04 lpar, RMC connection on HMC will be lost (LPM)(kernel/powerpc-ibm-utils)
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) ** Changed in: linux (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1441856 Title: ISST-LTE:Ubuntu15.04: After migration of Ubuntu15.04 lpar, RMC connection on HMC will be lost (LPM)(kernel/powerpc-ibm-utils) Status in linux package in Ubuntu: Confirmed Bug description: Defect Description: For Ubuntu15.04 lpar which has Active RMC connection with HMC, once the lpar is migrated to another machine in same/different HMC, the RMC connection on HMC with the lpar will be lost. The following upstream fixes are required to ensure device tree is properly updated after migration/suspend and that the RMC connection is not lost as a result. From powerpc-utils upstream -next branch: -- commit a941cdfb9609bba04c5bad18ab8af1c85b7b6a9b Author: Tyrel Datwyler Date: Tue Mar 31 18:23:59 2015 -0400 drmgr: Use sysfs migration store to initiate migration when possible From mainline Linux 4.0: -- commit f6ff04149637723261aa4738958b0098b929ee9e Author: Tyrel Datwyler Date: Wed Mar 4 11:59:33 2015 -0800 powerpc/pseries: Little endian fixes for post mobility device tree update From mpe's PowerPC -next branch: - commit 288a298c05774dde0a8d5abac9b692503d4e41f2 Author: Tyrel Datwyler Date: Wed Mar 4 18:25:38 2015 -0800 powerpc/pseries: Introduce api_version to migration sysfs interface commit c03e73740d24fbe990291cd9ac2d6ae0d95b975f Author: Tyrel Datwyler Date: Fri Mar 27 12:47:25 2015 -0700 powerpc/pseries: Simplify check for suspendability during suspend/migration I have verified LPM after installing test packages containing the mentioned patches and now LPM works fine for me. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1441856/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1440970] Re: Add cpufreq-powernv bugfix for error reporting of throttled frequency to 15.04
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1440970 Title: Add cpufreq-powernv bugfix for error reporting of throttled frequency to 15.04 Status in linux package in Ubuntu: New Bug description: This is a request to include cpufreq-powernv bugfix to 15.04. This patch duly reports cpu frequency throttling scenarios. https://lkml.org/lkml/2015/3/26/574 This is a trivial reporting which can be used to identify degradation in performance due to throttled cpu frequency. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1440970/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1427075] Re: Stopping and starting KVM partitions results in guest kernel softlockup
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1427075 Title: Stopping and starting KVM partitions results in guest kernel softlockup Status in linux package in Ubuntu: New Bug description: == Comment: #0 - Cyril Bur - 2015-02-23 18:03:41 == +++ This bug was initially created as a clone of Bug #108455 +++ I was investigating the cause of some ppc64le KVM guest softlockup warnings. On a single CPU KVM guest, I ran something to keep the guest busy: yes > /dev/null & Followed by (qemu) stop Wait a while, then: (qemu) cont We get a softlockup error: BUG: soft lockup - CPU#0 stuck for 9220s! [yes:2389] .__getnstimeofday .getnstimeofday .ktime_get_real .netif_receive_skb .ibmveth_poll .net_rx_action .__do_softirq .irq_exit .__do_irq .call_do_irq .do_IRQ I was going to file it away in the "don't do that" bin, but I notice x86 have something to detect a paused VM and avoid spewing the soft lockup error. Do we need something like this on ppc64? commit 5d1c0f4a80a6df73395fb3fc2c302510f8f09d36 Author: Eric B Munson Date: Sat Mar 10 14:37:28 2012 -0500 watchdog: add check for suspended vm in softlockup detector A suspended VM can cause spurious soft lockup warnings. To avoid these, the watchdog now checks if the kernel knows it was stopped by the host and skips the warning if so. When the watchdog is reset successfully, clear the guest paused flag. == Comment: #1 - Cyril Bur - 2015-02-23 18:03:55 == Hi, I have been working on a fix for guest kernels. This requires two patches: 1/2 commit 545a2bf742fb41f17d03486dd8a8c74ad511dec2 Author: Cyril Bur Date: Thu Feb 12 15:01:24 2015 -0800 kernel/sched/clock.c: add another clock for use with the soft lockup watchdog and 2/2 commit 4be1b29795d692d512bb67b770665d6f8ea5cb0b Author: Cyril Bur Date: Thu Feb 12 15:01:28 2015 -0800 powerpc: add running_clock for powerpc to prevent spurious softlockup warnings Both are in upstream. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1427075/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1434023] Re: ISST-LTE:Ubuntu15.04:20150318:kdump: linux-crashdump packages install using apt-get will stuck
** Package changed: ubuntu => linux-meta (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-meta in Ubuntu. https://bugs.launchpad.net/bugs/1434023 Title: ISST-LTE:Ubuntu15.04:20150318:kdump: linux-crashdump packages install using apt-get will stuck Status in linux-meta package in Ubuntu: New Bug description: == Comment: #0 - Hemant Kumar - 2015-03-19 04:25:43 == Defect Description: --- I have installed ubuntu15.04 daily build (20150318). After installation i tried installing kdump package "linux-crashdump" using apt-get. "apt-get install linux-crashdump" And after package installation it will stuck and will not give prompt. here is output: root@highlp2:/home/hemant# apt-get install linux-crashdump Reading package lists... Done Building dependency tree Reading state information... Done The following extra packages will be installed: apport apport-symptoms binutils crash kdump-tools kexec-tools libdw1 libpolkit-agent-1-0 libpolkit-backend-1-0 makedumpfile policykit-1 python3-apport python3-problem-report Suggested packages: apport-gtk apport-kde binutils-doc python3-launchpadlib The following NEW packages will be installed: apport apport-symptoms binutils crash kdump-tools kexec-tools libdw1 libpolkit-agent-1-0 libpolkit-backend-1-0 linux-crashdump makedumpfile policykit-1 python3-apport python3-problem-report 0 upgraded, 14 newly installed, 0 to remove and 0 not upgraded. Need to get 5,514 kB of archives. After this operation, 30.5 MB of additional disk space will be used. Do you want to continue? [Y/n] y Get:1 http://ports.ubuntu.com/ubuntu-ports/ vivid/main libdw1 ppc64el 0.160-0ubuntu3 [161 kB] Get:2 http://ports.ubuntu.com/ubuntu-ports/ vivid/main libpolkit-agent-1-0 ppc64el 0.105-8ubuntu2 [14.6 kB] Get:3 http://ports.ubuntu.com/ubuntu-ports/ vivid/main libpolkit-backend-1-0 ppc64el 0.105-8ubuntu2 [33.5 kB] Get:4 http://ports.ubuntu.com/ubuntu-ports/ vivid/main python3-problem-report all 2.16.2-0ubuntu3 [10.2 kB] Get:5 http://ports.ubuntu.com/ubuntu-ports/ vivid/main python3-apport all 2.16.2-0ubuntu3 [75.8 kB] Get:6 http://ports.ubuntu.com/ubuntu-ports/ vivid/main apport all 2.16.2-0ubuntu3 [113 kB] Get:7 http://ports.ubuntu.com/ubuntu-ports/ vivid/main apport-symptoms all 0.20 [14.2 kB] Get:8 http://ports.ubuntu.com/ubuntu-ports/ vivid/main binutils ppc64el 2.25-5ubuntu1 [2,184 kB] Get:9 http://ports.ubuntu.com/ubuntu-ports/ vivid/main crash ppc64el 7.0.8-1ubuntu1 [2,664 kB] Get:10 http://ports.ubuntu.com/ubuntu-ports/ vivid/main makedumpfile ppc64el 1:1.5.7-5 [99.8 kB] Get:11 http://ports.ubuntu.com/ubuntu-ports/ vivid/main kexec-tools ppc64el 1:2.0.7-5ubuntu2 [73.8 kB] Get:12 http://ports.ubuntu.com/ubuntu-ports/ vivid/main kdump-tools all 1:1.5.7-5 [16.2 kB] Get:13 http://ports.ubuntu.com/ubuntu-ports/ vivid/main linux-crashdump ppc64el 3.19.0.9.8 [2,550 B] Get:14 http://ports.ubuntu.com/ubuntu-ports/ vivid/main policykit-1 ppc64el 0.105-8ubuntu2 [50.7 kB] Fetched 5,514 kB in 0s (31.4 MB/s) Preconfiguring packages ... Package configuration ?? Configuring kexec-tools ?? ? ? ? If you choose this option, a system reboot will trigger a restart into a ? ? kernel loaded by kexec instead of going through the full system boot ? ? loader process. ? ? ? ? Should kexec-tools handle reboots?? ? ? ? ? ? ? ? Selecting previously unselected package libdw1:ppc64el. (Reading database ... 48391 files and directories currently installed.) Preparing to unpack .../libdw1_0.160-0ubuntu3_ppc64el.deb ... Unpacking libdw1:ppc64el (0.160-0ubuntu3) ... [ 81.163819] sda2: WRITE SAME failed. Manually zeroing. Selecting previously unselected package libpolkit-agent-1-0:ppc64el. Preparing to unpack .../libpolkit-agent-1-0_0.105-8ubuntu2_ppc64el.deb ... Unpacking libpolkit-agent-1-0:ppc64el (0.105-8ubuntu2) ... Selecting previously unselected package libpolkit-backend-1-0:ppc64el. Preparing to unpack .../libpolkit-backend-1-0_0.105-8ubuntu2_ppc64el.deb ... Unpacking libpolkit-backend-1-0:ppc64el (0.105-8ubuntu2) ... Selecting previously unselected package python3-problem-report. Preparing to unpack .../python3-problem-report_2.16.2-0ubuntu3_all.deb ... Unpacking pytho
[Kernel-packages] [Bug 1435951] Re: Add cpuidle-powernv upstream bug fixes to fix kernel regression due to cpuidle in 15.04 kernel
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1435951 Title: Add cpuidle-powernv upstream bug fixes to fix kernel regression due to cpuidle in 15.04 kernel Status in linux package in Ubuntu: New Bug description: This is a feature request to include the following patches in 15.04 kernel. [1] 92c83ff5b42b109 cpuidle/powernv: Read target_residency value of idle states from DT if available [2] 70734a786acfd1 driver/cpuidle-powernv: Avoid endianness conversions while parsing DT [3] tick/broadcast-hrtimer : Fix suspicious RCU usage in idle loop https://lkml.org/lkml/2015/3/18/185 [4] tick/hotplug: Handover time related duties before cpu offline https://patchwork.ozlabs.org/patch/435097/ [1] and [2] patches provide fixes to read the latency and target_residency values of different idle states from the device tree if they are present in the device tree. This fix will avoid cpu from aggressively entering into fastsleep. [3] and [4] are important fixes against a kernel regression, cpu offline bug down the cpuidle path which was reported in the community Thanks for the new bug. I understand that patches [2] and [3] are not accepted upstream yet, right? To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1435951/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1435571] Re: docker: docker run --cpuset is not having any effect
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1435571 Title: docker: docker run --cpuset is not having any effect Status in linux package in Ubuntu: New Bug description: ---Problem Description--- docker: cpuset resouce allocaton on a container shows/lists all CPU of VM host ---uname output--- root@8a2d293ba30d:/sys# uname -a Linux 8a2d293ba30d 3.19.0-7-generic #7-Ubuntu SMP Fri Feb 27 00:26:30 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux Machine Type = Power 8 / PowerKVM ---Steps to Reproduce--- 1 - Install docker 1.4 developer build on PowerKVM Ubuntu 15.04 LE geust ftp://ftp.unicamp.br/pub/linuxpatch/docker-ppc64/ubuntu/14_10/docker.io-1.4.1-dev_ppc64el.deb 2 - Create vivid image using debootstrap 3 - Run a container using below command with cpuset=6,7 root@dockerbase:~# docker run -it --cpuset=6,7 vivid-cpuset-stress /bin/bash 4 - Inside container check for : grep processor /proc/cpuinfo root@06b22766d612:/# grep processor /proc/cpuinfo processor : 0 processor : 1 processor : 2 processor : 3 processor : 4 processor : 5 processor : 6 processor : 7 processor : 8 processor : 9 processor : 10 processor : 11 processor : 12 processor : 13 processor : 14 processor : 15 root@06b22766d612:/# It does list all CPUs of guest VM aka host in this case instead expected 6,7 cpu to be bound to this container. At host end "cpuset" : root@dockerbase:~# cat /sys/fs/cgroup/cpuset/docker/06b22766d61244fc5964d2a11bb0972d05ad72bb5899b6f55663c603bf5d6cba/cpuset.cpus 6-7 Nish, AFAIK, effective cpus was relevant only for unified hierarchy... they used to set effective_cpus= cpuset.cpus .. but will investigate more on this.. Uh, I think this got fixed upstream, can we check? Maybe build an Ubuntu test kernel (this should get auto-pulled in in some future build, due to -stable): 79063bffc81f82689bd90e16da1b49408f3bf095 ("cpuset: fix a warning when clearing configured masks in old hierarchy"). -Nish I did some more investigation and as Nish pointed there were some bugs that got fixed in upstream. Root cause: cgroup.clone_children was not handled properly. the bug appears when we have cgroup.clone_children=1 if we set 0 to clone_children it should work properly. so better if we backport below patch (if not already to fix the current problem) 1. 790317e1b266c776765a4bdcedefea706ff0fada: cpuset: initialize effective masks when clone_children is enabled 2. 79063bffc81f82689bd90e16da1b49408f3bf095 ("cpuset: fix a warning when clearing configured masks in old hierarchy"). (Nish pointed this patch) Canonical, both the above patches are targetted for 3.19-stable (in fact 3.17+), but given the possible miss on schedule between the upstream 3.19-stable releases and 15.04 kernel freeze, can we please manually include them until they are present in -stable? Without the changes, the cpuset cgroups are rather broken. -Nish To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1435571/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1429959] Re: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV]
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1429959 Title: Auto Error Recovery is failing after error injected for sailfish card in Ubuntu 14.10 [PowerNV] Status in linux package in Ubuntu: New Bug description: ---Problem Description--- PowerNV/Ubuntu 14.10 Auto Error Recovery is failing after error injected for sailfish ---uname output--- Linux powerio-le21 3.16.0-23-generic #31-Ubuntu SMP Tue Oct 21 17:55:08 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-42A ---Steps to Reproduce--- There are 2 LUNs coming across 3 different paths and multipath is configured. 1. Run I/O activity by running HTX load on the multipath devices. 2. Verify I/O activity on the multipath devices by iostat command 2. Injected error by the following command in echo 0x8000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA; sleep 1; echo 0x0 > /sys/kernel/debug/powerpc/PCI0001/err_injct_inboundA 3. The error injection happened and the I/O activity was suspended as confirmed by iostat. 4. Error recovery of the PCI devices did not happen and the devices remained inaccessible. The dmesg during the event is as follows [ 376.148715] systemd-logind[7123]: New session 6 of user root. [ 497.572751] EEH: Frozen PHB#1-PE#8 detected [ 497.572799] EEH: PE location: U78C9.001.WZS006T-P1-C12 , PHB location: U78C9.001.WZS006T-P1-C32 [ 497.572890] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G OE 3.16.0-23-generic #31-Ubuntu [ 497.572892] Call Trace: [ 497.572898] [c03fffe97b90] [c0017390] show_stack+0x170/0x290 (unreliable) [ 497.572902] [c03fffe97c70] [c0a05fc0] dump_stack+0x90/0xbc [ 497.572906] [c03fffe97ca0] [c0038010] eeh_dev_check_failure+0x560/0x580 [ 497.572908] [c03fffe97d40] [c00380b8] eeh_check_failure+0x88/0xe0 [ 497.572933] [c03fffe97d80] [d0001cb247a8] qla24xx_msix_rsp_q+0x108/0x200 [qla2xxx] [ 497.572936] [c03fffe97e10] [c01319b0] handle_irq_event_percpu+0x90/0x2b0 [ 497.572938] [c03fffe97ed0] [c0131c38] handle_irq_event+0x68/0xd0 [ 497.572940] [c03fffe97f00] [c0136f80] handle_fasteoi_irq+0xe0/0x2a0 [ 497.572942] [c03fffe97f30] [c0130ca8] generic_handle_irq+0x58/0x90 [ 497.572943] [c03fffe97f60] [c00119c0] __do_irq+0x80/0x190 [ 497.572945] [c03fffe97f90] [c00253d0] call_do_irq+0x14/0x24 [ 497.572946] [c02fe83abab0] [c0011b68] do_IRQ+0x98/0x140 [ 497.572948] [c02fe83abb00] [c0002794] hardware_interrupt_common+0x114/0x180 [ 497.572952] --- Exception: 501 at snooze_loop+0xd8/0x170 LR = snooze_loop+0x90/0x170 [ 497.572955] [c02fe83abdf0] [c0a33680] cpu_online_mask+0x0/0x8 (unreliable) [ 497.572957] [c02fe83abe30] [c08405bc] cpuidle_enter_state+0x6c/0x140 [ 497.572960] [c02fe83abe80] [c0113938] cpu_startup_entry+0x318/0x4c0 [ 497.572962] [c02fe83abf20] [c0043844] start_secondary+0x324/0x350 [ 497.572964] [c02fe83abf90] [c0009a6c] start_secondary_prolog+0x10/0x14 [ 497.572973] EEH: Detected PCI bus error on PHB#1-PE#8 [ 497.572978] EEH: This PCI device has failed 1 times in the last hour [ 497.572979] EEH: Notify device drivers to shutdown [ 497.573000] qla2xxx [0001:07:00.0]-015b:2: Disabling adapter. [ 497.573071] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573072] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573075] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573076] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573077] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573078] sd 2:0:1:1: [sdd] [ 497.573079] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573080] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573081] sd 2:0:1:1: [sdd] [ 497.573082] sd 2:0:1:1: [sdd] Unhandled error code [ 497.573084] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573085] sd 2:0:1:0: [sdc] Unhandled error code [ 497.573086] sd 2:0:1:1: [sdd] CDB: [ 497.573087] sd 2:0:1:1: [sdd] [ 497.573088] sd 2:0:1:0: [sdc] [ 497.573088] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573089] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573090] sd 2:0:1:1: [sdd] CDB: [ 497.573091] sd 2:0:1:1: [sdd] [ 497.573095] Read(10) [ 497.573095] sd 2:0:1:0: [sdc] [ 497.573096] sd 2:0:1:0: [sdc] [ 497.573097] : [ 497.573097] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573099] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [ 497.573100] Read(10) [ 497.573100] sd 2:0:1:1: [sdd] CDB: [ 497.573101] sd 2:0:1:0: [sdc] [ 497.573103] : [ 497.5731
[Kernel-packages] [Bug 1428351] Re: ISST-LTE: LPM on Ubuntu15.04 lpar hangs.
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1428351 Title: ISST-LTE: LPM on Ubuntu15.04 lpar hangs. Status in linux package in Ubuntu: New Bug description: Defect Description: --- LPM on Ubuntu15.04 LE lpar hangs at 0%. I tried it twice and waited for more then 1 hour but there was not progress. Details: -- 1. LPM stuck at 0% 2. No work load tests are running on lpar. 3. Lpar is accessible though out on source CEC side. 4. Destination CEC shows lpar reference code as: "End powering on VIO slots". Machine version details: --- root@highlp3:~# uname -a Linux highlp3 3.18.0-13-generic #14-Ubuntu SMP Fri Feb 6 09:57:41 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux root@highlp3:~# root@highlp3:~# cat /etc/issue Ubuntu Vivid Vervet (development branch) \n \l VIOS version: 2.2.3.3 Source firmware: b0211p_1507.811 Destination firmware: b0211p_1507.811 Steps to recreate: --- 1. Install Ubuntu15.04 LE on a lpar. 2. Setup LPM environment (install RMC and RSCT packages). 3. Execute LPM operation on HMC. First, there is a kernel fix needed to make sure the stream id is passed in cpu endian during hcall to check VASI state. The following in the dmesg log confirms that this issue is present: [ 666.978230] rtas_ibm_suspend_me: vasi_state returned -4 This is the relevant upstream commit: commit 3df76a9dcc74d5f012b94ea01ed6e7aaf8362c5a Author: Cyril Bur Date: Wed Jan 21 13:32:00 2015 +1100 powerpc/pseries: Fix endian problems with LE migration Just found that this commit took part of kernel 3.19, which is going to be the default kernel for 15.04, so, we don't need to worry about asking Canonical to integrated it. # git checkout v3.19-rc7 # git show 3df76a9dcc74d5f012b94ea01ed6e7aaf8362c5a I have updated the kernel in lpar. root@highlp3:~# dpkg -l | grep linux | grep 3.19.0-6 ii linux-image-3.19.0-6-generic3.19.0-6.6 ppc64el Linux kernel image for version 3.19.0 on PowerPC 64el SMP ii linux-image-extra-3.19.0-6-generic 3.19.0-6.6 ppc64el Linux kernel extra modules for version 3.19.0 on PowerPC 64el SMP Still LPM is getting hang. The difference is last time it got hung at 0% this time it got hung at 99%. I can't seem to get the console on the desintation even after a rmvterm. I checked the cpus from the phyp debug console and all of the cpus seem to be stuck at the same place. NIA C00809AC MSR 80019033 LR C087D7B4 Can you please recover the lpar and look up the following addresses using addr2line: addr2line -e C00809AC addr2line -e C087D7B4 And please install the kernel source as well so we can cross reference those lines within the kernel. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1428351/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1425202] Re: ISST-LTE: Ubuntu 15.04 lpar crashes at iommu_free_table after adapter DLPAR operation
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1425202 Title: ISST-LTE: Ubuntu 15.04 lpar crashes at iommu_free_table after adapter DLPAR operation Status in linux package in Ubuntu: New Bug description: Description: I have installed all packages required for DLPAR in Ubuntu 15.04. When i am trying to do network adapter DLPAR operation, lpar crashes. root@highlp1:/etc/network# [ 454.482742] kernel BUG at /build/buildd/linux-3.18.0/arch/powerpc/kernel/iommu.c:732! cpu 0xa: Vector: 700 (Program Check) at [c0016e67f850] pc: c0040484: iommu_free_table+0x64/0x140 lr: c004047c: iommu_free_table+0x5c/0x140 sp: c0016e67fad0 msr: 80029033 current = 0xc0016b4277e0 paca= 0xc7b35a00 softe: 0irq_happened: 0x01 pid = 3288, comm = drmgr kernel BUG at /build/buildd/linux-3.18.0/arch/powerpc/kernel/iommu.c:732! enter ? for help [c0016e67fb40] c0083428 iommu_reconfig_notifier+0x88/0x160 [c0016e67fb70] c00dab38 notifier_call_chain+0x98/0x100 [c0016e67fbc0] c00db244 __blocking_notifier_call_chain+0x74/0xe0 [c0016e67fc10] c0877290 of_reconfig_notify+0x40/0xa0 [c0016e67fc40] c0877cdc of_detach_node+0x8c/0xb0 [c0016e67fc70] c00809b8 ofdt_write+0x1f8/0x7b0 [c0016e67fd40] c03459f0 proc_reg_write+0xb0/0x110 [c0016e67fd90] c02b950c vfs_write+0xdc/0x260 [c0016e67fde0] c02ba0ac SyS_write+0x6c/0x110 [c0016e67fe30] c000927c syscall_exit+0x0/0x7c --- Exception: c01 (System Call) at 101ba638 SP (3fffdba63300) is in userspace a:mon> a:mon> a:mon> a:mon> t [c0016e67fb40] c0083428 iommu_reconfig_notifier+0x88/0x160 [c0016e67fb70] c00dab38 notifier_call_chain+0x98/0x100 [c0016e67fbc0] c00db244 __blocking_notifier_call_chain+0x74/0xe0 [c0016e67fc10] c0877290 of_reconfig_notify+0x40/0xa0 [c0016e67fc40] c0877cdc of_detach_node+0x8c/0xb0 [c0016e67fc70] c00809b8 ofdt_write+0x1f8/0x7b0 [c0016e67fd40] c03459f0 proc_reg_write+0xb0/0x110 [c0016e67fd90] c02b950c vfs_write+0xdc/0x260 [c0016e67fde0] c02ba0ac SyS_write+0x6c/0x110 [c0016e67fe30] c000927c syscall_exit+0x0/0x7c --- Exception: c01 (System Call) at 101ba638 SP (3fffdba63300) is in userspace a:mon> e cpu 0xa: Vector: 700 (Program Check) at [c0016e67f850] pc: c0040484: iommu_free_table+0x64/0x140 lr: c004047c: iommu_free_table+0x5c/0x140 sp: c0016e67fad0 msr: 80029033 current = 0xc0016b4277e0 paca= 0xc7b35a00 softe: 0irq_happened: 0x01 pid = 3288, comm = drmgr kernel BUG at /build/buildd/linux-3.18.0/arch/powerpc/kernel/iommu.c:732! Original patch posted https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-February/125068.html Updated patch posted as: https://lists.ozlabs.org/pipermail/linuxppc- dev/2015-February/125133.html Will update again once merged into mpe's tree. -Nish To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1425202/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1424884] Re: Ubuntu 15.04 PowerNV install error during network due to bnx2x driver errors
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1424884 Title: Ubuntu 15.04 PowerNV install error during network due to bnx2x driver errors Status in linux package in Ubuntu: New Bug description: ---Problem Description--- I am trying to install Ubuntu 15.04 on a PowerNV system using a network installation. I am able to obtain an IP address with DHCP from the petitboot menu. I am also able to load the installation kernel and begin the installation but the network configuration within the installer fails. I tried to manually configure the network, but that failed too. When I exited to the shell of the installer, I can see my interface has the IP address that I manually configured(10.33.8.112), but I cannot connect to the local LAN: ~ # ip address 1: lo: mtu 65536 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc mq qlen 1000 link/ether 6c:ae:8b:02:10:88 brd ff:ff:ff:ff:ff:ff 3: eth1: mtu 1500 qdisc mq qlen 1000 link/ether 6c:ae:8b:02:10:89 brd ff:ff:ff:ff:ff:ff 4: eth2: mtu 1500 qdisc mq qlen 1000 link/ether 6c:ae:8b:02:10:8a brd ff:ff:ff:ff:ff:ff 5: eth3: mtu 1500 qdisc mq qlen 1000 link/ether 6c:ae:8b:02:10:8b brd ff:ff:ff:ff:ff:ff 6: eth4: mtu 1500 qdisc noop qlen 1000 link/ether 40:f2:e9:31:08:f4 brd ff:ff:ff:ff:ff:ff inet 10.33.8.112/16 brd 10.33.255.255 scope global eth4 valid_lft forever preferred_lft forever 7: eth5: mtu 1500 qdisc noop qlen 1000 link/ether 40:f2:e9:31:08:f5 brd ff:ff:ff:ff:ff:ff 8: eth6: mtu 1500 qdisc noop qlen 1000 link/ether 40:f2:e9:31:08:f6 brd ff:ff:ff:ff:ff:ff 9: eth7: mtu 1500 qdisc noop qlen 1000 link/ether 40:f2:e9:31:08:f7 brd ff:ff:ff:ff:ff:ff ~ # ping 10.33.0.1 PING 10.33.0.1 (10.33.0.1): 56 data bytes ping: sendto: Network is unreachable I noticed in the /var/log/syslog for the install the following errors related to the bnx2x driver: Feb 23 18:49:34 kernel: [ 546.172944] bnx2x 0005:01:00.2: Direct firmware load for bnx2x/bnx2x-e2-7.10.51.0.fw failed with error -2 Feb 23 18:49:34 kernel: [ 546.172950] bnx2x: [bnx2x_init_firmware:12958(eth6)]Can't load firmware file bnx2x/bnx2x-e2-7.10.51.0.fw Feb 23 18:49:34 kernel: [ 546.172952] bnx2x: [bnx2x_func_hw_init:5523(eth6)]Error loading firmware Feb 23 18:49:34 kernel: [ 546.172957] bnx2x: [bnx2x_nic_load:2704(eth6)]HW init failed, aborting I'm wondering if this failure message for the network driver is what is preventing me from connecting to the network during the install? ---uname output--- N/A - Trying to do a fresh install of 15.04 Machine Type = 8247-42L ---Steps to Reproduce--- Boot the PowerNV system to petitboot menu and select network install of Ubuntu 15.04. Install method: Network install Install ISO Information: vivid-server-ppc64el.iso 23-Feb-2015 14:25 388M The initrd has an older set of fw files and not the one the driver is expecting to load: /lib/firmware/bnx2x # ls bnx2x-e1-7.8.19.0.fw bnx2x-e1h-7.8.19.0.fw bnx2x-e2-7.8.19.0.fw /lib/firmware/bnx2x # To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1424884/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1422481] Re: mlx4 not recovering from EEH in Ubuntu 15.04 (Mellanox)
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1422481 Title: mlx4 not recovering from EEH in Ubuntu 15.04 (Mellanox) Status in linux package in Ubuntu: New Bug description: ---Problem Description--- EEH is not working with mlx4 driver. When the driver recovered it hits another EEH. ---uname output--- Linux ubuntu 3.18.0-12-generic #13 SMP Mon Feb 9 16:31:42 CST 2015 ppc64le ppc64le ppc64le GNU/Linux ---Additional Hardware Info--- Need Mellanox adapter like Connect 3 adapter. Machine Type = P8 ---Steps to Reproduce--- Just inject EEH to mlx4 device. Stack trace output: from EEH recovery then it hits this: [ 188.747571] EEH: Collect temporary log [ 188.748330] EEH: of node=/pci@8002007/ethernet@3 [ 188.748339] EEH: PCI device/vendor: 100715b3 [ 188.748361] EEH: PCI cmd/status register: 00100146 [ 188.748362] EEH: PCI-E capabilities and status follow: [ 188.748459] EEH: PCI-E 00: 00020010 10008e02 0001200e 0843f483 [ 188.748537] EEH: PCI-E 10: 1083 [ 188.748539] EEH: PCI-E 20: [ 188.748540] EEH: PCI-E AER capability register set follows: [ 188.748625] EEH: PCI-E AER 00: 00020001 00062010 [ 188.748704] EEH: PCI-E AER 10: 2000 2000 01e0 [ 188.748783] EEH: PCI-E AER 20: [ 188.748805] EEH: PCI-E AER 30: [ 188.748813] EEH: Reset without hotplug activity [ 193.833245] EEH: Notify device drivers the completion of reset [ 193.833257] mlx4_core: Initializing 0001:00:03.0 [ 193.833317] mlx4_core 0001:00:03.0: BAR 0: can't reserve [mem 0x170b000-0x170b00f] [ 193.833321] mlx4_core 0001:00:03.0: Couldn't get PCI resources, aborting [ 193.833395] EEH: Not recovered [ 193.833397] EEH: Unable to recover from failure from PHB#1-PE#1. Please try reseating or replacing it [ 193.834531] EEH: of node=/pci@8002007/ethernet@3 [ 193.834547] EEH: PCI device/vendor: 100715b3 [ 193.834580] EEH: PCI cmd/status register: 00100142 [ 193.834582] EEH: PCI-E capabilities and status follow: [ 193.834728] EEH: PCI-E 00: 00020010 10008e02 200e 0843f483 [ 193.834846] EEH: PCI-E 10: 1083 [ 193.834849] EEH: PCI-E 20: [ 193.834850] EEH: PCI-E AER capability register set follows: [ 193.834981] EEH: PCI-E AER 00: 00020001 00062010 [ 193.835101] EEH: PCI-E AER 10: 2000 2000 01e0 [ 193.835219] EEH: PCI-E AER 20: [ 193.835252] EEH: PCI-E AER 30: [ 193.835289] Unable to handle kernel paging request for data at address 0x0388 [ 193.835356] Faulting instruction address: 0xd1f3231c [ 193.835415] Oops: Kernel access of bad area, sig: 11 [#1] [ 193.835460] SMP NR_CPUS=2048 NUMA pSeries [ 193.835509] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc rtc_generic mlx4_en vxlan ip6_udp_tunnel udp_tunnel mlx4_core [ 193.835886] CPU: 6 PID: 50 Comm: eehd Not tainted 3.18.0-12-generic #13 [ 193.835942] task: c003f72ca880 ti: c003f707c000 task.ti: c003f707c000 [ 193.836009] NIP: d1f3231c LR: d1f32790 CTR: d1f32760 [ 193.836076] REGS: c003f707f790 TRAP: 0300 Not tainted (3.18.0-12-generic) [ 193.836141] MSR: 80019033 CR: 4448 XER: 2000 [ 193.836302] CFAR: c00a7be0 DAR: 0388 DSISR: 4000 SOFTE: 1 GPR00: d1f32790 c003f707fa10 d1f66310 c003fe0ad000 GPR04: 0003 c003fd00 GPR08: 0001 d1f32760 fffa 00011001 GPR12: d1f32760 cfb83600 c00d9118 c003f90e56c0 GPR16: GPR20: c0c4ab90 GPR24: c0c4ab68 00100100 c003fe068580 c003fe068580 GPR28: c003fe0ad000 c003fe0685e0 d1f5da50 [ 193.837205] NIP [d1f3231c] mlx4_unload_one+0x3c/0x480 [mlx4_core] [ 193.837269] LR [d1f32790] mlx4_pci_err_detected+0x30/0x60 [mlx4_core] [ 193.837336] Call Trace: [ 193.837361] [c003f707fa10] [c003fe068580] 0xc003fe068580 (unreliable) [ 193.837447] [c003f707faa0] [d1f32790] mlx4_pci_err_detected+0x30/0x60 [mlx4_core] [ 193.837528] [c003f707fae0] [c003ac64] eeh_r
[Kernel-packages] [Bug 1415919] Re: Ubuntu - Fixing RTAS call from xmon (running ppc64le on PowerVM)
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1415919 Title: Ubuntu - Fixing RTAS call from xmon (running ppc64le on PowerVM) Status in linux package in Ubuntu: New Bug description: ---Problem Description--- The commit 3b8a3c010969 ("powerpc/pseries: Fix endiannes issue in RTAS call from xmon") was fixing an endianness issue in the call made from xmon to RTAS. However, as Michael Ellerman noticed, this fix was not complete, the token value was not byte swapped. This lead to call an unexpected and most of the time unexisting RTAS function, which is silently ignored by RTAS. This bug is opened to ensure Linux distro will taking up this upstream patch: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e6eb2eba494d6f99e69ca3c3748cd37a2544ab38 This bug is affecting Ubuntu 15.04 and 15.10. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415919/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1415562] Re: [Ubuntu 15.04] Support firmware assisted dump on ppc64le
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1415562 Title: [Ubuntu 15.04] Support firmware assisted dump on ppc64le Status in linux package in Ubuntu: New Bug description: Starting from POWER6, the firmware now has a capability to preserve the partition memory dump during system crash and boot into a fresh copy of the kernel with fully-reset system. This feature adds the necessary support to exploit the dump capture capability provided by Power firmware. With this feature support, the production kernel will register for firmware-assisted dump using RTAS (Runtime Abstraction Service) calls and builds required ELF header which then gets exported through '/proc/vmcore' in the second kernel after crash. This feature improves Power serviceability by making it more robust compared to current kdump mechanism on Linux. Ubuntu 15.04 kernel already includes the necessary code for fadump. The only kernel change needed is to enable CONFIG_FA_DUMP in the kernel configuration. In addition, an update is needed for a script in kdump-tools package. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415562/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1396235] Re: Ubuntu - unable to use XMON debugger (running ppc64le on PowerVM)
Hi Chris, Can another track be opened for Vivid to provide the same fix there or is it preferable to just open a new bug or even necessary? Thanks. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1396235 Title: Ubuntu - unable to use XMON debugger (running ppc64le on PowerVM) Status in linux package in Ubuntu: Fix Released Status in linux source package in Trusty: Fix Released Status in linux source package in Utopic: Fix Released Bug description: SRU Justification: [Impact] Users of LE Power8 hardware might not be able to use XMON properly. [Test Case] # echo x > /proc/sysrq-trigger This should put us in the xmon debugger without errors. [Fix] 3b8a3c01096925a824ed3272601082289d9c23a5 can be cleanly cherry-picked to 3.13/3.16. It only adds cpu_to_be32 macros which shouldn't change the behavior of BE mode CPUs. -- == Comment: #0 - Laurent Dufour - 2014-11-25 10:13:17 == ---Problem Description--- Kernel fails to invoke xmon, instead print following messages # echo x > /proc/sysrq-trigger [ 47.600133] SysRq : Entering xmon cpu 0xf: Vector: 0 at [c000e8603b80] pc: c05610c0: write_sysrq_trigger+0x120/0x260 lr: c05610c0: write_sysrq_trigger+0x120/0x260 sp: c000e8603ce0 msr: 80009033 current = 0xc000ef4a paca= 0xc7df3480 softe: 0irq_happened: 0x00 pid = 2303, comm = bash [ 47.607247] Bad kernel stack pointer fc7b4b0 at ee27cc4 cpu 0xf: Vector: 300 (Data Access) at [c7f37d40] pc: 0ee27cc4 lr: 0ee27c44 sp: fc7b4b0 msr: 80001000 dar: 1000 dsisr: 4200 current = 0xc000ef4a paca= 0xc7df3480 softe: 0irq_happened: 0x00 pid = 2303, comm = bash cpu 0xf: Exception 300 (Data Access) in xmon, returning to main loop xmon: WARNING: bad recursive fault on cpu 0xf ---uname output--- N/A Machine Type = powervm le ---System Hang--- System is hung ---Debugger--- A debugger is not configured ---Steps to Reproduce--- see problem description Stack trace output: no Oops output: see problem description System Dump Info: The system is not configured to capture a system dump. This patch has been sent upstream: http://patchwork.ozlabs.org/patch/413744/ To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1396235/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1415178] Re: Kernel trace message when the ipr driver is rmmod from Ubuntu 14.10 guest (GTO - PCI Passthrough)
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1415178 Title: Kernel trace message when the ipr driver is rmmod from Ubuntu 14.10 guest (GTO - PCI Passthrough) Status in linux package in Ubuntu: New Bug description: pKVM version: [root@yangtze-lp1 ~]# cat /etc/issue IBM_PowerKVM release 2.1.1 build 10 alpha (pkvm2_1_1) Kernel \r on a \m (\l) Ubuntu version: root@ubuntushinner:~# uname -a Linux ubuntushinner 3.16.0-14-generic #20-Ubuntu SMP Sat Sep 6 23:45:12 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux steps to reproduce: 1. rmmod the ipr driver and kernel traces are seen on the syslog. root@ubuntushinner:~# lsmod Module Size Used by pseries_rng 2849 0 ses 9046 0 enclosure 11198 1 ses rtc_generic 2249 0 ipr 140038 0 ohci_pci6794 0 root@ubuntushinner:~# uname -a Linux ubuntushinner 3.16.0-14-generic #20-Ubuntu SMP Sat Sep 6 23:45:12 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux root@ubuntushinner:~# rmmod ipr root@ubuntushinner:~# lsscsi [0:0:0:0]diskQEMU QEMU HARDDISK2.0. /dev/sda [0:0:0:1]cd/dvd QEMU QEMU CD-ROM 2.0. /dev/sr0 syslog: Sep 9 02:57:57 ubuntushinner kernel: [ 576.546878] kernfs: can not remove 'device', no directory Sep 9 02:57:57 ubuntushinner kernel: [ 576.546903] [ cut here ] Sep 9 02:57:57 ubuntushinner kernel: [ 576.546906] WARNING: at /build/buildd/linux-3.16.0/fs/kernfs/dir.c:1220 Sep 9 02:57:57 ubuntushinner kernel: [ 576.546908] Modules linked in: pseries_rng ses enclosure rtc_generic ipr(-) ohci_pci Sep 9 02:57:57 ubuntushinner kernel: [ 576.546922] CPU: 1 PID: 3537 Comm: rmmod Not tainted 3.16.0-14-generic #20-Ubuntu Sep 9 02:57:57 ubuntushinner kernel: [ 576.546926] task: c003ebad6830 ti: c003e5da8000 task.ti: c003e5da8000 Sep 9 02:57:57 ubuntushinner kernel: [ 576.546928] NIP: c0350c34 LR: c0350c30 CTR: c0517a60 Sep 9 02:57:57 ubuntushinner kernel: [ 576.546931] REGS: c003e5dab380 TRAP: 0700 Not tainted (3.16.0-14-generic) Sep 9 02:57:57 ubuntushinner kernel: [ 576.546932] MSR: 800100029033 CR: 28088844 XER: 2000 Sep 9 02:57:57 ubuntushinner kernel: [ 576.546942] CFAR: c09fd270 SOFTE: 1 Sep 9 02:57:57 ubuntushinner kernel: [ 576.546942] GPR00: c0350c30 c003e5dab600 c13d49e0 002d Sep 9 02:57:57 ubuntushinner kernel: [ 576.546942] GPR04: c1845db0 c1856618 0175 0175 Sep 9 02:57:57 ubuntushinner kernel: [ 576.546942] GPR08: c0e449e0 c0c41b80 Sep 9 02:57:57 ubuntushinner kernel: [ 576.546942] GPR12: 8800 cfb80900 0100311e01f0 Sep 9 02:57:57 ubuntushinner kernel: [ 576.546942] GPR16: 4c7133a0 4c713358 Sep 9 02:57:57 ubuntushinner kernel: [ 576.546942] GPR20: 4c713380 4c7133b8 4c713398 Sep 9 02:57:57 ubuntushinner kernel: [ 576.546942] GPR24: 0001 3fffe6f71750 c003e4e01d28 Sep 9 02:57:57 ubuntushinner kernel: [ 576.546942] GPR28: d5fe18e0 c003e4e01d18 Sep 9 02:57:57 ubuntushinner kernel: [ 576.546980] NIP [c0350c34] kernfs_remove_by_name_ns+0xe4/0xf0 Sep 9 02:57:57 ubuntushinner kernel: [ 576.546984] LR [c0350c30] kernfs_remove_by_name_ns+0xe0/0xf0 Sep 9 02:57:57 ubuntushinner kernel: [ 576.546987] Call Trace: Sep 9 02:57:57 ubuntushinner kernel: [ 576.546990] [c003e5dab600] [c0350c30] kernfs_remove_by_name_ns+0xe0/0xf0 (unreliable) Sep 9 02:57:57 ubuntushinner kernel: [ 576.546995] [c003e5dab680] [c03542c0] sysfs_remove_link+0x40/0x90 Sep 9 02:57:57 ubuntushinner kernel: [ 576.547001] [c003e5dab6c0] [d5fe0de0] enclosure_remove_links.part.2+0x80/0xb0 [enclosure] Sep 9 02:57:57 ubuntushinner kernel: [ 576.547005] [c003e5dab730] [d5fe0e54] enclosure_component_release+0x44/0x70 [enclosure] Sep 9 02:57:57 ubuntushinner kernel: [ 576.547012] [c003e5dab760] [c0639360] device_release+0x60/0xf0 Sep 9 02:57:57 ubuntushinner kernel: [ 576.547018] [c003e5dab7e0] [c050b73c] kobject_release+0xdc/0x250 Sep 9 02:57:57 ubuntushinner kernel: [ 576.547022] [c003e5dab870] [c0639f3c] device_unregister+0x4c/0xb0 Sep 9 02:57:57 ubuntushinner kernel: [ 576.547025] [c003e5dab8e0] [d5fe0810] enclosure_unregister+0xb0/0x100 [enclosure] Sep 9 02:57:57 ubuntushinner kernel: [ 576.54702
[Kernel-packages] [Bug 1415102] Re: [Ubuntu 15.04] Corsa Update
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1415102 Title: [Ubuntu 15.04] Corsa Update Status in linux package in Ubuntu: New Bug description: Have the following features (bug fixes) by 15.04: * Add sysfs interface for bitstream reload * Add support for EEH error recovery * PERST * tracepoints * Userspace EEH * timebase sync * guest support * mambo support * kexec support * AFU dynamic download Target: kernel 3.18/3.19 > I think the only one of these we are going to get are: > * PERST > * tracepoints > * Add support for EEH error recovery > > The rest will probably need to be deferred. Canonical, For this feature, we will need to have some backports from kernel 3.20. We will do it by 3.20-rc1 timeframe. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415102/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1410817] Re: Kdump triggered manually after cpu offline operation fails to collect dump
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1410817 Title: Kdump triggered manually after cpu offline operation fails to collect dump Status in linux package in Ubuntu: New Bug description: ---Problem Description--- Kdump triggered manually after cpu offline operation fails to collect dump ---uname output--- Linux ubuntu 3.18.0-9-generic #10-Ubuntu SMP Mon Jan 12 21:35:28 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux Machine Type = P8 ---System Hang--- We have to reboot the LPAR and gain access to the machine again. ---Steps to Reproduce--- Install a Power VM LPAR with Ubuntu 15.04 ISO using Virtual DVD. Then offline one of the cpu's of the machine. root@ubuntu:~# lscpu Architecture: ppc64le Byte Order:Little Endian CPU(s):16 On-line CPU(s) list: 0-15 Thread(s) per core:8 Core(s) per socket:1 Socket(s): 2 NUMA node(s): 2 Model: IBM,8284-22A Hypervisor vendor: pHyp Virtualization type: para L1d cache: 64K L1i cache: 32K NUMA node0 CPU(s): 0-15 NUMA node2 CPU(s): root@ubuntu:~# chcpu -d 15 CPU 15 disabled root@ubuntu:~# lscpu Architecture: ppc64le Byte Order:Little Endian CPU(s):16 On-line CPU(s) list: 0-14 Off-line CPU(s) list: 15 Thread(s) per core:7 Core(s) per socket:1 Socket(s): 2 NUMA node(s): 2 Model: IBM,8284-22A Hypervisor vendor: pHyp Virtualization type: para L1d cache: 64K L1i cache: 32K NUMA node0 CPU(s): 0-14 NUMA node2 CPU(s): Configure and enable kdump on the LPAR. root@ubuntu:~# /etc/init.d/kdump-tools status current state : ready to kdump root@ubuntu:~# kdump-config load Modified cmdline:BOOT_IMAGE=/boot/vmlinux-3.18.0-9-generic root=UUID=70957e56-8669-466f-b0e7-140f2ec39a04 ro splash quiet irqpoll maxcpus=1 nousb elfcorehdr=155072K segment[0].mem:0x800 memsz:24510464 segment[1].mem:0x976 memsz:65536 segment[2].mem:0x977 memsz:65536 segment[3].mem:0x978 memsz:65536 segment[4].mem:0x979 memsz:22020096 segment[5].mem:0xec7 memsz:196608 * loaded kdump kernel root@ubuntu:~# root@ubuntu:~# kdump-config show USE_KDUMP:1 KDUMP_SYSCTL: kernel.panic_on_oops=1 KDUMP_COREDIR:/var/crash crashkernel addr: current state:ready to kdump kexec command: /sbin/kexec -p --args-linux --command-line="BOOT_IMAGE=/boot/vmlinux-3.18.0-9-generic root=UUID=70957e56-8669-466f-b0e7-140f2ec39a04 ro splash quiet irqpoll maxcpus=1 nousb" --initrd=/boot/initrd.img-3.18.0-9-generic /boot/vmlinux-3.18.0-9-generic root@ubuntu:~# kdump-config status current state : ready to kdump root@ubuntu:~# sysctl -w kernel.sysrq=1 kernel.sysrq = 1 root@ubuntu:~# cat /proc/sys/kernel/sysrq 1 Trigger the crash manually using sysrq-trigger. root@ubuntu:~# echo c > /proc/sysrq-trigger root@ubuntu:~# [ 311.088315] SysRq : Trigger a crash [ 311.088331] Unable to handle kernel paging request for data at address 0x [ 311.088336] Faulting instruction address: 0xc05f9094 [ 311.088341] Oops: Kernel access of bad area, sig: 11 [#1] [ 311.088344] SMP NR_CPUS=2048 NUMA pSeries [ 311.088349] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp bridge stp llc iptable_filter ip_tables x_tables pseries_rng rtc_generic binfmt_misc [ 311.088372] CPU: 14 PID: 1705 Comm: bash Not tainted 3.18.0-9-generic #10-Ubuntu [ 311.088377] task: c0027773e470 ti: c002782d task.ti: c002782d [ 311.088381] NIP: c05f9094 LR: c05fa12c CTR: c05f9060 [ 311.088385] REGS: c002782d39d0 TRAP: 0300 Not tainted (3.18.0-9-generic) [ 311.088389] MSR: 80009033 CR: 28242822 XER: 0001 [ 311.088401] CFAR: c00084d8 DAR: DSISR: 4200 SOFTE: 1 GPR00: c05fa12c c002782d3c50 c1426890 0063 GPR04: c1b85c28 c1b965e0 00ff c15e71f0 GPR08: c0e76890 0001 0001 GPR12: c05f9060 c7b37e00 2200 GPR16: 1016d6e8 01088208 10143eb8 100c9390 GPR20: 1017b008 10143d18 GPR24: 10156c00 10178868 c13756a8 0004 GPR28: 0063 c133f598 c1375a68 [ 311.088459] NIP [c05f9094] sysrq
[Kernel-packages] [Bug 1410519] Re: [PowerVM] Kernel BUG @ kernel/irq_work.c:157! - 24x7 hw counters
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1410519 Title: [PowerVM] Kernel BUG @ kernel/irq_work.c:157! - 24x7 hw counters Status in linux package in Ubuntu: New Bug description: Steps to recreate the problem: 1. Install Ubuntu 15.04 as a PowerVM guest. 2. Install perf tool 3. Run following scripts to test 24/7 Power8 hardware counter event with perf. tool === Script 1 #!/bin/bash count=0; offset=0x128 PERF_ARGS="-r 10 -C 0" while [ $count -lt 100 ]; do EVENT="hv_24x7/domain=0x2,offset=$offset,starting_index=10/" perf stat $PERF_ARGS -x ' ' perf stat $PERF_ARGS -x ' ' -e $EVENT ls count=) done Script 2 #!/bin/bash offset=0; PERF_ARGS="-r 10 -C 0" while [ $offset -lt 8192 ]; do EVENT="hv_24x7/domain=0x2,offset=$offset,starting_index=10/" perf stat $PERF_ARGS -x ' ' perf stat $PERF_ARGS -x ' ' -e $EVENT ls offset=) done After few iterations I hit the following BUG. tt2.sh tt.sh tt2.sh tt.sh tt2.sh tt.sh 275679187521558 hv_24x7/domain=0x2,offset=6848,starting_index=10/ 0.00% tt2.sh tt.sh [ 4657.314709] softirq: huh, entered softirq 7 SCHED c010abc0 with preem pt_count 0100, exited with bfff? [ 4657.314727] kernel BUG at /build/buildd/linux-3.16.0/kernel/irq_work.c:157! [ 4657.314732] Oops: Exception in kernel mode, sig: 5 [#1] [ 4657.314740] Modules linked in: rtc_generic pseries_rng [ 4657.314749] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-25-generic #33-U [ 4657.314755] task: c1375e00 ti: c13d task.ti: c13d [ 4657.314759] NIP: c01e8ffc LR: c001fe70 CTR: c0002800ic) [ 4657.314770] MSR: 80029033 CR: 28042024 XER: 000a [ 4657.314782] CFAR: c001fe6c SOFTE: 0 GPR04: 0010 009c c1424a98 0002 GPR12: 80009033 ce9a 06a3fcd0 0060 GPR16: 0020 c0e57c00 GPR20: c1595dca c1595478 0001 GPR28: c0e40380 c0e40300 c13d3590 c0e56f08 [ 4657.314832] NIP [c01e8ffc] irq_work_run+0x1c/0x30 [ 4657.314841] Call Trace: 4000 (unreliable) [ 4657.314861] [c13d34f0] [c001ff90] timer_interrupt+0xa0/0xe0 [ 4657.314871] [c13d3520] [c0002914] decrementer_common+0x114/0x180 [ 4657.314884] --- Exception: 901 at arch_local_irq_restore+0x14/0x90 [ 4657.314896] [c13d3810] [c012ed08] vprintk_emit+0x3b8/0x660 (u [ 4657.314908] [c13d38e0] [c0a02650] printk+0x84/0x98 [ 4657.314918] [c13d3910] [c00b51b4] __do_softirq+0x1e4/0x410 [ 4657.314927] [c13d3a00] [c00b57b8] irq_exit+0xf8/0x1400 [ 4657.314948] [c13d3a60] [c0002c14] doorbell_super_common+0x114/0x180 [ 4657.314963] --- Exception: a01 at plpar_hcall_norets+0x8c/0xdc [ 4657.314963] LR = check_and_cede_processor+0x34/0x5020/0x50 (unreliable) [ 4657.314997] [c13d3df0] [c084077c] cpuidle_enter_state+0x6c/0x140c0 [ 4657.315030] [c13d3f00] [c0d63ea8] start_kernel+0x500/0x51c [ 4657.315047] Instruction dump: [ 4657.315052] eba1ffe8 7c0803a6 ebc1fff0 ebe1fff8 4e800020 3c4c011f 3842c110 78290464 [ 4657.315068] 81290014 752a000f 7d380026 55291ffe <0b09> 4bfffec8 6000 6000 [ 4657.315090] ---[ end trace ee202cccd2211e5d ]--- [ 4657.320224] [ 4657.362675] Unable to handle kernel paging request for data at address 0xc000 000b35515048 [ 4657.362680] Faulting instruction address: 0xc006a37c [ 4657.362684] Oops: Kernel access of bad area, sig: 11 [#2] [ 4657.362686] SMP NR_CPUS=2048 NUMA pSeries
[Kernel-packages] [Bug 1401150] Re: Endianness issue in the VPHN topology update code
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1401150 Title: Endianness issue in the VPHN topology update code Status in linux package in Ubuntu: New Bug description: -- Problem Description -- The current VPHN code assumes the NUMA toplogy updates data are big endian. They are native endian actually since the hypervisor pass them through registers. This has a great performance impact on little endian guests. A fix has been sent to fix the issue: http://patchwork.ozlabs.org/patch/396171/ Please pick the following commit from Michael Ellermans's tree: http://git.kernel.org/cgit/linux/kernel/git/mpe/linux.git/commit/?id=5c9fb1899400096c6818181c525897a31d57e488 which reads: "powerpc/vphn: NUMA node code expects big-endian" Now upstream: commit 5c9fb1899400096c6818181c525897a31d57e488 Author: Greg Kurz Date: Wed Oct 15 12:42:58 2014 +0200 powerpc/vphn: NUMA node code expects big-endian To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1401150/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1400411] Re: Feature: cpuidle: Enable fastsleep and winkle in ubuntu 14.04.02 kernel
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1400411 Title: Feature: cpuidle: Enable fastsleep and winkle in ubuntu 14.04.02 kernel Status in linux package in Ubuntu: New Bug description: This is a feature request to enable power management features fastsleep and winkle cpuidle states in ubuntu -14.04.02 kernel. The cpuidle state management patches have been posted to Linux Kernel Community. Mailing list: [PATCH 0/4] powernv: cpuidle: Redesign idle states management https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-November/122433.html Patchset: [1/4] powerpc: powernv: Switch off MMU before entering nap/sleep/rvwinkle mode https://patchwork.ozlabs.org/patch/406249/ [2/4] powerpc/powernv: Enable Offline CPUs to enter deep idle states https://patchwork.ozlabs.org/patch/406250/ [3/4] powernv: cpuidle: Redesign idle states management https://patchwork.ozlabs.org/patch/406256/ [4/4] powernv: powerpc: Add winkle support for offline cpus https://patchwork.ozlabs.org/patch/406251/ To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1400411/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1392152] Re: ipr failing device information is received after executing vpdupdate -s command
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1392152 Title: ipr failing device information is received after executing vpdupdate -s command Status in “linux” package in Ubuntu: New Bug description: ---Problem Description--- ipr failing device information is received after executing vpdupdate -s command ---uname output--- Linux lep8d 3.16.0-9-generic #14-Ubuntu SMP Fri Aug 15 15:03:36 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = P8 ---Steps to Reproduce--- Install Ubuntu 14.10 iso on local hard disk of P8 machine in Non Virtualized environment. Then installed the lsvpd and iprutils package available in ubuntu repo. root@lep8d:~# vpdupdate -s sh: 1: Syntax error: EOF in backquote substitution root@lep8d:~# echo $? 0 root@lep8d:~# dmesg [ 7370.262781] ipr 0001:04:00.0: FFF4: Command to logical unit failed [ 7370.262789] ipr: -Failing Device Information- [ 7370.262791] ipr: World Wide Unique ID: 5000C500724B2FC3 [ 7370.262794] ipr: Device Resource Path: 00-01 [ 7370.262796] ipr: Primary Problem Description: Invalid command opcode/data [ 7370.262798] ipr: Secondary Problem Description: Status Check [ 7370.262800] ipr: SCSI Sense Data: [ 7370.262803] ipr: : 7500 0018 240001CF [ 7370.262805] ipr: 0010: 0002 05240001 [ 7370.262807] ipr: SCSI Command Descriptor Block: [ 7370.262809] ipr: : 12008300 2404 [ 7370.262811] ipr: Additional IOA Data: [ 7370.262813] ipr: : 455300CC 05B03B04 0024 0410 [ 7370.262816] ipr: 0010: 0B66B200 0B66B7C0 [ 7370.262818] ipr: 0020: 0B66B0E0 88008000 0024 [ 7370.262820] ipr: 0030: 5D80 8F00 [ 7370.262823] ipr: 0040: 00C095B5 [ 7370.262825] ipr: 0050: [ 7370.262827] ipr: 0060: [ 7370.262829] ipr: 0070: [ 7370.262831] ipr: 0080: D401 8000 [ 7370.262833] ipr: 0090: 0001 2AEB6BBF [ 7370.262836] ipr: 00A0: 0131BC24 [ 7370.262838] ipr: 00B0: 01608740 E003 E060 04448502 [ 7370.262840] ipr: 00C0: 0024 12008300 2404 [ 7370.262842] ipr: 00D0: 43440010 12008300 2404 [ 7370.262844] ipr: 00E0: 45480020 03001EB0 1200 [ 7370.262847] ipr: 00F0: 046E 05240001 01011EB0 1200 [ 7370.262849] ipr: 0100: 01448100 05240001 45540004 0192 [ 7370.262851] ipr: 0110: 43490018 0002 0001 [ 7370.262853] ipr: 0120: 5000C500 724B2FC1 1770 545209C0 [ 7370.276107] ipr 0001:04:00.0: FFF4: Command to logical unit failed [ 7370.276110] ipr: -Failing Device Information- [ 7370.276112] ipr: World Wide Unique ID: 5000C500724B2FC3 [ 7370.276114] ipr: Device Resource Path: 00-01 [ 7370.276117] ipr: Primary Problem Description: Invalid command opcode/data [ 7370.276119] ipr: Secondary Problem Description: Status Check [ 7370.276122] ipr: SCSI Sense Data: [ 7370.276124] ipr: : 7500 0018 240001CF [ 7370.276126] ipr: 0010: 0002 05240001 [ 7370.276128] ipr: SCSI Command Descriptor Block: [ 7370.276130] ipr: : 1200C700 2404 [ 7370.276132] ipr: Additional IOA Data: [ 7370.276134] ipr: : 455300CC 05B03B04 0024 0410 [ 7370.276137] ipr: 0010: 0B668A00 0B668FC0 [ 7370.276139] ipr: 0020: 0B6688E0 88008000 0024 [ 7370.276141] ipr: 0030: 6080 8F00 [ 7370.276143] ipr: 0040: 00C398B8 [ 7370.276146] ipr: 0050: [ 7370.276148] ipr: 0060: [ 7370.276150] ipr: 0070: [ 7370.276152] ipr: 0080: D401 8000 [ 7370.276155] ipr: 0090: 0001 2AEB70D7 [ 7370.276157] ipr: 00A0: 0131B124 [ 7370.276160] ipr: 00B0: 01608740 E003 E060 04448502 [ 7370.276162] ipr: 00C0: 0024 1200C700 2404 [ 7370.276164] ipr: 00D0: 43440010 1200C700 2404 [ 7370.276167] ipr: 00E0: 45480020 03001EB0 1200 [ 7370.276169] ipr: 00F0: 046E 05240001 01011EB0 1200 [ 7370.276172] ipr: 0100: 01448100 05240001 45540004 0186 [ 7370.276174] ipr: 0110: 43490018 0002 0001 [ 7370.276176] ipr: 012
[Kernel-packages] [Bug 1391953] Re: [Ubuntu 14.04.02] Kernel patches for PowerKVM host and guest
Set package to "linux" for kernel component ** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1391953 Title: [Ubuntu 14.04.02] Kernel patches for PowerKVM host and guest Status in “linux” package in Ubuntu: New Bug description: ---Problem Description--- We need below patches to support Ubunut 14.04.02 on PowerKVM host / guest. PowerKVM host : 831cf65b0295de75f40f8cf52ce62e5d261dab4f powerpc/powernv: Check OPAL dump calls exist before using 7dc992ec7b3fd875b05f49f454a922ee94af330b powerpc/powernv: Check OPAL elog calls exist before using 035ed26fb090ff3277900259f19d57e54da2e116 powerpc/powernv: Check OPAL RTC calls exists before using bffe6bda342578deea0b74f2d9cb97cc40585a1b powerpc/powernv: Add OPAL check token call cdd91b89adedb77e3e581c40788620790edc33b5 powerpc/powernv: Improve error messages in dump code 76215b04fd297c008b91ece732ed36e67e0181fa arch/powerpc/platforms/powernv/opal-dump.c: fix world-writable sysfs files 6656c21ca10e54a84673d0ec2f0cf5f676e66a40 arch/powerpc/platforms/powernv/opal-elog.c: fix world-writable sysfs files Guest; e36d1227776a2daa2c9aa7f997ac7083d6783f2c pseries: Fix endian issues in cpu hot-removal 822e71224e07f07a07c385be869fe416ce436430 pseries: Fix endian issues in onlining cpu threads c9ac408bc7329911237c25508f578fb2fa1c4235 powerpc/pseries: Fix endian issues in memory hotplug 587870e8650a0571e895cc879cd895c78c6391bf powerpc/pseries/hvcserver: Fix endian issue in hvcs_get_partner_info d6f1e7abdb95a7ea031e7604829e4b5514d7e2c1 powerpc/pseries: Make CPU hotplug path endian safe PowerVM: PowerVM patch : 408cddd96e3b155337f9e3aba2198e92e94c6068 powerpc/fadump: Fix endianess issues in firmware assisted dump handling Note that these are the required patches on top of Ubuntu 14.10 kernel and list commit ID is from upstream.. -Vasant To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1391953/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1380432] Re: 24x7 counters: Bug in catalog_read()
** Tags removed: verification-needed-utopic ** Tags added: verification-done-utopic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1380432 Title: 24x7 counters: Bug in catalog_read() Status in “linux” package in Ubuntu: Fix Released Status in “linux” source package in Trusty: Invalid Status in “linux” source package in Utopic: Fix Committed Status in “linux” source package in Vivid: Fix Released Bug description: SRU Justification: [Impact] When using POWER8 in HV mode one cannot access /sys/bus/event_source/devices/hv_24x7/interface/catalog [Test Case] cp /sys/bus/event_source/devices/hv_24x7/interface/catalog /tmp ls -l /tmp/catalog # this should show a file with a size > 0 [Fix] 56f12bee55d740dc47eed0ca9d5c72cffdffd6cf which is in 3.18-rc1 It is a clean cherry pick into 3.16/3.13. -- ---Problem Description--- 24x7 counters: Bug in catalog_read() ---Additional Hardware Info--- Power8 system with support for 24x7 counters Machine Type = Power8 IBM,9119-MME ---Steps to Reproduce--- cp /sys/bus/event_source/devices/hv_24x7/interface/catalog /tmp ls -l /tmp/catalog Shows a file size 0 for the /tmp/catalog. We recently discovered a bug in the upstream Linux kernel. It was fixed by this patch https://lkml.org/lkml/2014/10/1/35 and that patch is scheduled to be included in powerpc tree. When it is included, we should back port to Ubuntu 14.10 or next release. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1380432/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1387523] Re: CXL: Fix PSL error due to duplicate segment table entries
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1387523 Title: CXL: Fix PSL error due to duplicate segment table entries Status in “linux” package in Ubuntu: Incomplete Bug description: Problem Description == CXL: Fix PSL error due to duplicate segment table entries ---uname output--- 3.16.0-23-generic Machine Type = POWER8 + CAPI card Steps to Reproduce == stress test surelock application with HTX == Comment: #1 - Michael Neuling - == Fix is already upstream. Need to pull these 4 patches patches from mainline. % git log --oneline bf19edd290..eb01d4c238 eb01d4c cxl: Fix PSL error due to duplicate segment table entries 03f5439 powerpc/mm: Use appropriate ESID mask in copro_calculate_slb() b03a7f5 cxl: Refactor cxl_load_segment() and find_free_sste() 5100a9d cxl: Disable secondary hash in segment table Already fixed upstream, just need Canonical to cherry-pick the 4 patches. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1387523/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1386587] Re: Bridge Test, please ignore
** Changed in: linux (Ubuntu) Status: Incomplete => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1386587 Title: Bridge Test, please ignore Status in “linux” package in Ubuntu: Invalid Bug description: just test To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1386587/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1380432] Re: 24x7 counters: Bug in catalog_read()
** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1380432 Title: 24x7 counters: Bug in catalog_read() Status in “linux” package in Ubuntu: Confirmed Bug description: ---Problem Description--- 24x7 counters: Bug in catalog_read() ---Additional Hardware Info--- Power8 system with support for 24x7 counters Machine Type = Power8 IBM,9119-MME ---Steps to Reproduce--- cp /sys/bus/event_source/devices/hv_24x7/interface/catalog /tmp ls -l /tmp/catalog Shows a file size 0 for the /tmp/catalog. We recently discovered a bug in the upstream Linux kernel. It was fixed by this patch https://lkml.org/lkml/2014/10/1/35 and that patch is scheduled to be included in powerpc tree. When it is included, we should back port to Ubuntu 14.10 or next release. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1380432/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1380432] Re: 24x7 counters: Bug in catalog_read()
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1380432 Title: 24x7 counters: Bug in catalog_read() Status in “linux” package in Ubuntu: New Bug description: ---Problem Description--- 24x7 counters: Bug in catalog_read() ---Additional Hardware Info--- Power8 system with support for 24x7 counters Machine Type = Power8 IBM,9119-MME ---Steps to Reproduce--- cp /sys/bus/event_source/devices/hv_24x7/interface/catalog /tmp ls -l /tmp/catalog Shows a file size 0 for the /tmp/catalog. We recently discovered a bug in the upstream Linux kernel. It was fixed by this patch https://lkml.org/lkml/2014/10/1/35 and that patch is scheduled to be included in powerpc tree. When it is included, we should back port to Ubuntu 14.10 or next release. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1380432/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1378413] Re: Feature request for Core infrastructure for Ubuntu 14.10
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Assignee: (unassigned) => Taco Screen team (taco-screen-team) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1378413 Title: Feature request for Core infrastructure for Ubuntu 14.10 Status in “linux” package in Ubuntu: New Bug description: -- Problem Description -- Feature request to track the Core infrastructure Fixes for Power Virt Stability into Ubuntu 14.10 Talked to Nathan Fontenot and Benjamin Herrenschmidt push the sparse vmemmap patches into his next tree. IIRC, now it's time to ask Canonical to include those as feature request of 14.10. The 14.10 uses relatively new kernel (3.16), so the upsteam patches could be applied directly. I applied the patches, and the guest could boot without any issues. So I attached all the 4 patches here. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1378413/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1375441] Re: 'perf record' fails with "Perf session creation failed"
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1375441 Title: 'perf record' fails with "Perf session creation failed" Status in “linux” package in Ubuntu: Confirmed Bug description: ---Problem Description--- When run as normal user, 'perf record ls' fails with: Perf session creation failed. The command runs successfully when run as root. ---uname output--- Linux ubuntu 3.16.0-16-generic #22-Ubuntu SMP Wed Sep 17 18:45:43 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = POWER8E pvr 004b 0201 > Maynard, > > Can you check if /proc/sys/kernel/kptr_restrict has 1 in it on your > system ? Yes, the value is '1'. > > On my system it does have 1 and I can repro. Following seems to > fix: > echo 0 > /proc/sys/kernel/kptr_restrict Yes, changing it to '0' does "fix" it, although that shouldn't be necessary, as you know. So something is broken. I am able to repro with 3.17.0-rc4 based mainline perf tool. I suspect that the change in behavior was introduced unintentionally by this upstream commit. machine__create_kernel_maps() now calls machine__get_kernel_start_addr() which checks the kptr_restrict state. --- commit a93f0e551af9e194db38bfe16001e17a3a1d189a Author: Simon Que Date: Mon Jun 16 11:32:09 2014 -0700 perf symbols: Get kernel start address by symbol name This is being fixed by a recent upstream commit: https://lkml.org/lkml/2014/9/27/26 Will Ubuntu pick that fix up automatically ? To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1375441/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1372975] Re: ISST-KVM : Ubuntu 14.04 guest actg1 running TCP and IO hanging
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1372975 Title: ISST-KVM : Ubuntu 14.04 guest actg1 running TCP and IO hanging Status in “linux” package in Ubuntu: Confirmed Bug description: Problem Description: -- PowerKVM guest actg1 is hanging while running IO and TCP tests. Unable to ssh or virsh console to the guest. While running IO and TCP tests the guest actg1 is giving below CPU stall trace : [16091.824841] INFO: rcu_sched self-detected stall on CPU { 1} (t=670218 jiffies g=122793 c=122792 q=6344067) [16154.853532] INFO: rcu_sched self-detected stall on CPU { 1} (t=676521 jiffies g=122793 c=122792 q=6345885) [16217.882223] INFO: rcu_sched self-detected stall on CPU { 1} (t=682824 jiffies g=122793 c=122792 q=6347609) [16280.910913] INFO: rcu_sched self-detected stall on CPU { 1} (t=689127 jiffies g=122793 c=122792 q=6349994) [16343.939603] INFO: rcu_sched self-detected stall on CPU { 1} (t=695430 jiffies g=122793 c=122792 q=6351789) [16406.968293] INFO: rcu_sched self-detected stall on CPU { 1} (t=701733 jiffies g=122793 c=122792 q=6353519) [16469.996983] INFO: rcu_sched self-detected stall on CPU { 1} (t=708036 jiffies g=122793 c=122792 q=6355275) [16533.025674] INFO: rcu_sched self-detected stall on CPU { 1} (t=714339 jiffies g=122793 c=122792 q=6357064) [16596.054364] INFO: rcu_sched self-detected stall on CPU { 1} (t=720642 jiffies g=122793 c=122792 q=6358808) [16659.083056] INFO: rcu_sched self-detected stall on CPU { 1} (t=726945 jiffies g=122793 c=122792 q=6360538) [16722.111749] INFO: rcu_sched self-detected stall on CPU { 1} (t=733248 jiffies g=122793 c=122792 q=6362273) XML file : - actg1 01a4e83f-2275-43b2-a4b3-31ada581141d 4194304 4194304 2 /machine hvm power8 destroy restart restart /usr/bin/qemu-system-ppc64 ===> SAN (coho) > LocalCEC > LocalCEC > iscsi > iscsi system_u:system_r:svirt_t:s0:c405,c741 system_u:object_r:svirt_image_t:s0:c405,c741 Host configuration: --- [root@actkvm home]# cat /etc/issue IBM_PowerKVM release 2.1.0 build 27 service (pkvm2_1) Kernel \r on a \m (\l) [root@actkvm home]# virsh version Compiled against library: libvirt 1.1.3 Using library: libvirt 1.1.3 Using API: QEMU 1.1.3 Running hypervisor: QEMU 1.6.0 Dropped into xmon, much of the same, soft lockups between raw_spin_lock and shrink_dentry_list. The guest manages to get a fair bit into its syslog before locking up, I think the two most interesting bits are Sep 18 05:44:46 actg1 kernel: [ 8349.373013] [ cut here ] Sep 18 05:44:46 actg1 kernel: [ 8349.373019] WARNING: at /build/buildd/linux-3.13.0/fs/dcache.c:362 Sep 18 05:44:46 actg1 kernel: [ 8349.373020] Modules linked in: rpcsec_gss_krb5 nfsv4 dm_round_robin dm_multipath scsi_dh pseries_rng nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache rtc_generic virtio_scsi Sep 18 05:44:46 actg1 kernel: [ 8349.373036] CPU: 1 PID: 22289 Comm: host01 Not tainted 3.13.0-35-generic #62-Ubuntu Sep 18 05:44:46 actg1 kernel: [ 8349.373038] task: c000fcb4 ti: c000faec task.ti: c000faec Sep 18 05:44:46 actg1 kernel: [ 8349.373040] NIP: c0274980 LR: c02757f8 CTR: c02e2f90 Sep 18 05:44:46 actg1 kernel: [ 8349.373042] REGS: c000faec3590 TRAP: 0700 Not tainted (3.13.0-35-generic) Sep 18 05:44:46 actg1 kernel: [ 8349.373042] MSR: 80029033 CR: 2848 XER: 2000 Sep 18 05:44:46 actg1 kernel: [ 8349.373048] CFAR: c0274900 SOFTE: 1 Sep 18 05:44:46 actg1 kernel: [ 8349.373048] GPR00: c02757f8 c000faec3810 c164f6a0 c000f9b1c840 Sep 18 05:44:46 actg1 kernel: [ 8349.373048] GPR04: c000faec39b8 01af2460 00
[Kernel-packages] [Bug 1370425] Re: kernel bug seen while try to use madvise system call with MADV_HWPOISON mode
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1370425 Title: kernel bug seen while try to use madvise system call with MADV_HWPOISON mode Status in “linux” package in Ubuntu: Incomplete Bug description: Problem Description kernel bug seen while try to use madvise system call with MADV_HWPOISON mode ---uname output--- Linux u10thp 3.16.0-9-generic #14-Ubuntu SMP Fri Aug 15 15:03:36 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = Power 8 Steps to Reproduce 1. Install Ubuntu 14.10 guest on PowerKVM. 2. Setup hugepage backing guest VM. 3. Try madv_poison.c code to test madvise sys. call with HWPOISON mode(test code is attached). gcc -o madv_poison madv_poison.c ./madv_poison -C -i 1 (1 - shm_test) Ubuntu 14.10 LE throws kernel bug : root@u10thp:~# ./madv_poison -C -i 1 vm.memory_failure_early_kill = 0 [pid 2301] start page-poisoning test [pid 2301] there are 1 shm_child [pid 2301] have spawned 1 processes [pid 2301] wait for Pid 2304 [pid 2304] shm dirty poisoning page 0x3fffa7ce [ 7905.009001] Injecting memory failure for page 0xe6a7 at 0x3fffa7ce [ 7905.009359] MCE 0xe6a7: dirty LRU page recovery: Recovered [pid 2304] writing 2 [ 7905.009901] [ cut here ] [ 7905.010164] kernel BUG at /build/buildd/linux-3.16.0/arch/powerpc/mm/fault.c:180! [ 7905.010396] Oops: Exception in kernel mode, sig: 5 [#234] [ 7905.010438] SMP NR_CPUS=2048 NUMA pSeries [ 7905.010480] Modules linked in: pseries_rng rtc_generic ohci_pci [ 7905.010614] CPU: 0 PID: 2304 Comm: madv_poison Tainted: G D 3.16.0-9-generic #14-Ubuntu [ 7905.010686] task: c000e0a92a60 ti: c000e09e8000 task.ti: c000e09e8000 [ 7905.010746] NIP: c09e3314 LR: c09e2e54 CTR: [ 7905.010864] REGS: c000e09eb990 TRAP: 0700 Tainted: G D (3.16.0-9-generic) [ 7905.010924] MSR: 80029033 CR: 28002882 XER: [ 7905.011125] CFAR: c09e3170 SOFTE: 1 GPR00: c09e2e54 c000e09ebc10 c13742e0 0010 GPR04: c000e0b37ff8 3fffa7ce 00a9 GPR08: 0010 c000e0a92a60 0020 GPR12: 48002884 cfe4 GPR16: GPR20: 00a9 c000e0597a40 c000e022b060 GPR24: 0010 c000e022b000 c0009568 3fffa7ce GPR28: 0200 c000e09ebea0 [ 7905.012189] NIP [c09e3314] do_page_fault+0x984/0x990 [ 7905.012241] LR [c09e2e54] do_page_fault+0x4c4/0x990 [ 7905.012281] Call Trace: [ 7905.012361] [c000e09ebc10] [c09e2e54] do_page_fault+0x4c4/0x990 (unreliable) [ 7905.012434] [c000e09ebe30] [c0009568] handle_page_fault+0x10/0x30 [ 7905.012494] Instruction dump: [ 7905.012580] e92d0290 e8690460 38630060 4b7274d9 6000 e93f0108 3bc0 792a97e3 [ 7905.012683] 4082f77c 3bc9 6000 4bfff774 <0fe0> 3c4c0099 [ 7905.012845] ---[ end trace a48a199a061eed79 ]--- [ 7905.019084] [pid 2301] Ins 0: Pid 2304: failed - shared memory test [pid 2301]!!! Page Poisoning Test is FAILED (1 failures found). !!! [pid 2301] page-poisoning test done! root@u10thp:~# == Comment: #1 - Kalpana Shetty - == The test code works fine with x86/Ubuntu VM so if it is not supported on power then it should have thrown an error not supported as it does with PowerKVM / RHEL 7 VM. Intel/Ubuntu 14.04 VM: => Working fine. root@u04vm14:~# ./madv_poison -C -i 1 (shm_test case) vm.memory_failure_early_kill = 0 [pid 7325] start page-poisoning test [pid 7325] there are 1 shm_child [pid 7325] have spawned 1 processes [pid 7325] wait for Pid 7328 [pid 7328] shm dirty poisoning page 0x7f60ca8ea000 [pid 7328] writing 2 [pid 7328] signal 7 code 4 addr 0x7f60ca8ea000 [pid 7328] pass: recovered [pid 7325] Ins 0: Pid 7328: pass - shared memory test [pid 7325]!!! Page Poisoning Test got PASS. !!! [pid 7325] page-poisoning test done! PowerKVM / RHEL 7 VM: [root@rhel7-web-VM1 ~]# ./madv_poison -C -i 1 sysctl: cannot stat /proc/sys/vm/memory_failure_early_kill: No such file or directory [pid 11512] start page-poisoning test [pid 11512] there are 1 shm_child [pid 11512] have spawned 1 processes [pid 11514] shm dirty poisoning page 0x3fff84d6 [pid 11512] wait for Pid 11514 [pid 11514] failed: Kernel doesn't support poison in
[Kernel-packages] [Bug 1365655] Re: ISST-KVM: R2-0: xmon disabled in Ubuntu
** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1365655 Title: ISST-KVM: R2-0: xmon disabled in Ubuntu Status in “linux” package in Ubuntu: Confirmed Bug description: ---Problem Description--- The Ubuntu kernels currently have xmon disabled. xmon is a standard debugging interface on power platforms and is expected to be compiled in to enable for debugging. Please enable CONFIG_XMON. ---uname output--- Linux actg1 3.13.0-34-generic #60-Ubuntu SMP Wed Aug 13 15:45:54 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8247-22L ---Steps to Reproduce--- Run the following command: echo 1 > /proc/sys/kernel/sysrq Try to invoke xmon by inputting ctrl + o, x System should drop into xmon, but the following message is printed instead: "[229073.090037] SysRq : This sysrq operation is disabled." 14.10 is also the same: root@hatg5:~# grep XMON /boot/config-3.16.0-8-generic # CONFIG_XMON is not set root@hatg5:~# uname -a Linux hatg5 3.16.0-8-generic #13-Ubuntu SMP Wed Aug 13 17:11:23 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1365655/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1334793] Re: PowerVM: ubuntu-14.04 stuck in ibm, client-architecture-support reboot loop
** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1334793 Title: PowerVM: ubuntu-14.04 stuck in ibm,client-architecture-support reboot loop Status in “linux” package in Ubuntu: Confirmed Bug description: ---Problem Description--- This bug is a follow-up of Bug#110009. The problem here is after successful installation of Ubuntu 14.04, the system fails to boot the OS. When I select installed OS "Ubuntu" from grub menu, it throws below error and falls back to GRUB menu again and again: .. .. OF stdout device is: /vdevice/vty@3000 Preparing to boot Linux version 3.13.0-24-generic (buildd@fisher04) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #46-Ubuntu SMP Thu Apr 10 19:09:21 UTC 2014 (Ubuntu 3.13.0-24.46-generic 3.13.9) Detected machine type: 0101 Max number of cores passed to firmware: 256 (NR_CPUS = 2048) Calling ibm,client-architecture-support... ibm,sp /vdevice/IBM,sp@4000 \ ibm,sp /vdevice/IBM,sp@4000 \ Elapsed time since release of system processors: 10675 mins 3 secs error: no suitable video mode found. Machine Type = IBM,8286-42A Mach serial number: 1069B3T ---Steps to Reproduce--- 1) Try virtual DVD installation 2) Boot the LPAR via cdrom i.e try below commands from openfirmware prompt: 0> devalias cdrom /vdevice/v-scsi@3010/disk@8300 0> boot cdrom 3) proceed with default installation, 100% installation completes. Boot the installed OS from hard disk and you hit this "no suitable video mode found" error and system falls back to GRUB. Install method: virtual DVD Install ISO Information: ubuntu-14.04-server-ppc64el.iso *Additional Instructions for backup: khusr...@in.ibm.com: -Post a private note with access information to the machine that the bug is occuring on. == Comment: #1 - David Heller - 2014-06-11 15:32:48 == This seems to be a pretty common problem, albeit with multiple causes. I see that you already tried some of the workarounds in similar bug here: https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/699802 Still, it seems likely this is due to the video mode that grub is attempting to user for the splash screen. Either that, or it's a video driver issue at the grub level. Can you please try the procedure in comment #4 here: http://.ubuntuforums.org/showthread.php?t=1471399 This involves booting to an alternate method, such as a live cd, and operating on the grub config files in the newly installed system. If you are unable to, or have no success with updating the files, it would at least be good to collect the files and attach the information here, before sending this back to launchpad If you are successful in booting to live CD on the same system where the boot fails, it would be good to compare the files from live CD to those in the failing system. Also, do you mind installing with Ubuntu 14.10 alpha images? http://cdimage.ubuntu.com/ubuntu-server/daily/current/utopic-server- ppc64el.iso To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1334793/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1365655] Re: ISST-KVM: R2-0: xmon disabled in Ubuntu
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1365655 Title: ISST-KVM: R2-0: xmon disabled in Ubuntu Status in “linux” package in Ubuntu: Incomplete Bug description: ---Problem Description--- The Ubuntu kernels currently have xmon disabled. xmon is a standard debugging interface on power platforms and is expected to be compiled in to enable for debugging. Please enable CONFIG_XMON. ---uname output--- Linux actg1 3.13.0-34-generic #60-Ubuntu SMP Wed Aug 13 15:45:54 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8247-22L ---Steps to Reproduce--- Run the following command: echo 1 > /proc/sys/kernel/sysrq Try to invoke xmon by inputting ctrl + o, x System should drop into xmon, but the following message is printed instead: "[229073.090037] SysRq : This sysrq operation is disabled." 14.10 is also the same: root@hatg5:~# grep XMON /boot/config-3.16.0-8-generic # CONFIG_XMON is not set root@hatg5:~# uname -a Linux hatg5 3.16.0-8-generic #13-Ubuntu SMP Wed Aug 13 17:11:23 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1365655/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1365140] [NEW] test ; please ignore
Public bug reported: This is a test. Will close soon. ** Affects: linux (Ubuntu) Importance: Undecided Status: Invalid ** Changed in: linux (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1365140 Title: test ; please ignore Status in “linux” package in Ubuntu: Invalid Bug description: This is a test. Will close soon. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1365140/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1364427] Re: kexeced kernel hung
** Package changed: ubuntu => kexec-tools (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to kexec-tools in Ubuntu. https://bugs.launchpad.net/bugs/1364427 Title: kexeced kernel hung Status in “kexec-tools” package in Ubuntu: New Bug description: ---Problem Description--- kexec is not working with kernel 3.16 or later. The root cause is missing patch in the kexec-tools package provided with Ubuntu 14.04 and 14.10. ---uname output--- Linux qemu 3.16.0-10-generic #15-Ubuntu SMP Thu Aug 21 16:32:31 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = P8 ---Steps to Reproduce--- On a 14.10 guest or host, run : $ sudo kexec -l /boot/vmlinux --initrd=/boot/initrd.img-3.16.0-10-generic $ sudo kexec -e The kexeced kernel will hang here : [ 300.002862] Starting new kernel I'm in purgatory The root cause is the following kexec-tools's patch missing: 2ca220389d21 kexec/ppc64: move to device tree version 17 This patch is required to kexec kernel 3.16 and later. Among this patch, the following upstream pactches should be added to the current package : 335bad77fb07 kexec/ppc64: disabling exception handling when building the purgato 90853885a859 ppc64/purgatory: Device tree values should be read/stored in Big En In addition the following incoming patch should be applied : ppc64/kdump: Fix ELF header endianess (http://lists.infradead.org/pipermail/kexec/2014-July/012247.html) Another option to consider is to move to kexec-tools 2.0.7, and to apply the missing patches Is it possible to get this patches also applied to the kexec-tools shipped with 14.04 ? To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/kexec-tools/+bug/1364427/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1361364] Re: The Crocodile bar overlap
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1361364 Title: The Crocodile bar overlap Status in “linux” package in Ubuntu: Incomplete Bug description: Canonical, Is it possible to accept the following patch into Canonical 14.10? It fix a problem that is affecting a lot of machines on our side. https://patchwork.ozlabs.org/patch/381836/ The patch was just submitted upstream at the moment, and I understand that it should be accepted into upstream tree soon. Thank you, Breno To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1361364/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1360428] Re: Feature: Add zpool patchset to ubuntu 14.10
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1360428 Title: Feature: Add zpool patchset to ubuntu 14.10 Status in “linux” package in Ubuntu: New Bug description: The zpool patch set allows zswap to use either zbud or zsmalloc for its compressed storage. The patches are upstream now and will be in 3.17. All 4 patches backported to ubuntu 14.10 kernel - only patch 2 required a minor context change. Taken from upstream patches (listed in reverse order of attached patches): 12d79d6 mm/zpool: update zswap to use zpool c795779 mm/zpool: zbud/zsmalloc implement zpool af8d417 mm/zpool: implement common zpool api to zbud/zsmalloc 99eef8e mm/zbud: change zbud_alloc size type to size_t To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1360428/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1359423] Re: Kernel patches for FSP dump
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1359423 Title: Kernel patches for FSP dump Status in “linux” package in Ubuntu: Incomplete Bug description: ---Problem Description--- A few kernel patches are needed for FSP dump to work on PowerNV host. Kindly include the attached backported patches into Ubuntu 14.10. These correspond to the below upstream commits: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=14c4000a88afaaa2d0877cc86d42a74fde0f35e0 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b09c2ec4082c63584491f35df2cb530ee8ca312d Machine Type = POWER8 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1359423/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1352990] Re: remap_4K_pfn() safety improvement needed for Ubuntu 14.10
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1352990 Title: remap_4K_pfn() safety improvement needed for Ubuntu 14.10 Status in “linux” package in Ubuntu: New Bug description: == Comment: #0 - Brian Hart - 2014-08-04 17:41:57 == ---Problem Description--- The current implementation of remap_4k_pfn() trusts that it's safe to map the PFN supplied by the requestor. But there may be PFNs that are not safe to map via remap_4k_pfn(). (For example, the addresses at which PCI MMIO regions are mapped in some hypervisor configurations.) When an unsafe PFN passes through remap_4k_pfn() some address bits may be unknowingly dropped by the underlying remapping routines. When that happens the remap will appear to succeed, but any later attempt to use the mapping will checkstop the machine because the truncated target address is not present in the machine. A patch has been submitted that will cause remap_4k_pfn() to detect and reject these unsafe requests: https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-July/119179.html Our project needs some form of this safety improvement in the Ubuntu 14.10 release. ---uname output--- Linux tul115p1 3.16.0-6-generic #11-Ubuntu SMP Mon Jul 28 02:00:45 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-42A ---Debugger--- A debugger is not configured ---Steps to Reproduce--- The problem requires a hypervisor that allows PCI MMIO regions to span above the 46-bit line, and a device driver that maps MMIO regions using remap_4k_pfn(). I can provide detailed instructions and a driver upon request. Stack trace output: no Oops output: no System Dump Info: The system is not configured to capture a system dump. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1352990/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1354024] Re: Running YCSB workload on MongoDB on Ubuntu 14.10 VM resulted in kernel bug
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1354024 Title: Running YCSB workload on MongoDB on Ubuntu 14.10 VM resulted in kernel bug Status in “linux” package in Ubuntu: Incomplete Bug description: == Comment: #0 - Kalpana Shetty - 2014-08-05 23:53:28 == ---Problem Description--- Running YCSB workload on LongoDB on Ubuntu 14.10 VM resulted in kernel bug ---uname output--- root@u10vm15:~# uname -a Linux u10vm15 3.16.0-6-generic #11-Ubuntu SMP Mon Jul 28 02:00:45 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux ---Additional Hardware Info--- Power 8 - Tuleta Machine Type = POWER 8 ---System Hang--- Ubuntu 14.10 LE guest needs to be restarted when seen this issue. Steps to reproduce: - Install Ubuntu 14.10 on 2 VMs(July 30th build) - Run Mongodb 2.6.2 on one of PowerKVM VM - Run YCSB 0.1.4 on other VM - Create 1million record load on MongoDB using YCSB; allow it to run for 4 to 5 hours or so. Setup details: - MongoDB server on one VM (version: 2.6.2) - YCSB workload running on one VM (YCSB version - ycsb-0.1.4) uname on Host: [root@powerkvm5-lp1 ~]# uname -a Linux powerkvm5-lp1.austin.ibm.com 3.10.42-2004.pkvm2_1_1.8.ppc64 #1 SMP Fri Jul 18 11:20:03 CDT 2014 ppc64 ppc64 ppc64 GNU/Linux uname on Guest OS: root@u10vm15:~# uname -a Linux u10vm15 3.16.0-6-generic #11-Ubuntu SMP Mon Jul 28 02:00:45 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux [23001.071911] [ cut here ] [23001.071922] kernel BUG at /build/buildd/linux-3.16.0/fs/dcache.c:1626! [23001.072917] Oops: Exception in kernel mode, sig: 5 [#1] [23001.073620] SMP NR_CPUS=2048 NUMA pSeries [23001.074149] Modules linked in: pseries_rng rtc_generic ohci_pci [23001.075162] CPU: 8 PID: 3384 Comm: updatedb.mlocat Not tainted 3.16.0-6-generic #11-Ubuntu [23001.076006] task: c6e0 ti: c00130364000 task.ti: c00130364000 [23001.076834] NIP: c02abc68 LR: c02abf90 CTR: c001f880 [23001.077650] REGS: c001303676d0 TRAP: 0700 Not tainted (3.16.0-6-generic) [23001.078468] MSR: 800100029033 CR: 24004842 XER: 2000 [23001.080432] CFAR: c02abf8c SOFTE: 1 [23001.080432] GPR00: c02abf90 c00130367950 c1346618 c5dd [23001.080432] GPR04: 1000 c5dcd170 0fcc [23001.080432] GPR08: 1000 0001 8803dabf05ff 0016eb0c [23001.080432] GPR12: 4400 cfe41c00 0010 [23001.080432] GPR16: 1001d660 010034e94ec0 0001 53d94034 [23001.080432] GPR20: 0001 3fffcaa1efb8 010034e842e0 [23001.080432] GPR24: 010034e94ec0 ff9c [23001.080432] GPR28: 0040 c5dd [23001.091266] NIP [c02abc68] d_instantiate+0x38/0xf0 [23001.091837] LR [c02abf90] d_splice_alias+0x60/0x1a0 [23001.092404] Call Trace: [23001.092692] [c00130367980] [c02abf90] d_splice_alias+0x60/0x1a0 [23001.093544] [c001303679c0] [c034c5b4] ext4_lookup+0xc4/0x1c0 [23001.094399] [c00130367a50] [c0299944] lookup_real+0x64/0xc0 [23001.095261] [c00130367a90] [c029a790] __lookup_hash+0x60/0x80 [23001.096106] [c00130367ae0] [c029d610] lookup_slow+0x70/0x110 [23001.096946] [c00130367b20] [c029ea08] path_lookupat+0x958/0x9a0 [23001.097804] [c00130367be0] [c029eaa8] filename_lookup+0x58/0x140 [23001.098648] [c00130367c30] [c02a2524] user_path_at_empty+0x84/0xe0 [23001.099580] [c00130367d20] [c02937e4] vfs_fstatat+0x84/0x140 [23001.100432] [c00130367d80] [c0293eb4] SyS_newlstat+0x34/0x60 [23001.101378] [c00130367e30] [c000a0fc] syscall_exit+0x0/0x7c [23001.102193] Instruction dump: [23001.102589] 7c0802a6 fbc1fff0 fbe1fff8 f8010010 f821ffd1 7c7e1b78 7c9f2378 6000 [23001.103945] 6000 e93e00b8 3149 7d2a4910 <0b09> 2fbf 419e0060 387f0088 [23001.105276] ---[ end trace b20dd6fbb5b21932 ]--- [23001.118598] root@u10vm15:~# After I rebooted I'm keep seeing below call traces: Ubuntu Utopic Unicorn (development branch) u10vm15 hvc0 u10vm15 login: root Password: Last login: Wed Aug 6 00:02:18 IST 2014 on hvc0 Welcome to Ubuntu Utopic Unicorn (development branch) (GNU/Linux 3.16.0-6-generic ppc64le) * Documentation: https://help.ubuntu.com/ [32950.678160] systemd-logind[1071]: Removed session c1. [32950.694697] systemd-logind[1071]: New session c2 of user root. [32950.703411] Unable to handle kernel paging request
[Kernel-packages] [Bug 1357014] Re: Add THP fixes in 14.10 kernel
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1357014 Title: Add THP fixes in 14.10 kernel Status in “linux” package in Ubuntu: Confirmed Bug description: Hi Canonical, We would like to include 8 patches regarding Transparent Huge Page for powerpc. The patches are already upstream on Ben powerpc-next tree and should be sent to Linus' tree today, which mean that they are expected to make 3.17. Here are the patches 9e813308a5c18c58f9ccae1ec72ed4e14eaf9025 powerpc/thp: Add tracepoints to track hugepage invalidate 85c1fafd7262e68ad821ee1808686b1392b1167d powerpc/mm: Use read barrier when creating real_pte 7e467245bf5226db34c4b12d3cbacfa2f7a15a8b powerpc/thp: Use ACCESS_ONCE when loading pmdp 969b7b208f7408712a3526856e4ae60ad13f6928 powerpc/thp: Invalidate with vpn in loop fc0479557572375100ef16c71170b29a98e0d69a powerpc/thp: Handle combo pages in invalidate 629149fae478f0ac6bf705a535708b192e9c6b59 powerpc/thp: Invalidate old 64K based hash page mapping before insert of 4k pte fa1f8ae80f8bb996594167ff4750a0b0a5a5bb5d powerpc/thp: Don't recompute vsid and ssize in loop on invalidate b0aa44a3dfae3d8f45bd1264349aa87f87b7774f powerpc/thp: Add write barrier after updating the valid bit I am also adding the patches here for reference. Thank you, Breno To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1357014/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1356948] Re: ISST-KTE:PowerNV:UBUNTU14.10: Shiner Adapter ethernet port does not come up
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1356948 Title: ISST-KTE:PowerNV:UBUNTU14.10: Shiner Adapter ethernet port does not come up Status in “linux” package in Ubuntu: Confirmed Bug description: ---Problem Description--- When trying to bring up an ethernet port on a Shiner Network adapter, the terminal outputs the following: root@podkvm:~# ifconfig eth7 up [ 3126.678507] bnx2x: [bnx2x_attn_int_deasserted2:4099(eth7)]CFC hw attention 0x2 [ 3126.678586] bnx2x: [bnx2x_attn_int_deasserted2:4102(eth7)]FATAL error from CFC [ 3136.698592] bnx2x: [bnx2x_state_wait:308(eth7)]timeout waiting for state 1 [ 3136.698678] bnx2x: [bnx2x_setup_queue:8625(eth7)]Queue(0) SETUP failed [ 3136.698688] bnx2x: [bnx2x_nic_load:2721(eth7)]Setup leading failed! SIOCSIFFLAGS: Device or resource busy modules loaded: root@podkvm:~# lsmod Module Size Used by rtc_generic 2711 0 powernv_rng 3244 0 ses10118 0 enclosure 12767 1 ses mlx4_en 118002 0 bnx2x 920334 0 lpfc 836357 0 mlx4_core 311074 1 mlx4_en mdio6270 1 bnx2x libcrc32c 1995 1 bnx2x ipr 142194 2 be2net144413 0 scsi_transport_fc 80636 1 lpfc scsi_tgt 18399 1 scsi_transport_fc vxlan 48609 2 be2net,mlx4_en ---uname output--- Linux podkvm 3.15.0-6-generic #11-Ubuntu SMP Thu Jun 12 00:40:49 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux ---Additional Hardware Info--- System firmware: 1427A Machine Type = 8247-22L I can get the interface up on petitboot and on a 3.10 kernel used for PowerKVM installer. I can't do it for Ubuntu 14.04 kernel, based on 3.13. I can reproduce the same thing with a 3.13 kernel on another PowerNV system. However, it works on a PCI passthrough environment on PowerKVM, with both 14.04 kernel and 14.10. So, this is specific to PowerNV. I will try upstream kernel versions and see if it's possible to find a culprit. I collected this with msglevel options on the driver. The driver specific option SP, timer, interrupt, link, ifup, and probe. Cascardo. >>However, it works on a PCI passthrough environment on PowerKVM, with both >>14.04 kernel and 14.10. So, this is specific to PowerNV. I will try upstream >>kernel versions and see if it's possible to find a culprit. I couldn't find Thadeu's kernel. I built the newer kernel + bnx2x driver from upstream, still saw the same failure. I will look into more. Thanks, Wendy Shiner info: root@podkvm:~# ethtool -i eth7 driver: bnx2x version: 1.78.19-0 firmware-version: bc 7.10.4 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1356948/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1355469] Re: large bar patch for 14.10's kernel
** Changed in: ubuntu Status: New => Confirmed ** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1355469 Title: large bar patch for 14.10's kernel Status in “linux” package in Ubuntu: Confirmed Bug description: -- Problem Description -- Please, apply the following patch to the utopic's kernel: https://git.kernel.org/cgit/linux/kernel/git/benh/powerpc.git/commit/?h=next&id=262af557dd750e94adcee3f450782c743f9a92d6 This is going to solve the 'large bar' feature for Ubuntu 14.10 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1355469/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1353377] Re: ISST-KVM-UBUNTU1410-LE:kidkvm:ubuntu guest crash running IO stress tests.
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1353377 Title: ISST-KVM-UBUNTU1410-LE:kidkvm:ubuntu guest crash running IO stress tests. Status in “linux” package in Ubuntu: Confirmed Bug description: Problem Description - ubuntu guest crash running IO stress tests. uname output - uname -a Linux kidg1 3.16.0-6-generic #11-Ubuntu SMP Mon Jul 28 02:00:45 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux System Hang System hangs/crashed . Steps to reproduce 1) Used below XML to define the guest [root@kidkvm ~]# virsh dumpxml kidg1 kidg1 86748b80-a84b-4c70-b1b3-d2a23d88c7ab 4194304 4194304 4 /machine hvm power8 destroy restart restart /usr/bin/qemu-system-ppc64 2) Defined guest xml and started console to start the installation. 3) Installation succeded. 4) Guest is up and running as below [root@magkvm ~]# virsh list --all IdName State 26 kidg1running 5) Initiated IO and BASE stress tests using LE scripts 6) Stress tests running root@kidg1:~# slj STAF running status: Job ID Job Name Start Date-Time -- - 1 IO 20140731-21:59:41 2 BASE 20140731-21:59:58 7) After 72 hours , tried to login kidg1 guest and it fails as below [root@kidkvm ~]# ssh root@kidg1 ssh: connect to host kidg1 port 22: No route to host [root@kidkvm ~] 8) Verified ping test and it failed [root@kidkvm ~]# ping kidg1 PING kidg1.isst.aus.stglabs.ibm.com (10.33.4.124) 56(84) bytes of data. From kidkvm.isst.aus.stglabs.ibm.com (10.33.18.47) icmp_seq=1 Destination Host U nreachable From kidkvm.isst.aus.stglabs.ibm.com (10.33.18.47) icmp_seq=2 Destination Host U nreachable From kidkvm.isst.aus.stglabs.ibm.com (10.33.18.47) icmp_seq=5 Destination Host U nreachable From kidkvm.isst.aus.stglab 9) Tried to login the console using below command [root@kidkvm ~]# virsh console --force kidg1 Connected to domain kidg1 --> It hangs for a long time 10) Verified logs on kidkvm host and did not see any errors 11) Restarted the guest, kidg1 by destroying it. 12) Seen Stack Traces in /var/log/syslog file on kidg1 guest as below Aug 1 06:47:09 kidg1 kernel: [69250.533913] zram0: detected capacity change from 536870912 to 0 13 Aug 1 06:47:43 kidg1 kernel: [69284.609537] [ cut here ] 14 Aug 1 06:47:43 kidg1 kernel: [69284.609802] WARNING: at /build/buildd/linux-3.16.0/block/blk-mq.c:727 15 Aug 1 06:47:43 kidg1 kernel: [69284.610298] Modules linked in: lz4_compress rpcsec_gss_krb5 nfsv4 nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache bochs_drm ttm drm_kms_helper drm syscopy area sysfillrect pseries_rng sysimgblt rtc_generic ohci_pci [last unloaded: zram] 16 Aug 1 06:47:43 kidg1 kernel: [69284.611757] CPU: 1 PID: 15 Comm: kworker/1:0H Not tainted 3.16.0-6-generic #11-Ubuntu 17 Aug 1 06:47:43 kidg1 kernel: [69284.611950] Workqueue: kblockd blk_mq_run_work_fn 18 Aug 1 06:47:43 kidg1 kernel: [69284.612110] task: c000fc9f6830 ti: c000fcbc4000 task.ti: c000fcbc4000 19 Aug 1 06:47:43 kidg1 kernel: [69284.612363] NIP: c04acc18 LR: c00c1eec CTR: c04ad220 20 Aug 1 06:47:43 kidg1 kernel: [69284.612536] REGS: c000fcbc7900 TRAP: 0700 Not tainted (3.16.0-6-generic) 21 Aug 1 06:47:43 kidg1 kernel: [69284.612725] MSR: 800100029033 CR: 22002044 XER: 22 Aug 1 06:47:43 kidg1 kernel: [69284.613198] CFAR: c04ad258 SOFTE: 1 23 Aug 1 06:47:43 kidg1 kernel: [69284.613198] GPR00: c00c1eec c000fcbc7b80 c1346618 c000f9e99000 24 Aug 1 06:47:43 kidg1 kernel: [69284.613198] GPR04: 0060 3aa318b68523 e1fc185ab7c99088 2
[Kernel-packages] [Bug 1352994] Re: remap_4K_pfn() safety improvement needed for Ubuntu 14.10
** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1352994 Title: remap_4K_pfn() safety improvement needed for Ubuntu 14.10 Status in “linux” package in Ubuntu: Confirmed Bug description: == Comment: #0 - Brian Hart - 2014-08-04 17:41:57 == ---Problem Description--- The current implementation of remap_4k_pfn() trusts that it's safe to map the PFN supplied by the requestor. But there may be PFNs that are not safe to map via remap_4k_pfn(). (For example, the addresses at which PCI MMIO regions are mapped in some hypervisor configurations.) When an unsafe PFN passes through remap_4k_pfn() some address bits may be unknowingly dropped by the underlying remapping routines. When that happens the remap will appear to succeed, but any later attempt to use the mapping will checkstop the machine because the truncated target address is not present in the machine. A patch has been submitted that will cause remap_4k_pfn() to detect and reject these unsafe requests: https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-July/119179.html Our project needs some form of this safety improvement in the Ubuntu 14.10 release. ---uname output--- Linux tul115p1 3.16.0-6-generic #11-Ubuntu SMP Mon Jul 28 02:00:45 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-42A ---Debugger--- A debugger is not configured ---Steps to Reproduce--- The problem requires a hypervisor that allows PCI MMIO regions to span above the 46-bit line, and a device driver that maps MMIO regions using remap_4k_pfn(). I can provide detailed instructions and a driver upon request. Stack trace output: no Oops output: no System Dump Info: The system is not configured to capture a system dump. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1352994/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1353105] Re: bnx2x crashes on bxn2x_tpa_start
** Package changed: ubuntu => linux (Ubuntu) ** Changed in: linux (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1353105 Title: bnx2x crashes on bxn2x_tpa_start Status in “linux” package in Ubuntu: Confirmed Bug description: ---Problem Description--- bnx2x may cause crashes or stop working because of a missing memory barrier. Messages like the following appear on the log: bnx2x: [bnx2x_tpa_start:392(eth7)]start of bin not in stop [0] ---uname output--- Linux ubuntu 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:50:31 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux ---Additional Hardware Info--- 0001:00:03.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM57800 1/10 Gigabit Ethernet (rev 10) 0001:00:03.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM57800 1/10 Gigabit Ethernet (rev 10) 0001:00:03.2 Ethernet controller: Broadcom Corporation NetXtreme II BCM57800 1/10 Gigabit Ethernet (rev 10) 0001:00:03.3 Ethernet controller: Broadcom Corporation NetXtreme II BCM57800 1/10 Gigabit Ethernet (rev 10) Machine Type = model : IBM pSeries (emulated by qemu) ---Steps to Reproduce--- Send lots of traffic on a busy workload. Fix for reported issue upstream commit 9aaae044abe95de182d09004cc3fa181bf22e6e0 Fix for EEH related issue upstream commit 0c0e63410a393aae4b615849625f539db775d586 We would like to get those applied to both Ubuntu 14.04 updates and Ubuntu 14.10, please. Cascardo. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1353105/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1352995] Re: ERAT Multihit machine checks
** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1352995 Title: ERAT Multihit machine checks Status in “linux” package in Ubuntu: Confirmed Bug description: -- Problem Description -- Our project involves porting a 3rd-party out-of-tree module to LE Ubuntu on Power. We've been seeing occasional ERAT Multihit machine checks with kernels ranging from the LE Ubuntu 14.04 3.13-based kernel through the very latest 3.16-rc5 mainline kernels. Our kernels are running directly on top of OPAL/Sapphire in PowerNV mode, with no intervening KVM host. FSP dumps captured at the time of the ERAT detection show that there are duplicate mappings in force for the same real page, with the duplicate mappings being for different sized pages. So, for example, the same 4K real page will be referred to by a 4K mapping and an overlapping 16M mapping. Aneesh has been working with us on this. We are currently testing this patchset. (git format-patch --stdout format). We are still finding ERAT with this changes. Most of these changes are already posted externally. Some of them got updated after that. Current status is. When hitting multi hit erat, I don't find duplicate hash pte entries. So it possibly indicate a missed flush or a race. Dar value is 3fff7d0f psize 0 slot = 453664 v = 40001f0d74ff7d01 r = 7ca0f0196 with b_size = 15 a_size = -1 Dump the rest of 256 entries Dar value is 3fff7d0f psize 0 slot = 453664 v = 40001f0d74ff7d01 r = 7ca0f0196 with b_size = 15 a_size = -1 Done.. Dump the rest of 256 entries Done.. Found hugepage shift 0 for ea 3fff7d0f with ptep 1f283d8000383 Severe Machine check interrupt [Recovered] Initiator: CPU Error type: ERAT [Multihit] Effective address: 3fff7d0f That is what i am finding on machine check. I am searching the hash pte with base page size 4K and 64K and printing matching hash table entries. b_size = 15 and a_size = -1 both indicate 4K. -aneesh I guess we now have a race in the unmap path. I am auditing the hpte_slot_array usage. We do check for hpte_slot_array != NULL in invalidate. But if we hit two pmdp_splitting flush one will skip the invalidate as per current code and will go ahead and mark hpte_slot_array NULL. I have a patch in the repo which try to work around that. But I am not sure whether we really can have two pmdp_splitting flush simultaneously. because we call that under pmd_lock. Still need to look at the details. -aneesh I added more debug prints. And this is what i found. Before a hugepage flush I added debug prints to dump the hash table to see if we are failing to clear any hash table entries. After every update we seems to have clearly updated hash table. One MCE some of the relevant part of logs are pmd_hugepage_update dumping entries for 0x3fff7100 with clr = 0x set = 0x0 . . dump_hash_pte_group dumping entries for 0x3fff7191da8c with clr = 0x0 set = 0x0 func = dump_hash_pte_group, addr = 3fff7191da8c psize = 0 slot = 1174024 v = 4001a9245cff7181 r = 7dfb5d193 with b_size = 0 a_size = 0 count = 2333 func = dump_hash_pte_group, addr = 3fff7100 psize = 0 slot = 1155808 v = 4001a9245cff7105 r = 7cc038196 with b_size = 0 a_size = 9 count = 0 func = dump_hash_pte_group, addr = 3fff710a2000 psize = 0 slot = 1157104 v = 4001a9245cff7105 r = 7cc038116 with b_size = 0 a_size = 9 count = 162 func = dump_hash_pte_group, addr = 3fff710e6000 psize = 0 slot = 1156560 v = 4001a9245cff7105 r = 7cc038196 with b_size = 0 a_size = 9 count = 230 func = dump_hash_pte_group, addr = 3fff71378000 psize = 0 slot = 1161504 v = 4001a9245cff7105 r = 7cc038116 with b_size = 0 a_size = 9 count = 888 So we end up clearing the huge pmd with 0x3fff7100 and at that point we didn't had anything in hash table. That is the last pmdp_splitting_flush or pmd_hugepage_update even on that address. Can we audit the driver code to understand large/huge page usage and if it is making any x86 assumptions around the page table accessor. For example ppc64 rules around page table access are more strict than x86. We don't have flush_tlb_* functions and we need to make sure we hold ptl while updating page table and also flush the hash pte holding the lock. Attaching the log also -aneesh Aneesh writes: Can we audit the driver code to understand large/huge page usage and if it is making any x86 assumptions around the page table accessor. For example ppc64 rules around page table access are more strict than x86. We don't have flush_tlb_* functions and we need to make sure we hold ptl while updating page table and also flush the hash pte holding the lock. Yes, we can do that (all the driv
[Kernel-packages] [Bug 1352640] Re: Huge PCI BAR support needed for Ubuntu 14.10
** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1352640 Title: Huge PCI BAR support needed for Ubuntu 14.10 Status in “linux” package in Ubuntu: Confirmed Bug description: == Comment: #0 - Brian Hart - 2014-08-04 17:26:34 == ---Problem Description--- Our project requires Huge BAR Support (i.e. support for PCI BAR spaces > 1G) in Ubuntu 14.10. Guo and Gavin are working on the kernel changes (and Sapphire/OPAL changes?) and have already submitted patches. The patches did not make the 3.16 cut-off, so will need to be back-ported to the 14.10 kernel. I believe the patches are the set described by: https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-July/119170.html Contact Information = Brian Hart (ha...@us.ibm.com), Dave Marquardt (davem...@us.ibm.com) ---uname output--- Linux tul115p1 3.16.0-6-generic #11-Ubuntu SMP Mon Jul 28 02:00:45 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux ---Additional Hardware Info--- Problem is seen with PCI adapters that request > 1G of BAR space. We've seen it with GPUs that request a BAR1 of size 16G. Machine Type = 8286-42A ---Debugger--- A debugger is not configured ---Steps to Reproduce--- Add a PCI adapter which requests huge BAR space in the system. The huge BARs will not be assigned and the adapter will likely not initialize. Stack trace output: no Oops output: no System Dump Info: The system is not configured to capture a system dump. *Additional Instructions for Brian Hart (ha...@us.ibm.com), Dave Marquardt (davem...@us.ibm.com): -Attach sysctl -a output output to the bug. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1352640/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1352994] Re: remap_4K_pfn() safety improvement needed for Ubuntu 14.10
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1352994 Title: remap_4K_pfn() safety improvement needed for Ubuntu 14.10 Status in “linux” package in Ubuntu: Incomplete Bug description: == Comment: #0 - Brian Hart - 2014-08-04 17:41:57 == ---Problem Description--- The current implementation of remap_4k_pfn() trusts that it's safe to map the PFN supplied by the requestor. But there may be PFNs that are not safe to map via remap_4k_pfn(). (For example, the addresses at which PCI MMIO regions are mapped in some hypervisor configurations.) When an unsafe PFN passes through remap_4k_pfn() some address bits may be unknowingly dropped by the underlying remapping routines. When that happens the remap will appear to succeed, but any later attempt to use the mapping will checkstop the machine because the truncated target address is not present in the machine. A patch has been submitted that will cause remap_4k_pfn() to detect and reject these unsafe requests: https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-July/119179.html Our project needs some form of this safety improvement in the Ubuntu 14.10 release. ---uname output--- Linux tul115p1 3.16.0-6-generic #11-Ubuntu SMP Mon Jul 28 02:00:45 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = 8286-42A ---Debugger--- A debugger is not configured ---Steps to Reproduce--- The problem requires a hypervisor that allows PCI MMIO regions to span above the 46-bit line, and a device driver that maps MMIO regions using remap_4k_pfn(). I can provide detailed instructions and a driver upon request. Stack trace output: no Oops output: no System Dump Info: The system is not configured to capture a system dump. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1352994/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1353005] Re: sensors command is not getting executed in Ubuntu 14.10 Non Virtualised environment
** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1353005 Title: sensors command is not getting executed in Ubuntu 14.10 Non Virtualised environment Status in “linux” package in Ubuntu: Confirmed Bug description: ---Problem Description--- sensors command is not getting executed in Ubuntu 14.10 ---uname output--- Linux lep8d 3.16.0-6-generic #11-Ubuntu SMP Mon Jul 28 02:00:45 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = P8 ---Steps to Reproduce--- Install Ubuntu 14.10 iso on local hard disk of P8 machine in Non Virtualised environment. Install the lm-sensors ppc64le package on the same. root@lep8d:~# sensors No sensors found! Make sure you loaded all the kernel drivers you need. Try sensors-detect to find out which these are. root@lep8d:~# echo $? 1 root@lep8d:~# sensors-detect # sensors-detect revision 6170 (2013-05-20 21:25:22 +0200) # DMI data unavailable, please consider installing dmidecode 2.7 # or later for better results. This program will help you determine which kernel modules you need to load to use lm_sensors most effectively. It is generally safe and recommended to accept the default answers to all questions, unless you know what you're doing. Some south bridges, CPUs or memory controllers contain embedded sensors. Do you want to scan for them? This is totally safe. (YES/no): YES modprobe: FATAL: Module cpuid not found. Failed to load module cpuid. Silicon Integrated Systems SIS5595... No VIA VT82C686 Integrated Sensors... No VIA VT8231 Integrated Sensors...No AMD K8 thermal sensors... No AMD Family 10h thermal sensors... No AMD Family 11h thermal sensors... No AMD Family 12h and 14h thermal sensors... No AMD Family 15h thermal sensors... No AMD Family 15h power sensors... No AMD Family 16h power sensors... No Intel digital thermal sensor... No Intel AMB FB-DIMM thermal sensor... No VIA C7 thermal sensor...No VIA Nano thermal sensor... No Lastly, we can probe the I2C/SMBus adapters for connected hardware monitoring devices. This is the most risky part, and while it works reasonably well on most systems, it has been reported to cause trouble on some systems. Do you want to probe the I2C/SMBus adapters now? (YES/no): YES Sorry, no supported PCI bus adapters found. Module i2c-dev loaded successfully. Sorry, no sensors were detected. Either your system has no sensors, or they are not supported, or they are connected to an I2C or SMBus adapter that is not supported. If you find out what chips are on your board, check http://www.lm-sensors.org/wiki/Devices for driver status. After a big more digging I see: in Ubuntu where it is failing, there is: $ cat /boot/config-3.16.0-6-generic | grep -i sensor | grep -i ibm CONFIG_SENSORS_IBMAEM=m CONFIG_SENSORS_IBMPEX=m And in PowerKVM where it is working, there is: $ cat /boot/config-3.10.42-2004.pkvm2_1_1.8.ppc64 | grep -i sensors | grep -i ibm CONFIG_SENSORS_IBMAEM=m CONFIG_SENSORS_IBMPEX=m CONFIG_SENSORS_IBMPOWERNV=y So now I think problem is: Ubuntu is missing "ibmpowernv", i.e. the powernv hwmon driver for sensors, as described here: http://lists.lm-sensors.org/pipermail/lm-sensors/2014-May/041867.html Adding to copy: Neelesh Gupta, who is author of this module. Neelesh: Am I correct that this is the missing module in Ubuntu? Is there a kernel patch Ubuntu needs to pick up, or perhaps they already have the patch but just need to fix their kernel config? Thx -DaveH. Yes, 'ibmpowernv' module is missing from Ubuntu 14.10. It's already been applied to the -next tree and should be available to mainline soon. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1353005/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1353005] Re: sensors command is not getting executed in Ubuntu 14.10 Non Virtualised environment
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1353005 Title: sensors command is not getting executed in Ubuntu 14.10 Non Virtualised environment Status in “linux” package in Ubuntu: New Bug description: ---Problem Description--- sensors command is not getting executed in Ubuntu 14.10 ---uname output--- Linux lep8d 3.16.0-6-generic #11-Ubuntu SMP Mon Jul 28 02:00:45 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = P8 ---Steps to Reproduce--- Install Ubuntu 14.10 iso on local hard disk of P8 machine in Non Virtualised environment. Install the lm-sensors ppc64le package on the same. root@lep8d:~# sensors No sensors found! Make sure you loaded all the kernel drivers you need. Try sensors-detect to find out which these are. root@lep8d:~# echo $? 1 root@lep8d:~# sensors-detect # sensors-detect revision 6170 (2013-05-20 21:25:22 +0200) # DMI data unavailable, please consider installing dmidecode 2.7 # or later for better results. This program will help you determine which kernel modules you need to load to use lm_sensors most effectively. It is generally safe and recommended to accept the default answers to all questions, unless you know what you're doing. Some south bridges, CPUs or memory controllers contain embedded sensors. Do you want to scan for them? This is totally safe. (YES/no): YES modprobe: FATAL: Module cpuid not found. Failed to load module cpuid. Silicon Integrated Systems SIS5595... No VIA VT82C686 Integrated Sensors... No VIA VT8231 Integrated Sensors...No AMD K8 thermal sensors... No AMD Family 10h thermal sensors... No AMD Family 11h thermal sensors... No AMD Family 12h and 14h thermal sensors... No AMD Family 15h thermal sensors... No AMD Family 15h power sensors... No AMD Family 16h power sensors... No Intel digital thermal sensor... No Intel AMB FB-DIMM thermal sensor... No VIA C7 thermal sensor...No VIA Nano thermal sensor... No Lastly, we can probe the I2C/SMBus adapters for connected hardware monitoring devices. This is the most risky part, and while it works reasonably well on most systems, it has been reported to cause trouble on some systems. Do you want to probe the I2C/SMBus adapters now? (YES/no): YES Sorry, no supported PCI bus adapters found. Module i2c-dev loaded successfully. Sorry, no sensors were detected. Either your system has no sensors, or they are not supported, or they are connected to an I2C or SMBus adapter that is not supported. If you find out what chips are on your board, check http://www.lm-sensors.org/wiki/Devices for driver status. After a big more digging I see: in Ubuntu where it is failing, there is: $ cat /boot/config-3.16.0-6-generic | grep -i sensor | grep -i ibm CONFIG_SENSORS_IBMAEM=m CONFIG_SENSORS_IBMPEX=m And in PowerKVM where it is working, there is: $ cat /boot/config-3.10.42-2004.pkvm2_1_1.8.ppc64 | grep -i sensors | grep -i ibm CONFIG_SENSORS_IBMAEM=m CONFIG_SENSORS_IBMPEX=m CONFIG_SENSORS_IBMPOWERNV=y So now I think problem is: Ubuntu is missing "ibmpowernv", i.e. the powernv hwmon driver for sensors, as described here: http://lists.lm-sensors.org/pipermail/lm-sensors/2014-May/041867.html Adding to copy: Neelesh Gupta, who is author of this module. Neelesh: Am I correct that this is the missing module in Ubuntu? Is there a kernel patch Ubuntu needs to pick up, or perhaps they already have the patch but just need to fix their kernel config? Thx -DaveH. Yes, 'ibmpowernv' module is missing from Ubuntu 14.10. It's already been applied to the -next tree and should be available to mainline soon. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1353005/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1352995] Re: ERAT Multihit machine checks
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1352995 Title: ERAT Multihit machine checks Status in “linux” package in Ubuntu: New Bug description: -- Problem Description -- Our project involves porting a 3rd-party out-of-tree module to LE Ubuntu on Power. We've been seeing occasional ERAT Multihit machine checks with kernels ranging from the LE Ubuntu 14.04 3.13-based kernel through the very latest 3.16-rc5 mainline kernels. Our kernels are running directly on top of OPAL/Sapphire in PowerNV mode, with no intervening KVM host. FSP dumps captured at the time of the ERAT detection show that there are duplicate mappings in force for the same real page, with the duplicate mappings being for different sized pages. So, for example, the same 4K real page will be referred to by a 4K mapping and an overlapping 16M mapping. Aneesh has been working with us on this. We are currently testing this patchset. (git format-patch --stdout format). We are still finding ERAT with this changes. Most of these changes are already posted externally. Some of them got updated after that. Current status is. When hitting multi hit erat, I don't find duplicate hash pte entries. So it possibly indicate a missed flush or a race. Dar value is 3fff7d0f psize 0 slot = 453664 v = 40001f0d74ff7d01 r = 7ca0f0196 with b_size = 15 a_size = -1 Dump the rest of 256 entries Dar value is 3fff7d0f psize 0 slot = 453664 v = 40001f0d74ff7d01 r = 7ca0f0196 with b_size = 15 a_size = -1 Done.. Dump the rest of 256 entries Done.. Found hugepage shift 0 for ea 3fff7d0f with ptep 1f283d8000383 Severe Machine check interrupt [Recovered] Initiator: CPU Error type: ERAT [Multihit] Effective address: 3fff7d0f That is what i am finding on machine check. I am searching the hash pte with base page size 4K and 64K and printing matching hash table entries. b_size = 15 and a_size = -1 both indicate 4K. -aneesh I guess we now have a race in the unmap path. I am auditing the hpte_slot_array usage. We do check for hpte_slot_array != NULL in invalidate. But if we hit two pmdp_splitting flush one will skip the invalidate as per current code and will go ahead and mark hpte_slot_array NULL. I have a patch in the repo which try to work around that. But I am not sure whether we really can have two pmdp_splitting flush simultaneously. because we call that under pmd_lock. Still need to look at the details. -aneesh I added more debug prints. And this is what i found. Before a hugepage flush I added debug prints to dump the hash table to see if we are failing to clear any hash table entries. After every update we seems to have clearly updated hash table. One MCE some of the relevant part of logs are pmd_hugepage_update dumping entries for 0x3fff7100 with clr = 0x set = 0x0 . . dump_hash_pte_group dumping entries for 0x3fff7191da8c with clr = 0x0 set = 0x0 func = dump_hash_pte_group, addr = 3fff7191da8c psize = 0 slot = 1174024 v = 4001a9245cff7181 r = 7dfb5d193 with b_size = 0 a_size = 0 count = 2333 func = dump_hash_pte_group, addr = 3fff7100 psize = 0 slot = 1155808 v = 4001a9245cff7105 r = 7cc038196 with b_size = 0 a_size = 9 count = 0 func = dump_hash_pte_group, addr = 3fff710a2000 psize = 0 slot = 1157104 v = 4001a9245cff7105 r = 7cc038116 with b_size = 0 a_size = 9 count = 162 func = dump_hash_pte_group, addr = 3fff710e6000 psize = 0 slot = 1156560 v = 4001a9245cff7105 r = 7cc038196 with b_size = 0 a_size = 9 count = 230 func = dump_hash_pte_group, addr = 3fff71378000 psize = 0 slot = 1161504 v = 4001a9245cff7105 r = 7cc038116 with b_size = 0 a_size = 9 count = 888 So we end up clearing the huge pmd with 0x3fff7100 and at that point we didn't had anything in hash table. That is the last pmdp_splitting_flush or pmd_hugepage_update even on that address. Can we audit the driver code to understand large/huge page usage and if it is making any x86 assumptions around the page table accessor. For example ppc64 rules around page table access are more strict than x86. We don't have flush_tlb_* functions and we need to make sure we hold ptl while updating page table and also flush the hash pte holding the lock. Attaching the log also -aneesh Aneesh writes: Can we audit the driver code to understand large/huge page usage and if it is making any x86 assumptions around the page table accessor. For example ppc64 rules around page table access are more strict than x86. We don't have flush_tlb_* functions and we need to make sure we hold ptl while updating page table and also flush the hash pte holding the lock. Yes, we can do that (all the driver code that's specific to lin
[Kernel-packages] [Bug 1352056] Re: kdump on Ubuntu 14.04 is not generating a dump.
** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1352056 Title: kdump on Ubuntu 14.04 is not generating a dump. Status in “linux” package in Ubuntu: Confirmed Bug description: ---Problem Description--- kdump is not producing a dump on powerKVM LE P8 Ubuntu 14.04 ---uname output--- 3.13.0-30-generic ---Additional Hardware Info--- Power8 LE configuration. ---Patches Installed--- 1324544 - kdump-config load fails with vmlinux kernel (vs. vmlinuz) Machine Type = 8247-22L ---Steps to Reproduce--- Installed kdump-tools 1.5.5-2ubuntu1 and crash 7.0.3-3ubuntu3. Updated /etc/default/kdump-tools, first I updated just USE_KDUMP=1. Rebooted the node and see: root=UUID=87986483-5fec-4b4d-b22e-bf2a72096df8 ro quiet splash crashkernel=384M-:128M root@c656f2n02:~# cat /proc/sys/kernel/sysrq 1 root@c656f2n02:~# cat /proc/sys/kernel/sysrq 1 root@c656f2n02:~# ^Cnd /proc | grep sysrq root@c656f2n02:~# kdump-config status current state : ready to kdump root@c656f2n02:~# kdump-config show USE_KDUMP:1 KDUMP_SYSCTL: kernel.panic_on_oops=1 KDUMP_COREDIR:/var/crash crashkernel addr: current state:ready to kdump kexec command: /sbin/kexec -p --args-linux --command-line="root=UUID=87986483-5fec-4b4d-b22e-bf2a72096df8 ro quiet splash irqpoll maxcpus=1 nousb" --initrd=/boot/initrd.img-3.13.0-30-generic /boot/vmlinux-3.13.0-30-generic root@c656f2n02:/boot/grub# cat /sys/kernel/kexec_crash_loaded 1 root@c656f2n02:/boot/grub# cat /sys/kernel/kexec_loaded 0 echo c > /proc/sysrq-trigger root@c656f2n02:/var/log# echo c > /proc/sysrq-trigger [ 1956.014243] SysRq : Trigger a crash [ 1956.014328] Unable to handle kernel paging request for data at address 0x [ 1956.014404] Faulting instruction address: 0xc0586c2c [ 1956.014468] Oops: Kernel access of bad area, sig: 11 [#1] [ 1956.014518] SMP NR_CPUS=2048 NUMA PowerNV [ 1956.014570] Modules linked in: ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp bridge stp llc ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables autofs4 rdma_ucm(OF) ib_ucm(OF) rdma_cm(OF) iw_cm(OF) ib_ipoib(OF) ib_cm(OF) ib_uverbs(OF) ib_umad(OF) mlx5_ib(OF) mlx5_core(OF) mlx4_ib(OF) ib_sa(OF) ib_mad(OF) ib_core(OF) ib_addr(OF) mlx4_en(OF) mlx4_core(OF) compat(OF) nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache rtc_generic powernv_rng ses enclosure ipr [ 1956.015306] CPU: 146 PID: 2522 Comm: bash Tainted: GF O 3.13.0-30-generic #54-Ubuntu [ 1956.015394] task: c03fcabda120 ti: c03fcac58000 task.ti: c03fcac58000 [ 1956.015469] NIP: c0586c2c LR: c0587b8c CTR: c0586c00 [ 1956.015543] REGS: c03fcac5b820 TRAP: 0300 Tainted: GF O (3.13.0-30-generic) [ 1956.015617] MSR: 90009033 CR: 42422822 XER: 2000 [ 1956.015804] CFAR: c0009318 DAR: DSISR: 4200 SOFTE: 0 GPR00: c0587b8c c03fcac5baa0 c162e840 0063 GPR04: c2f45bd0 c2f564c8 00015ad0 c1827480 GPR08: c0dfe840 0001 00015ad0 GPR12: 42422822 c7e5ff00 01002fe90648 1016e008 GPR16: 1013ad70 01002fe94648 1016fed0 1016e008 GPR20: 100c31e0 10171fc8 1016f840 GPR24: 0004 0001 c14b7dc8 GPR28: c1974c90 0063 c148d9c0 c14b8188 [ 1956.016794] NIP [c0586c2c] .sysrq_handle_crash+0x2c/0x40 [ 1956.016858] LR [c0587b8c] .__handle_sysrq+0xfc/0x260 [ 1956.016920] Call Trace: [ 1956.016948] [c03fcac5baa0] [10172a34] 0x10172a34 (unreliable) [ 1956.017025] [c03fcac5bb10] [c0587b8c] .__handle_sysrq+0xfc/0x260 [ 1956.017101] [c03fcac5bbd0] [c0588324] .write_sysrq_trigger+0x74/0x90 [ 1956.017190] [c03fcac5bc50] [c02dff1c] .proc_reg_write+0xac/0x110 [ 1956.017266] [c03fcac5bcf0] [c0254c00] .vfs_write+0xe0/0x260 [ 1956.017342] [c03fcac5bd90] [c02558f4] .SyS_write+0x64/0xe0 [ 1956.017418] [c03fcac5be30] [c000a158] syscall_exit+0x0/0x98 [ 1956.017492] Instruction dump: [ 1956.017530] 4bac 7c0802a6 f8010010 f821ff91 6000 6000 3d42001f 392a8ca8 [ 1956.017658] 3941 9149 7c0004ac 3920 <9949> 38210070 e8010010 7c0803a6 [ 1956.017894] ---[ end trace d163ff42366bde72 ]--- [ 1956.017986] [ 1956.018042] Sending IPI to other CPUs [ 1956.019188] IPI complete -> smp_releas
[Kernel-packages] [Bug 1352056] Re: kdump on Ubuntu 14.04 is not generating a dump.
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1352056 Title: kdump on Ubuntu 14.04 is not generating a dump. Status in “linux” package in Ubuntu: New Bug description: ---Problem Description--- kdump is not producing a dump on powerKVM LE P8 Ubuntu 14.04 ---uname output--- 3.13.0-30-generic ---Additional Hardware Info--- Power8 LE configuration. ---Patches Installed--- 1324544 - kdump-config load fails with vmlinux kernel (vs. vmlinuz) Machine Type = 8247-22L ---Steps to Reproduce--- Installed kdump-tools 1.5.5-2ubuntu1 and crash 7.0.3-3ubuntu3. Updated /etc/default/kdump-tools, first I updated just USE_KDUMP=1. Rebooted the node and see: root=UUID=87986483-5fec-4b4d-b22e-bf2a72096df8 ro quiet splash crashkernel=384M-:128M root@c656f2n02:~# cat /proc/sys/kernel/sysrq 1 root@c656f2n02:~# cat /proc/sys/kernel/sysrq 1 root@c656f2n02:~# ^Cnd /proc | grep sysrq root@c656f2n02:~# kdump-config status current state : ready to kdump root@c656f2n02:~# kdump-config show USE_KDUMP:1 KDUMP_SYSCTL: kernel.panic_on_oops=1 KDUMP_COREDIR:/var/crash crashkernel addr: current state:ready to kdump kexec command: /sbin/kexec -p --args-linux --command-line="root=UUID=87986483-5fec-4b4d-b22e-bf2a72096df8 ro quiet splash irqpoll maxcpus=1 nousb" --initrd=/boot/initrd.img-3.13.0-30-generic /boot/vmlinux-3.13.0-30-generic root@c656f2n02:/boot/grub# cat /sys/kernel/kexec_crash_loaded 1 root@c656f2n02:/boot/grub# cat /sys/kernel/kexec_loaded 0 echo c > /proc/sysrq-trigger root@c656f2n02:/var/log# echo c > /proc/sysrq-trigger [ 1956.014243] SysRq : Trigger a crash [ 1956.014328] Unable to handle kernel paging request for data at address 0x [ 1956.014404] Faulting instruction address: 0xc0586c2c [ 1956.014468] Oops: Kernel access of bad area, sig: 11 [#1] [ 1956.014518] SMP NR_CPUS=2048 NUMA PowerNV [ 1956.014570] Modules linked in: ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp bridge stp llc ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables autofs4 rdma_ucm(OF) ib_ucm(OF) rdma_cm(OF) iw_cm(OF) ib_ipoib(OF) ib_cm(OF) ib_uverbs(OF) ib_umad(OF) mlx5_ib(OF) mlx5_core(OF) mlx4_ib(OF) ib_sa(OF) ib_mad(OF) ib_core(OF) ib_addr(OF) mlx4_en(OF) mlx4_core(OF) compat(OF) nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache rtc_generic powernv_rng ses enclosure ipr [ 1956.015306] CPU: 146 PID: 2522 Comm: bash Tainted: GF O 3.13.0-30-generic #54-Ubuntu [ 1956.015394] task: c03fcabda120 ti: c03fcac58000 task.ti: c03fcac58000 [ 1956.015469] NIP: c0586c2c LR: c0587b8c CTR: c0586c00 [ 1956.015543] REGS: c03fcac5b820 TRAP: 0300 Tainted: GF O (3.13.0-30-generic) [ 1956.015617] MSR: 90009033 CR: 42422822 XER: 2000 [ 1956.015804] CFAR: c0009318 DAR: DSISR: 4200 SOFTE: 0 GPR00: c0587b8c c03fcac5baa0 c162e840 0063 GPR04: c2f45bd0 c2f564c8 00015ad0 c1827480 GPR08: c0dfe840 0001 00015ad0 GPR12: 42422822 c7e5ff00 01002fe90648 1016e008 GPR16: 1013ad70 01002fe94648 1016fed0 1016e008 GPR20: 100c31e0 10171fc8 1016f840 GPR24: 0004 0001 c14b7dc8 GPR28: c1974c90 0063 c148d9c0 c14b8188 [ 1956.016794] NIP [c0586c2c] .sysrq_handle_crash+0x2c/0x40 [ 1956.016858] LR [c0587b8c] .__handle_sysrq+0xfc/0x260 [ 1956.016920] Call Trace: [ 1956.016948] [c03fcac5baa0] [10172a34] 0x10172a34 (unreliable) [ 1956.017025] [c03fcac5bb10] [c0587b8c] .__handle_sysrq+0xfc/0x260 [ 1956.017101] [c03fcac5bbd0] [c0588324] .write_sysrq_trigger+0x74/0x90 [ 1956.017190] [c03fcac5bc50] [c02dff1c] .proc_reg_write+0xac/0x110 [ 1956.017266] [c03fcac5bcf0] [c0254c00] .vfs_write+0xe0/0x260 [ 1956.017342] [c03fcac5bd90] [c02558f4] .SyS_write+0x64/0xe0 [ 1956.017418] [c03fcac5be30] [c000a158] syscall_exit+0x0/0x98 [ 1956.017492] Instruction dump: [ 1956.017530] 4bac 7c0802a6 f8010010 f821ff91 6000 6000 3d42001f 392a8ca8 [ 1956.017658] 3941 9149 7c0004ac 3920 <9949> 38210070 e8010010 7c0803a6 [ 1956.017894] ---[ end trace d163ff42366bde72 ]--- [ 1956.017986] [ 1956.018042] Sending IPI to other CPUs [ 1956.019188] IPI complete -> smp_release_cpus() spinning_secondarie
[Kernel-packages] [Bug 1350443] Re: Not able to load the kdump kernel and generate the dump in Ubuntu14.10 on Non virtualised system
** Package changed: ubuntu => makedumpfile (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to makedumpfile in Ubuntu. https://bugs.launchpad.net/bugs/1350443 Title: Not able to load the kdump kernel and generate the dump in Ubuntu14.10 on Non virtualised system Status in “makedumpfile” package in Ubuntu: New Bug description: ---Problem Description--- Not able to load the kdump kernel and save the vmcore in /var/crash/ ---uname output--- Linux lep8d 3.16.0-5-generic #10-Ubuntu SMP Mon Jul 21 16:17:25 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = P8 ---Steps to Reproduce--- Install a P8 machine with Ubuntu 14.10 in Non virtualised environment. Installed all the kexec-tools and kdump-tools packages. Then trying to start the kdump service and loading. root@lep8d:~# /etc/init.d/kdump-tools start Starting kdump-tools: Cannot open `/boot/vmlinuz-3.16.0-5-generic': No such file or directory * failed to load kdump kernel root@lep8d:~# echo $? 0 root@lep8d:~# kdump-config load Cannot open `/boot/vmlinuz-3.16.0-5-generic': No such file or directory * failed to load kdump kernel root@lep8d:~# ls -l /boot/vmlinux-3.16.0-5-generic -rw--- 1 root root 20712936 Jul 21 22:02 /boot/vmlinux-3.16.0-5-generic root@lep8d:~# root@lep8d:~# dpkg -l | grep kexec ii kexec-tools1:2.0.6-0ubuntu2 ppc64el tools to support fast kexec reboots ii pxe-kexec 0.2.4-3 ppc64el Fetch PXE configuration file and netboot using kexec root@lep8d:~# dpkg -l | grep kdump ii kdump-tools1.5.6-2 all scripts and tools for automating kdump (Linux crash dumps) root@lep8d:~# cat /sys/kernel/kexec_crash_loaded 0 root@lep8d:~# kdump-config show USE_KDUMP:1 KDUMP_SYSCTL: kernel.panic_on_oops=1 KDUMP_COREDIR:/var/crash crashkernel addr: current state:Not ready to kdump kexec command: no kexec command recorded root@lep8d:~# kdump-config status current state : Not ready to kdump Tried to manually trigger a crash as below: root@lep8d:~# sysctl -w kernel.sysrq=1 kernel.sysrq = 1 root@lep8d:~# cat /proc/sys/kernel/sysrq 1 root@lep8d:~# echo c > /proc/sysrq-trigger [ 4252.703681] SysRq : Trigger a crash [ 4252.703773] Unable to handle kernel paging request for data at address 0x [ 4252.703779] Faulting instruction address: 0xc05b88f4 [ 4252.703807] Oops: Kernel access of bad area, sig: 11 [#1] [ 4252.703852] SMP NR_CPUS=2048 NUMA PowerNV [ 4252.703899] Modules linked in: dm_multipath scsi_dh shpchp powernv_rng uio_pdrv_genirq uio rtc_generic binfmt_misc parport_pc ppdev lp parport ses enclosure lpfc scsi_transport_fc ipr scsi_tgt [ 4252.704162] CPU: 76 PID: 4635 Comm: bash Not tainted 3.16.0-5-generic #10-Ubuntu [ 4252.704230] task: c01fdf7cbeb0 ti: c01fdf8e4000 task.ti: c01fdf8e4000 [ 4252.704298] NIP: c05b88f4 LR: c05b997c CTR: c05b88c0 [ 4252.704365] REGS: c01fdf8e79d0 TRAP: 0300 Not tainted (3.16.0-5-generic) [ 4252.704432] MSR: 90009033 CR: 28422824 XER: 2000 [ 4252.704602] CFAR: c0009358 DAR: DSISR: 4200 SOFTE: 1 GPR00: c05b997c c01fdf8e7c50 c1346498 0063 GPR04: c00014305db0 c00014316618 00018010 c14ff2d8 GPR08: c0dd6498 0001 00018010 GPR12: c05b88c0 c7e50a00 010016f94818 1016e008 GPR16: 1013ad70 010016f9c958 1016fed0 1016e008 GPR20: 100c31e0 10171fc8 1016f840 GPR24: 1014d9b0 1014d0b0 0004 GPR28: c127eee8 0063 c125d6a0 c127f2a8 [ 4252.705502] NIP [c05b88f4] sysrq_handle_crash+0x34/0x50 [ 4252.705558] LR [c05b997c] __handle_sysrq+0xec/0x270 [ 4252.705604] Call Trace: [ 4252.705631] [c01fdf8e7c50] [c018e3f0] __acct_update_integrals+0x80/0x170 (unreliable) [ 4252.705722] [c01fdf8e7c70] [c05b997c] __handle_sysrq+0xec/0x270 [ 4252.705790] [c01fdf8e7d10] [c05ba138] write_sysrq_trigger+0x78/0xa0 [ 4252.705871] [c01fdf8e7d40] [c03141f0] proc_reg_write+0xb0/0x110 [ 4252.705940] [c01fdf8e7d90] [c028c07c] vfs_write+0xdc/0x260 [ 4252.706007] [c01fdf8e7de0] [c028ce1c] SyS_write+0x6c/0x110 [ 4252.706076] [c01fdf8e7e30] [c000a0fc] syscall_exit+0x0/0x7c [ 4252.706143] Instruction dump: [ 4252.706177] 3842dbd8 7c0802a6 f8010010 f821ffe1 6000 6000 3d
[Kernel-packages] [Bug 1334793] Re: PowerVM: ubuntu-14.04 stuck in ibm, client-architecture-support reboot loop
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1334793 Title: PowerVM: ubuntu-14.04 stuck in ibm,client-architecture-support reboot loop Status in “linux” package in Ubuntu: New Bug description: ---Problem Description--- This bug is a follow-up of Bug#110009. The problem here is after successful installation of Ubuntu 14.04, the system fails to boot the OS. When I select installed OS "Ubuntu" from grub menu, it throws below error and falls back to GRUB menu again and again: .. .. OF stdout device is: /vdevice/vty@3000 Preparing to boot Linux version 3.13.0-24-generic (buildd@fisher04) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #46-Ubuntu SMP Thu Apr 10 19:09:21 UTC 2014 (Ubuntu 3.13.0-24.46-generic 3.13.9) Detected machine type: 0101 Max number of cores passed to firmware: 256 (NR_CPUS = 2048) Calling ibm,client-architecture-support... ibm,sp /vdevice/IBM,sp@4000 \ ibm,sp /vdevice/IBM,sp@4000 \ Elapsed time since release of system processors: 10675 mins 3 secs error: no suitable video mode found. Machine Type = IBM,8286-42A Mach serial number: 1069B3T ---Steps to Reproduce--- 1) Try virtual DVD installation 2) Boot the LPAR via cdrom i.e try below commands from openfirmware prompt: 0> devalias cdrom /vdevice/v-scsi@3010/disk@8300 0> boot cdrom 3) proceed with default installation, 100% installation completes. Boot the installed OS from hard disk and you hit this "no suitable video mode found" error and system falls back to GRUB. Install method: virtual DVD Install ISO Information: ubuntu-14.04-server-ppc64el.iso *Additional Instructions for backup: khusr...@in.ibm.com: -Post a private note with access information to the machine that the bug is occuring on. == Comment: #1 - David Heller - 2014-06-11 15:32:48 == This seems to be a pretty common problem, albeit with multiple causes. I see that you already tried some of the workarounds in similar bug here: https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/699802 Still, it seems likely this is due to the video mode that grub is attempting to user for the splash screen. Either that, or it's a video driver issue at the grub level. Can you please try the procedure in comment #4 here: http://.ubuntuforums.org/showthread.php?t=1471399 This involves booting to an alternate method, such as a live cd, and operating on the grub config files in the newly installed system. If you are unable to, or have no success with updating the files, it would at least be good to collect the files and attach the information here, before sending this back to launchpad If you are successful in booting to live CD on the same system where the boot fails, it would be good to compare the files from live CD to those in the failing system. Also, do you mind installing with Ubuntu 14.10 alpha images? http://cdimage.ubuntu.com/ubuntu-server/daily/current/utopic-server- ppc64el.iso To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1334793/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp