Re: [PATCH] scsi: eata: drop VLA in reorder()
Linus Torvalds wrote on 13/03/18 05:15: On Sun, Mar 11, 2018 at 8:08 PM, Tobin C. Harding <to...@apporbit.com> wrote: I think we are going to see a recurring theme here. MAX_MAILBOXES==64 so this patch adds 1536 bytes to the stack on a 64 bit machine or 768 bytes on a 32 bit machine. Yeah, that's a bit excessive. It probably works, but one or two of those allocations will make the kernel stack really tight, so in general I really would suggest using kmalloc() instead, or figuring out some way to simply shrink the data structures. That said, I wonder if the solution to this particular driver is "delete it". Because the hardware is truly ancient and nobody sane would use it any more. The last patch that seemed to come from an actual _user_ finding a problem was in 2008 (commit 20c09df7eb9c: "[SCSI] eata: fix the data buffer accessors conversion regression"). And even then it apparently took a year for people to have noticed the breakage. But because the person who reported that problem is still around, I'll just add him to the cc, just in case. Arthur Marsh, you have the dubious honor and distinction of being the only person to have apparently used that driver in the last ten years. Do you still have hardware using that? Because maybe it's really time to retire that driver. Linus Hi Linus and maintainers, thanks for the courtesy email and all the help with the driver. I am unable to make use of the driver any more due to failed hardware. The DPT2044W SCSI controller and the IBM disk from May 1998 last officially ran on 7 August 2017. I was had previously been able to get the data off it and disconnected the controller and disk following recurring problems with booting. Aug 7 16:40:24 localhost kernel: [ 105.098705] sd 0:0:6:0: [sda] Synchronizing SCSI cache Aug 7 16:40:24 localhost kernel: [ 105.233166] EATA0: IRQ 11 mapped to IO-APIC IRQ 18. Aug 7 16:40:24 localhost kernel: [ 105.233475] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio. Aug 7 16:40:24 localhost kernel: [ 105.233485] EATA config options -> tm:1, lc:y, mq:16, rs:y, et:n, ip:n, ep:n, pp:y. Aug 7 16:40:24 localhost kernel: [ 105.233492] EATA0: 2.0C, PCI 0x9010, IRQ 18, BMST, SG 122, MB 64. Aug 7 16:40:24 localhost kernel: [ 105.233499] EATA0: wide SCSI support enabled, max_id 16, max_lun 8. Aug 7 16:40:24 localhost kernel: [ 105.233505] EATA0: SCSI channel 0 enabled, host target ID 7. Aug 7 16:40:24 localhost kernel: [ 105.233521] scsi host0: EATA/DMA 2.0x rev. 8.10.00 Arthur Marsh.
Re: CPU lock-ups with 4.12.0+ kernels related to usb_storage
Arthur Marsh wrote on 14/07/17 04:18: Alan Stern wrote on 14/07/17 02:30: All right. In the meantime, changing usb-storage won't hurt. Arthur, can you test the patch below? Alan Stern Index: usb-4.x/drivers/usb/storage/usb.c === --- usb-4.x.orig/drivers/usb/storage/usb.c +++ usb-4.x/drivers/usb/storage/usb.c @@ -315,6 +315,7 @@ static int usb_stor_control_thread(void { struct us_data *us = (struct us_data *)__us; struct Scsi_Host *host = us_to_host(us); +struct scsi_cmnd *srb; for (;;) { usb_stor_dbg(us, "*** thread sleeping\n"); @@ -330,6 +331,7 @@ static int usb_stor_control_thread(void scsi_lock(host); /* When we are called with no command pending, we're done */ +srb = us->srb; if (us->srb == NULL) { scsi_unlock(host); mutex_unlock(>dev_mutex); @@ -398,14 +400,11 @@ static int usb_stor_control_thread(void /* lock access to the state */ scsi_lock(host); -/* indicate that the command is done */ -if (us->srb->result != DID_ABORT << 16) { -usb_stor_dbg(us, "scsi cmd done, result=0x%x\n", - us->srb->result); -us->srb->scsi_done(us->srb); -} else { +/* was the command aborted? */ +if (us->srb->result == DID_ABORT << 16) { SkipForAbort: usb_stor_dbg(us, "scsi command aborted\n"); +srb = NULL;/* Don't call srb->scsi_done() */ } /* @@ -429,6 +428,13 @@ SkipForAbort: /* unlock the device pointers */ mutex_unlock(>dev_mutex); + +/* now that the locks are released, notify the SCSI core */ +if (srb) { +usb_stor_dbg(us, "scsi cmd done, result=0x%x\n", +srb->result); +srb->scsi_done(srb); +} } /* for (;;) */ /* Wait until we are told to stop */ Hi, just to confirm no further lock-ups occurred in the last 4 days with this patch applied. Arthur.
Re: CPU lock-ups with 4.12.0+ kernels related to usb_storage
Alan Stern wrote on 14/07/17 02:30: All right. In the meantime, changing usb-storage won't hurt. Arthur, can you test the patch below? Alan Stern Index: usb-4.x/drivers/usb/storage/usb.c === --- usb-4.x.orig/drivers/usb/storage/usb.c +++ usb-4.x/drivers/usb/storage/usb.c @@ -315,6 +315,7 @@ static int usb_stor_control_thread(void { struct us_data *us = (struct us_data *)__us; struct Scsi_Host *host = us_to_host(us); + struct scsi_cmnd *srb; for (;;) { usb_stor_dbg(us, "*** thread sleeping\n"); @@ -330,6 +331,7 @@ static int usb_stor_control_thread(void scsi_lock(host); /* When we are called with no command pending, we're done */ + srb = us->srb; if (us->srb == NULL) { scsi_unlock(host); mutex_unlock(>dev_mutex); @@ -398,14 +400,11 @@ static int usb_stor_control_thread(void /* lock access to the state */ scsi_lock(host); - /* indicate that the command is done */ - if (us->srb->result != DID_ABORT << 16) { - usb_stor_dbg(us, "scsi cmd done, result=0x%x\n", -us->srb->result); - us->srb->scsi_done(us->srb); - } else { + /* was the command aborted? */ + if (us->srb->result == DID_ABORT << 16) { SkipForAbort: usb_stor_dbg(us, "scsi command aborted\n"); + srb = NULL; /* Don't call srb->scsi_done() */ } /* @@ -429,6 +428,13 @@ SkipForAbort: /* unlock the device pointers */ mutex_unlock(>dev_mutex); + + /* now that the locks are released, notify the SCSI core */ + if (srb) { + usb_stor_dbg(us, "scsi cmd done, result=0x%x\n", + srb->result); + srb->scsi_done(srb); + } } /* for (;;) */ /* Wait until we are told to stop */ Thanks for the patch! I have applied it and am running the resulting kernel. As I didn't have a reproducible way to trigger the problem on demand, I'll just have to see that there isn't a lock-up that looks related over the next several days. Arthur.
Re: [PATCH 0/4] block: Fixes for bdi handling
Jan Kara wrote on 09/03/17 03:18: Hi! patches in this series fix the most urgent bugs that were introduced by commit 165a5e22fafb "block: Move bdi_unregister() to del_gendisk()" and by 0dba1314d4f8 "scsi, block: fix duplicate bdi name registration crashes". In fact before these commits we had a different set of problems in the code but they were less visible :). I'm still waiting for test confirmation from Omar and Arthur Marsh who reported issues but I'm not able to hit any problem anymore in my testing. I think it would be nice to get the patches to rc2 so to speed up things I'm posting the patches now so that review can happen in parallel with the testing. Other BDI handling fixes I have in my queue can wait a bit more since they are either theoretical or long-standing issues. So I'll repost them once these four are sorted out. Honza Sorry for the delay in replying, I had to leave the kernel with all 4 patches applied rebuilding while I was at work and just booted it. I've only done a kexec reboot so far but there were no problems - no errors in dmesg, all disks were recognised and all attempted mounts worked. Thanks very much for the quick fix! Arthur.
problem with block: Move bdi_unregister() to del_gendisk() commit 165a5e22fafb127ecb5914e12e8c32a1f0d3f820
On one of my pc's I have 2 PATA disks (one, WDC below is used for booting, the other SAMSUNG is not mounted), plus an IBM SCSI disk using a DPT 2044W controller with eata driver and sometimes a Verbatim Storengo USB stick. On recent 4.10.0+ kernel builds (i386), the resulting kernel would pause during the start-up when the USB stick was inserted but boot normally otherwise. A git-bisect lead to: commit 165a5e22fafb127ecb5914e12e8c32a1f0d3f820 Author: Jan KaraDate: Wed Feb 8 08:05:56 2017 +0100 block: Move bdi_unregister() to del_gendisk() Commit 6cd18e711dd8 "block: destroy bdi before blockdev is unregistered." moved bdi unregistration (at that time through bdi_destroy()) from blk_release_queue() to blk_cleanup_queue() because it needs to happen before blk_unregister_region() call in del_gendisk() for MD. SCSI though will free up the device number from sd_remove() called through a maze of callbacks from device_del() in __scsi_remove_device() before blk_cleanup_queue() and thus similar races as described in 6cd18e711dd8 can happen for SCSI as well as reported by Omar [1]. Moving bdi_unregister() to del_gendisk() works for MD and fixes the problem for SCSI since del_gendisk() gets called from sd_remove() before freeing the device number. This also makes device_add_disk() (calling bdi_register_owner()) more symmetric with del_gendisk(). [1] http://marc.info/?l=linux-block=148554717109098=2 When booting the bad kernel, I would eventually get a prompt to press the enter key to boot and it eventually started, but the SCSI disk partitions were not found by blkid nor could they be mounted. lsscsi reports: [0:0:6:0]diskIBM DCAS-34330W S65A /dev/sda [1:0:0:0]diskATA WDC WD3200AAJB-0 2C01 /dev/sdc [2:0:0:0]cd/dvd HL-DT-ST DVDRAM GSA-4163B A103 /dev/sr0 [2:0:1:0]diskATA SAMSUNG SP4002H 0-57 /dev/sdd [3:0:0:0]diskVerbatim STORE N GO 5.00 /dev/sdb blkid reports: /dev/sdb1: LABEL="STORENGO" UUID="B08B-79DA" TYPE="vfat" PARTUUID="961d9655-01" /dev/sdc1: UUID="bfdeb6d6-0b77-4beb-a63d-bdc3e455b8ea" TYPE="ext3" PTTYPE="dos" PARTUUID="000750bf-01" /dev/sdc5: UUID="26b7280a-f40c-49dd-a086-dbbb9b7e3def" TYPE="swap" PARTUUID="000750bf-05" /dev/sdc6: UUID="7417-5AFF" TYPE="vfat" PARTUUID="000750bf-06" /dev/sdc7: UUID="96c96a61-8615-4715-86d0-09cb8c62638c" TYPE="ext3" PARTUUID="000750bf-07" /dev/sdd1: LABEL="W-98 SE" UUID="3571-16DE" TYPE="vfat" PARTUUID="43598af3-01" /dev/sdd3: UUID="fd6a052e-c062-4c47-801d-087595635c5d" SEC_TYPE="ext2" TYPE="ext3" PARTUUID="43598af3-03" /dev/sdd5: UUID="026a3f5c-0064-4ae7-869e-519d2cee05e7" SEC_TYPE="ext2" TYPE="ext3" PTTYPE="dos" PARTUUID="43598af3-05" /dev/sdd6: UUID="9a0970fa-74ba-4426-98ac-1e8b81933e0e" TYPE="swap" PARTUUID="43598af3-06" /dev/sdd7: UUID="4912-06CA" TYPE="vfat" PARTUUID="43598af3-07" The boot screen is at: http://www.users.on.net/~arthur.marsh/20170308_18.jpg and the dmesg output from booting the bad kernel is attached. I'm happy to supply any other configuration details needed and run further tests. Regards, Arthur. [0.00] Linux version 4.10.0+ (root@victoria) (gcc version 6.3.0 20170221 (Debian 6.3.0-8) ) #652 SMP PREEMPT Wed Mar 8 04:45:06 ACDT 2017 [0.00] x86/fpu: x87 FPU will use FXSAVE [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009fbff] usable [0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved [0.00] BIOS-e820: [mem 0x000e-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0x3ffa] usable [0.00] BIOS-e820: [mem 0x3ffb-0x3ffbdfff] ACPI data [0.00] BIOS-e820: [mem 0x3ffbe000-0x3ffd] ACPI NVS [0.00] BIOS-e820: [mem 0x3ffe-0x3fff] reserved [0.00] BIOS-e820: [mem 0xe000-0xefff] reserved [0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved [0.00] BIOS-e820: [mem 0xff78-0x] reserved [0.00] NX (Execute Disable) protection: active [0.00] SMBIOS 2.3 present. [0.00] DMI: System manufacturer System Product Name/A8V-MX, BIOS 0503 12/06/2005 [0.00] e820: update [mem 0x-0x0fff] usable ==> reserved [0.00] e820: remove [mem 0x000a-0x000f] usable [0.00] e820: last_pfn = 0x3ffb0 max_arch_pfn = 0x100 [0.00] MTRR default type: uncachable [0.00] MTRR fixed ranges enabled: [0.00] 0-9 write-back [0.00] A-E uncachable [0.00] F-F write-protect [0.00] MTRR variable ranges enabled: [0.00] 0 base 00 mask FFC000 write-back [0.00] 1 base 00D000 mask FFF000
Re: [PATCH] eata: Convert eata driver as normal PCI and platform device drivers
Jiang Liu wrote on 02/03/16 13:50: I spoke too soon, without removing and re-inserting the eata module before any filesystems on disks attached to the DPT controller were mounted, I'd get the following messages, similar to ones previously reported: sd 0:0:6:0: tag#0 abort, mbox 1. EATA0: abort, mbox 1 is in use. sd 0:0:6:0: tag#0 reset, enter. EATA0: reset, mbox 1 in reset. EATA0: reset, board reset done, enabling interrupts. EATA0: reset, interrupts disabled, loops 100415. EATA0, reset, mbox 1 locked, DID_RESET, done. EATA0: reset, exit, done. and so on, finally hanging after printing "kexec_core: Starting new kernel" (I have a photo of the messages if they're needed). So I'm still using the new patch but have to continue to remove and reinsert eata at start-up before any attempts to mount disks attatched to the DPT SCSI controller. Hi Authur, Thanks for testing. So current situation is that we have a working driver for normal case, but still have issues during kexec. Per my understanding, we need to implement a PCI device driver shutdown callback to reset the RAID controller. I have once tried to implement the shutdown callback, but it doesn't work. And I have no deep understanding of the RAID controller and have no hardware for experiment too, so have no idea about next step. Maybe one acceptable way is to merge this patch first, so we get a basic working driver, and then ask help from expert to solve the kexec issue. Thanks! Gerry My controller is a DPT2044W, it does not provide any hardware RAID capabilities. I'm not sure where responsibility lies in driver development but I'm still using the DPT2044W controller which worked on the 4.2.0 kernels and earlier and this problem has been around for nearly 5 months now. I can do builds and tests of any patches that people can provide, but am not a C programmer, much less a Linux driver developer. Regards, Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] eata: Convert eata driver as normal PCI and platform device drivers
Arthur Marsh wrote on 02/03/16 03:57: Christoph Hellwig wrote on 01/03/16 17:22: Hi Jiang. I'd love to see this patch in and abuse of the old PCI API gone. Did you resolve the problems Arthur saw with the previous iteratons of the patch? I applied Jiang Liu's patch of 1st March 2016 to a clean kernel 4.5.0-rc6 source, removed my workaround of removing and re-adding the eata module before mounting file-systems that are on disks attached to the DPT SCSI card using the eata driver, and was able to kexec from the new kernel successfully. Arthur. I spoke too soon, without removing and re-inserting the eata module before any filesystems on disks attached to the DPT controller were mounted, I'd get the following messages, similar to ones previously reported: sd 0:0:6:0: tag#0 abort, mbox 1. EATA0: abort, mbox 1 is in use. sd 0:0:6:0: tag#0 reset, enter. EATA0: reset, mbox 1 in reset. EATA0: reset, board reset done, enabling interrupts. EATA0: reset, interrupts disabled, loops 100415. EATA0, reset, mbox 1 locked, DID_RESET, done. EATA0: reset, exit, done. and so on, finally hanging after printing "kexec_core: Starting new kernel" (I have a photo of the messages if they're needed). So I'm still using the new patch but have to continue to remove and reinsert eata at start-up before any attempts to mount disks attatched to the DPT SCSI controller. Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] eata: Convert eata driver as normal PCI and platform device drivers
Christoph Hellwig wrote on 01/03/16 17:22: Hi Jiang. I'd love to see this patch in and abuse of the old PCI API gone. Did you resolve the problems Arthur saw with the previous iteratons of the patch? I applied Jiang Liu's patch of 1st March 2016 to a clean kernel 4.5.0-rc6 source, removed my workaround of removing and re-adding the eata module before mounting file-systems that are on disks attached to the DPT SCSI card using the eata driver, and was able to kexec from the new kernel successfully. Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
eata module for DPT SCSI cards
Hi, I'm still having to have the following applied to be able to use the eata driver for my DPT2044W SCSI card. Is there any chance that this could be mainlined or another fix implemented that can be mainlined? As it is with the following patches applied, I still have to unload and reload the eata driver before mounting filesystems on the disk attached to the DPT2044W SCSI card that uses the eata driver, otherwise kexec reboots fail. Without the patches applied, the machine locks up when it tries to load the eata module. Arthur. diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index d7ffd66..8321c46 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -391,6 +391,7 @@ int __weak pcibios_alloc_irq(struct pci_dev *dev) { return 0; } +EXPORT_SYMBOL_GPL(pcibios_alloc_irq); void __weak pcibios_free_irq(struct pci_dev *dev) { diff --git a/drivers/scsi/eata.c b/drivers/scsi/eata.c index 227dd2c..7e6eaf8 100644 --- a/drivers/scsi/eata.c +++ b/drivers/scsi/eata.c @@ -1061,6 +1061,7 @@ static void enable_pci_ports(void) driver_name, dev->bus->number, dev->devfn); #endif + pcibios_alloc_irq(dev); if (pci_enable_device(dev)) printk ("%s: warning, pci_enable_device failed, bus %d devfn 0x%x.\n", @@ -1520,6 +1521,7 @@ static void add_pci_ports(void) if (!(dev = pci_get_class(PCI_CLASS_STORAGE_SCSI << 8, dev))) break; + pcibios_alloc_irq(dev); if (pci_enable_device(dev)) { #if defined(DEBUG_PCI_DETECT) printk ## end -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT v3] eata: Convert eata driver as normal PCI and platform device drivers
Jiang Liu wrote on 03/10/15 17:41: If I do a normal boot which includes eata being loaded, the disk attached to the DPT2044W controller having its filesystems checked and mounted, then attempt a kexec reboot, I get the reboot pausing after the "synchronizing SCSI cache" messages as before. If I un-mount the filesystems on the disk attached to the DPT2044W controller after start-up and try a reboot I get the same problem. If I do modprobe -r eata after un-mounting the filesystems on the disk attached to the DPT2044W controller after a start-up kexec *works fine*. Hi Arthur, The above results suggest that we need to shutdown eata controller for kexec. So could you please try to apply the attached patch upon the previous two patches? Thanks! Gerry To clarify, if the eata driver gets loaded once and stays loaded, at a kexec reboot attempt the "Synchronising SCSI cache" message is missing for the SCSI disk attached to the controller using the eata driver and eventually other error messages appear as seen in screen images that I have previously posted. If the eata driver is loaded, unloaded via modprobe -r, then reloaded, a kexec reboot shows 2 "Synchronising SCSI cache" messages for the SCSI disk attached to the controller using the eata driver and the kexec reboot is successful. Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT v3] eata: Convert eata driver as normal PCI and platform device drivers
Jiang Liu wrote on 03/10/15 17:41: Hi Arthur, The above results suggest that we need to shutdown eata controller for kexec. So could you please try to apply the attached patch upon the previous two patches? Thanks! Gerry Hi, I still get kexec shutdown errors like this with the 3rd patch applied: http://www.users.on.net/~arthur.marsh/20151003566.jpg I can still unmount filesystems, modprobe -r eata and modprobe eata to get things into a state where a kexec reboot works. Regards, Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT v3] eata: Convert eata driver as normal PCI and platform device drivers
Arthur Marsh wrote on 24/09/15 15:26: Jiang Liu wrote on 24/09/15 13:58: Hi James, Thanks for review. How about the attached patch which addresses the three suggestions from you? Thanks! Gerry I've applied the patch, rebuilt the kernel and verified that it allows unloading of the eata module and reloading it, as well as a successful kexec. Regards, Arthur. After some more thorough testing I've encountered an ongoing problem trying to use kexec with filesystems mounted with the eata driver. If I boot up and have the eata driver loaded but no filesystem check or mounting of filesystems on the disk attached to the DPT2044W controller, then attempt a kexec reboot I get the reboot pausing after the "synchronizing scsi cache" messages and getting the errors that I have included as pictures in my previous reports. If I do a normal boot which includes eata being loaded, the disk attached to the DPT2044W controller having its filesystems checked and mounted, then attempt a kexec reboot, I get the reboot pausing after the "synchronizing SCSI cache" messages as before. If I un-mount the filesystems on the disk attached to the DPT2044W controller after start-up and try a reboot I get the same problem. If I do modprobe -r eata after un-mounting the filesystems on the disk attached to the DPT2044W controller after a start-up kexec *works fine*. If I do: start-up un-mount filesystems on disk attached to DPT2044W controller modprobe -r eata modprobe eata fsck -a of filesystems on disk attached to DPT2044W controller mount filesystems then a kexec reboot works fine. I did some more experimenting and found a workaround: I was unable to blacklist the eata module but if I did: modprobe -r eata modprobe eata in a cron job before the fsck and mount commands then I could then perform a kexec reboot successfully. I also verified that if I did: modprobe -r eata after eata was loaded on boot-up without any fsck or mounting of filesystems on the disk attached to the DPT2044W controller using the eata the kexec reboot worked fine. In summary: if eata is loaded kexec reboot will fail unless a modprobe -r eata is done either manually or by a cron job. if a modprobe -r eata has been done, then even if I modprobe eata and fsck and mount filesystems, kexec reboot works. Any suggestions for further tests or checks welcome. Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT v3] eata: Convert eata driver as normal PCI and platform device drivers
Jiang Liu wrote on 23/09/15 14:54: Hi Arthur, I have found the cause of the warning messages, it's caused by a flaw in the conversion. But according to my understanding, it isn't related to the kexec/kdump failure. Could you please help to test the attached new version? Thanks! Gerry Thanks, the patch worked, I could successfully unload and reload the eata module, and perform a kexec reboot with the eata module loading successfully afterwards. Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT v3] eata: Convert eata driver as normal PCI and platform device drivers
Jiang Liu wrote on 24/09/15 13:58: Hi James, Thanks for review. How about the attached patch which addresses the three suggestions from you? Thanks! Gerry I've applied the patch, rebuilt the kernel and verified that it allows unloading of the eata module and reloading it, as well as a successful kexec. Regards, Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT v3] eata: Convert eata driver as normal PCI and platform device drivers
Jiang Liu wrote on 22/09/15 17:00: Previously the eata driver just grabs and accesses eata PCI devices without implementing a PCI device driver, that causes troubles with latest IRQ related Commit 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()") changes the way to allocate PCI legacy IRQ for PCI devices on x86 platforms. Instead of allocating PCI legacy IRQs when pcibios_enable_device() gets called, now pcibios_alloc_irq() will be called by pci_device_probe() to allocate PCI legacy IRQs when binding PCI drivers to PCI devices. But the eata driver directly accesses PCI devices without implementing corresponding PCI drivers, so pcibios_alloc_irq() won't be called for those PCI devices and wrong IRQ number may be used to manage the PCI device. This patch implements a PCI device driver to manage eata PCI devices, so eata driver could properly cooperate with the PCI core. It also provides headroom for PCI hotplug with eata driver. It also represents non-PCI eata devices as platform devices, so it could be managed as normal devices. Signed-off-by: Jiang LiuCc: Hannes Reinecke Cc: Ballabio, Dario Cc: Christoph Hellwig --- Not really any change with this driver: previously http://www.users.on.net/~arthur.marsh/20150915547.jpg now http://www.users.on.net/~arthur.marsh/20150922553.jpg If there was any way of capturing any more debug output I'd be happy to do it. Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT v3] eata: Convert eata driver as normal PCI and platform device drivers
James Bottomley wrote on 23/09/15 08:15: On Wed, 2015-09-23 at 07:55 +0930, Arthur Marsh wrote: Jiang Liu wrote on 22/09/15 17:00: Previously the eata driver just grabs and accesses eata PCI devices without implementing a PCI device driver, that causes troubles with latest IRQ related Commit 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()") changes the way to allocate PCI legacy IRQ for PCI devices on x86 platforms. Instead of allocating PCI legacy IRQs when pcibios_enable_device() gets called, now pcibios_alloc_irq() will be called by pci_device_probe() to allocate PCI legacy IRQs when binding PCI drivers to PCI devices. But the eata driver directly accesses PCI devices without implementing corresponding PCI drivers, so pcibios_alloc_irq() won't be called for those PCI devices and wrong IRQ number may be used to manage the PCI device. This patch implements a PCI device driver to manage eata PCI devices, so eata driver could properly cooperate with the PCI core. It also provides headroom for PCI hotplug with eata driver. It also represents non-PCI eata devices as platform devices, so it could be managed as normal devices. Signed-off-by: Jiang Liu <jiang@linux.intel.com> Cc: Hannes Reinecke <h...@suse.de> Cc: Ballabio, Dario <dario.balla...@emc.com> Cc: Christoph Hellwig <h...@infradead.org> --- Not really any change with this driver: previously http://www.users.on.net/~arthur.marsh/20150915547.jpg now http://www.users.on.net/~arthur.marsh/20150922553.jpg If there was any way of capturing any more debug output I'd be happy to do it. It looks to be some problem in shut down. Can you simply remove and re-insert the driver successfully? If it's your root disk driver, you'll have to do this from an initrd so as not to have root mounted from the eata controller. If the remove and reinsert fails, it means we have a problem in the driver shut down. If not, it's likely something kexec related. James OK, it looks like there was a problem with unloading the driver. After un-mounting file systems on the disk attached to the SCSI controller using the eata driver I could do a: modprobe -r eata but received the output of the attached dmesg log. Attempting to do modprobe eata after the previous modprobe -r eata resulted in a complete lock-up. Arthur. [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Initializing cgroup subsys cpuacct [0.00] Linux version 4.3.0-rc2+ (root@victoria) (gcc version 5.2.1 20150911 (Debian 5.2.1-17) ) #49 SMP PREEMPT Tue Sep 22 04:58:18 ACST 2015 [0.00] x86/fpu: Legacy x87 FPU detected. [0.00] x86/fpu: Using 'lazy' FPU context switches. [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009fbff] usable [0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved [0.00] BIOS-e820: [mem 0x000e-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0x3ffa] usable [0.00] BIOS-e820: [mem 0x3ffb-0x3ffbdfff] ACPI data [0.00] BIOS-e820: [mem 0x3ffbe000-0x3ffd] ACPI NVS [0.00] BIOS-e820: [mem 0x3ffe-0x3fff] reserved [0.00] BIOS-e820: [mem 0xe000-0xefff] reserved [0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved [0.00] BIOS-e820: [mem 0xff78-0x] reserved [0.00] Notice: NX (Execute Disable) protection cannot be enabled: non-PAE kernel! [0.00] SMBIOS 2.3 present. [0.00] DMI: System manufacturer System Product Name/A8V-MX, BIOS 0503 12/06/2005 [0.00] e820: update [mem 0x-0x0fff] usable ==> reserved [0.00] e820: remove [mem 0x000a-0x000f] usable [0.00] e820: last_pfn = 0x3ffb0 max_arch_pfn = 0x10 [0.00] MTRR default type: uncachable [0.00] MTRR fixed ranges enabled: [0.00] 0-9 write-back [0.00] A-E uncachable [0.00] F-F write-protect [0.00] MTRR variable ranges enabled: [0.00] 0 base 00 mask FFC000 write-back [0.00] 1 base 00D000 mask FFF000 write-combining [0.00] 2 disabled [0.00] 3 disabled [0.00] 4 disabled [0.00] 5 disabled [0.00] 6 disabled [0.00] 7 disabled [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT [0.00] found SMP MP-table at [mem 0x000ff780-0x000ff78f] mapped at [c00ff780] [0.00] initial memory mapped: [mem 0x-0x023f] [0.00] Base memory trampoline at [c009b000] 9b000 size 16384 [0.00] init_memory_mapping: [mem 0x-0x000f] [0.00] [mem 0x0
Re: [Bugfix 3/3] eata: Enhance eata driver to support PCI device hot-removal
Christoph Hellwig wrote on 16/09/15 23:12: Jiang, you also need to convert the driver to scsi_add_host/scsi_remove_host from the legacy scsi_register interface, otherwise the SCSI layer will be very unhappy. Take a look at commit 0d31f8759109cbc1e6fc196d08e6b0e8a9e93b3f for example, the change should be straight forward. I am pleased to note that when I tried a Linus git head kernel from the last 24 hours, the IRQ routing for my DPT2044W SCSI card using eata module worked again, although the shut-down/kexec issue remains. Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugfix 0/3] Convert eata driver to a normal PCI device driver
Jiang Liu wrote on 16/09/15 14:37: On 2015/9/15 15:19, Arthur Marsh wrote: Jiang Liu wrote on 15/09/15 12:01: HI Arthur, Really appreciate your help to test the patches. That's a good sign we have moved forward a bit:) For kexec, it's always challenging to me. So could you please help to provide full dmesg logs with working kernels so I could try to figure out the order among scsi and PCI devices. It may be shutdown order related. Thanks! Gerry OK, attached is the dmesg output from the 4.2.0 kernel where kexec worked. Hi Arthur, Could you please also help to capture the log messages of kexec, I need to those log messages to figure out the order to shutdown PCI devices and scsi devices during kexec. Thanks! Gerry How would I capture the log messages of kexec (assuming that there are any, I couldn't see from the manual page entries and haven't seen anything beyond the screen images that I have already sent you)? Regards, Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugfix 0/3] Convert eata driver to a normal PCI device driver
Jiang Liu wrote on 16/09/15 17:51: Hi Arthur, It would be great if we could capture the text as in the picture posted by you at: http://www.users.on.net/~arthur.marsh/20150915547.jpg I guess a serial console could help us to capture those log messages. To use serial console, we need to setup serial cable, configure grub and kernel to use serial port as console. Thanks! Gerry Regards, Arthur. I've already included the text of what appeaered in the image above: sd 0:0:6:0: abort, mbox 63. EATA0: abort, mbox 63 is in use. sd 0:0:6:0: reset, enter. EATA0: reset, mbox 63 in reset. EATA0: reset, board reset done, enabling interrupts. EATA0: reset, interrupts disabled, loops 100469. EATA0: reset, mbox 63 locked, DID_RESET, done. EATA0: reset, exit, done. sd 0:0:6:0: qcomm, mbox 0, adapter busy, will start sd 0:0:6:0: abort, mbox 0. EATA0: abort, timeout error. sd 0:0:6:0: reset, enter. EATA0: reset, exit, timeout error. sd 0:0:6:0 Device offlinled - not ready after error recovery sd 0:0:6:0 rejecting I/O to offline device sd 0:0:6:0 rejecting I/O to offline device sd 0:0:6:0 [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK starting new kernel As mentioned previously this occurred after the normal Synchronizing SCSI cache messages. I don't think that there is anything else that gets sent to the console. Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugfix 0/3] Convert eata driver to a normal PCI device driver
Jiang Liu wrote on 15/09/15 12:01: HI Arthur, Really appreciate your help to test the patches. That's a good sign we have moved forward a bit:) For kexec, it's always challenging to me. So could you please help to provide full dmesg logs with working kernels so I could try to figure out the order among scsi and PCI devices. It may be shutdown order related. Thanks! Gerry OK, attached is the dmesg output from the 4.2.0 kernel where kexec worked. Arthur. [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Initializing cgroup subsys cpuacct [0.00] Linux version 4.2.0 (root@am64) (gcc version 5.1.1 20150711 (Debian 5.1.1-14) ) #1921 SMP PREEMPT Sun Sep 6 00:08:31 ACST 2015 [0.00] x86/fpu: Legacy x87 FPU detected. [0.00] x86/fpu: Using 'lazy' FPU context switches. [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009fbff] usable [0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved [0.00] BIOS-e820: [mem 0x000e-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0x3ffa] usable [0.00] BIOS-e820: [mem 0x3ffb-0x3ffbdfff] ACPI data [0.00] BIOS-e820: [mem 0x3ffbe000-0x3ffd] ACPI NVS [0.00] BIOS-e820: [mem 0x3ffe-0x3fff] reserved [0.00] BIOS-e820: [mem 0xe000-0xefff] reserved [0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved [0.00] BIOS-e820: [mem 0xff78-0x] reserved [0.00] Notice: NX (Execute Disable) protection cannot be enabled: non-PAE kernel! [0.00] SMBIOS 2.3 present. [0.00] DMI: System manufacturer System Product Name/A8V-MX, BIOS 0503 12/06/2005 [0.00] e820: update [mem 0x-0x0fff] usable ==> reserved [0.00] e820: remove [mem 0x000a-0x000f] usable [0.00] e820: last_pfn = 0x3ffb0 max_arch_pfn = 0x10 [0.00] MTRR default type: uncachable [0.00] MTRR fixed ranges enabled: [0.00] 0-9 write-back [0.00] A-E uncachable [0.00] F-F write-protect [0.00] MTRR variable ranges enabled: [0.00] 0 base 00 mask FFC000 write-back [0.00] 1 base 00D000 mask FFF000 write-combining [0.00] 2 disabled [0.00] 3 disabled [0.00] 4 disabled [0.00] 5 disabled [0.00] 6 disabled [0.00] 7 disabled [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT [0.00] found SMP MP-table at [mem 0x000ff780-0x000ff78f] mapped at [c00ff780] [0.00] initial memory mapped: [mem 0x-0x023f] [0.00] Base memory trampoline at [c009b000] 9b000 size 16384 [0.00] init_memory_mapping: [mem 0x-0x000f] [0.00] [mem 0x-0x000f] page 4k [0.00] init_memory_mapping: [mem 0x35c0-0x35ff] [0.00] [mem 0x35c0-0x35ff] page 4M [0.00] init_memory_mapping: [mem 0x0010-0x35bf] [0.00] [mem 0x0010-0x003f] page 4k [0.00] [mem 0x0040-0x35bf] page 4M [0.00] init_memory_mapping: [mem 0x3600-0x377fdfff] [0.00] [mem 0x3600-0x373f] page 4M [0.00] [mem 0x3740-0x377fdfff] page 4k [0.00] BRK [0x0207d000, 0x0207dfff] PGTABLE [0.00] RAMDISK: [mem 0x3614-0x37097fff] [0.00] ACPI: Early table checksum verification disabled [0.00] ACPI: RSDP 0x000FAC60 24 (v02 ACPIAM) [0.00] ACPI: XSDT 0x3FFB0100 3C (v01 A M I OEMXSDT 12000506 MSFT 0097) [0.00] ACPI: FACP 0x3FFB0290 F4 (v03 A M I OEMFACP 12000506 MSFT 0097) [0.00] ACPI: DSDT 0x3FFB03F0 0046F0 (v01 A0347 A0347001 0001 INTL 02002026) [0.00] ACPI: FACS 0x3FFBE000 40 [0.00] ACPI: FACS 0x3FFBE000 40 [0.00] ACPI: APIC 0x3FFB0390 5C (v01 A M I OEMAPIC 12000506 MSFT 0097) [0.00] ACPI: OEMB 0x3FFBE040 46 (v01 A M I AMI_OEM 12000506 MSFT 0097) [0.00] ACPI: Local APIC address 0xfee0 [0.00] 135MB HIGHMEM available. [0.00] 887MB LOWMEM available. [0.00] mapped low ram: 0 - 377fe000 [0.00] low ram: 0 - 377fe000 [0.00] BRK [0x0207e000, 0x0207efff] PGTABLE [0.00] Zone ranges: [0.00] DMA [mem 0x1000-0x00ff] [0.00] Normal [mem 0x0100-0x377fdfff] [0.00] HighMem [mem 0x377fe000-0x3ffa] [0.00] Movable zone start for each node [0.00] Early memory node ranges [0.00]
Re: [Bugfix 0/3] Convert eata driver to a normal PCI device driver
Jiang Liu wrote on 14/09/15 12:38: Hi Authur, As suggested by Bjorn, patch 1-2 set implement a PCI device driver to manage eata PCI devices. And patch 3 tries to support PCI device hot-removal for eata, but I have no change to test due to limited knowledge about scsi subsystem and lacking of hardware for tests. So you could please help to test patch 1-2? Patch 3 is just for comments. Thanks! Gerry Jiang Liu (3): eata: Use IDA to manage eata board IDs eata: Implement PCI driver to manage eata PCI devices eata: Enhance eata driver to support PCI device hot-removal drivers/scsi/eata.c | 232 +++ 1 file changed, 125 insertions(+), 107 deletions(-) With patches 1 and 2 applied, I get a successful boot with IRQ mapping: [1.147056] EATA0: IRQ 10 mapped to IO-APIC IRQ 17. [1.160404] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio. [1.160469] EATA config options -> tm:1, lc:y, mq:16, rs:y, et:n, ip:n, ep:n, pp:y. [1.160541] EATA0: 2.0C, PCI 0xd890, IRQ 17, BMST, SG 122, MB 64. [1.160600] EATA0: wide SCSI support enabled, max_id 16, max_lun 8. [1.160658] EATA0: SCSI channel 0 enabled, host target ID 7. [1.161207] scsi host0: EATA/DMA 2.0x rev. 8.10.00 but I still get errors when trying to do a kexec reboot, see http://www.users.on.net/~arthur.marsh/20150915547.jpg roughly it reads (after the synchronising SCSI cache reboot messages) and a long period of a dark screen: sd 0:0:6:0: abort, mbox 63. EATA0: abort, mbox 63 is in use. sd 0:0:6:0: reset, enter. EATA0: reset, mbox 63 in reset. EATA0: reset, board reset done, enabling interrupts. EATA0: reset, interrupts disabled, loops 100469. EATA0: reset, mbox 63 locked, DID_RESET, done. EATA0: reset, exit, done. sd 0:0:6:0: qcomm, mbox 0, adapter busy, will start sd 0:0:6:0: abort, mbox 0. EATA0: abort, timeout error. sd 0:0:6:0: reset, enter. EATA0: reset, exit, timeout error. sd 0:0:6:0 Device offlinled - not ready after error recovery sd 0:0:6:0 rejecting I/O to offline device sd 0:0:6:0 rejecting I/O to offline device sd 0:0:6:0 [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK starting new kernel It would be great if this problem could be fixed. Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: eata fails to load on post 4.2 kernels
Jiang Liu wrote on 08/09/15 14:49: Hi Auhur, Could you please help to apply the test patch against the latest mainstream linux kernel? Thanks! Gerry ... git bisect good 991de2e59090e55c65a7f59a049142e3c480f7bd is the first bad commit commit 991de2e59090e55c65a7f59a049142e3c480f7bd Author: Jiang LiuDate: Wed Jun 10 16:54:59 2015 +0800 PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq() To support IOAPIC hotplug, we need to allocate PCI IRQ resources on demand and free them when not used anymore. Implement pcibios_alloc_irq() and pcibios_free_irq() to dynamically allocate and free PCI IRQs. Remove mp_should_keep_irq(), which is no longer used. [bhelgaas: changelog] Signed-off-by: Jiang Liu Signed-off-by: Bjorn Helgaas Acked-by: Thomas Gleixner :04 04 765e2d5232d53247ec260b34b51589c3bccb36ae f680234a27685e94b1a35ae2a7218f8eafa9071a M arch :04 04 d55a682bcde72682e883365e88ad1df6186fd54d f82c470a04a6845fcf5e0aa934512c75628f798d M drivers I tried to do a kexec shut-down with the first version of your patch: >From 3085626fb2e677c1d88f158397948935b73f5239 Mon Sep 17 00:00:00 2001 From: Jiang Liu Date: Tue, 8 Sep 2015 10:41:19 +0800 Subject: [PATCH] Signed-off-by: Jiang Liu --- drivers/pci/pci-driver.c |1 + drivers/scsi/eata.c |2 ++ 2 files changed, 3 insertions(+) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 52a880ca1768..17d2a0b1de18 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -392,6 +392,7 @@ int __weak pcibios_alloc_irq(struct pci_dev *dev) { return 0; } +EXPORT_SYMBOL_GPL(pcibios_alloc_irq); void __weak pcibios_free_irq(struct pci_dev *dev) { diff --git a/drivers/scsi/eata.c b/drivers/scsi/eata.c index 227dd2c2ec2f..7e6eaf867987 100644 --- a/drivers/scsi/eata.c +++ b/drivers/scsi/eata.c @@ -1061,6 +1061,7 @@ static void enable_pci_ports(void) driver_name, dev->bus->number, dev->devfn); #endif + pcibios_alloc_irq(dev); if (pci_enable_device(dev)) printk ("%s: warning, pci_enable_device failed, bus %d devfn 0x%x.\n", @@ -1520,6 +1521,7 @@ static void add_pci_ports(void) if (!(dev = pci_get_class(PCI_CLASS_STORAGE_SCSI << 8, dev))) break; + pcibios_alloc_irq(dev); if (pci_enable_device(dev)) { #if defined(DEBUG_PCI_DETECT) printk -- 1.7.10.4 but I experience identical kexec shutdown and restart problems as with the second version of your patch, as seen here: http://www.users.on.net/~arthur.marsh/20150910541.jpg the original commit 991de2e59090e55c65a7f59a049142e3c480f7bd quoted above seems to have not only lead to start-up problems unless irqpoll was enabled but also lead to kexec shutdown/restart problems. I'm not sure what the solution is but it is good to continue to allow kexec reboots to work. Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: eata fails to load on post 4.2 kernels
Jiang Liu wrote on 10/09/15 17:43: Hi Authur, Thanks for the updating. Seem Bjorn doesn't like neither of my two patches. So I'm trying to convert eata to formal PCI driver, but the change will be much more bigger and still not sure whether we could achieve that. Will keep you updated. Thanks! Gerry Thanks, I'm a bit concerned since the original commit 991de2e59090e55c65a7f59a049142e3c480f7bd broke things badly for me (requiring irqpoll to avoid a kernel hang) and neither of the patches enabled kexec reboots to work like before the original commit. I just tested a kexec reboot with irqpoll enabled and that continues to fail, so I'm back to running 4.2 kernel until there is a patch that works with kexec reboots. Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugfix] PCI, x86: Correctly allocate IRQs for PCI devices managed by non-PCI drivers
Jiang Liu wrote on 08/09/15 16:56: Commit 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()") changes the way to allocate PCI legacy IRQ for PCI devices on x86 platforms. Instead of allocating PCI legacy IRQs when pcibios_enable_device() gets called, now pcibios_alloc_irq() will be called by pci_device_probe() to allocate PCI legacy IRQs when binding PCI drivers to PCI devices. But some device drivers, such as eata, directly access PCI devices without implementing corresponding PCI drivers, so pcibios_alloc_irq() won't be called for those PCI devices and wrong IRQ number may be used to manage the PCI device. So detect such a case in pcibios_enable_device() by checking pci_dev->driver is NULL and call pcibios_alloc_irq() to allocate PCI legacy IRQs. Signed-off-by: Jiang Liu--- arch/x86/pci/common.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index 09d3afc0a181..60b237783582 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -685,6 +685,16 @@ void pcibios_free_irq(struct pci_dev *dev) int pcibios_enable_device(struct pci_dev *dev, int mask) { + /* +* By design, pcibios_alloc_irq() will be called by pci_device_probe() +* when binding a PCI device to a PCI driver. But some device drivers, +* such as eata, directly make use of PCI devices without implementing +* PCI device drivers, so pcibios_alloc_irq() won't be called for those +* PCI devices. +*/ + if (!dev->driver) + pcibios_alloc_irq(dev); + return pci_enable_resources(dev, mask); } Sorry for the late report but this patch messes up things for kexec - rebooting is delayed with the error messages as shown in the fuzzy screen image here: http://www.users.on.net/~arthur.marsh/20150910541.jpg (the error messages are similar to what I was seeing on boot-up before Jiang Liu's patch) and the SCSI card is not recognised by the kernel after a kexec restart, and eata fails to load. Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: eata fails to load on post 4.2 kernels
Jiang Liu wrote on 08/09/15 14:49: Hi Auhur, Could you please help to apply the test patch against the latest mainstream linux kernel? Thanks! Gerry Done, and it appears to work properly thanks! Arthur. [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Initializing cgroup subsys cpuacct [0.00] Linux version 4.2.0+ (root@victoria) (gcc version 5.2.1 20150903 (Debian 5.2.1-16) ) #30 SMP PREEMPT Tue Sep 8 15:10:49 ACST 2015 [0.00] x86/fpu: Legacy x87 FPU detected. [0.00] x86/fpu: Using 'lazy' FPU context switches. [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009fbff] usable [0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved [0.00] BIOS-e820: [mem 0x000e-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0x3ffa] usable [0.00] BIOS-e820: [mem 0x3ffb-0x3ffbdfff] ACPI data [0.00] BIOS-e820: [mem 0x3ffbe000-0x3ffd] ACPI NVS [0.00] BIOS-e820: [mem 0x3ffe-0x3fff] reserved [0.00] BIOS-e820: [mem 0xe000-0xefff] reserved [0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved [0.00] BIOS-e820: [mem 0xff78-0x] reserved [0.00] Notice: NX (Execute Disable) protection cannot be enabled: non-PAE kernel! [0.00] SMBIOS 2.3 present. [0.00] DMI: System manufacturer System Product Name/A8V-MX, BIOS 0503 12/06/2005 [0.00] e820: update [mem 0x-0x0fff] usable ==> reserved [0.00] e820: remove [mem 0x000a-0x000f] usable [0.00] e820: last_pfn = 0x3ffb0 max_arch_pfn = 0x10 [0.00] MTRR default type: uncachable [0.00] MTRR fixed ranges enabled: [0.00] 0-9 write-back [0.00] A-E uncachable [0.00] F-F write-protect [0.00] MTRR variable ranges enabled: [0.00] 0 base 00 mask FFC000 write-back [0.00] 1 base 00D000 mask FFF000 write-combining [0.00] 2 disabled [0.00] 3 disabled [0.00] 4 disabled [0.00] 5 disabled [0.00] 6 disabled [0.00] 7 disabled [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT [0.00] found SMP MP-table at [mem 0x000ff780-0x000ff78f] mapped at [c00ff780] [0.00] initial memory mapped: [mem 0x-0x023f] [0.00] Base memory trampoline at [c009b000] 9b000 size 16384 [0.00] init_memory_mapping: [mem 0x-0x000f] [0.00] [mem 0x-0x000f] page 4k [0.00] init_memory_mapping: [mem 0x35c0-0x35ff] [0.00] [mem 0x35c0-0x35ff] page 4M [0.00] init_memory_mapping: [mem 0x0010-0x35bf] [0.00] [mem 0x0010-0x003f] page 4k [0.00] [mem 0x0040-0x35bf] page 4M [0.00] init_memory_mapping: [mem 0x3600-0x377fdfff] [0.00] [mem 0x3600-0x373f] page 4M [0.00] [mem 0x3740-0x377fdfff] page 4k [0.00] BRK [0x02075000, 0x02075fff] PGTABLE [0.00] RAMDISK: [mem 0x3614c000-0x3709dfff] [0.00] ACPI: Early table checksum verification disabled [0.00] ACPI: RSDP 0x000FAC60 24 (v02 ACPIAM) [0.00] ACPI: XSDT 0x3FFB0100 3C (v01 A M I OEMXSDT 12000506 MSFT 0097) [0.00] ACPI: FACP 0x3FFB0290 F4 (v03 A M I OEMFACP 12000506 MSFT 0097) [0.00] ACPI: DSDT 0x3FFB03F0 0046F0 (v01 A0347 A0347001 0001 INTL 02002026) [0.00] ACPI: FACS 0x3FFBE000 40 [0.00] ACPI: FACS 0x3FFBE000 40 [0.00] ACPI: APIC 0x3FFB0390 5C (v01 A M I OEMAPIC 12000506 MSFT 0097) [0.00] ACPI: OEMB 0x3FFBE040 46 (v01 A M I AMI_OEM 12000506 MSFT 0097) [0.00] ACPI: Local APIC address 0xfee0 [0.00] 135MB HIGHMEM available. [0.00] 887MB LOWMEM available. [0.00] mapped low ram: 0 - 377fe000 [0.00] low ram: 0 - 377fe000 [0.00] BRK [0x02076000, 0x02076fff] PGTABLE [0.00] Zone ranges: [0.00] DMA [mem 0x1000-0x00ff] [0.00] Normal [mem 0x0100-0x377fdfff] [0.00] HighMem [mem 0x377fe000-0x3ffa] [0.00] Movable zone start for each node [0.00] Early memory node ranges [0.00] node 0: [mem 0x1000-0x0009efff] [0.00] node 0: [mem 0x0010-0x3ffa] [0.00] Initmem setup node 0 [mem 0x1000-0x3ffa] [0.00] On node 0 totalpages: 261966 [
Re: [Bugfix] PCI, x86: Correctly allocate IRQs for PCI devices managed by non-PCI drivers
Jiang Liu wrote on 08/09/15 16:56: Commit 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()") changes the way to allocate PCI legacy IRQ for PCI devices on x86 platforms. Instead of allocating PCI legacy IRQs when pcibios_enable_device() gets called, now pcibios_alloc_irq() will be called by pci_device_probe() to allocate PCI legacy IRQs when binding PCI drivers to PCI devices. But some device drivers, such as eata, directly access PCI devices without implementing corresponding PCI drivers, so pcibios_alloc_irq() won't be called for those PCI devices and wrong IRQ number may be used to manage the PCI device. So detect such a case in pcibios_enable_device() by checking pci_dev->driver is NULL and call pcibios_alloc_irq() to allocate PCI legacy IRQs. Signed-off-by: Jiang Liu--- arch/x86/pci/common.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c index 09d3afc0a181..60b237783582 100644 --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -685,6 +685,16 @@ void pcibios_free_irq(struct pci_dev *dev) int pcibios_enable_device(struct pci_dev *dev, int mask) { + /* +* By design, pcibios_alloc_irq() will be called by pci_device_probe() +* when binding a PCI device to a PCI driver. But some device drivers, +* such as eata, directly make use of PCI devices without implementing +* PCI device drivers, so pcibios_alloc_irq() won't be called for those +* PCI devices. +*/ + if (!dev->driver) + pcibios_alloc_irq(dev); + return pci_enable_resources(dev, mask); } Thanks, I removed the test patch and applied the revised patch and built and rebooted the kernel and successfully mounted file systems on a disk attached to the DPT 2044W card using the eata driver: [0.00] Linux version 4.2.0+ (root@victoria) (gcc version 5.2.1 20150903 (Debian 5.2.1-16) ) #31 SMP PREEMPT Tue Sep 8 17:36:28 ACST 2015 ... [ 80.691097] EATA0: IRQ 10 mapped to IO-APIC IRQ 17. [ 80.724519] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio. [ 80.752035] EATA config options -> tm:1, lc:y, mq:16, rs:y, et:n, ip:n, ep:n, pp:y. [ 80.777063] EATA0: 2.0C, PCI 0xd890, IRQ 17, BMST, SG 122, MB 64. [ 80.802391] EATA0: wide SCSI support enabled, max_id 16, max_lun 8. [ 80.827959] EATA0: SCSI channel 0 enabled, host target ID 7. [ 80.853413] scsi host3: EATA/DMA 2.0x rev. 8.10.00 [ 82.445662] scsi 3:0:6:0: Direct-Access IBM DCAS-34330W S65A PQ: 0 ANSI: 2 [ 82.471584] scsi 3:0:6:0: cmds/lun 16, sorted, simple tags. [ 84.571451] sd 3:0:6:0: Attached scsi generic sg4 type 0 [ 84.597572] sd 3:0:6:0: [sdd] 8466688 512-byte logical blocks: (4.33 GB/4.03 GiB) [ 84.659874] sd 3:0:6:0: [sdd] Write Protect is off [ 84.688543] sd 3:0:6:0: [sdd] Mode Sense: b3 00 00 08 [ 84.714021] sd 3:0:6:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 84.817682] sdd: sdd1 sdd2 < sdd5 > [ 84.919267] sd 3:0:6:0: [sdd] Attached SCSI disk Arthur. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Fwd: Re: eata fails to load on post 4.2 kernels
Forwarding without image attachment to get below message size limit of the mailing lists. I've uploaded the image to: http://www.users.on.net/~arthur.marsh/20150907539.jpg Forwarded Message Subject: Re: eata fails to load on post 4.2 kernels Date: Mon, 07 Sep 2015 15:56:02 +0930 From: Arthur Marsh <arthur.ma...@internode.on.net> To: Jiang Liu <jiang@linux.intel.com> CC: Bjorn Helgaas <bhelg...@google.com>, t...@linutronix.de, linux-scsi@vger.kernel.org, linux-ker...@vger.kernel.org Jiang Liu wrote on 07/09/15 12:36: On 2015/9/7 4:31, Arthur Marsh wrote: Arthur Marsh wrote on 06/09/15 21:07: Arthur Marsh wrote on 06/09/15 18:34: Arthur Marsh wrote on 06/09/15 15:58: Hi, I'm seeing the following on post 4.2 kernels, am currently bisecting to find where it started: First kernel in the bisection that worked without needing irqpoll: [ 73.751482] EATA0: IRQ 10 mapped to IO-APIC IRQ 17. [ 73.776711] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio. [ 73.802005] EATA config options -> tm:1, lc:y, mq:16, rs:y, et:n, ip:n, ep:n, pp:y. [ 73.829175] EATA0: 2.0C, PCI 0xd890, IRQ 17, BMST, SG 122, MB 64. [ 73.82] EATA0: wide SCSI support enabled, max_id 16, max_lun 8. [ 73.881125] EATA0: SCSI channel 0 enabled, host target ID 7. After a git bisect, I get: git bisect good 991de2e59090e55c65a7f59a049142e3c480f7bd is the first bad commit commit 991de2e59090e55c65a7f59a049142e3c480f7bd Author: Jiang Liu <jiang@linux.intel.com> Date: Wed Jun 10 16:54:59 2015 +0800 PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq() To support IOAPIC hotplug, we need to allocate PCI IRQ resources on demand and free them when not used anymore. Implement pcibios_alloc_irq() and pcibios_free_irq() to dynamically allocate and free PCI IRQs. Remove mp_should_keep_irq(), which is no longer used. [bhelgaas: changelog] Signed-off-by: Jiang Liu <jiang@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelg...@google.com> Acked-by: Thomas Gleixner <t...@linutronix.de> :04 04 765e2d5232d53247ec260b34b51589c3bccb36ae f680234a27685e94b1a35ae2a7218f8eafa9071a M arch :04 04 d55a682bcde72682e883365e88ad1df6186fd54d f82c470a04a6845fcf5e0aa934512c75628f798d M drivers I'm happy to supply more details if needed. Hi Arthur, Thanks for reporting this. It seems to be an irq misrouting issue. Could you please help to provide: 1) full dmesg with the latest code 2) full dmesg and /proc/interrupts with the latest code and kernel parameter "irqpoll" specified Thanks! Gerry The pc locks up when loading the eata module so I've attached a photo of the monitor screen. Arthur. [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Initializing cgroup subsys cpuacct [0.00] Linux version 4.2.0+ (root@victoria) (gcc version 5.2.1 20150903 (Debian 5.2.1-16) ) #29 SMP PREEMPT Mon Sep 7 07:10:45 ACST 2015 [0.00] x86/fpu: Legacy x87 FPU detected. [0.00] x86/fpu: Using 'lazy' FPU context switches. [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009fbff] usable [0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved [0.00] BIOS-e820: [mem 0x000e-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0x3ffa] usable [0.00] BIOS-e820: [mem 0x3ffb-0x3ffbdfff] ACPI data [0.00] BIOS-e820: [mem 0x3ffbe000-0x3ffd] ACPI NVS [0.00] BIOS-e820: [mem 0x3ffe-0x3fff] reserved [0.00] BIOS-e820: [mem 0xe000-0xefff] reserved [0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved [0.00] BIOS-e820: [mem 0xff78-0x] reserved [0.00] Notice: NX (Execute Disable) protection cannot be enabled: non-PAE kernel! [0.00] SMBIOS 2.3 present. [0.00] DMI: System manufacturer System Product Name/A8V-MX, BIOS 0503 12/06/2005 [0.00] e820: update [mem 0x-0x0fff] usable ==> reserved [0.00] e820: remove [mem 0x000a-0x000f] usable [0.00] e820: last_pfn = 0x3ffb0 max_arch_pfn = 0x10 [0.00] MTRR default type: uncachable [0.00] MTRR fixed ranges enabled: [0.00] 0-9 write-back [0.00] A-E uncachable [0.00] F-F write-protect [0.00] MTRR variable ranges enabled: [0.00] 0 base 00 mask FFC000 write-back [0.00] 1 base 00D000 mask FFF000 write-combining [0.00] 2 disabled [0.00] 3 disabled [0.00] 4 disabled [0.00] 5 disabled [0.00] 6 disabled [0.0
Re: eata fails to load on post 4.2 kernels
Arthur Marsh wrote on 06/09/15 15:58: Hi, I'm seeing the following on post 4.2 kernels, am currently bisecting to find where it started: an error message suggested trying setting irqpoll on the kernel command line, which worked: [ 85.230148] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio. [ 85.255929] EATA config options -> tm:1, lc:y, mq:16, rs:y, et:n, ip:n, ep:n, pp:y. [ 85.282472] EATA0: 2.0C, PCI 0xd890, IRQ 10, BMST, SG 122, MB 64. [ 85.308281] EATA0: wide SCSI support enabled, max_id 16, max_lun 8. [ 85.333237] EATA0: SCSI channel 0 enabled, host target ID 7. [ 85.358097] scsi host3: EATA/DMA 2.0x rev. 8.10.00 [ 86.950246] scsi 3:0:6:0: Direct-Access IBM DCAS-34330W S65A PQ: 0 ANSI: 2 [ 86.975531] scsi 3:0:6:0: cmds/lun 16, sorted, simple tags. [ 89.075921] sd 3:0:6:0: Attached scsi generic sg4 type 0 [ 89.101628] sd 3:0:6:0: [sdd] 8466688 512-byte logical blocks: (4.33 GB/4.03 GiB) [ 89.166331] sd 3:0:6:0: [sdd] Write Protect is off [ 89.192023] sd 3:0:6:0: [sdd] Mode Sense: b3 00 00 08 [ 89.209400] sd 3:0:6:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 89.312977] sdd: sdd1 sdd2 < sdd5 > [ 89.402386] sd 3:0:6:0: [sdd] Attached SCSI disk -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: eata fails to load on post 4.2 kernels
Arthur Marsh wrote on 06/09/15 18:34: Arthur Marsh wrote on 06/09/15 15:58: Hi, I'm seeing the following on post 4.2 kernels, am currently bisecting to find where it started: First kernel in the bisection that worked without needing irqpoll: [ 73.751482] EATA0: IRQ 10 mapped to IO-APIC IRQ 17. [ 73.776711] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio. [ 73.802005] EATA config options -> tm:1, lc:y, mq:16, rs:y, et:n, ip:n, ep:n, pp:y. [ 73.829175] EATA0: 2.0C, PCI 0xd890, IRQ 17, BMST, SG 122, MB 64. [ 73.82] EATA0: wide SCSI support enabled, max_id 16, max_lun 8. [ 73.881125] EATA0: SCSI channel 0 enabled, host target ID 7. [ 73.906599] scsi host3: EATA/DMA 2.0x rev. 8.10.00 [ 75.466016] scsi 3:0:6:0: Direct-Access IBM DCAS-34330W S65A PQ: 0 ANSI: 2 [ 75.491947] scsi 3:0:6:0: cmds/lun 16, sorted, simple tags. [ 77.560139] sd 3:0:6:0: Attached scsi generic sg4 type 0 [ 77.586272] sd 3:0:6:0: [sdd] 8466688 512-byte logical blocks: (4.33 GB/4.03 GiB) [ 77.671836] sd 3:0:6:0: [sdd] Write Protect is off [ 77.700217] sd 3:0:6:0: [sdd] Mode Sense: b3 00 00 08 [ 77.725970] sd 3:0:6:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 77.829574] sdd: sdd1 sdd2 < sdd5 > [ 77.929879] sd 3:0:6:0: [sdd] Attached SCSI disk -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] eata: remove driver_lock
Christoph Hellwig wrote, on 14/07/14 17:56: port_detect is only called from the module_init routine and thus implicitly serialized, so remove the driver lock which was held over potentially sleeping function calls. Signed-off-by: Christoph Hellwig h...@lst.de Reported-by: Arthur Marsh arthur.ma...@internode.on.net Tested-by: Arthur Marsh arthur.ma...@internode.on.net --- drivers/scsi/eata.c | 9 - 1 file changed, 9 deletions(-) diff --git a/drivers/scsi/eata.c b/drivers/scsi/eata.c index 03372cf..980898e 100644 --- a/drivers/scsi/eata.c +++ b/drivers/scsi/eata.c @@ -837,7 +837,6 @@ struct hostdata { static struct Scsi_Host *sh[MAX_BOARDS]; static const char *driver_name = EATA; static char sha[MAX_BOARDS]; -static DEFINE_SPINLOCK(driver_lock); /* Initialize num_boards so that ihdlr can work while detect is in progress */ static unsigned int num_boards = MAX_BOARDS; @@ -1097,8 +1096,6 @@ static int port_detect(unsigned long port_base, unsigned int j, goto fail; } - spin_lock_irq(driver_lock); - if (do_dma(port_base, 0, READ_CONFIG_PIO)) { #if defined(DEBUG_DETECT) printk(%s: detect, do_dma failed at 0x%03lx.\n, name, @@ -1265,10 +1262,7 @@ static int port_detect(unsigned long port_base, unsigned int j, } #endif - spin_unlock_irq(driver_lock); sh[j] = shost = scsi_register(tpnt, sizeof(struct hostdata)); - spin_lock_irq(driver_lock); - if (shost == NULL) { printk(%s: unable to register host, detaching.\n, name); goto freedma; @@ -1345,8 +1339,6 @@ static int port_detect(unsigned long port_base, unsigned int j, else sprintf(dma_name, DMA %u, dma_channel); - spin_unlock_irq(driver_lock); - for (i = 0; i shost-can_queue; i++) ha-cp[i].cp_dma_addr = pci_map_single(ha-pdev, ha-cp[i], @@ -1439,7 +1431,6 @@ static int port_detect(unsigned long port_base, unsigned int j, freeirq: free_irq(irq, sha[j]); freelock: - spin_unlock_irq(driver_lock); release_region(port_base, REGION_SIZE); fail: return 0; Not sure if this is related but it only appeared in the last few days in Linus' git master: [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Initializing cgroup subsys cpuacct [0.00] Linux version 3.16.0+ (root@am64) (gcc version 4.9.1 (Debian 4.9.1-5) ) #1141 SMP Sun Aug 10 20:50:33 ACST 2014 [0.00] e820: BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x0100-0x0009efff] usable [0.00] BIOS-e820: [mem 0x0009f000-0x0009] reserved [0.00] BIOS-e820: [mem 0x000f-0x000f] reserved [0.00] BIOS-e820: [mem 0x0010-0x5fffbfff] usable [0.00] BIOS-e820: [mem 0x5fffc000-0x5fffefff] ACPI data [0.00] BIOS-e820: [mem 0x5000-0x5fff] ACPI NVS [0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved [0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved [0.00] BIOS-e820: [mem 0x-0x] reserved [0.00] Notice: NX (Execute Disable) protection missing in CPU! [0.00] SMBIOS 2.3 present. [0.00] DMI: System Manufacturer System Name/P4S800, BIOS ASUS P4S800 ACPI BIOS Revision 1011 Beta 001 08/30/2005 [0.00] e820: update [mem 0x-0x0fff] usable == reserved [0.00] e820: remove [mem 0x000a-0x000f] usable [0.00] e820: last_pfn = 0x5fffc max_arch_pfn = 0x10 [0.00] MTRR default type: uncachable [0.00] MTRR fixed ranges enabled: [0.00] 0-9 write-back [0.00] A-B uncachable [0.00] C-C7FFF write-protect [0.00] C8000-E uncachable [0.00] F-F write-protect [0.00] MTRR variable ranges enabled: [0.00] 0 base 0 mask FC000 write-back [0.00] 1 base 04000 mask FE000 write-back [0.00] 2 base 0C000 mask FF000 write-combining [0.00] 3 disabled [0.00] 4 disabled [0.00] 5 disabled [0.00] 6 disabled [0.00] 7 disabled [0.00] x86 PAT enabled: cpu 0, old 0x7010600070106, new 0x7010600070106 [0.00] initial memory mapped: [mem 0x-0x023f] [0.00] Base memory trampoline at [c009b000] 9b000 size 16384 [0.00] init_memory_mapping: [mem 0x-0x000f] [0.00] [mem 0x-0x000f] page 4k [0.00] init_memory_mapping: [mem 0x3700-0x373f] [0.00] [mem 0x3700-0x373f] page 2M [0.00] init_memory_mapping: [mem 0x3000-0x36ff
Re: eata - issue appeared in Linus git master in last 24-48 hours
Christoph Hellwig wrote, on 11/07/14 18:50: On Mon, Jun 30, 2014 at 04:31:33AM +0930, Arthur Marsh wrote: Hi, I haven't had time to do a git bisect yet, but just saw this after rebuilding the kernel in the last day or so: It seems like some of the routines called during the driver initialization may sleep while the driver_lock is held and irqs are disabled. As eata2x_detect is only called during module load the lock seems entirely pointless and should be removed, like in the patch below: diff --git a/drivers/scsi/eata.c b/drivers/scsi/eata.c index 03372cf..980898e 100644 --- a/drivers/scsi/eata.c +++ b/drivers/scsi/eata.c @@ -837,7 +837,6 @@ struct hostdata { static struct Scsi_Host *sh[MAX_BOARDS]; static const char *driver_name = EATA; static char sha[MAX_BOARDS]; -static DEFINE_SPINLOCK(driver_lock); /* Initialize num_boards so that ihdlr can work while detect is in progress */ static unsigned int num_boards = MAX_BOARDS; @@ -1097,8 +1096,6 @@ static int port_detect(unsigned long port_base, unsigned int j, goto fail; } - spin_lock_irq(driver_lock); - if (do_dma(port_base, 0, READ_CONFIG_PIO)) { #if defined(DEBUG_DETECT) printk(%s: detect, do_dma failed at 0x%03lx.\n, name, @@ -1265,10 +1262,7 @@ static int port_detect(unsigned long port_base, unsigned int j, } #endif - spin_unlock_irq(driver_lock); sh[j] = shost = scsi_register(tpnt, sizeof(struct hostdata)); - spin_lock_irq(driver_lock); - if (shost == NULL) { printk(%s: unable to register host, detaching.\n, name); goto freedma; @@ -1345,8 +1339,6 @@ static int port_detect(unsigned long port_base, unsigned int j, else sprintf(dma_name, DMA %u, dma_channel); - spin_unlock_irq(driver_lock); - for (i = 0; i shost-can_queue; i++) ha-cp[i].cp_dma_addr = pci_map_single(ha-pdev, ha-cp[i], @@ -1439,7 +1431,6 @@ static int port_detect(unsigned long port_base, unsigned int j, freeirq: free_irq(irq, sha[j]); freelock: - spin_unlock_irq(driver_lock); release_region(port_base, REGION_SIZE); fail: return 0; Thanks, I've rebuilt the kernel with this patch applied and running the rebuilt kernel fine using a DPT 2044W SCSI adaptor: $ lspci|grep DPT 00:0c.0 SCSI storage controller: Adaptec (formerly DPT) SmartCache/Raid I-IV Controller (rev 02) $ dmesg|grep -i eata [1.038968] EATA0: warning, DMA protocol support not asserted. [1.039041] EATA0: IRQ 11 mapped to IO-APIC IRQ 16. [1.040801] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio. [1.040861] EATA config options - tm:1, lc:y, mq:16, rs:y, et:n, ip:n, ep:n, pp:y. [1.040922] EATA0: 2.0C, PCI 0x7410, IRQ 16, BMST, SG 122, MB 64. [1.040973] EATA0: wide SCSI support enabled, max_id 16, max_lun 8. [1.041025] EATA0: SCSI channel 0 enabled, host target ID 7. [1.041095] scsi2 : EATA/DMA 2.0x rev. 8.10.00 Arthur. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
eata - issue appeared in Linus git master in last 24-48 hours
Hi, I haven't had time to do a git bisect yet, but just saw this after rebuilding the kernel in the last day or so: [1.044035] EATA0: warning, DMA protocol support not asserted. [1.044035] EATA0: IRQ 11 mapped to IO-APIC IRQ 16. [1.046040] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002 [1.046123] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber= 1 [1.046204] usb usb1: Product: EHCI Host Controller [1.046275] usb usb1: Manufacturer: Linux 3.16.0-rc2+ ehci_hcd [1.046348] usb usb1: SerialNumber: :00:03.3 [1.049496] hub 1-0:1.0: USB hub found [1.050029] hub 1-0:1.0: 6 ports detected [1.050625] BUG: spinlock wrong CPU on CPU#1, systemd-udevd/63 [1.050700] lock: driver_lock+0x0/0xef00 [eata], .magic: dead4ead, .owne r: systemd-udevd/63, .owner_cpu: 0 [1.050785] CPU: 1 PID: 63 Comm: systemd-udevd Not tainted 3.16.0-rc2+ #1038 [1.050850] Hardware name: System Manufacturer System Name/P4S800, BIOS ASUS P4S800 ACPI BIOS Revision 1011 Beta 001 08/30/2005 [1.050935] f8048100 f707fad8 c1416ba9 f43aad44 f707fb04 c1081b6f c 158aab0 [1.051301] f8048100 dead4ead f43aad44 003f f8048100 c155b1eb 0 010 [1.051678] f707fb14 c1081bdc f8048100 f50e8000 f707fb20 c1081e43 f8048100 f 707fb2c [1.052051] Call Trace: [1.052119] [c1416ba9] dump_stack+0x41/0x52 [1.052183] [c1081b6f] spin_dump+0x8c/0xde [1.052249] [c1081bdc] spin_bug+0x1b/0x1f [1.052310] [c1081e43] do_raw_spin_unlock+0x79/0x7b [1.052377] [c141c495] _raw_spin_unlock_irq+0x1d/0x26 [1.052444] [f8046176] port_detect+0xa54/0xefc [eata] [1.052509] [c141a510] ? __mutex_unlock_slowpath+0xb6/0x136 [1.052576] [c141a598] ? mutex_unlock+0x8/0xa [1.052641] [c102d766] ? ioapic_write_entry+0x17/0x43 [1.052706] [c102d78b] ? ioapic_write_entry+0x3c/0x43 [1.052771] [c102d78b] ? ioapic_write_entry+0x3c/0x43 [1.052837] [c102e633] ? io_apic_setup_irq_pin+0x175/0x319 [1.052904] [c12804bf] ? acpi_os_release_lock+0x8/0xa [1.052970] [c131f7dc] ? pci_conf1_read+0x43/0xdd [1.053036] [c131f801] ? pci_conf1_read+0x68/0xdd [1.053101] [c1410ccc] ? klist_next+0x1b/0xef [1.053166] [c1410d9e] ? klist_next+0xed/0xef [1.053237] [c141c449] ? _raw_spin_unlock+0x1d/0x20 [1.053304] [c1410d9e] ? klist_next+0xed/0xef [1.053383] [c12505f6] ? pci_do_find_bus+0x36/0x36 [1.053449] [c12e1a18] ? bus_find_device+0x5b/0x7d [1.053511] [c12dfc7c] ? put_device+0xf/0x11 [1.053571] [c124f172] ? pci_dev_put+0xf/0x11 [1.053635] [c125078e] ? pci_get_dev_by_id+0x3f/0x8a [1.053701] [c12505f6] ? pci_do_find_bus+0x36/0x36 [1.053763] [c12508d4] ? pci_get_class+0x46/0x48 [1.053829] [f80466f7] eata2x_detect+0xd9/0x3ef [eata] [1.053836] ohci-pci: OHCI PCI platform driver [1.054213] ohci-pci :00:03.0: OHCI PCI host controller [1.054231] ohci-pci :00:03.0: new USB bus registered, assigned bus numbe r 2 [1.054293] ohci-pci :00:03.0: irq 9, io mem 0xbe80 [1.054782] [f8021000] ? 0xf8020fff [1.054853] [f8021054] init_this_scsi_driver+0x54/0x1000 [eata] [1.054923] [f8021000] ? 0xf8020fff [1.054987] [c100041b] do_one_initcall+0x75/0x198 [1.055051] [f8021000] ? 0xf8020fff [1.055115] [c11255cd] ? __vunmap+0x77/0xce [1.055179] [c10aa53b] load_module+0x19a6/0x224a [1.055248] [c10aaed2] SyS_finit_module+0x5c/0x6b [1.055320] [f8039000] ? 0xf8038fff [1.055385] [c141ce0e] syscall_call+0x7/0xb [1.060902] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio. [1.060966] EATA config options - tm:1, lc:y, mq:16, rs:y, et:n, ip:n, ep:n, pp:y. [1.061029] EATA0: 2.0C, PCI 0x7410, IRQ 16, BMST, SG 122, MB 64. [1.061080] EATA0: wide SCSI support enabled, max_id 16, max_lun 8. [1.061132] EATA0: SCSI channel 0 enabled, host target ID 7. [1.061192] scsi0 : EATA/DMA 2.0x rev. 8.10.00 The machine has a dual core P4 and the kernel was compiled with gcc-4.9.0: Linux version 3.16.0-rc2+ (root@am64) (gcc version 4.9.0 (Debian 4.9.0-9) ) #1038 SMP Sun Jun 29 10:19:20 CST 2014 The actual SCSI HBA is a DPT 2044W. Regards, Arthur. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: eata - issue appeared in Linus git master in last 24-48 hours
Arthur Marsh wrote, on 30/06/14 04:31: Hi, I haven't had time to do a git bisect yet, but just saw this after rebuilding the kernel in the last day or so: [1.044035] EATA0: warning, DMA protocol support not asserted. [1.044035] EATA0: IRQ 11 mapped to IO-APIC IRQ 16. [1.046040] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002 [1.046123] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber= 1 [1.046204] usb usb1: Product: EHCI Host Controller [1.046275] usb usb1: Manufacturer: Linux 3.16.0-rc2+ ehci_hcd [1.046348] usb usb1: SerialNumber: :00:03.3 [1.049496] hub 1-0:1.0: USB hub found [1.050029] hub 1-0:1.0: 6 ports detected [1.050625] BUG: spinlock wrong CPU on CPU#1, systemd-udevd/63 [1.050700] lock: driver_lock+0x0/0xef00 [eata], .magic: dead4ead, .owne r: systemd-udevd/63, .owner_cpu: 0 [1.050785] CPU: 1 PID: 63 Comm: systemd-udevd Not tainted 3.16.0-rc2+ #1038 [1.050850] Hardware name: System Manufacturer System Name/P4S800, BIOS ASUS P4S800 ACPI BIOS Revision 1011 Beta 001 08/30/2005 [1.050935] f8048100 f707fad8 c1416ba9 f43aad44 f707fb04 c1081b6f c 158aab0 [1.051301] f8048100 dead4ead f43aad44 003f f8048100 c155b1eb 0 010 [1.051678] f707fb14 c1081bdc f8048100 f50e8000 f707fb20 c1081e43 f8048100 f 707fb2c [1.052051] Call Trace: [1.052119] [c1416ba9] dump_stack+0x41/0x52 [1.052183] [c1081b6f] spin_dump+0x8c/0xde [1.052249] [c1081bdc] spin_bug+0x1b/0x1f [1.052310] [c1081e43] do_raw_spin_unlock+0x79/0x7b [1.052377] [c141c495] _raw_spin_unlock_irq+0x1d/0x26 [1.052444] [f8046176] port_detect+0xa54/0xefc [eata] [1.052509] [c141a510] ? __mutex_unlock_slowpath+0xb6/0x136 [1.052576] [c141a598] ? mutex_unlock+0x8/0xa [1.052641] [c102d766] ? ioapic_write_entry+0x17/0x43 [1.052706] [c102d78b] ? ioapic_write_entry+0x3c/0x43 [1.052771] [c102d78b] ? ioapic_write_entry+0x3c/0x43 [1.052837] [c102e633] ? io_apic_setup_irq_pin+0x175/0x319 [1.052904] [c12804bf] ? acpi_os_release_lock+0x8/0xa [1.052970] [c131f7dc] ? pci_conf1_read+0x43/0xdd [1.053036] [c131f801] ? pci_conf1_read+0x68/0xdd [1.053101] [c1410ccc] ? klist_next+0x1b/0xef [1.053166] [c1410d9e] ? klist_next+0xed/0xef [1.053237] [c141c449] ? _raw_spin_unlock+0x1d/0x20 [1.053304] [c1410d9e] ? klist_next+0xed/0xef [1.053383] [c12505f6] ? pci_do_find_bus+0x36/0x36 [1.053449] [c12e1a18] ? bus_find_device+0x5b/0x7d [1.053511] [c12dfc7c] ? put_device+0xf/0x11 [1.053571] [c124f172] ? pci_dev_put+0xf/0x11 [1.053635] [c125078e] ? pci_get_dev_by_id+0x3f/0x8a [1.053701] [c12505f6] ? pci_do_find_bus+0x36/0x36 [1.053763] [c12508d4] ? pci_get_class+0x46/0x48 [1.053829] [f80466f7] eata2x_detect+0xd9/0x3ef [eata] [1.053836] ohci-pci: OHCI PCI platform driver [1.054213] ohci-pci :00:03.0: OHCI PCI host controller [1.054231] ohci-pci :00:03.0: new USB bus registered, assigned bus numbe r 2 [1.054293] ohci-pci :00:03.0: irq 9, io mem 0xbe80 [1.054782] [f8021000] ? 0xf8020fff [1.054853] [f8021054] init_this_scsi_driver+0x54/0x1000 [eata] [1.054923] [f8021000] ? 0xf8020fff [1.054987] [c100041b] do_one_initcall+0x75/0x198 [1.055051] [f8021000] ? 0xf8020fff [1.055115] [c11255cd] ? __vunmap+0x77/0xce [1.055179] [c10aa53b] load_module+0x19a6/0x224a [1.055248] [c10aaed2] SyS_finit_module+0x5c/0x6b [1.055320] [f8039000] ? 0xf8038fff [1.055385] [c141ce0e] syscall_call+0x7/0xb [1.060902] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio. [1.060966] EATA config options - tm:1, lc:y, mq:16, rs:y, et:n, ip:n, ep:n, pp:y. [1.061029] EATA0: 2.0C, PCI 0x7410, IRQ 16, BMST, SG 122, MB 64. [1.061080] EATA0: wide SCSI support enabled, max_id 16, max_lun 8. [1.061132] EATA0: SCSI channel 0 enabled, host target ID 7. [1.061192] scsi0 : EATA/DMA 2.0x rev. 8.10.00 This wasn't repeated on a reboot, so at this stage it is a one-off problem. Arthur. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html