Re: [PATCH] scsi: eata: drop VLA in reorder()

2018-03-12 Thread Arthur Marsh



Linus Torvalds wrote on 13/03/18 05:15:

On Sun, Mar 11, 2018 at 8:08 PM, Tobin C. Harding <to...@apporbit.com> wrote:


I think we are going to see a recurring theme here.  MAX_MAILBOXES==64
so this patch adds 1536 bytes to the stack on a 64 bit machine or 768
bytes on a 32 bit machine.


Yeah, that's a bit excessive. It probably works, but one or two of
those allocations will make the kernel stack really tight, so in
general I really would suggest using kmalloc() instead, or figuring
out some way to simply shrink the data structures.

That said, I wonder if the solution to this particular driver is
"delete it". Because the hardware is truly ancient and nobody sane
would use it any more.

The last patch that seemed to come from an actual _user_ finding a
problem was in 2008 (commit 20c09df7eb9c: "[SCSI] eata: fix the data
buffer accessors conversion regression"). And even then it apparently
took a year for people to have noticed the breakage.

But because the person who reported that problem is still around, I'll
just add him to the cc, just in case.

Arthur Marsh, you have the dubious honor and distinction of being the
only person to have apparently used that driver in the last ten years.
Do you still have hardware using that? Because maybe it's really time
to retire that driver.

Linus



Hi Linus and maintainers, thanks for the courtesy email and all the help 
with the driver.


I am unable to make use of the driver any more due to failed hardware.

The DPT2044W SCSI controller and the IBM disk from May 1998 last 
officially ran on 7 August 2017. I was had previously been able to get 
the data off it and disconnected the controller and disk following 
recurring problems with booting.


Aug  7 16:40:24 localhost kernel: [  105.098705] sd 0:0:6:0: [sda] 
Synchronizing SCSI cache
Aug  7 16:40:24 localhost kernel: [  105.233166] EATA0: IRQ 11 mapped to 
IO-APIC IRQ 18.
Aug  7 16:40:24 localhost kernel: [  105.233475] EATA/DMA 2.0x: 
Copyright (C) 1994-2003 Dario Ballabio.
Aug  7 16:40:24 localhost kernel: [  105.233485] EATA config options -> 
tm:1, lc:y, mq:16, rs:y, et:n, ip:n, ep:n, pp:y.
Aug  7 16:40:24 localhost kernel: [  105.233492] EATA0: 2.0C, PCI 
0x9010, IRQ 18, BMST, SG 122, MB 64.
Aug  7 16:40:24 localhost kernel: [  105.233499] EATA0: wide SCSI 
support enabled, max_id 16, max_lun 8.
Aug  7 16:40:24 localhost kernel: [  105.233505] EATA0: SCSI channel 0 
enabled, host target ID 7.
Aug  7 16:40:24 localhost kernel: [  105.233521] scsi host0: EATA/DMA 
2.0x rev. 8.10.00


Arthur Marsh.


Re: CPU lock-ups with 4.12.0+ kernels related to usb_storage

2017-07-17 Thread Arthur Marsh

Arthur Marsh wrote on 14/07/17 04:18:



Alan Stern wrote on 14/07/17 02:30:


All right.  In the meantime, changing usb-storage won't hurt.
Arthur, can you test the patch below?

Alan Stern



Index: usb-4.x/drivers/usb/storage/usb.c
===
--- usb-4.x.orig/drivers/usb/storage/usb.c
+++ usb-4.x/drivers/usb/storage/usb.c
@@ -315,6 +315,7 @@ static int usb_stor_control_thread(void
  {
  struct us_data *us = (struct us_data *)__us;
  struct Scsi_Host *host = us_to_host(us);
+struct scsi_cmnd *srb;
  for (;;) {
  usb_stor_dbg(us, "*** thread sleeping\n");
@@ -330,6 +331,7 @@ static int usb_stor_control_thread(void
  scsi_lock(host);
  /* When we are called with no command pending, we're done */
+srb = us->srb;
  if (us->srb == NULL) {
  scsi_unlock(host);
  mutex_unlock(>dev_mutex);
@@ -398,14 +400,11 @@ static int usb_stor_control_thread(void
  /* lock access to the state */
  scsi_lock(host);
-/* indicate that the command is done */
-if (us->srb->result != DID_ABORT << 16) {
-usb_stor_dbg(us, "scsi cmd done, result=0x%x\n",
- us->srb->result);
-us->srb->scsi_done(us->srb);
-} else {
+/* was the command aborted? */
+if (us->srb->result == DID_ABORT << 16) {
  SkipForAbort:
  usb_stor_dbg(us, "scsi command aborted\n");
+srb = NULL;/* Don't call srb->scsi_done() */
  }
  /*
@@ -429,6 +428,13 @@ SkipForAbort:
  /* unlock the device pointers */
  mutex_unlock(>dev_mutex);
+
+/* now that the locks are released, notify the SCSI core */
+if (srb) {
+usb_stor_dbg(us, "scsi cmd done, result=0x%x\n",
+srb->result);
+srb->scsi_done(srb);
+}
  } /* for (;;) */
  /* Wait until we are told to stop */


Hi, just to confirm no further lock-ups occurred in the last 4 days with 
this patch applied.


Arthur.


Re: CPU lock-ups with 4.12.0+ kernels related to usb_storage

2017-07-13 Thread Arthur Marsh



Alan Stern wrote on 14/07/17 02:30:


All right.  In the meantime, changing usb-storage won't hurt.
Arthur, can you test the patch below?

Alan Stern



Index: usb-4.x/drivers/usb/storage/usb.c
===
--- usb-4.x.orig/drivers/usb/storage/usb.c
+++ usb-4.x/drivers/usb/storage/usb.c
@@ -315,6 +315,7 @@ static int usb_stor_control_thread(void
  {
struct us_data *us = (struct us_data *)__us;
struct Scsi_Host *host = us_to_host(us);
+   struct scsi_cmnd *srb;
  
  	for (;;) {

usb_stor_dbg(us, "*** thread sleeping\n");
@@ -330,6 +331,7 @@ static int usb_stor_control_thread(void
scsi_lock(host);
  
  		/* When we are called with no command pending, we're done */

+   srb = us->srb;
if (us->srb == NULL) {
scsi_unlock(host);
mutex_unlock(>dev_mutex);
@@ -398,14 +400,11 @@ static int usb_stor_control_thread(void
/* lock access to the state */
scsi_lock(host);
  
-		/* indicate that the command is done */

-   if (us->srb->result != DID_ABORT << 16) {
-   usb_stor_dbg(us, "scsi cmd done, result=0x%x\n",
-us->srb->result);
-   us->srb->scsi_done(us->srb);
-   } else {
+   /* was the command aborted? */
+   if (us->srb->result == DID_ABORT << 16) {
  SkipForAbort:
usb_stor_dbg(us, "scsi command aborted\n");
+   srb = NULL; /* Don't call srb->scsi_done() */
}
  
  		/*

@@ -429,6 +428,13 @@ SkipForAbort:
  
  		/* unlock the device pointers */

mutex_unlock(>dev_mutex);
+
+   /* now that the locks are released, notify the SCSI core */
+   if (srb) {
+   usb_stor_dbg(us, "scsi cmd done, result=0x%x\n",
+   srb->result);
+   srb->scsi_done(srb);
+   }
} /* for (;;) */
  
  	/* Wait until we are told to stop */





Thanks for the patch!

I have applied it and am running the resulting kernel.

As I didn't have a reproducible way to trigger the problem on demand, 
I'll just have to see that there isn't a lock-up that looks related over 
the next several days.


Arthur.


Re: [PATCH 0/4] block: Fixes for bdi handling

2017-03-09 Thread Arthur Marsh



Jan Kara wrote on 09/03/17 03:18:

Hi!

patches in this series fix the most urgent bugs that were introduced by commit
165a5e22fafb "block: Move bdi_unregister() to del_gendisk()" and by
0dba1314d4f8 "scsi, block: fix duplicate bdi name registration crashes".
In fact before these commits we had a different set of problems in the
code but they were less visible :).

I'm still waiting for test confirmation from Omar and Arthur Marsh who reported
issues but I'm not able to hit any problem anymore in my testing.  I think it
would be nice to get the patches to rc2 so to speed up things I'm posting the
patches now so that review can happen in parallel with the testing.

Other BDI handling fixes I have in my queue can wait a bit more since they are
either theoretical or long-standing issues. So I'll repost them once these four
are sorted out.

Honza



Sorry for the delay in replying, I had to leave the kernel with all 4 
patches applied rebuilding while I was at work and just booted it.


I've only done a kexec reboot so far but there were no problems - no 
errors in dmesg, all disks were recognised and all attempted mounts worked.


Thanks very much for the quick fix!

Arthur.


problem with block: Move bdi_unregister() to del_gendisk() commit 165a5e22fafb127ecb5914e12e8c32a1f0d3f820

2017-03-08 Thread Arthur Marsh


On one of my pc's I have 2 PATA disks (one, WDC below is used for 
booting, the other SAMSUNG is not mounted), plus an IBM SCSI disk using 
a DPT 2044W controller with eata driver and sometimes a Verbatim 
Storengo USB stick.


On recent 4.10.0+ kernel builds (i386), the resulting kernel would pause 
during the start-up when the USB stick was inserted but boot normally 
otherwise.


A git-bisect lead to:

commit 165a5e22fafb127ecb5914e12e8c32a1f0d3f820
Author: Jan Kara 
Date:   Wed Feb 8 08:05:56 2017 +0100

block: Move bdi_unregister() to del_gendisk()

Commit 6cd18e711dd8 "block: destroy bdi before blockdev is
unregistered." moved bdi unregistration (at that time through
bdi_destroy()) from blk_release_queue() to blk_cleanup_queue() because
it needs to happen before blk_unregister_region() call in del_gendisk()
for MD. SCSI though will free up the device number from sd_remove()
called through a maze of callbacks from device_del() in
__scsi_remove_device() before blk_cleanup_queue() and thus similar 
races

as described in 6cd18e711dd8 can happen for SCSI as well as reported by
Omar [1].

Moving bdi_unregister() to del_gendisk() works for MD and fixes the
problem for SCSI since del_gendisk() gets called from sd_remove() 
before

freeing the device number.

This also makes device_add_disk() (calling bdi_register_owner()) more
symmetric with del_gendisk().

[1] http://marc.info/?l=linux-block=148554717109098=2

When booting the bad kernel, I would eventually get a prompt to press 
the enter key to boot and it eventually started, but the SCSI disk 
partitions were not found by blkid nor could they be mounted.


lsscsi reports:

[0:0:6:0]diskIBM  DCAS-34330W  S65A  /dev/sda
[1:0:0:0]diskATA  WDC WD3200AAJB-0 2C01  /dev/sdc
[2:0:0:0]cd/dvd  HL-DT-ST DVDRAM GSA-4163B A103  /dev/sr0
[2:0:1:0]diskATA  SAMSUNG SP4002H  0-57  /dev/sdd
[3:0:0:0]diskVerbatim STORE N GO   5.00  /dev/sdb

blkid reports:

/dev/sdb1: LABEL="STORENGO" UUID="B08B-79DA" TYPE="vfat" 
PARTUUID="961d9655-01"
/dev/sdc1: UUID="bfdeb6d6-0b77-4beb-a63d-bdc3e455b8ea" TYPE="ext3" 
PTTYPE="dos" PARTUUID="000750bf-01"
/dev/sdc5: UUID="26b7280a-f40c-49dd-a086-dbbb9b7e3def" TYPE="swap" 
PARTUUID="000750bf-05"

/dev/sdc6: UUID="7417-5AFF" TYPE="vfat" PARTUUID="000750bf-06"
/dev/sdc7: UUID="96c96a61-8615-4715-86d0-09cb8c62638c" TYPE="ext3" 
PARTUUID="000750bf-07"
/dev/sdd1: LABEL="W-98 SE" UUID="3571-16DE" TYPE="vfat" 
PARTUUID="43598af3-01"
/dev/sdd3: UUID="fd6a052e-c062-4c47-801d-087595635c5d" SEC_TYPE="ext2" 
TYPE="ext3" PARTUUID="43598af3-03"
/dev/sdd5: UUID="026a3f5c-0064-4ae7-869e-519d2cee05e7" SEC_TYPE="ext2" 
TYPE="ext3" PTTYPE="dos" PARTUUID="43598af3-05"
/dev/sdd6: UUID="9a0970fa-74ba-4426-98ac-1e8b81933e0e" TYPE="swap" 
PARTUUID="43598af3-06"

/dev/sdd7: UUID="4912-06CA" TYPE="vfat" PARTUUID="43598af3-07"

The boot screen is at:

http://www.users.on.net/~arthur.marsh/20170308_18.jpg

and the dmesg output from booting the bad kernel is attached.

I'm happy to supply any other configuration details needed and run 
further tests.


Regards,

Arthur.



[0.00] Linux version 4.10.0+ (root@victoria) (gcc version 6.3.0 
20170221 (Debian 6.3.0-8) ) #652 SMP PREEMPT Wed Mar 8 04:45:06 ACDT 2017
[0.00] x86/fpu: x87 FPU will use FXSAVE
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009fbff] usable
[0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x3ffa] usable
[0.00] BIOS-e820: [mem 0x3ffb-0x3ffbdfff] ACPI data
[0.00] BIOS-e820: [mem 0x3ffbe000-0x3ffd] ACPI NVS
[0.00] BIOS-e820: [mem 0x3ffe-0x3fff] reserved
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved
[0.00] BIOS-e820: [mem 0xff78-0x] reserved
[0.00] NX (Execute Disable) protection: active
[0.00] SMBIOS 2.3 present.
[0.00] DMI: System manufacturer System Product Name/A8V-MX, BIOS 0503   
 12/06/2005
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0x3ffb0 max_arch_pfn = 0x100
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-E uncachable
[0.00]   F-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 00 mask FFC000 write-back
[0.00]   1 base 00D000 mask FFF000 

Re: [PATCH] eata: Convert eata driver as normal PCI and platform device drivers

2016-03-01 Thread Arthur Marsh



Jiang Liu wrote on 02/03/16 13:50:


I spoke too soon, without removing and re-inserting the eata module
before any filesystems on disks attached to the DPT controller were
mounted, I'd get the following messages, similar to ones previously
reported:

sd 0:0:6:0: tag#0 abort, mbox 1.
EATA0: abort, mbox 1 is in use.
sd 0:0:6:0: tag#0 reset, enter.
EATA0: reset, mbox 1 in reset.
EATA0: reset, board reset done, enabling interrupts.
EATA0: reset, interrupts disabled, loops 100415.
EATA0, reset, mbox 1 locked, DID_RESET, done.
EATA0: reset, exit, done.


and so on, finally hanging after printing "kexec_core: Starting new
kernel" (I have a photo of the messages if they're needed).

So I'm still using the new patch but have to continue to remove and
reinsert eata at start-up before any attempts to mount disks attatched
to the DPT SCSI controller.

Hi Authur,
Thanks for testing. So current situation is that we have
a working driver for normal case, but still have issues during kexec.
Per my understanding, we need to implement a PCI device driver shutdown
callback to reset the RAID controller. I have once tried to implement
the shutdown callback, but it doesn't work. And I have no deep
understanding of the RAID controller and have no hardware for
experiment too, so have no idea about next step.
Maybe one acceptable way is to merge this patch first, so
we get a basic working driver, and then ask help from expert to
solve the kexec issue.
Thanks!
Gerry


My controller is a DPT2044W, it does not provide any hardware RAID 
capabilities.


I'm not sure where responsibility lies in driver development but I'm 
still using the DPT2044W controller which worked on the 4.2.0 kernels 
and earlier and this problem has been around for nearly 5 months now.


I can do builds and tests of any patches that people can provide, but am 
not a C programmer, much less a Linux driver developer.


Regards,

Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] eata: Convert eata driver as normal PCI and platform device drivers

2016-03-01 Thread Arthur Marsh



Arthur Marsh wrote on 02/03/16 03:57:



Christoph Hellwig wrote on 01/03/16 17:22:

Hi Jiang.

I'd love to see this patch in and abuse of the old PCI API gone.

Did you resolve the problems Arthur saw with the previous iteratons
of the patch?



I applied Jiang Liu's patch of 1st March 2016 to a clean kernel
4.5.0-rc6 source, removed my workaround of removing and re-adding the
eata module before mounting file-systems that are on disks attached to
the DPT SCSI card using the eata driver, and was able to kexec from the
new kernel successfully.

Arthur.


I spoke too soon, without removing and re-inserting the eata module 
before any filesystems on disks attached to the DPT controller were 
mounted, I'd get the following messages, similar to ones previously 
reported:


sd 0:0:6:0: tag#0 abort, mbox 1.
EATA0: abort, mbox 1 is in use.
sd 0:0:6:0: tag#0 reset, enter.
EATA0: reset, mbox 1 in reset.
EATA0: reset, board reset done, enabling interrupts.
EATA0: reset, interrupts disabled, loops 100415.
EATA0, reset, mbox 1 locked, DID_RESET, done.
EATA0: reset, exit, done.


and so on, finally hanging after printing "kexec_core: Starting new 
kernel" (I have a photo of the messages if they're needed).


So I'm still using the new patch but have to continue to remove and 
reinsert eata at start-up before any attempts to mount disks attatched 
to the DPT SCSI controller.


Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] eata: Convert eata driver as normal PCI and platform device drivers

2016-03-01 Thread Arthur Marsh



Christoph Hellwig wrote on 01/03/16 17:22:

Hi Jiang.

I'd love to see this patch in and abuse of the old PCI API gone.

Did you resolve the problems Arthur saw with the previous iteratons
of the patch?



I applied Jiang Liu's patch of 1st March 2016 to a clean kernel 
4.5.0-rc6 source, removed my workaround of removing and re-adding the 
eata module before mounting file-systems that are on disks attached to 
the DPT SCSI card using the eata driver, and was able to kexec from the 
new kernel successfully.


Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


eata module for DPT SCSI cards

2015-12-06 Thread Arthur Marsh


Hi, I'm still having to have the following applied to be able to use the 
eata driver for my DPT2044W SCSI card.


Is there any chance that this could be mainlined or another fix 
implemented that can be mainlined?


As it is with the following patches applied, I still have to unload and 
reload the eata driver before mounting filesystems on the disk attached 
to the DPT2044W SCSI card that uses the eata driver, otherwise kexec 
reboots fail.


Without the patches applied, the machine locks up when it tries to load 
the eata module.


Arthur.


diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index d7ffd66..8321c46 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -391,6 +391,7 @@ int __weak pcibios_alloc_irq(struct pci_dev *dev)
 {
return 0;
 }
+EXPORT_SYMBOL_GPL(pcibios_alloc_irq);

 void __weak pcibios_free_irq(struct pci_dev *dev)
 {
diff --git a/drivers/scsi/eata.c b/drivers/scsi/eata.c
index 227dd2c..7e6eaf8 100644
--- a/drivers/scsi/eata.c
+++ b/drivers/scsi/eata.c
@@ -1061,6 +1061,7 @@ static void enable_pci_ports(void)
   driver_name, dev->bus->number, dev->devfn);
 #endif

+   pcibios_alloc_irq(dev);
if (pci_enable_device(dev))
printk
("%s: warning, pci_enable_device failed, bus %d devfn 
0x%x.\n",
@@ -1520,6 +1521,7 @@ static void add_pci_ports(void)
if (!(dev = pci_get_class(PCI_CLASS_STORAGE_SCSI << 8, dev)))
break;

+   pcibios_alloc_irq(dev);
if (pci_enable_device(dev)) {
 #if defined(DEBUG_PCI_DETECT)
printk

## end
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT v3] eata: Convert eata driver as normal PCI and platform device drivers

2015-10-05 Thread Arthur Marsh



Jiang Liu wrote on 03/10/15 17:41:


If I do a normal boot which includes eata being loaded, the disk
attached to the DPT2044W controller having its filesystems checked and
mounted, then attempt a kexec reboot, I get the reboot pausing after the
"synchronizing SCSI cache" messages as before.

If I un-mount the filesystems on the disk attached to the DPT2044W
controller after start-up and try a reboot I get the same problem.

If I do modprobe -r eata after un-mounting the filesystems on the disk
attached to the DPT2044W controller after a start-up kexec *works fine*.

Hi Arthur,
The above results suggest that we need to shutdown eata
controller for kexec. So could you please try to apply the attached
patch upon the previous two patches?
Thanks!
Gerry



To clarify, if the eata driver gets loaded once and stays loaded, at a 
kexec reboot attempt the "Synchronising SCSI cache" message is missing 
for the SCSI disk attached to the controller using the eata driver and 
eventually other error messages appear as seen in screen images that I 
have previously posted.


If the eata driver is loaded, unloaded via modprobe -r, then reloaded, a 
kexec reboot shows 2 "Synchronising SCSI cache" messages for the SCSI 
disk attached to the controller using the eata driver and the kexec 
reboot is successful.



Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT v3] eata: Convert eata driver as normal PCI and platform device drivers

2015-10-03 Thread Arthur Marsh



Jiang Liu wrote on 03/10/15 17:41:


Hi Arthur,
The above results suggest that we need to shutdown eata
controller for kexec. So could you please try to apply the attached
patch upon the previous two patches?
Thanks!
Gerry



Hi, I still get kexec shutdown errors like this with the 3rd patch applied:

http://www.users.on.net/~arthur.marsh/20151003566.jpg

I can still unmount filesystems, modprobe -r eata and modprobe eata to 
get things into a state where a kexec reboot works.


Regards,

Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT v3] eata: Convert eata driver as normal PCI and platform device drivers

2015-09-26 Thread Arthur Marsh



Arthur Marsh wrote on 24/09/15 15:26:



Jiang Liu wrote on 24/09/15 13:58:


Hi James,
Thanks for review. How about the attached patch which addresses
the three suggestions from you?
Thanks!
Gerry


I've applied the patch, rebuilt the kernel and verified that it allows
unloading of the eata module and reloading it, as well as a successful
kexec.

Regards,

Arthur.


After some more thorough testing I've encountered an ongoing problem 
trying to use kexec with filesystems mounted with the eata driver.


If I boot up and have the eata driver loaded but no filesystem check or 
mounting of filesystems on the disk attached to the DPT2044W controller, 
then attempt a kexec reboot I get the reboot pausing after the 
"synchronizing scsi cache" messages and getting the errors that I have 
included as pictures in my previous reports.


If I do a normal boot which includes eata being loaded, the disk 
attached to the DPT2044W controller having its filesystems checked and 
mounted, then attempt a kexec reboot, I get the reboot pausing after the 
"synchronizing SCSI cache" messages as before.


If I un-mount the filesystems on the disk attached to the DPT2044W 
controller after start-up and try a reboot I get the same problem.


If I do modprobe -r eata after un-mounting the filesystems on the disk 
attached to the DPT2044W controller after a start-up kexec *works fine*.


If I do:

start-up
un-mount filesystems on disk attached to DPT2044W controller
modprobe -r eata
modprobe eata
fsck -a of filesystems on disk attached to DPT2044W controller
mount filesystems

then a kexec reboot works fine.

I did some more experimenting and found a workaround:

I was unable to blacklist the eata module but if I did:

modprobe -r eata
modprobe eata

in a cron job before the fsck and mount commands then
I could then perform a kexec reboot successfully.

I also verified that if I did:

modprobe -r eata

after eata was loaded on boot-up without any fsck or mounting of 
filesystems on the disk attached to the DPT2044W controller using the 
eata the kexec reboot worked fine.


In summary:

if eata is loaded kexec reboot will fail unless a modprobe -r eata is 
done either manually or by a cron job.


if a modprobe -r eata has been done, then even if I modprobe eata and 
fsck and mount filesystems, kexec reboot works.


Any suggestions for further tests or checks welcome.

Arthur.


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT v3] eata: Convert eata driver as normal PCI and platform device drivers

2015-09-23 Thread Arthur Marsh



Jiang Liu wrote on 23/09/15 14:54:


Hi Arthur,
I have found the cause of the warning messages, it's caused
by a flaw in the conversion. But according to my understanding,
it isn't related to the kexec/kdump failure. Could you please help
to test the attached new version?
Thanks!
Gerry



Thanks, the patch worked, I could successfully unload and reload the 
eata module, and perform a kexec reboot with the eata module loading 
successfully afterwards.


Arthur.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT v3] eata: Convert eata driver as normal PCI and platform device drivers

2015-09-23 Thread Arthur Marsh



Jiang Liu wrote on 24/09/15 13:58:


Hi James,
Thanks for review. How about the attached patch which addresses
the three suggestions from you?
Thanks!
Gerry


I've applied the patch, rebuilt the kernel and verified that it allows 
unloading of the eata module and reloading it, as well as a successful 
kexec.


Regards,

Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT v3] eata: Convert eata driver as normal PCI and platform device drivers

2015-09-22 Thread Arthur Marsh



Jiang Liu wrote on 22/09/15 17:00:

Previously the eata driver just grabs and accesses eata PCI devices
without implementing a PCI device driver, that causes troubles with
latest IRQ related

Commit 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and
pcibios_free_irq()") changes the way to allocate PCI legacy IRQ
for PCI devices on x86 platforms. Instead of allocating PCI legacy
IRQs when pcibios_enable_device() gets called, now pcibios_alloc_irq()
will be called by pci_device_probe() to allocate PCI legacy IRQs
when binding PCI drivers to PCI devices.

But the eata driver directly accesses PCI devices without implementing
corresponding PCI drivers, so pcibios_alloc_irq() won't be called for
those PCI devices and wrong IRQ number may be used to manage the PCI
device.

This patch implements a PCI device driver to manage eata PCI devices,
so eata driver could properly cooperate with the PCI core. It also
provides headroom for PCI hotplug with eata driver.

It also represents non-PCI eata devices as platform devices, so it could
be managed as normal devices.

Signed-off-by: Jiang Liu 
Cc: Hannes Reinecke 
Cc: Ballabio, Dario 
Cc: Christoph Hellwig 
---


Not really any change with this driver:

previously

http://www.users.on.net/~arthur.marsh/20150915547.jpg

now

http://www.users.on.net/~arthur.marsh/20150922553.jpg

If there was any way of capturing any more debug output I'd be happy to 
do it.


Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT v3] eata: Convert eata driver as normal PCI and platform device drivers

2015-09-22 Thread Arthur Marsh



James Bottomley wrote on 23/09/15 08:15:

On Wed, 2015-09-23 at 07:55 +0930, Arthur Marsh wrote:


Jiang Liu wrote on 22/09/15 17:00:

Previously the eata driver just grabs and accesses eata PCI devices
without implementing a PCI device driver, that causes troubles with
latest IRQ related

Commit 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and
pcibios_free_irq()") changes the way to allocate PCI legacy IRQ
for PCI devices on x86 platforms. Instead of allocating PCI legacy
IRQs when pcibios_enable_device() gets called, now pcibios_alloc_irq()
will be called by pci_device_probe() to allocate PCI legacy IRQs
when binding PCI drivers to PCI devices.

But the eata driver directly accesses PCI devices without implementing
corresponding PCI drivers, so pcibios_alloc_irq() won't be called for
those PCI devices and wrong IRQ number may be used to manage the PCI
device.

This patch implements a PCI device driver to manage eata PCI devices,
so eata driver could properly cooperate with the PCI core. It also
provides headroom for PCI hotplug with eata driver.

It also represents non-PCI eata devices as platform devices, so it could
be managed as normal devices.

Signed-off-by: Jiang Liu <jiang@linux.intel.com>
Cc: Hannes Reinecke <h...@suse.de>
Cc: Ballabio, Dario <dario.balla...@emc.com>
Cc: Christoph Hellwig <h...@infradead.org>
---


Not really any change with this driver:

previously

http://www.users.on.net/~arthur.marsh/20150915547.jpg

now

http://www.users.on.net/~arthur.marsh/20150922553.jpg

If there was any way of capturing any more debug output I'd be happy to
do it.


It looks to be some problem in shut down.  Can you simply remove and
re-insert the driver successfully?  If it's your root disk driver,
you'll have to do this from an initrd so as not to have root mounted
from the eata controller.

If the remove and reinsert fails, it means we have a problem in the
driver shut down.  If not, it's likely something kexec related.

James


OK, it looks like there was a problem with unloading the driver.

After un-mounting file systems on the disk attached to the SCSI 
controller using the eata driver I could do a:


modprobe -r eata

but received the output of the attached dmesg log.

Attempting to do

modprobe eata

after the previous modprobe -r eata resulted in a complete lock-up.

Arthur.
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Initializing cgroup subsys cpuacct
[0.00] Linux version 4.3.0-rc2+ (root@victoria) (gcc version 5.2.1 
20150911 (Debian 5.2.1-17) ) #49 SMP PREEMPT Tue Sep 22 04:58:18 ACST 2015
[0.00] x86/fpu: Legacy x87 FPU detected.
[0.00] x86/fpu: Using 'lazy' FPU context switches.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009fbff] usable
[0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x3ffa] usable
[0.00] BIOS-e820: [mem 0x3ffb-0x3ffbdfff] ACPI data
[0.00] BIOS-e820: [mem 0x3ffbe000-0x3ffd] ACPI NVS
[0.00] BIOS-e820: [mem 0x3ffe-0x3fff] reserved
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved
[0.00] BIOS-e820: [mem 0xff78-0x] reserved
[0.00] Notice: NX (Execute Disable) protection cannot be enabled: 
non-PAE kernel!
[0.00] SMBIOS 2.3 present.
[0.00] DMI: System manufacturer System Product Name/A8V-MX, BIOS 0503   
 12/06/2005
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0x3ffb0 max_arch_pfn = 0x10
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-E uncachable
[0.00]   F-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 00 mask FFC000 write-back
[0.00]   1 base 00D000 mask FFF000 write-combining
[0.00]   2 disabled
[0.00]   3 disabled
[0.00]   4 disabled
[0.00]   5 disabled
[0.00]   6 disabled
[0.00]   7 disabled
[0.00] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- WT  
[0.00] found SMP MP-table at [mem 0x000ff780-0x000ff78f] mapped at 
[c00ff780]
[0.00] initial memory mapped: [mem 0x-0x023f]
[0.00] Base memory trampoline at [c009b000] 9b000 size 16384
[0.00] init_memory_mapping: [mem 0x-0x000f]
[0.00]  [mem 0x0

Re: [Bugfix 3/3] eata: Enhance eata driver to support PCI device hot-removal

2015-09-18 Thread Arthur Marsh



Christoph Hellwig wrote on 16/09/15 23:12:


Jiang, you also need to convert the driver to
scsi_add_host/scsi_remove_host from the legacy scsi_register interface,
otherwise the SCSI layer will be very unhappy.

Take a look at commit 0d31f8759109cbc1e6fc196d08e6b0e8a9e93b3f for
example, the change should be straight forward.



I am pleased to note that when I tried a Linus git head kernel from the 
last 24 hours, the IRQ routing for my DPT2044W SCSI card using eata 
module worked again, although the shut-down/kexec issue remains.


Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugfix 0/3] Convert eata driver to a normal PCI device driver

2015-09-16 Thread Arthur Marsh



Jiang Liu wrote on 16/09/15 14:37:

On 2015/9/15 15:19, Arthur Marsh wrote:



Jiang Liu wrote on 15/09/15 12:01:


HI Arthur,
 Really appreciate your help to test the patches. That's
a good sign we have moved forward a bit:)
 For kexec, it's always challenging to me. So could you
please help to provide full dmesg logs with working kernels
so I could try to figure out the order among scsi and PCI devices.
It may be shutdown order related.
Thanks!
Gerry


OK, attached is the dmesg output from the 4.2.0 kernel where kexec worked.

Hi Arthur,
Could you please also help to capture the log messages
of kexec, I need to those log messages to figure out the order
to shutdown PCI devices and scsi devices during kexec.
Thanks!
Gerry



How would I capture the log messages of kexec (assuming that there are 
any, I couldn't see from the manual page entries and haven't seen 
anything beyond the screen images that I have already sent you)?


Regards,

Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugfix 0/3] Convert eata driver to a normal PCI device driver

2015-09-16 Thread Arthur Marsh



Jiang Liu wrote on 16/09/15 17:51:


Hi Arthur,
It would be great if we could capture the text as in the
picture posted by you at:
http://www.users.on.net/~arthur.marsh/20150915547.jpg
I guess a serial console could help us to capture those
log messages. To use serial console, we need to setup serial cable,
configure grub and kernel to use serial port as console.
Thanks!
Gerry



Regards,

Arthur.




I've already included the text of what appeaered in the image above:

sd 0:0:6:0: abort, mbox 63.
EATA0: abort, mbox 63 is in use.
sd 0:0:6:0: reset, enter.
EATA0: reset, mbox 63 in reset.
EATA0: reset, board reset done, enabling interrupts.
EATA0: reset, interrupts disabled, loops 100469.
EATA0: reset, mbox 63 locked, DID_RESET, done.
EATA0: reset, exit, done.
sd 0:0:6:0: qcomm, mbox 0, adapter busy, will start
sd 0:0:6:0: abort, mbox 0.
EATA0: abort, timeout error.
sd 0:0:6:0: reset, enter.
EATA0: reset, exit, timeout error.
sd 0:0:6:0 Device offlinled - not ready after error recovery
sd 0:0:6:0 rejecting I/O to offline device
sd 0:0:6:0 rejecting I/O to offline device
sd 0:0:6:0 [sda] Synchronize Cache(10) failed: Result: 
hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK

starting new kernel

As mentioned previously this occurred after the normal Synchronizing 
SCSI cache messages.


I don't think that there is anything else that gets sent to the console.

Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugfix 0/3] Convert eata driver to a normal PCI device driver

2015-09-15 Thread Arthur Marsh



Jiang Liu wrote on 15/09/15 12:01:


HI Arthur,
Really appreciate your help to test the patches. That's
a good sign we have moved forward a bit:)
For kexec, it's always challenging to me. So could you
please help to provide full dmesg logs with working kernels
so I could try to figure out the order among scsi and PCI devices.
It may be shutdown order related.
Thanks!
Gerry


OK, attached is the dmesg output from the 4.2.0 kernel where kexec worked.

Arthur.
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Initializing cgroup subsys cpuacct
[0.00] Linux version 4.2.0 (root@am64) (gcc version 5.1.1 20150711 
(Debian 5.1.1-14) ) #1921 SMP PREEMPT Sun Sep 6 00:08:31 ACST 2015
[0.00] x86/fpu: Legacy x87 FPU detected.
[0.00] x86/fpu: Using 'lazy' FPU context switches.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009fbff] usable
[0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x3ffa] usable
[0.00] BIOS-e820: [mem 0x3ffb-0x3ffbdfff] ACPI data
[0.00] BIOS-e820: [mem 0x3ffbe000-0x3ffd] ACPI NVS
[0.00] BIOS-e820: [mem 0x3ffe-0x3fff] reserved
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved
[0.00] BIOS-e820: [mem 0xff78-0x] reserved
[0.00] Notice: NX (Execute Disable) protection cannot be enabled: 
non-PAE kernel!
[0.00] SMBIOS 2.3 present.
[0.00] DMI: System manufacturer System Product Name/A8V-MX, BIOS 0503   
 12/06/2005
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0x3ffb0 max_arch_pfn = 0x10
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-E uncachable
[0.00]   F-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 00 mask FFC000 write-back
[0.00]   1 base 00D000 mask FFF000 write-combining
[0.00]   2 disabled
[0.00]   3 disabled
[0.00]   4 disabled
[0.00]   5 disabled
[0.00]   6 disabled
[0.00]   7 disabled
[0.00] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- WT  
[0.00] found SMP MP-table at [mem 0x000ff780-0x000ff78f] mapped at 
[c00ff780]
[0.00] initial memory mapped: [mem 0x-0x023f]
[0.00] Base memory trampoline at [c009b000] 9b000 size 16384
[0.00] init_memory_mapping: [mem 0x-0x000f]
[0.00]  [mem 0x-0x000f] page 4k
[0.00] init_memory_mapping: [mem 0x35c0-0x35ff]
[0.00]  [mem 0x35c0-0x35ff] page 4M
[0.00] init_memory_mapping: [mem 0x0010-0x35bf]
[0.00]  [mem 0x0010-0x003f] page 4k
[0.00]  [mem 0x0040-0x35bf] page 4M
[0.00] init_memory_mapping: [mem 0x3600-0x377fdfff]
[0.00]  [mem 0x3600-0x373f] page 4M
[0.00]  [mem 0x3740-0x377fdfff] page 4k
[0.00] BRK [0x0207d000, 0x0207dfff] PGTABLE
[0.00] RAMDISK: [mem 0x3614-0x37097fff]
[0.00] ACPI: Early table checksum verification disabled
[0.00] ACPI: RSDP 0x000FAC60 24 (v02 ACPIAM)
[0.00] ACPI: XSDT 0x3FFB0100 3C (v01 A M I  OEMXSDT  
12000506 MSFT 0097)
[0.00] ACPI: FACP 0x3FFB0290 F4 (v03 A M I  OEMFACP  
12000506 MSFT 0097)
[0.00] ACPI: DSDT 0x3FFB03F0 0046F0 (v01 A0347  A0347001 
0001 INTL 02002026)
[0.00] ACPI: FACS 0x3FFBE000 40
[0.00] ACPI: FACS 0x3FFBE000 40
[0.00] ACPI: APIC 0x3FFB0390 5C (v01 A M I  OEMAPIC  
12000506 MSFT 0097)
[0.00] ACPI: OEMB 0x3FFBE040 46 (v01 A M I  AMI_OEM  
12000506 MSFT 0097)
[0.00] ACPI: Local APIC address 0xfee0
[0.00] 135MB HIGHMEM available.
[0.00] 887MB LOWMEM available.
[0.00]   mapped low ram: 0 - 377fe000
[0.00]   low ram: 0 - 377fe000
[0.00] BRK [0x0207e000, 0x0207efff] PGTABLE
[0.00] Zone ranges:
[0.00]   DMA  [mem 0x1000-0x00ff]
[0.00]   Normal   [mem 0x0100-0x377fdfff]
[0.00]   HighMem  [mem 0x377fe000-0x3ffa]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   

Re: [Bugfix 0/3] Convert eata driver to a normal PCI device driver

2015-09-14 Thread Arthur Marsh



Jiang Liu wrote on 14/09/15 12:38:

Hi Authur,
As suggested by Bjorn, patch 1-2 set implement a PCI device
driver to manage eata PCI devices. And patch 3 tries to support PCI
device hot-removal for eata, but I have no change to test due to
limited knowledge about scsi subsystem and lacking of hardware for
tests.
So you could please help to test patch 1-2? Patch 3 is just
for comments.
Thanks!
Gerry

Jiang Liu (3):
   eata: Use IDA to manage eata board IDs
   eata: Implement PCI driver to manage eata PCI devices
   eata: Enhance eata driver to support PCI device hot-removal

  drivers/scsi/eata.c |  232 +++
  1 file changed, 125 insertions(+), 107 deletions(-)



With patches 1 and 2 applied, I get a successful boot with IRQ mapping:

[1.147056] EATA0: IRQ 10 mapped to IO-APIC IRQ 17.
[1.160404] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio.
[1.160469] EATA config options -> tm:1, lc:y, mq:16, rs:y, et:n, 
ip:n, ep:n, pp:y.

[1.160541] EATA0: 2.0C, PCI 0xd890, IRQ 17, BMST, SG 122, MB 64.
[1.160600] EATA0: wide SCSI support enabled, max_id 16, max_lun 8.
[1.160658] EATA0: SCSI channel 0 enabled, host target ID 7.
[1.161207] scsi host0: EATA/DMA 2.0x rev. 8.10.00


but I still get errors when trying to do a kexec reboot, see 
http://www.users.on.net/~arthur.marsh/20150915547.jpg


roughly it reads (after the synchronising SCSI cache reboot messages) 
and a long period of a dark screen:


sd 0:0:6:0: abort, mbox 63.
EATA0: abort, mbox 63 is in use.
sd 0:0:6:0: reset, enter.
EATA0: reset, mbox 63 in reset.
EATA0: reset, board reset done, enabling interrupts.
EATA0: reset, interrupts disabled, loops 100469.
EATA0: reset, mbox 63 locked, DID_RESET, done.
EATA0: reset, exit, done.
sd 0:0:6:0: qcomm, mbox 0, adapter busy, will start
sd 0:0:6:0: abort, mbox 0.
EATA0: abort, timeout error.
sd 0:0:6:0: reset, enter.
EATA0: reset, exit, timeout error.
sd 0:0:6:0 Device offlinled - not ready after error recovery
sd 0:0:6:0 rejecting I/O to offline device
sd 0:0:6:0 rejecting I/O to offline device
sd 0:0:6:0 [sda] Synchronize Cache(10) failed: Result: 
hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK

starting new kernel

It would be great if this problem could be fixed.

Arthur.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: eata fails to load on post 4.2 kernels

2015-09-10 Thread Arthur Marsh



Jiang Liu wrote on 08/09/15 14:49:

Hi Auhur,
Could you please help to apply the test patch
against the latest mainstream linux kernel?
Thanks!
Gerry

...


git bisect good
991de2e59090e55c65a7f59a049142e3c480f7bd is the first bad commit
commit 991de2e59090e55c65a7f59a049142e3c480f7bd
Author: Jiang Liu 
Date:   Wed Jun 10 16:54:59 2015 +0800

  PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()

  To support IOAPIC hotplug, we need to allocate PCI IRQ resources on
demand
  and free them when not used anymore.

  Implement pcibios_alloc_irq() and pcibios_free_irq() to dynamically
  allocate and free PCI IRQs.

  Remove mp_should_keep_irq(), which is no longer used.

  [bhelgaas: changelog]
  Signed-off-by: Jiang Liu 
  Signed-off-by: Bjorn Helgaas 
  Acked-by: Thomas Gleixner 

:04 04 765e2d5232d53247ec260b34b51589c3bccb36ae
f680234a27685e94b1a35ae2a7218f8eafa9071a M  arch
:04 04 d55a682bcde72682e883365e88ad1df6186fd54d
f82c470a04a6845fcf5e0aa934512c75628f798d M  drivers


I tried to do a kexec shut-down with the first version of your patch:

>From 3085626fb2e677c1d88f158397948935b73f5239 Mon Sep 17 00:00:00 2001
From: Jiang Liu 
Date: Tue, 8 Sep 2015 10:41:19 +0800
Subject: [PATCH]


Signed-off-by: Jiang Liu 
---
 drivers/pci/pci-driver.c |1 +
 drivers/scsi/eata.c  |2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 52a880ca1768..17d2a0b1de18 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -392,6 +392,7 @@ int __weak pcibios_alloc_irq(struct pci_dev *dev)
 {
return 0;
 }
+EXPORT_SYMBOL_GPL(pcibios_alloc_irq);

 void __weak pcibios_free_irq(struct pci_dev *dev)
 {
diff --git a/drivers/scsi/eata.c b/drivers/scsi/eata.c
index 227dd2c2ec2f..7e6eaf867987 100644
--- a/drivers/scsi/eata.c
+++ b/drivers/scsi/eata.c
@@ -1061,6 +1061,7 @@ static void enable_pci_ports(void)
   driver_name, dev->bus->number, dev->devfn);
 #endif

+   pcibios_alloc_irq(dev);
if (pci_enable_device(dev))
printk
("%s: warning, pci_enable_device failed, bus %d devfn 
0x%x.\n",
@@ -1520,6 +1521,7 @@ static void add_pci_ports(void)
if (!(dev = pci_get_class(PCI_CLASS_STORAGE_SCSI << 8, dev)))
break;

+   pcibios_alloc_irq(dev);
if (pci_enable_device(dev)) {
 #if defined(DEBUG_PCI_DETECT)
printk
--
1.7.10.4

but I experience identical kexec shutdown and restart problems as with 
the second version of your patch, as seen here:


http://www.users.on.net/~arthur.marsh/20150910541.jpg

the original commit 991de2e59090e55c65a7f59a049142e3c480f7bd quoted 
above seems to have not only lead to start-up problems unless irqpoll 
was enabled but also lead to kexec shutdown/restart problems.


I'm not sure what the solution is but it is good to continue to allow 
kexec reboots to work.


Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: eata fails to load on post 4.2 kernels

2015-09-10 Thread Arthur Marsh



Jiang Liu wrote on 10/09/15 17:43:

Hi Authur,
Thanks for the updating. Seem Bjorn doesn't like
neither of my two patches. So I'm trying to convert eata
to formal PCI driver, but the change will be much more
bigger and still not sure whether we could achieve that.
Will keep you updated.
Thanks!
Gerry


Thanks, I'm a bit concerned since the original

commit 991de2e59090e55c65a7f59a049142e3c480f7bd

broke things badly for me (requiring irqpoll to avoid a kernel hang) and 
neither of the patches enabled kexec reboots to work like before the 
original commit.


I just tested a kexec reboot with irqpoll enabled and that continues to 
fail, so I'm back to running 4.2 kernel until there is a patch that 
works with kexec reboots.


Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugfix] PCI, x86: Correctly allocate IRQs for PCI devices managed by non-PCI drivers

2015-09-09 Thread Arthur Marsh



Jiang Liu wrote on 08/09/15 16:56:

Commit 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and
pcibios_free_irq()") changes the way to allocate PCI legacy IRQ
for PCI devices on x86 platforms. Instead of allocating PCI legacy
IRQs when pcibios_enable_device() gets called, now pcibios_alloc_irq()
will be called by pci_device_probe() to allocate PCI legacy IRQs
when binding PCI drivers to PCI devices.

But some device drivers, such as eata, directly access PCI devices
without implementing corresponding PCI drivers, so pcibios_alloc_irq()
won't be called for those PCI devices and wrong IRQ number may be
used to manage the PCI device.

So detect such a case in pcibios_enable_device() by checking
pci_dev->driver is NULL and call pcibios_alloc_irq() to allocate PCI
legacy IRQs.

Signed-off-by: Jiang Liu 
---
  arch/x86/pci/common.c |   10 ++
  1 file changed, 10 insertions(+)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 09d3afc0a181..60b237783582 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -685,6 +685,16 @@ void pcibios_free_irq(struct pci_dev *dev)

  int pcibios_enable_device(struct pci_dev *dev, int mask)
  {
+   /*
+* By design, pcibios_alloc_irq() will be called by pci_device_probe()
+* when binding a PCI device to a PCI driver. But some device drivers,
+* such as eata, directly make use of PCI devices without implementing
+* PCI device drivers, so pcibios_alloc_irq() won't be called for those
+* PCI devices.
+*/
+   if (!dev->driver)
+   pcibios_alloc_irq(dev);
+
return pci_enable_resources(dev, mask);
  }




Sorry for the late report but this patch messes up things for kexec - 
rebooting is delayed with the error messages as shown in the fuzzy 
screen image here:


http://www.users.on.net/~arthur.marsh/20150910541.jpg

(the error messages are similar to what I was seeing on boot-up before 
Jiang Liu's patch)


and the SCSI card is not recognised by the kernel after a kexec restart, 
and eata fails to load.


Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: eata fails to load on post 4.2 kernels

2015-09-08 Thread Arthur Marsh



Jiang Liu wrote on 08/09/15 14:49:

Hi Auhur,
Could you please help to apply the test patch
against the latest mainstream linux kernel?
Thanks!
Gerry


Done, and it appears to work properly thanks!

Arthur.
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Initializing cgroup subsys cpuacct
[0.00] Linux version 4.2.0+ (root@victoria) (gcc version 5.2.1 20150903 
(Debian 5.2.1-16) ) #30 SMP PREEMPT Tue Sep 8 15:10:49 ACST 2015
[0.00] x86/fpu: Legacy x87 FPU detected.
[0.00] x86/fpu: Using 'lazy' FPU context switches.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009fbff] usable
[0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x3ffa] usable
[0.00] BIOS-e820: [mem 0x3ffb-0x3ffbdfff] ACPI data
[0.00] BIOS-e820: [mem 0x3ffbe000-0x3ffd] ACPI NVS
[0.00] BIOS-e820: [mem 0x3ffe-0x3fff] reserved
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved
[0.00] BIOS-e820: [mem 0xff78-0x] reserved
[0.00] Notice: NX (Execute Disable) protection cannot be enabled: 
non-PAE kernel!
[0.00] SMBIOS 2.3 present.
[0.00] DMI: System manufacturer System Product Name/A8V-MX, BIOS 0503   
 12/06/2005
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0x3ffb0 max_arch_pfn = 0x10
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-E uncachable
[0.00]   F-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 00 mask FFC000 write-back
[0.00]   1 base 00D000 mask FFF000 write-combining
[0.00]   2 disabled
[0.00]   3 disabled
[0.00]   4 disabled
[0.00]   5 disabled
[0.00]   6 disabled
[0.00]   7 disabled
[0.00] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- WT  
[0.00] found SMP MP-table at [mem 0x000ff780-0x000ff78f] mapped at 
[c00ff780]
[0.00] initial memory mapped: [mem 0x-0x023f]
[0.00] Base memory trampoline at [c009b000] 9b000 size 16384
[0.00] init_memory_mapping: [mem 0x-0x000f]
[0.00]  [mem 0x-0x000f] page 4k
[0.00] init_memory_mapping: [mem 0x35c0-0x35ff]
[0.00]  [mem 0x35c0-0x35ff] page 4M
[0.00] init_memory_mapping: [mem 0x0010-0x35bf]
[0.00]  [mem 0x0010-0x003f] page 4k
[0.00]  [mem 0x0040-0x35bf] page 4M
[0.00] init_memory_mapping: [mem 0x3600-0x377fdfff]
[0.00]  [mem 0x3600-0x373f] page 4M
[0.00]  [mem 0x3740-0x377fdfff] page 4k
[0.00] BRK [0x02075000, 0x02075fff] PGTABLE
[0.00] RAMDISK: [mem 0x3614c000-0x3709dfff]
[0.00] ACPI: Early table checksum verification disabled
[0.00] ACPI: RSDP 0x000FAC60 24 (v02 ACPIAM)
[0.00] ACPI: XSDT 0x3FFB0100 3C (v01 A M I  OEMXSDT  
12000506 MSFT 0097)
[0.00] ACPI: FACP 0x3FFB0290 F4 (v03 A M I  OEMFACP  
12000506 MSFT 0097)
[0.00] ACPI: DSDT 0x3FFB03F0 0046F0 (v01 A0347  A0347001 
0001 INTL 02002026)
[0.00] ACPI: FACS 0x3FFBE000 40
[0.00] ACPI: FACS 0x3FFBE000 40
[0.00] ACPI: APIC 0x3FFB0390 5C (v01 A M I  OEMAPIC  
12000506 MSFT 0097)
[0.00] ACPI: OEMB 0x3FFBE040 46 (v01 A M I  AMI_OEM  
12000506 MSFT 0097)
[0.00] ACPI: Local APIC address 0xfee0
[0.00] 135MB HIGHMEM available.
[0.00] 887MB LOWMEM available.
[0.00]   mapped low ram: 0 - 377fe000
[0.00]   low ram: 0 - 377fe000
[0.00] BRK [0x02076000, 0x02076fff] PGTABLE
[0.00] Zone ranges:
[0.00]   DMA  [mem 0x1000-0x00ff]
[0.00]   Normal   [mem 0x0100-0x377fdfff]
[0.00]   HighMem  [mem 0x377fe000-0x3ffa]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x1000-0x0009efff]
[0.00]   node   0: [mem 0x0010-0x3ffa]
[0.00] Initmem setup node 0 [mem 0x1000-0x3ffa]
[0.00] On node 0 totalpages: 261966
[

Re: [Bugfix] PCI, x86: Correctly allocate IRQs for PCI devices managed by non-PCI drivers

2015-09-08 Thread Arthur Marsh



Jiang Liu wrote on 08/09/15 16:56:

Commit 991de2e59090 ("PCI, x86: Implement pcibios_alloc_irq() and
pcibios_free_irq()") changes the way to allocate PCI legacy IRQ
for PCI devices on x86 platforms. Instead of allocating PCI legacy
IRQs when pcibios_enable_device() gets called, now pcibios_alloc_irq()
will be called by pci_device_probe() to allocate PCI legacy IRQs
when binding PCI drivers to PCI devices.

But some device drivers, such as eata, directly access PCI devices
without implementing corresponding PCI drivers, so pcibios_alloc_irq()
won't be called for those PCI devices and wrong IRQ number may be
used to manage the PCI device.

So detect such a case in pcibios_enable_device() by checking
pci_dev->driver is NULL and call pcibios_alloc_irq() to allocate PCI
legacy IRQs.

Signed-off-by: Jiang Liu 
---
  arch/x86/pci/common.c |   10 ++
  1 file changed, 10 insertions(+)

diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 09d3afc0a181..60b237783582 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -685,6 +685,16 @@ void pcibios_free_irq(struct pci_dev *dev)

  int pcibios_enable_device(struct pci_dev *dev, int mask)
  {
+   /*
+* By design, pcibios_alloc_irq() will be called by pci_device_probe()
+* when binding a PCI device to a PCI driver. But some device drivers,
+* such as eata, directly make use of PCI devices without implementing
+* PCI device drivers, so pcibios_alloc_irq() won't be called for those
+* PCI devices.
+*/
+   if (!dev->driver)
+   pcibios_alloc_irq(dev);
+
return pci_enable_resources(dev, mask);
  }




Thanks, I removed the test patch and applied the revised patch and built 
and rebooted the kernel and successfully mounted file systems on a disk 
attached to the DPT 2044W card using the eata driver:


[0.00] Linux version 4.2.0+ (root@victoria) (gcc version 5.2.1 
20150903

(Debian 5.2.1-16) ) #31 SMP PREEMPT Tue Sep 8 17:36:28 ACST 2015
...

[   80.691097] EATA0: IRQ 10 mapped to IO-APIC IRQ 17.
[   80.724519] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio.
[   80.752035] EATA config options -> tm:1, lc:y, mq:16, rs:y, et:n, 
ip:n, ep:n, pp:y.

[   80.777063] EATA0: 2.0C, PCI 0xd890, IRQ 17, BMST, SG 122, MB 64.
[   80.802391] EATA0: wide SCSI support enabled, max_id 16, max_lun 8.
[   80.827959] EATA0: SCSI channel 0 enabled, host target ID 7.
[   80.853413] scsi host3: EATA/DMA 2.0x rev. 8.10.00
[   82.445662] scsi 3:0:6:0: Direct-Access IBM  DCAS-34330W 
 S65A PQ: 0 ANSI: 2

[   82.471584] scsi 3:0:6:0: cmds/lun 16, sorted, simple tags.
[   84.571451] sd 3:0:6:0: Attached scsi generic sg4 type 0
[   84.597572] sd 3:0:6:0: [sdd] 8466688 512-byte logical blocks: (4.33 
GB/4.03 GiB)

[   84.659874] sd 3:0:6:0: [sdd] Write Protect is off
[   84.688543] sd 3:0:6:0: [sdd] Mode Sense: b3 00 00 08
[   84.714021] sd 3:0:6:0: [sdd] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA

[   84.817682]  sdd: sdd1 sdd2 < sdd5 >
[   84.919267] sd 3:0:6:0: [sdd] Attached SCSI disk

Arthur.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: Re: eata fails to load on post 4.2 kernels

2015-09-07 Thread Arthur Marsh
Forwarding without image attachment to get below message size limit of 
the mailing lists.


I've uploaded the image to:
http://www.users.on.net/~arthur.marsh/20150907539.jpg


 Forwarded Message 
Subject: Re: eata fails to load on post 4.2 kernels
Date: Mon, 07 Sep 2015 15:56:02 +0930
From: Arthur Marsh <arthur.ma...@internode.on.net>
To: Jiang Liu <jiang@linux.intel.com>
CC: Bjorn Helgaas <bhelg...@google.com>, t...@linutronix.de, 
linux-scsi@vger.kernel.org, linux-ker...@vger.kernel.org




Jiang Liu wrote on 07/09/15 12:36:

On 2015/9/7 4:31, Arthur Marsh wrote:

Arthur Marsh wrote on 06/09/15 21:07:

Arthur Marsh wrote on 06/09/15 18:34:

Arthur Marsh wrote on 06/09/15 15:58:

Hi, I'm seeing the following on post 4.2 kernels, am currently
bisecting
to find where it started:


First kernel in the bisection that worked without needing irqpoll:

[   73.751482] EATA0: IRQ 10 mapped to IO-APIC IRQ 17.
[   73.776711] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio.
[   73.802005] EATA config options -> tm:1, lc:y, mq:16, rs:y, et:n,
ip:n, ep:n, pp:y.
[   73.829175] EATA0: 2.0C, PCI 0xd890, IRQ 17, BMST, SG 122, MB 64.
[   73.82] EATA0: wide SCSI support enabled, max_id 16, max_lun 8.
[   73.881125] EATA0: SCSI channel 0 enabled, host target ID 7.


After a git bisect, I get:

git bisect good
991de2e59090e55c65a7f59a049142e3c480f7bd is the first bad commit
commit 991de2e59090e55c65a7f59a049142e3c480f7bd
Author: Jiang Liu <jiang@linux.intel.com>
Date:   Wed Jun 10 16:54:59 2015 +0800

 PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()

 To support IOAPIC hotplug, we need to allocate PCI IRQ resources on
demand
 and free them when not used anymore.

 Implement pcibios_alloc_irq() and pcibios_free_irq() to dynamically
 allocate and free PCI IRQs.

 Remove mp_should_keep_irq(), which is no longer used.

 [bhelgaas: changelog]
 Signed-off-by: Jiang Liu <jiang@linux.intel.com>
 Signed-off-by: Bjorn Helgaas <bhelg...@google.com>
 Acked-by: Thomas Gleixner <t...@linutronix.de>

:04 04 765e2d5232d53247ec260b34b51589c3bccb36ae
f680234a27685e94b1a35ae2a7218f8eafa9071a M  arch
:04 04 d55a682bcde72682e883365e88ad1df6186fd54d
f82c470a04a6845fcf5e0aa934512c75628f798d M  drivers

I'm happy to supply more details if needed.

Hi Arthur,
Thanks for reporting this. It seems to be an irq misrouting
issue. Could you please help to provide:
1) full dmesg with the latest code
2) full dmesg and /proc/interrupts with the latest code and
kernel parameter "irqpoll" specified
Thanks!
Gerry


The pc locks up when loading the eata module so I've attached a photo of
the monitor screen.

Arthur.





[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Initializing cgroup subsys cpuacct
[0.00] Linux version 4.2.0+ (root@victoria) (gcc version 5.2.1 20150903 
(Debian 5.2.1-16) ) #29 SMP PREEMPT Mon Sep 7 07:10:45 ACST 2015
[0.00] x86/fpu: Legacy x87 FPU detected.
[0.00] x86/fpu: Using 'lazy' FPU context switches.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009fbff] usable
[0.00] BIOS-e820: [mem 0x0009fc00-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x3ffa] usable
[0.00] BIOS-e820: [mem 0x3ffb-0x3ffbdfff] ACPI data
[0.00] BIOS-e820: [mem 0x3ffbe000-0x3ffd] ACPI NVS
[0.00] BIOS-e820: [mem 0x3ffe-0x3fff] reserved
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved
[0.00] BIOS-e820: [mem 0xff78-0x] reserved
[0.00] Notice: NX (Execute Disable) protection cannot be enabled: 
non-PAE kernel!
[0.00] SMBIOS 2.3 present.
[0.00] DMI: System manufacturer System Product Name/A8V-MX, BIOS 0503   
 12/06/2005
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0x3ffb0 max_arch_pfn = 0x10
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-E uncachable
[0.00]   F-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 00 mask FFC000 write-back
[0.00]   1 base 00D000 mask FFF000 write-combining
[0.00]   2 disabled
[0.00]   3 disabled
[0.00]   4 disabled
[0.00]   5 disabled
[0.00]   6 disabled
[0.0

Re: eata fails to load on post 4.2 kernels

2015-09-06 Thread Arthur Marsh

Arthur Marsh wrote on 06/09/15 15:58:

Hi, I'm seeing the following on post 4.2 kernels, am currently bisecting
to find where it started:


an error message suggested trying setting irqpoll on the kernel command 
line, which worked:


[   85.230148] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio.
[   85.255929] EATA config options -> tm:1, lc:y, mq:16, rs:y, et:n, 
ip:n, ep:n, pp:y.

[   85.282472] EATA0: 2.0C, PCI 0xd890, IRQ 10, BMST, SG 122, MB 64.
[   85.308281] EATA0: wide SCSI support enabled, max_id 16, max_lun 8.
[   85.333237] EATA0: SCSI channel 0 enabled, host target ID 7.
[   85.358097] scsi host3: EATA/DMA 2.0x rev. 8.10.00
[   86.950246] scsi 3:0:6:0: Direct-Access IBM  DCAS-34330W 
 S65A PQ: 0 ANSI: 2

[   86.975531] scsi 3:0:6:0: cmds/lun 16, sorted, simple tags.
[   89.075921] sd 3:0:6:0: Attached scsi generic sg4 type 0
[   89.101628] sd 3:0:6:0: [sdd] 8466688 512-byte logical blocks: (4.33 
GB/4.03 GiB)

[   89.166331] sd 3:0:6:0: [sdd] Write Protect is off
[   89.192023] sd 3:0:6:0: [sdd] Mode Sense: b3 00 00 08
[   89.209400] sd 3:0:6:0: [sdd] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA

[   89.312977]  sdd: sdd1 sdd2 < sdd5 >
[   89.402386] sd 3:0:6:0: [sdd] Attached SCSI disk

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: eata fails to load on post 4.2 kernels

2015-09-06 Thread Arthur Marsh

Arthur Marsh wrote on 06/09/15 18:34:

Arthur Marsh wrote on 06/09/15 15:58:

Hi, I'm seeing the following on post 4.2 kernels, am currently bisecting
to find where it started:


First kernel in the bisection that worked without needing irqpoll:

[   73.751482] EATA0: IRQ 10 mapped to IO-APIC IRQ 17.
[   73.776711] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio.
[   73.802005] EATA config options -> tm:1, lc:y, mq:16, rs:y, et:n, 
ip:n, ep:n, pp:y.

[   73.829175] EATA0: 2.0C, PCI 0xd890, IRQ 17, BMST, SG 122, MB 64.
[   73.82] EATA0: wide SCSI support enabled, max_id 16, max_lun 8.
[   73.881125] EATA0: SCSI channel 0 enabled, host target ID 7.
[   73.906599] scsi host3: EATA/DMA 2.0x rev. 8.10.00
[   75.466016] scsi 3:0:6:0: Direct-Access IBM  DCAS-34330W 
 S65A PQ: 0 ANSI: 2

[   75.491947] scsi 3:0:6:0: cmds/lun 16, sorted, simple tags.
[   77.560139] sd 3:0:6:0: Attached scsi generic sg4 type 0
[   77.586272] sd 3:0:6:0: [sdd] 8466688 512-byte logical blocks: (4.33 
GB/4.03 GiB)

[   77.671836] sd 3:0:6:0: [sdd] Write Protect is off
[   77.700217] sd 3:0:6:0: [sdd] Mode Sense: b3 00 00 08
[   77.725970] sd 3:0:6:0: [sdd] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA

[   77.829574]  sdd: sdd1 sdd2 < sdd5 >
[   77.929879] sd 3:0:6:0: [sdd] Attached SCSI disk


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] eata: remove driver_lock

2014-08-10 Thread Arthur Marsh

Christoph Hellwig wrote, on 14/07/14 17:56:

port_detect is only called from the module_init routine and thus implicitly
serialized, so remove the driver lock which was held over potentially
sleeping function calls.

Signed-off-by: Christoph Hellwig h...@lst.de
Reported-by: Arthur Marsh arthur.ma...@internode.on.net
Tested-by: Arthur Marsh arthur.ma...@internode.on.net
---
  drivers/scsi/eata.c | 9 -
  1 file changed, 9 deletions(-)

diff --git a/drivers/scsi/eata.c b/drivers/scsi/eata.c
index 03372cf..980898e 100644
--- a/drivers/scsi/eata.c
+++ b/drivers/scsi/eata.c
@@ -837,7 +837,6 @@ struct hostdata {
  static struct Scsi_Host *sh[MAX_BOARDS];
  static const char *driver_name = EATA;
  static char sha[MAX_BOARDS];
-static DEFINE_SPINLOCK(driver_lock);

  /* Initialize num_boards so that ihdlr can work while detect is in progress */
  static unsigned int num_boards = MAX_BOARDS;
@@ -1097,8 +1096,6 @@ static int port_detect(unsigned long port_base, unsigned 
int j,
goto fail;
}

-   spin_lock_irq(driver_lock);
-
if (do_dma(port_base, 0, READ_CONFIG_PIO)) {
  #if defined(DEBUG_DETECT)
printk(%s: detect, do_dma failed at 0x%03lx.\n, name,
@@ -1265,10 +1262,7 @@ static int port_detect(unsigned long port_base, unsigned 
int j,
}
  #endif

-   spin_unlock_irq(driver_lock);
sh[j] = shost = scsi_register(tpnt, sizeof(struct hostdata));
-   spin_lock_irq(driver_lock);
-
if (shost == NULL) {
printk(%s: unable to register host, detaching.\n, name);
goto freedma;
@@ -1345,8 +1339,6 @@ static int port_detect(unsigned long port_base, unsigned 
int j,
else
sprintf(dma_name, DMA %u, dma_channel);

-   spin_unlock_irq(driver_lock);
-
for (i = 0; i  shost-can_queue; i++)
ha-cp[i].cp_dma_addr = pci_map_single(ha-pdev,
  ha-cp[i],
@@ -1439,7 +1431,6 @@ static int port_detect(unsigned long port_base, unsigned 
int j,
freeirq:
free_irq(irq, sha[j]);
freelock:
-   spin_unlock_irq(driver_lock);
release_region(port_base, REGION_SIZE);
fail:
return 0;



Not sure if this is related but it only appeared in the last few days in 
Linus' git master:


[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Initializing cgroup subsys cpuacct
[0.00] Linux version 3.16.0+ (root@am64) (gcc version 4.9.1 
(Debian 4.9.1-5) ) #1141 SMP Sun Aug 10 20:50:33 ACST 2014

[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x0100-0x0009efff] usable
[0.00] BIOS-e820: [mem 0x0009f000-0x0009] 
reserved
[0.00] BIOS-e820: [mem 0x000f-0x000f] 
reserved

[0.00] BIOS-e820: [mem 0x0010-0x5fffbfff] usable
[0.00] BIOS-e820: [mem 0x5fffc000-0x5fffefff] 
ACPI data
[0.00] BIOS-e820: [mem 0x5000-0x5fff] 
ACPI NVS
[0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] 
reserved
[0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] 
reserved
[0.00] BIOS-e820: [mem 0x-0x] 
reserved

[0.00] Notice: NX (Execute Disable) protection missing in CPU!
[0.00] SMBIOS 2.3 present.
[0.00] DMI: System Manufacturer System Name/P4S800, BIOS ASUS 
P4S800 ACPI BIOS Revision 1011 Beta 001 08/30/2005

[0.00] e820: update [mem 0x-0x0fff] usable == reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0x5fffc max_arch_pfn = 0x10
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-B uncachable
[0.00]   C-C7FFF write-protect
[0.00]   C8000-E uncachable
[0.00]   F-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 0 mask FC000 write-back
[0.00]   1 base 04000 mask FE000 write-back
[0.00]   2 base 0C000 mask FF000 write-combining
[0.00]   3 disabled
[0.00]   4 disabled
[0.00]   5 disabled
[0.00]   6 disabled
[0.00]   7 disabled
[0.00] x86 PAT enabled: cpu 0, old 0x7010600070106, new 
0x7010600070106

[0.00] initial memory mapped: [mem 0x-0x023f]
[0.00] Base memory trampoline at [c009b000] 9b000 size 16384
[0.00] init_memory_mapping: [mem 0x-0x000f]
[0.00]  [mem 0x-0x000f] page 4k
[0.00] init_memory_mapping: [mem 0x3700-0x373f]
[0.00]  [mem 0x3700-0x373f] page 2M
[0.00] init_memory_mapping: [mem 0x3000-0x36ff

Re: eata - issue appeared in Linus git master in last 24-48 hours

2014-07-11 Thread Arthur Marsh



Christoph Hellwig wrote, on 11/07/14 18:50:

On Mon, Jun 30, 2014 at 04:31:33AM +0930, Arthur Marsh wrote:

Hi, I haven't had time to do a git bisect yet, but just saw this after
rebuilding the kernel in the last day or so:


It seems like some of the routines called during the driver
initialization may sleep while the driver_lock is held and irqs are
disabled.

As eata2x_detect is only called during module load the lock seems
entirely pointless and should be removed, like in the patch below:


diff --git a/drivers/scsi/eata.c b/drivers/scsi/eata.c
index 03372cf..980898e 100644
--- a/drivers/scsi/eata.c
+++ b/drivers/scsi/eata.c
@@ -837,7 +837,6 @@ struct hostdata {
  static struct Scsi_Host *sh[MAX_BOARDS];
  static const char *driver_name = EATA;
  static char sha[MAX_BOARDS];
-static DEFINE_SPINLOCK(driver_lock);

  /* Initialize num_boards so that ihdlr can work while detect is in progress */
  static unsigned int num_boards = MAX_BOARDS;
@@ -1097,8 +1096,6 @@ static int port_detect(unsigned long port_base, unsigned 
int j,
goto fail;
}

-   spin_lock_irq(driver_lock);
-
if (do_dma(port_base, 0, READ_CONFIG_PIO)) {
  #if defined(DEBUG_DETECT)
printk(%s: detect, do_dma failed at 0x%03lx.\n, name,
@@ -1265,10 +1262,7 @@ static int port_detect(unsigned long port_base, unsigned 
int j,
}
  #endif

-   spin_unlock_irq(driver_lock);
sh[j] = shost = scsi_register(tpnt, sizeof(struct hostdata));
-   spin_lock_irq(driver_lock);
-
if (shost == NULL) {
printk(%s: unable to register host, detaching.\n, name);
goto freedma;
@@ -1345,8 +1339,6 @@ static int port_detect(unsigned long port_base, unsigned 
int j,
else
sprintf(dma_name, DMA %u, dma_channel);

-   spin_unlock_irq(driver_lock);
-
for (i = 0; i  shost-can_queue; i++)
ha-cp[i].cp_dma_addr = pci_map_single(ha-pdev,
  ha-cp[i],
@@ -1439,7 +1431,6 @@ static int port_detect(unsigned long port_base, unsigned 
int j,
freeirq:
free_irq(irq, sha[j]);
freelock:
-   spin_unlock_irq(driver_lock);
release_region(port_base, REGION_SIZE);
fail:
return 0;



Thanks, I've rebuilt the kernel with this patch applied and running the 
rebuilt kernel fine using a DPT 2044W SCSI adaptor:


$ lspci|grep DPT
00:0c.0 SCSI storage controller: Adaptec (formerly DPT) SmartCache/Raid 
I-IV Controller (rev 02)



$ dmesg|grep -i eata
[1.038968] EATA0: warning, DMA protocol support not asserted.
[1.039041] EATA0: IRQ 11 mapped to IO-APIC IRQ 16.
[1.040801] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio.
[1.040861] EATA config options - tm:1, lc:y, mq:16, rs:y, et:n, 
ip:n, ep:n, pp:y.

[1.040922] EATA0: 2.0C, PCI 0x7410, IRQ 16, BMST, SG 122, MB 64.
[1.040973] EATA0: wide SCSI support enabled, max_id 16, max_lun 8.
[1.041025] EATA0: SCSI channel 0 enabled, host target ID 7.
[1.041095] scsi2 : EATA/DMA 2.0x rev. 8.10.00

Arthur.
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


eata - issue appeared in Linus git master in last 24-48 hours

2014-06-29 Thread Arthur Marsh
Hi, I haven't had time to do a git bisect yet, but just saw this after 
rebuilding the kernel in the last day or so:


[1.044035] EATA0: warning, DMA protocol support not asserted.
[1.044035] EATA0: IRQ 11 mapped to IO-APIC IRQ 16.
[1.046040] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[1.046123] usb usb1: New USB device strings: Mfr=3, Product=2, 
SerialNumber=

1
[1.046204] usb usb1: Product: EHCI Host Controller
[1.046275] usb usb1: Manufacturer: Linux 3.16.0-rc2+ ehci_hcd
[1.046348] usb usb1: SerialNumber: :00:03.3
[1.049496] hub 1-0:1.0: USB hub found
[1.050029] hub 1-0:1.0: 6 ports detected
[1.050625] BUG: spinlock wrong CPU on CPU#1, systemd-udevd/63
[1.050700]  lock: driver_lock+0x0/0xef00 [eata], .magic: 
dead4ead, .owne

r: systemd-udevd/63, .owner_cpu: 0
[1.050785] CPU: 1 PID: 63 Comm: systemd-udevd Not tainted 
3.16.0-rc2+ #1038
[1.050850] Hardware name: System Manufacturer System Name/P4S800, 
BIOS ASUS

P4S800 ACPI BIOS Revision 1011 Beta 001 08/30/2005
[1.050935]  f8048100  f707fad8 c1416ba9 f43aad44 f707fb04 
c1081b6f c

158aab0
[1.051301]  f8048100 dead4ead f43aad44 003f  f8048100 
c155b1eb 0

010
[1.051678]  f707fb14 c1081bdc f8048100 f50e8000 f707fb20 c1081e43 
f8048100 f

707fb2c
[1.052051] Call Trace:
[1.052119]  [c1416ba9] dump_stack+0x41/0x52
[1.052183]  [c1081b6f] spin_dump+0x8c/0xde
[1.052249]  [c1081bdc] spin_bug+0x1b/0x1f
[1.052310]  [c1081e43] do_raw_spin_unlock+0x79/0x7b
[1.052377]  [c141c495] _raw_spin_unlock_irq+0x1d/0x26
[1.052444]  [f8046176] port_detect+0xa54/0xefc [eata]
[1.052509]  [c141a510] ? __mutex_unlock_slowpath+0xb6/0x136
[1.052576]  [c141a598] ? mutex_unlock+0x8/0xa
[1.052641]  [c102d766] ? ioapic_write_entry+0x17/0x43
[1.052706]  [c102d78b] ? ioapic_write_entry+0x3c/0x43
[1.052771]  [c102d78b] ? ioapic_write_entry+0x3c/0x43
[1.052837]  [c102e633] ? io_apic_setup_irq_pin+0x175/0x319
[1.052904]  [c12804bf] ? acpi_os_release_lock+0x8/0xa
[1.052970]  [c131f7dc] ? pci_conf1_read+0x43/0xdd
[1.053036]  [c131f801] ? pci_conf1_read+0x68/0xdd
[1.053101]  [c1410ccc] ? klist_next+0x1b/0xef
[1.053166]  [c1410d9e] ? klist_next+0xed/0xef
[1.053237]  [c141c449] ? _raw_spin_unlock+0x1d/0x20
[1.053304]  [c1410d9e] ? klist_next+0xed/0xef
[1.053383]  [c12505f6] ? pci_do_find_bus+0x36/0x36
[1.053449]  [c12e1a18] ? bus_find_device+0x5b/0x7d
[1.053511]  [c12dfc7c] ? put_device+0xf/0x11
[1.053571]  [c124f172] ? pci_dev_put+0xf/0x11
[1.053635]  [c125078e] ? pci_get_dev_by_id+0x3f/0x8a
[1.053701]  [c12505f6] ? pci_do_find_bus+0x36/0x36
[1.053763]  [c12508d4] ? pci_get_class+0x46/0x48
[1.053829]  [f80466f7] eata2x_detect+0xd9/0x3ef [eata]
[1.053836] ohci-pci: OHCI PCI platform driver
[1.054213] ohci-pci :00:03.0: OHCI PCI host controller
[1.054231] ohci-pci :00:03.0: new USB bus registered, assigned 
bus numbe

r 2
[1.054293] ohci-pci :00:03.0: irq 9, io mem 0xbe80
[1.054782]  [f8021000] ? 0xf8020fff
[1.054853]  [f8021054] init_this_scsi_driver+0x54/0x1000 [eata]
[1.054923]  [f8021000] ? 0xf8020fff
[1.054987]  [c100041b] do_one_initcall+0x75/0x198
[1.055051]  [f8021000] ? 0xf8020fff
[1.055115]  [c11255cd] ? __vunmap+0x77/0xce
[1.055179]  [c10aa53b] load_module+0x19a6/0x224a
[1.055248]  [c10aaed2] SyS_finit_module+0x5c/0x6b
[1.055320]  [f8039000] ? 0xf8038fff
[1.055385]  [c141ce0e] syscall_call+0x7/0xb
[1.060902] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio.
[1.060966] EATA config options - tm:1, lc:y, mq:16, rs:y, et:n, 
ip:n, ep:n,

 pp:y.
[1.061029] EATA0: 2.0C, PCI 0x7410, IRQ 16, BMST, SG 122, MB 64.
[1.061080] EATA0: wide SCSI support enabled, max_id 16, max_lun 8.
[1.061132] EATA0: SCSI channel 0 enabled, host target ID 7.
[1.061192] scsi0 : EATA/DMA 2.0x rev. 8.10.00

The machine has a dual core P4 and the kernel was compiled with gcc-4.9.0:

 Linux version 3.16.0-rc2+ (root@am64) (gcc version 4.9.0 (Debian
4.9.0-9) ) #1038 SMP Sun Jun 29 10:19:20 CST 2014

The actual SCSI HBA is a DPT 2044W.

Regards,

Arthur.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: eata - issue appeared in Linus git master in last 24-48 hours

2014-06-29 Thread Arthur Marsh

Arthur Marsh wrote, on 30/06/14 04:31:

Hi, I haven't had time to do a git bisect yet, but just saw this after
rebuilding the kernel in the last day or so:

[1.044035] EATA0: warning, DMA protocol support not asserted.
[1.044035] EATA0: IRQ 11 mapped to IO-APIC IRQ 16.
[1.046040] usb usb1: New USB device found, idVendor=1d6b,
idProduct=0002
[1.046123] usb usb1: New USB device strings: Mfr=3, Product=2,
SerialNumber=
1
[1.046204] usb usb1: Product: EHCI Host Controller
[1.046275] usb usb1: Manufacturer: Linux 3.16.0-rc2+ ehci_hcd
[1.046348] usb usb1: SerialNumber: :00:03.3
[1.049496] hub 1-0:1.0: USB hub found
[1.050029] hub 1-0:1.0: 6 ports detected
[1.050625] BUG: spinlock wrong CPU on CPU#1, systemd-udevd/63
[1.050700]  lock: driver_lock+0x0/0xef00 [eata], .magic:
dead4ead, .owne
r: systemd-udevd/63, .owner_cpu: 0
[1.050785] CPU: 1 PID: 63 Comm: systemd-udevd Not tainted
3.16.0-rc2+ #1038
[1.050850] Hardware name: System Manufacturer System Name/P4S800,
BIOS ASUS
P4S800 ACPI BIOS Revision 1011 Beta 001 08/30/2005
[1.050935]  f8048100  f707fad8 c1416ba9 f43aad44 f707fb04
c1081b6f c
158aab0
[1.051301]  f8048100 dead4ead f43aad44 003f  f8048100
c155b1eb 0
010
[1.051678]  f707fb14 c1081bdc f8048100 f50e8000 f707fb20 c1081e43
f8048100 f
707fb2c
[1.052051] Call Trace:
[1.052119]  [c1416ba9] dump_stack+0x41/0x52
[1.052183]  [c1081b6f] spin_dump+0x8c/0xde
[1.052249]  [c1081bdc] spin_bug+0x1b/0x1f
[1.052310]  [c1081e43] do_raw_spin_unlock+0x79/0x7b
[1.052377]  [c141c495] _raw_spin_unlock_irq+0x1d/0x26
[1.052444]  [f8046176] port_detect+0xa54/0xefc [eata]
[1.052509]  [c141a510] ? __mutex_unlock_slowpath+0xb6/0x136
[1.052576]  [c141a598] ? mutex_unlock+0x8/0xa
[1.052641]  [c102d766] ? ioapic_write_entry+0x17/0x43
[1.052706]  [c102d78b] ? ioapic_write_entry+0x3c/0x43
[1.052771]  [c102d78b] ? ioapic_write_entry+0x3c/0x43
[1.052837]  [c102e633] ? io_apic_setup_irq_pin+0x175/0x319
[1.052904]  [c12804bf] ? acpi_os_release_lock+0x8/0xa
[1.052970]  [c131f7dc] ? pci_conf1_read+0x43/0xdd
[1.053036]  [c131f801] ? pci_conf1_read+0x68/0xdd
[1.053101]  [c1410ccc] ? klist_next+0x1b/0xef
[1.053166]  [c1410d9e] ? klist_next+0xed/0xef
[1.053237]  [c141c449] ? _raw_spin_unlock+0x1d/0x20
[1.053304]  [c1410d9e] ? klist_next+0xed/0xef
[1.053383]  [c12505f6] ? pci_do_find_bus+0x36/0x36
[1.053449]  [c12e1a18] ? bus_find_device+0x5b/0x7d
[1.053511]  [c12dfc7c] ? put_device+0xf/0x11
[1.053571]  [c124f172] ? pci_dev_put+0xf/0x11
[1.053635]  [c125078e] ? pci_get_dev_by_id+0x3f/0x8a
[1.053701]  [c12505f6] ? pci_do_find_bus+0x36/0x36
[1.053763]  [c12508d4] ? pci_get_class+0x46/0x48
[1.053829]  [f80466f7] eata2x_detect+0xd9/0x3ef [eata]
[1.053836] ohci-pci: OHCI PCI platform driver
[1.054213] ohci-pci :00:03.0: OHCI PCI host controller
[1.054231] ohci-pci :00:03.0: new USB bus registered, assigned
bus numbe
r 2
[1.054293] ohci-pci :00:03.0: irq 9, io mem 0xbe80
[1.054782]  [f8021000] ? 0xf8020fff
[1.054853]  [f8021054] init_this_scsi_driver+0x54/0x1000 [eata]
[1.054923]  [f8021000] ? 0xf8020fff
[1.054987]  [c100041b] do_one_initcall+0x75/0x198
[1.055051]  [f8021000] ? 0xf8020fff
[1.055115]  [c11255cd] ? __vunmap+0x77/0xce
[1.055179]  [c10aa53b] load_module+0x19a6/0x224a
[1.055248]  [c10aaed2] SyS_finit_module+0x5c/0x6b
[1.055320]  [f8039000] ? 0xf8038fff
[1.055385]  [c141ce0e] syscall_call+0x7/0xb
[1.060902] EATA/DMA 2.0x: Copyright (C) 1994-2003 Dario Ballabio.
[1.060966] EATA config options - tm:1, lc:y, mq:16, rs:y, et:n,
ip:n, ep:n,
  pp:y.
[1.061029] EATA0: 2.0C, PCI 0x7410, IRQ 16, BMST, SG 122, MB 64.
[1.061080] EATA0: wide SCSI support enabled, max_id 16, max_lun 8.
[1.061132] EATA0: SCSI channel 0 enabled, host target ID 7.
[1.061192] scsi0 : EATA/DMA 2.0x rev. 8.10.00


This wasn't repeated on a reboot, so at this stage it is a one-off problem.

Arthur.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html