O hang caused by irq vector automatic affinity
Works on HP DL360G6 with integrated smartarray, with no visible
regressions.
--
Meelis Roos (mr...@linux.ee)
> Hi Meelis,
>
> This issue should already be addressed by a very recent commit:
>
> 6a2cf8d3663e13e1 scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe
> failure
What tree is that commit in?
--
Meelis Roos (mr...@linux.ee)
; + qla2x00_mem_free(ha);
> +
> + qla2x00_free_queues(ha);
> +
>
> are unnecessary. These routines are already called by qla2x00_free_device
> just above
> in qla2x00_remove_one.
No, that was the point of my changes - they must not be called from
qla2x00_free_device or they will be d
, and also zero
the req and rsp pointers after freeing them once in the error handler of
qla2x00_probe_one().
This fixes memory corruption and further crashes in unrelated code when qla2200
init fails for some reason.
Signed-off-by: Meelis Roos <mr...@linux.ee>
---
drivers/scsi/qla2xxx/qla_os.
Fix an obvious copy-paste error in freeing QLAFX00 response queue - the code
checked for rsp->ring but freed rsp->ring_fx00.
Signed-off-by: Meelis Roos <mr...@linux.ee>
---
drivers/scsi/qla2xxx/qla_os.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/
- will submit them separately). After that I can observe firmware
being loaded and verified but qla2x00_init_firmware fails.
detailed debug trace with qla2xxx.ql2xextended_error_logging=0x7fff
is available at http://kodu.ut.ee/~mroos/qla2200-sparc64-trace.txt
How can I debug it further?
--
Meelis
> This happens on a HP DL360 G6 with Smart Array 410i.
>
> Will try to bisect.
>
> IO completion timeout could be because of some IRQ toubles?
Reverting 84676c1f21e8ff54befe985f4f14dc1edc10046b fixes it for me (as
suggested by Laurence Oberman).
--
Meelis Roos (mr...@linux.ee)
--
Meelis Roos (mr...@linux.ee)
e_one() so PCI layer assumes there is
driver attached, and tries to shut it down later.
Fix it by returning error from aac_probe_one() when card-specific init
function fails.
This fixes reboot on my HP NetRAID-4M with dead battery.
Signed-off-by: Meelis Roos <mr...@linux.ee>
diff --gi
reg_2xxx __iomem *reg = >iobase->isp;
/* Read all mbox registers? */
- mboxes = (1 << ha->mbx_count) - 1;
+ mboxes = (1ULL << ha->mbx_count) - 1;
if (!ha->mcp)
ql_dbg(ql_dbg_async, vha, 0x5001, "MBX pointer ERROR.\n");
else
--
Meelis Roos (mr...@linux.ee)
> Hello again.
And again...
>
> > > On Sep 18, 2017, at 3:49 AM, Meelis Roos <mr...@linux.ee> wrote:
> > >
> > > Hello, I decided to widen the coverage of my kernel testbed and put some
> > > FC cards into servers. This one is a PCI-X QLA23
map+0x99/0x120
? do_init_module+0x1a/0x245
do_init_module+0x83/0x245
load_module+0x2764/0x34a0
? kernel_read_file+0x150/0x320
SyS_finit_module+0x82/0xa0
do_fast_syscall_32+0xba/0x340
Signed-off-by: Meelis Roos <mr...@linux.ee>
diff --git a/drivers/scsi/aacraid/sa.c b/drivers/scsi/aac
; if (now.tv_nsec > NSEC_PER_SEC / 2)
> ++now.tv_sec;
>
> but I don't see why we add in half a second here. Any ideas?
I did not try to understand the details but I can confirm that this
patch makes the warnings go away.
--
Meelis Roos (mr...@linux.ee)
+0x13d/0x1f0
[ 12.293172] ? aac_send_hosttime+0xf0/0xf0 [aacraid]
[ 12.293231] ? __kthread_create_worker+0x110/0x110
[ 12.293289] ret_from_fork+0x19/0x24
[ 12.293345]
--
Meelis Roos (mr...@linux.ee)
Hello again.
> > On Sep 18, 2017, at 3:49 AM, Meelis Roos <mr...@linux.ee> wrote:
> >
> > Hello, I decided to widen the coverage of my kernel testbed and put some
> > FC cards into servers. This one is a PCI-X QLA2340 in HP Proliant DL 380
> > G4 (first 64
> On 08/19/2017 10:41 PM, Meelis Roos wrote:
> > Hello, I just tried Linux with the latest kernel (4.13-rc5+git) on a HP
> > DL360 G6 with HP branded ISP2432 HBA. The driver mentions unsupported
> > model of the card:
> >
> > [3.868589] scsi host1: qla2xxx
devinfo->target_mask = (0x01 << devinfo->target_offset);
+ } else {
+ devinfo->target_mask = 0;
+ }
}
void
--
Meelis Roos (mr...@linux.ee)
249,10 +251,12 @@ ahc_linux_pci_dev_probe(struct pci_dev *pdev, const
struct pci_device_id *ent)
return (-ENODEV);
}
}
+ ahc_set_unit(ahc, ahc_linux_unit++);
ahc->dev_softc = pci;
error = ahc_pci_config(ahc, entry);
if (error != 0) {
ahc_free(ahc);
+ ahc_linux_unit--;
return (-error);
}
--
Meelis Roos (mr...@linux.ee)
976] qla2xxx [:06:02.0]-00fc:4: ISP2312: PCI-X (100 MHz) @
:06:02.0 hdma+ host#=4 fw=3.03.28 IPX.
--
Meelis Roos (mr...@linux.ee)
Just went and changed kernel conf to HPSA instead of old CCISS but got a
compilation failure:
drivers/scsi/scsi_transport_sas.o: In function `sas_bsg_initialize':
scsi_transport_sas.c:(.text+0x12fd): undefined reference to `bsg_setup_queue'
scsi_transport_sas.c:(.text+0x13b2): undefined
+Smart Array P400
> +Smart Array P400i
> + Smart Array P600
> +Smart Array P700m
> +Smart Array P800
> +.fi
> .SS Configuration details
> To configure HP Smart Array controllers,
> use the HP Array Configuration Utility (either
>
--
Meelis Roos (mr...@ut.ee) http://www.cs.ut.ee/~mroos/
2432
SSVID/SSDID (0x103C,0x7041).
Is there some information I can provide to include this card in fully
supported list?
--
Meelis Roos (mr...@linux.ee)
> You should be able to suppress the "can't get device id' messages with:
Yes, these messages are gone and it still works.
--
Meelis Roos (mr...@linux.ee)
ct-Access MAN3735MC
hpsa0: hpsa_update_device_info: can't get device id for host 0:C0:T-1:L-1
Direct-Access MAP3735NC
report luns requested format 2, got 0
--
Meelis Roos (mr...@linux.ee)
be worth trying hpsa driver instead of cciss,
with a longer term goal to to move users of cciss over to hpsa if
possible. Now that I have tested it, it seems not all older cards are
supported in hpsa - it's more than ID-s and interrupt masks.
--
Meelis Roos (mr...@linux.ee)
ind it but no sda is
detected and no bootup.
What next?
And, for readability, we should use something like "Using unsupported
board ID", not plain "unsupported board ID" - the last one leaves
assumption that it will not work, although it should.
--
Meelis Roos (mr...@linux.ee)
[5.199125] hpsa :00:04.0: unrecognized board ID: 0x40800e11, ignoring.
[5.282517] hpsa :00:04.0: Board ID not found
Added specific PCI ID and subdevice ID quad and I still get the same
messages and the adapter is ignored.
What am I doing wrong?
--
Meelis Roos (mr...@linux.ee)
nd of June, 5.4.0 appeared masked. Unmasking kgcc64
5.4.0 and building it made my pariscs work again, with 4.7-rc6 running
fine on all of them.
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ma
0xc900035d2000.
[4.167084] qla2xxx [:07:00.0]-0034:3: MSI-X: Unsupported ISP 2432
SSVID/SSDID (0x103C,0x7041).
Why does the driver need to know subsystem ID-s at all?
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" i
[4.260755] [] entry_SYSCALL64_slow_path+0x25/0x25
[4.260933]
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message
hange:
Out of interest, how do other parisc users get the new compiler?
What distro are you using?
My pariscs are gentoo and still only the old versions are available on
hppa.
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi&
] [c0069378] worker_thread+0x74/0x704
[2.650764] [ef111ef0] [c00731fc] kthread+0xd8/0x134
[2.657326] [ef111f40] [c0019394] ret_from_kernel_thread+0x5c/0x64
[2.663877]
--
Meelis Roos (mr...@linux.ee
] [] ? rest_init+0x60/0x60
[1.732026]
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majo
f0/0x838
[<404067b8>] ext4_filemap_fault+0x58/0x90
[<403193f8>] __do_fault+0x78/0x180
[<40320504>] handle_mm_fault+0x134c/0x1ec0
CPU: 2 PID: 1 Comm: init Tainted: GW 4.6.0 #85
Backtrace:
[<40216b58>] show_stack+0x20/0x38
[
+0xb6/0x1d0
[4.900284] [] ? sysenter_past_esp+0x40/0x6a
[4.900284]
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message
x160/0x1b0
Caller[00426ea4]: do_one_initcall+0xe4/0x1e0
Caller[00a70b8c]: kernel_init_freeable+0x130/0x1e0
Caller[0087cfa4]: kernel_init+0x4/0x100
Caller[00406124]: ret_from_fork+0x1c/0x2c
Caller[]: (null)
Instruction DUMP: 15002703 02c6401a 0320 82088001
22c84009 c20e203c 11002704 92100
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0009
Press Stop-A (L1-A) to return to the boot prom
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
[ 74.084068] Press Stop-A (L1-A) to return to the boot prom
[ 74.155305] ---[ end Kernel panic - not syncing: Irrecoverable deferred
error trap.
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord
it to you:
Revert change that breaks QLA2XXX on big-endian systems,
__constant_cpu_to_le16() is still needed.
Signed-off-by: Meelis Roos mr...@linux.ee
diff --git a/drivers/scsi/qla2xxx/qla_fw.h b/drivers/scsi/qla2xxx/qla_fw.h
index 42bb357..88d3143 100644
--- a/drivers/scsi/qla2xxx/qla_fw.h
+++ b
[00646a3c] add_disk+0x33c/0x480
[00791aa8] sd_probe_async+0x148/0x180
[0047d4dc] async_run_entry_fn+0x3c/0x100
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo
] systemd-journald[70]: Received request to flush runtime journal
from PID 1
[ 387.067316] eth0: Link is up using internal transceiver at 100Mb/s, Full
Duplex.
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord
easier, and would make this sort
of problem much less likely to occur!
How about this one?
It make the machine work.
Thanks for testing!
What's the status of this fix? It is still not applied on yesterdays
3.19.0-rc6-00105-gc59c961 git...
--
Meelis Roos (mr...@linux.ee
] unhandled_fault+0x84/0x90
[ 160.452681] [00444298] do_sparc64_fault+0x498/0x6e0
[ 160.520397] [00407bcc] sparc64_realfault_common+0x10/0x20
[ 160.594378] [005f1cf8] blk_mq_map_queue+0x18/0x20
[ 160.660012] [0066c244] scsi_execute+0x24/0x140
--
Meelis Roos (mr
The second oops is in blk_mq_map_queue() which is a trivial
two level cpu lookup. I wonder if there's something odd about
cpu numbers on these big old sparc systems?
CPU numbers are sparse - they are determined by hardware slot number and
some models only fill every other mainboard
On Mon, Nov 03, 2014 at 11:32:14PM +0200, Meelis Roos wrote:
Yes. I took the same 3.18.0-rc1-00422-g2cc9188-dirty kernel that had
just this patch reverted, it started the controller fine, detected disk,
mounted root, started multiple tasks and then some time after startin
exim it just
.
Perhaps DaveM can tell which one is coreect or if there is any related
problem.
Do either of these help?
Thanx, Paul
On Thu, Nov 06, 2014 at 05:45:43PM +0200, Meelis Roos wrote:
I tested a machine with multiple scsi adapters
bytes left
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
.
Works fine on both DL380G3 and the other server with MPT and IDE CD.
--
Meelis Roos
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
, started multiple tasks and then some time after startin
exim it just hangs. This is consisten with what I saw during bisection.
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo
at the
moment.
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
. The
problem happens with 3.17 too with blk-mq.
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
-locked)
scsi_eh_lock_door(sdev);
}
+#endif
/*
* next free up anything directly waiting upon the host. this
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord
[ 741.176246] [c10f4d5d] SyS_open+0x1d/0x20
[ 741.227304] [c138070c] sysenter_do_call+0x12/0x12
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at http
has similar problem,
scsi_mode.use_blk_mq=0 cures it but I can not get good trace (no serail
console). 3.18.0-rc2-00043-gf7e87a4 was tested there.
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord
On 2014-10-29 05:46, Meelis Roos wrote:
I tried 3.18-rc2 with blk-mq default on on HP ProLiant DL380 G3 (with HP
CCISS RAID controller). It fails late in the bootup with task
scsi_eh_1:720 blocked for more than 120 seconds. messages.
Booting with scsi_mod.use_blk_mq=0 fixes
.
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
the
question was asked during make oldconfig. Will try 3.17 with use_blk_mq
today.
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo
linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
---end quoted text---
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord
fixed upstream:
commit 480cadc2b7e0fa2bbab20141efb547dfe0c3707c
Yes, works for both sparc64 and parisc.
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at http
[ 389.720324] [] (null)
[ 389.775518] no locks held by swapper/0/1.
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line
] (null)
[ 389.775518] no locks held by swapper/0/1.
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi
On Tue, 2014-08-19 at 14:25 +0300, Meelis Roos wrote:
3.16 scsi worked fine, 3.17-rc1 misbehaves on 3 of my sparc64 test
machines. E220R and E420R are with onboard 5c3875, V210 is with onboarc
53c1010 and all behave the same. Any ideas whre to dig deeper? bisection
might be nontrivial
[006a7164] do_scan_async+0x4/0x20
[004817b8] async_run_entry_fn+0x58/0x120
---[ end trace 9a1420108ebfd590 ]---
Signed-off-by: Meelis Roos mr...@linux.ee
diff --git a/drivers/scsi/qla1280.c b/drivers/scsi/qla1280.c
index 5a522c5..97dabd3 100644
--- a/drivers/scsi/qla1280.c
+++ b
Therefore I think the fix is going to involve adding a member to
struct esp_cmd_entry called -orig_tag[] so that we can see what
the original tag[] values were at esp_alloc_lun_tag() time.
Please try this patch:
It works on 3 consecutive boots, thank you!
Tested-by: Meelis Roos mr
[ 355.350042] Kernel panic - not syncing: Aiee, killing interrupt handler!
[ 355.430295] Press Stop-A (L1-A) to return to the boot prom
.] Waiting for /dev to be fully populated...
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body
/0x320 SS:ESP
0068:f5f8be2c
[42051.800011] CR2:
[42051.800320] ---[ end trace c351067a2986e126 ]---
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info
.
--
Meelis Roos (mr...@linux.ee)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
- SBus-only Ultra 1 with sparc64
architecture. The test I meant to do but could't was to load the libsas
module.
--
Meelis Roos ([EMAIL PROTECTED])
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http
Eliminate unnecessary PCI dependencies in libsas. It should use generic DMA
and struct device like other subsystems.
Compiles fine, unfortunately I can not test kernels on this machine
since I have yet to dig out the reason my kernels do not boot.
--
Meelis Roos ([EMAIL PROTECTED
# CONFIG_SCSI_QLOGICPTI is not set
# CONFIG_SCSI_DEBUG is not set
CONFIG_SCSI_SUNESP=y
--
Meelis Roos ([EMAIL PROTECTED])
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
69 matches
Mail list logo