Re: tws bug ? (LSI SAS 9750)

2012-12-12 Thread John-Mark Gurney
Jim Harris wrote this message on Fri, Oct 26, 2012 at 13:24 -0700:
 On Fri, Oct 26, 2012 at 1:18 PM, John-Mark Gurney j...@funkthat.com wrote:
 
  I'm seeing similar stuff on the hpt27xx driver:
  (probe18:hpt27xx0:0:18:0): INQUIRY. CDB: 12 0 0 0 24 0
  (probe18:hpt27xx0:0:18:0): CAM status: Invalid Target ID
  (probe18:hpt27xx0:0:18:0): Error 22, Unretryable error
 
  Should I make a similar change in sys/dev/hpt27xx/osm_bsd.c?  Looks like
  there are two CAM_TID_INVALID lines, but from reading the comments, only
  the second one should change...
 
  Correct?  If so, I'll try making the change and make sure everything
  works well.
 
 
 Yes - I agree that a similar change is needed, and only to the second
 one in that file.

Ok, I've tested a patch, and so far things look much better...  It shuts
up all the bad probe messges...

Though I ran across a bug where the card went out to lunch giving these
messages:
(da2:hpt27xx0:0:2:0): READ(10). CDB: 28 0 a5 4c ae d8 0 0 58 0 
(da2:hpt27xx0:0:2:0): CAM status: SCSI Status Error
(da2:hpt27xx0:0:2:0): SCSI status: OK
(da3:hpt27xx0:0:3:0): READ(10). CDB: 28 0 a5 4c b9 f0 0 0 50 0 
(da3:hpt27xx0:0:3:0): CAM status: SCSI Status Error
(da3:hpt27xx0:0:3:0): SCSI status: OK

Scott Long suggested the first part of the patch so that an error is
actually generated...  Though it would be good for the sense data to
be set, but not sure where to get it...

Index: osm_bsd.c
===
--- osm_bsd.c   (revision 241041)
+++ osm_bsd.c   (working copy)
@@ -453,7 +453,7 @@
ccb-ccb_h.status = CAM_BUSY;
break;
default:
-   ccb-ccb_h.status = CAM_SCSI_STATUS_ERROR;
+   ccb-ccb_h.status = CAM_AUTOSENSE_FAIL;
break;
}
 
@@ -569,7 +569,7 @@
vd = ldm_find_target(vbus, ccb-ccb_h.target_id);
 
if (!vd) {
-   ccb-ccb_h.status = CAM_TID_INVALID;
+   ccb-ccb_h.status = CAM_SEL_TIMEOUT;
xpt_done(ccb);
return;
}

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 All that I will do, has been done, All that I have, has not.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: tws bug ? (LSI SAS 9750)

2012-10-26 Thread John-Mark Gurney
Jim Harris wrote this message on Fri, Sep 21, 2012 at 20:17 -0700:
 On Fri, Sep 21, 2012 at 5:37 PM, Mike Tancsa m...@sentex.net wrote:
  On 9/21/2012 8:03 PM, Jim Harris wrote:
  .
  then a lot of
  .
  (probe65:tws0:0:65:0): INQUIRY. CDB: 12 0 0 0 24 0
  (probe65:tws0:0:65:0): CAM status: Invalid Target ID
  (probe65:tws0:0:65:0): Error 22, Unretryable error
  (probe1:tws0:0:1:0): INQUIRY. CDB: 12 0 0 0 24 0
  (probe1:tws0:0:1:0): CAM status: Invalid Target ID
  (probe1:tws0:0:1:0): Error 22, Unretryable error
  (probe2:tws0:0:2:0): INQUIRY. CDB: 12 0 0 0 24 0
  (probe2:tws0:0:2:0): CAM status: Invalid Target ID
  .
  .
  .
  (probe63:tws0:0:63:0): INQUIRY. CDB: 12 0 0 0 24 0
  (probe63:tws0:0:63:0): CAM status: Invalid Target ID
  (probe63:tws0:0:63:0): Error 22, Unretryable error
  (probe64:tws0:0:64:0): INQUIRY. CDB: 12 0 0 0 24 0
  (probe64:tws0:0:64:0): CAM status: Invalid Target ID
  (probe64:tws0:0:64:0): Error 22, Unretryable error
 
  These can be ignored.  CAM is just telling you that there are no
  devices attached at these target IDs.
 
  What about a change similar to what Alexander Motin did in
 
  http://lists.freebsd.org/pipermail/svn-src-head/2012-June/038196.html
 
 Ah, yes.  I was thinking you had CAM_DEBUG enabled which is why you
 were seeing this spew - but that's not the case.  This indeed should
 be fixed and not just ignored.
 
 Seeing the attributions on Alexander's commit, you certainly seem to
 have a monopoly on controllers that exhibit this problem on FreeBSD.
 :)
 
 I believe the CAM_LUN_INVALID here should be fixed as well, similar to
 the twa commit.  If you send me a revised patch I will commit it.

I'm seeing similar stuff on the hpt27xx driver:
(probe18:hpt27xx0:0:18:0): INQUIRY. CDB: 12 0 0 0 24 0 
(probe18:hpt27xx0:0:18:0): CAM status: Invalid Target ID
(probe18:hpt27xx0:0:18:0): Error 22, Unretryable error

Should I make a similar change in sys/dev/hpt27xx/osm_bsd.c?  Looks like
there are two CAM_TID_INVALID lines, but from reading the comments, only
the second one should change...

Correct?  If so, I'll try making the change and make sure everything
works well.

Thanks.

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 All that I will do, has been done, All that I have, has not.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: tws bug ? (LSI SAS 9750)

2012-10-26 Thread Jim Harris
On Fri, Oct 26, 2012 at 1:18 PM, John-Mark Gurney j...@funkthat.com wrote:

 I'm seeing similar stuff on the hpt27xx driver:
 (probe18:hpt27xx0:0:18:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe18:hpt27xx0:0:18:0): CAM status: Invalid Target ID
 (probe18:hpt27xx0:0:18:0): Error 22, Unretryable error

 Should I make a similar change in sys/dev/hpt27xx/osm_bsd.c?  Looks like
 there are two CAM_TID_INVALID lines, but from reading the comments, only
 the second one should change...

 Correct?  If so, I'll try making the change and make sure everything
 works well.


Yes - I agree that a similar change is needed, and only to the second
one in that file.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: tws bug ? (LSI SAS 9750)

2012-09-22 Thread Thomas Mueller
On Fri, Sep 21, 2012 at 5:37 PM, Mike Tancsa m...@sentex.net wrote:
 On 9/21/2012 8:03 PM, Jim Harris wrote:
 .
 then a lot of
 .
 (probe65:tws0:0:65:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe65:tws0:0:65:0): CAM status: Invalid Target ID
 (probe65:tws0:0:65:0): Error 22, Unretryable error
 (probe1:tws0:0:1:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe1:tws0:0:1:0): CAM status: Invalid Target ID
 (probe1:tws0:0:1:0): Error 22, Unretryable error
 (probe2:tws0:0:2:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe2:tws0:0:2:0): CAM status: Invalid Target ID
 .
 .
 .
 (probe63:tws0:0:63:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe63:tws0:0:63:0): CAM status: Invalid Target ID
 (probe63:tws0:0:63:0): Error 22, Unretryable error
 (probe64:tws0:0:64:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe64:tws0:0:64:0): CAM status: Invalid Target ID
 (probe64:tws0:0:64:0): Error 22, Unretryable error

 These can be ignored.  CAM is just telling you that there are no
 devices attached at these target IDs.

 What about a change similar to what Alexander Motin did in

 http://lists.freebsd.org/pipermail/svn-src-head/2012-June/038196.html

Jim Harris jimhar...@freebsd.org responded:

 Ah, yes.  I was thinking you had CAM_DEBUG enabled which is why you
 were seeing this spew - but that's not the case.  This indeed should
 be fixed and not just ignored.

 Seeing the attributions on Alexander's commit, you certainly seem to
 have a monopoly on controllers that exhibit this problem on FreeBSD.
 :)

 I believe the CAM_LUN_INVALID here should be fixed as well, similar to
 the twa commit.  If you send me a revised patch I will commit it.

The specific subject of this thread is not my issue, but I did notice 
problems apparently related to CAM on a SATA hard drive.

I use one UFS partition, with FreeBSD 9.0-BETA1 installed (subsequently 
updated on another partition, using GPT as opposed to MBR), for ports tree
and also NetBSD pkgsrc and NetBSD source code.  I built NetBSD 5.1_STABLE i386
from FreeBSD and also built xorg-modular on the new NetBSD installation from
pkgsrc.  Going into and out of the newly installed Xorg resulted in some
crashes with the FreeBSD 9.0-BETA1 partition mounted and not cleanly 
unmounted.  File system was damaged, and FreeBSD fsck_ffs wouldn't fix it,
went into a loop:


Script started on Wed Sep 19 04:15:02 2012
fsck_ffs /dev/ada0p9
** /dev/ada0p9
** Last Mounted on /BETA1
** Phase 1 - Check Blocks and Sizes

CANNOT READ BLK: 7584192
CONTINUE? [yn] y

THE FOLLOWING DISK SECTORS COULD NOT BE READ: 7584318, 7584319,
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
1475900 files, 4638292 used, 21162419 free (61643 frags, 2637597 blocks, 0.2% 
fragmentation)

* FILE SYSTEM STILL DIRTY *

* PLEASE RERUN FSCK *

Script done on Wed Sep 19 04:17:27 2012


This happened repeatedly, meaning an impasse.

I didn't get to record preceding error messages relating to ATA and CAM but,
seeing this last message, wonder if there are some bugs in the CAM.

I booted that new NetBSD 5.1_STABLE i386 installation, on a USB stick, was
able to mount that partition and see it wasn't trashed though there was a 
message about the dirty flag.  I then umounted and ran NetBSD fsck_ffs
successfully, just a few files were lost, and FreeBSD can access that
partition again.

I still intend to be more cautious when in NetBSD, not mounting a FreeBSD
partition unnecessarily when doing something crash-prone on my system in 
NetBSD, such as going into and out of X.

Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: tws bug ? (LSI SAS 9750)

2012-09-22 Thread Mike Tancsa
On 9/21/2012 11:17 PM, Jim Harris wrote:
 What about a change similar to what Alexander Motin did in

 http://lists.freebsd.org/pipermail/svn-src-head/2012-June/038196.html
 
 Ah, yes.  I was thinking you had CAM_DEBUG enabled which is why you
 were seeing this spew - but that's not the case.  This indeed should
 be fixed and not just ignored.
 
 Seeing the attributions on Alexander's commit, you certainly seem to
 have a monopoly on controllers that exhibit this problem on FreeBSD.
 :)
 
 I believe the CAM_LUN_INVALID here should be fixed as well, similar to
 the twa commit.  If you send me a revised patch I will commit it.

Thanks, yes seems to be déjà vu all over again with this manufacturer :)

I think this is correct ?

0(ich10)# diff -u tws_cam.c.orig tws_cam.c
--- tws_cam.c.orig  2012-09-21 20:10:43.0 -0400
+++ tws_cam.c   2012-09-22 08:21:36.0 -0400
@@ -529,10 +529,10 @@

 if ( ccb-ccb_h.target_lun ) {
 TWS_TRACE_DEBUG(sc, invalid lun error,0,0);
-ccb-ccb_h.status |= CAM_LUN_INVALID;
+ccb-ccb_h.status |= CAM_SEL_TIMEOUT;
 } else {
 TWS_TRACE_DEBUG(sc, invalid target error,0,0);
-ccb-ccb_h.status |= CAM_TID_INVALID;
+ccb-ccb_h.status |= CAM_SEL_TIMEOUT;
 }

 } else {
1(ich10)#

---Mike


 Thanks,
 
 -Jim
 
 

 0(ich10)# diff -u tws_cam.c.orig tws_cam.c
 --- tws_cam.c.orig  2012-09-21 20:10:43.0 -0400
 +++ tws_cam.c   2012-09-21 20:11:11.0 -0400
 @@ -532,7 +532,7 @@
  ccb-ccb_h.status |= CAM_LUN_INVALID;
  } else {
  TWS_TRACE_DEBUG(sc, invalid target error,0,0);
 -ccb-ccb_h.status |= CAM_TID_INVALID;
 +ccb-ccb_h.status |= CAM_SEL_TIMEOUT;
  }

  } else {
 1(ich10)#

 ---Mike


 --
 ---
 Mike Tancsa, tel +1 519 651 3400
 Sentex Communications, m...@sentex.net
 Providing Internet services since 1994 www.sentex.net
 Cambridge, Ontario Canada   http://www.tancsa.com/
 
 


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: tws bug ? (LSI SAS 9750)

2012-09-22 Thread Jim Harris
On Sat, Sep 22, 2012 at 1:32 AM, Thomas Mueller muelle...@insightbb.com wrote:

 The specific subject of this thread is not my issue, but I did notice
 problems apparently related to CAM on a SATA hard drive.


I would suggest starting a new thread if you have a different issue.

 I use one UFS partition, with FreeBSD 9.0-BETA1 installed (subsequently
 updated on another partition, using GPT as opposed to MBR), for ports tree
 and also NetBSD pkgsrc and NetBSD source code.  I built NetBSD 5.1_STABLE i386
 from FreeBSD and also built xorg-modular on the new NetBSD installation from
 pkgsrc.  Going into and out of the newly installed Xorg resulted in some
 crashes with the FreeBSD 9.0-BETA1 partition mounted and not cleanly
 unmounted.  File system was damaged, and FreeBSD fsck_ffs wouldn't fix it,
 went into a loop:


 Script started on Wed Sep 19 04:15:02 2012
 fsck_ffs /dev/ada0p9
 ** /dev/ada0p9
 ** Last Mounted on /BETA1
 ** Phase 1 - Check Blocks and Sizes

 CANNOT READ BLK: 7584192
 CONTINUE? [yn] y

 THE FOLLOWING DISK SECTORS COULD NOT BE READ: 7584318, 7584319,
 ** Phase 2 - Check Pathnames
 ** Phase 3 - Check Connectivity
 ** Phase 4 - Check Reference Counts
 ** Phase 5 - Check Cyl groups
 1475900 files, 4638292 used, 21162419 free (61643 frags, 2637597 blocks, 0.2% 
 fragmentation)

 * FILE SYSTEM STILL DIRTY *

 * PLEASE RERUN FSCK *

 Script done on Wed Sep 19 04:17:27 2012


 This happened repeatedly, meaning an impasse.

 I didn't get to record preceding error messages relating to ATA and CAM but,
 seeing this last message, wonder if there are some bugs in the CAM.

 I booted that new NetBSD 5.1_STABLE i386 installation, on a USB stick, was
 able to mount that partition and see it wasn't trashed though there was a
 message about the dirty flag.  I then umounted and ran NetBSD fsck_ffs
 successfully, just a few files were lost, and FreeBSD can access that
 partition again.

 I still intend to be more cautious when in NetBSD, not mounting a FreeBSD
 partition unnecessarily when doing something crash-prone on my system in
 NetBSD, such as going into and out of X.

 Tom
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: tws bug ? (LSI SAS 9750)

2012-09-21 Thread Jim Harris
On Fri, Sep 21, 2012 at 1:07 PM, Mike Tancsa m...@sentex.net wrote:
 Hi,
 I have been trying out a nice new tws controller and decided to enable
 debugging in the kernel and run some stress tests.  With a regular
 GENERIC kernel, it boots up fine.  But with debugging, it panics on
 boot. Anyone know whats up ? Is this something that should be sent
 directly to LSI ?

Through a code inspection, this mutex is being recursed whether or not
debugging is enabled.  There is no code path here specific to
INVARIANTS.  And the main IO path in this driver is always recursing
on this lock - it is not specific to the initialization callstack you
listed below.

The best course of action seems to be initializing the lock with
MTX_RECURSE, since the driver seems to expect to be able to recurse on
the io_lock.  Can you try the following patch?

diff --git a/sys/dev/tws/tws.c b/sys/dev/tws/tws.c
index b1615db..d156d40 100644
--- a/sys/dev/tws/tws.c
+++ b/sys/dev/tws/tws.c
@@ -197,7 +197,7 @@ tws_attach(device_t dev)
 mtx_init( sc-q_lock, tws_q_lock, NULL, MTX_DEF);
 mtx_init( sc-sim_lock,  tws_sim_lock, NULL, MTX_DEF);
 mtx_init( sc-gen_lock,  tws_gen_lock, NULL, MTX_DEF);
-mtx_init( sc-io_lock,  tws_io_lock, NULL, MTX_DEF);
+mtx_init( sc-io_lock,  tws_io_lock, NULL, MTX_DEF | MTX_RECURSE);

 if ( tws_init_trace_q(sc) == FAILURE )
 printf(trace init failure\n);




 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
 pci0: ACPI PCI bus on pcib0
 pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
 pci1: ACPI PCI bus on pcib1
 pcib2: ACPI PCI-PCI bridge irq 17 at device 1.1 on pci0
 pci2: ACPI PCI bus on pcib2
 LSI 3ware device driver for SAS/SATA storage controllers, version:
 10.80.00.003
 tws0: LSI 3ware SAS/SATA Storage Controller port 0x4000-0x40ff mem
 0xc246-0xc2463fff,0xc240-0xc243 irq 17 at device 0.0 on pci2
 tws0: Using legacy INTx
 panic: _mtx_lock_sleep: recursed on non-recursive mutex tws_io_lock @
 /usr/HEAD/src/sys/dev/tws/tws_hdm.c:287

 cpuid = 0
 KDB: stack backtrace:
 db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
 kdb_backtrace() at kdb_backtrace+0x37
 panic() at panic+0x1d8
 _mtx_lock_sleep() at _mtx_lock_sleep+0x27f
 _mtx_lock_flags() at _mtx_lock_flags+0xf1
 tws_submit_command() at tws_submit_command+0x3f
 tws_dmamap_data_load_cbfn() at tws_dmamap_data_load_cbfn+0xb7
 bus_dmamap_load() at bus_dmamap_load+0x16c
 tws_map_request() at tws_map_request+0x78
 tws_get_param() at tws_get_param+0xe1
 tws_display_ctlr_info() at tws_display_ctlr_info+0x4c
 tws_init_ctlr() at tws_init_ctlr+0x6d
 tws_attach() at tws_attach+0x68c
 device_attach() at device_attach+0x72
 bus_generic_attach() at bus_generic_attach+0x1a
 acpi_pci_attach() at acpi_pci_attach+0x164
 device_attach() at device_attach+0x72
 bus_generic_attach() at bus_generic_attach+0x1a
 acpi_pcib_attach() at acpi_pcib_attach+0x1a7
 acpi_pcib_pci_attach() at acpi_pcib_pci_attach+0x9b
 device_attach() at device_attach+0x72
 bus_generic_attach() at bus_generic_attach+0x1a
 acpi_pci_attach() at acpi_pci_attach+0x164
 device_attach() at device_attach+0x72
 bus_generic_attach() at bus_generic_attach+0x1a
 acpi_pcib_attach() at acpi_pcib_attach+0x1a7
 acpi_pcib_acpi_attach() at acpi_pcib_acpi_attach+0x1f6
 device_attach() at device_attach+0x72
 bus_generic_attach() at bus_generic_attach+0x1a
 acpi_attach() at acpi_attach+0xbc1
 device_attach() at device_attach+0x72
 bus_generic_attach() at bus_generic_attach+0x1a
 nexus_acpi_attach() at nexus_acpi_attach+0x69
 device_attach() at device_attach+0x72
 bus_generic_new_pass() at bus_generic_new_pass+0xd6
 bus_set_pass() at bus_set_pass+0x7a
 configure() at configure+0xa
 mi_startup() at mi_startup+0x77
 btext() at btext+0x2c
 KDB: enter: panic
 [ thread pid 0 tid 10 ]
 Stopped at  kdb_enter+0x3b: movq$0,0x993262(%rip)
 db


 int
 tws_submit_command(struct tws_softc *sc, struct tws_request *req)
 {
 u_int32_t regl, regh;
 u_int64_t mfa=0;

 /*
  * mfa register  read and write must be in order.
  * Get the io_lock to protect against simultinous
  * passthru calls
  */
 mtx_lock(sc-io_lock);

 if ( sc-obfl_q_overrun ) {
 tws_init_obfl_q(sc);
 }



 With no debugging in the kernel, it boots up fine

 pcib2: ACPI PCI-PCI bridge irq 17 at device 1.1 on pci0
 pci2: ACPI PCI bus on pcib2
 LSI 3ware device driver for SAS/SATA storage controllers, version:
 10.80.00.003
 tws0: LSI 3ware SAS/SATA Storage Controller port 0x4000-0x40ff mem
 0xc246-0xc2463fff,0xc240-0xc243 irq 17 at device 0.0 on pci2
 tws0: Using legacy INTx
 tws0: Controller details: Model 9750-4i, 8 Phys, Firmware FH9X
 5.12.00.007, BIOS BE9X 5.11.00.006
 em0: Intel(R) PRO/1000 Network Connection 7.3.2 port 0x5040-0x505f mem
 0xc250-0xc251,0xc257-0xc2570fff irq 19 at device 25.0 on pci0
 em0: Using an MSI interrupt
 em0: Ethernet address: 00:1e:67:45:b6:29
 ehci0: EHCI (generic) USB 2.0 controller mem 

Re: tws bug ? (LSI SAS 9750)

2012-09-21 Thread Mike Tancsa
On 9/21/2012 4:59 PM, Jim Harris wrote:
 boot. Anyone know whats up ? Is this something that should be sent
 directly to LSI ?
 
 Through a code inspection, this mutex is being recursed whether or not
 debugging is enabled.  There is no code path here specific to
 INVARIANTS.  And the main IO path in this driver is always recursing
 on this lock - it is not specific to the initialization callstack you
 listed below.
 
 The best course of action seems to be initializing the lock with
 MTX_RECURSE, since the driver seems to expect to be able to recurse on
 the io_lock.  Can you try the following patch?
 
 diff --git a/sys/dev/tws/tws.c b/sys/dev/tws/tws.c
 index b1615db..d156d40 100644
 --- a/sys/dev/tws/tws.c
 +++ b/sys/dev/tws/tws.c
 @@ -197,7 +197,7 @@ tws_attach(device_t dev)
  mtx_init( sc-q_lock, tws_q_lock, NULL, MTX_DEF);
  mtx_init( sc-sim_lock,  tws_sim_lock, NULL, MTX_DEF);
  mtx_init( sc-gen_lock,  tws_gen_lock, NULL, MTX_DEF);
 -mtx_init( sc-io_lock,  tws_io_lock, NULL, MTX_DEF);
 +mtx_init( sc-io_lock,  tws_io_lock, NULL, MTX_DEF | MTX_RECURSE);
 
  if ( tws_init_trace_q(sc) == FAILURE )
  printf(trace init failure\n);


Thanks, that allows it to boot up now!

pci2: ACPI PCI bus on pcib2
LSI 3ware device driver for SAS/SATA storage controllers, version:
10.80.00.003
tws0: LSI 3ware SAS/SATA Storage Controller port 0x4000-0x40ff mem
0xc246-0xc2463fff,0xc240-0xc243 irq 17 at device 0.0 on pci2
tws0: Using MSI
tws0: Controller details: Model 9750-4i, 8 Phys, Firmware FH9X
5.12.00.007, BIOS BE9X 5.11.00.006
em0: Intel(R) PRO/1000 Network Connection 7.3.2 port 0x5040-0x505f mem
0xc250-0xc251,0xc257-0xc2570fff irq 19 at device 25.0 on pci0
.
then a lot of
.
(probe65:tws0:0:65:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe65:tws0:0:65:0): CAM status: Invalid Target ID
(probe65:tws0:0:65:0): Error 22, Unretryable error
(probe1:tws0:0:1:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe1:tws0:0:1:0): CAM status: Invalid Target ID
(probe1:tws0:0:1:0): Error 22, Unretryable error
(probe2:tws0:0:2:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe2:tws0:0:2:0): CAM status: Invalid Target ID
.
.
.
(probe63:tws0:0:63:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe63:tws0:0:63:0): CAM status: Invalid Target ID
(probe63:tws0:0:63:0): Error 22, Unretryable error
(probe64:tws0:0:64:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe64:tws0:0:64:0): CAM status: Invalid Target ID
(probe64:tws0:0:64:0): Error 22, Unretryable error
da0 at tws0 bus 0 scbus0 target 0 lun 0
da0: LSI 9750-4iDISK 5.12 Fixed Direct Access SCSI-5 device
da0: 6000.000MB/s transfers
da0: 953654MB (1953083392 512 byte sectors: 255H 63S/T 121573C)
SMP: AP CPU #1 Launched!
SMP: AP CPU #4 Launched!




 Also, any reason NOT to set hw.tws.enable_msi=1 in /boot/loader.conf ?


Any thoughts on msi vs no msi ?  Time to run some stress tests.  Its
certainly a fast little controller for the money!


---Mike
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: tws bug ? (LSI SAS 9750)

2012-09-21 Thread Jim Harris
On Fri, Sep 21, 2012 at 3:11 PM, Mike Tancsa m...@sentex.net wrote:
 On 9/21/2012 4:59 PM, Jim Harris wrote:

snip

 Thanks, that allows it to boot up now!

 pci2: ACPI PCI bus on pcib2
 LSI 3ware device driver for SAS/SATA storage controllers, version:
 10.80.00.003
 tws0: LSI 3ware SAS/SATA Storage Controller port 0x4000-0x40ff mem
 0xc246-0xc2463fff,0xc240-0xc243 irq 17 at device 0.0 on pci2
 tws0: Using MSI
 tws0: Controller details: Model 9750-4i, 8 Phys, Firmware FH9X
 5.12.00.007, BIOS BE9X 5.11.00.006
 em0: Intel(R) PRO/1000 Network Connection 7.3.2 port 0x5040-0x505f mem
 0xc250-0xc251,0xc257-0xc2570fff irq 19 at device 25.0 on pci0
 .
 then a lot of
 .
 (probe65:tws0:0:65:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe65:tws0:0:65:0): CAM status: Invalid Target ID
 (probe65:tws0:0:65:0): Error 22, Unretryable error
 (probe1:tws0:0:1:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe1:tws0:0:1:0): CAM status: Invalid Target ID
 (probe1:tws0:0:1:0): Error 22, Unretryable error
 (probe2:tws0:0:2:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe2:tws0:0:2:0): CAM status: Invalid Target ID
 .
 .
 .
 (probe63:tws0:0:63:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe63:tws0:0:63:0): CAM status: Invalid Target ID
 (probe63:tws0:0:63:0): Error 22, Unretryable error
 (probe64:tws0:0:64:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe64:tws0:0:64:0): CAM status: Invalid Target ID
 (probe64:tws0:0:64:0): Error 22, Unretryable error

These can be ignored.  CAM is just telling you that there are no
devices attached at these target IDs.

 da0 at tws0 bus 0 scbus0 target 0 lun 0
 da0: LSI 9750-4iDISK 5.12 Fixed Direct Access SCSI-5 device
 da0: 6000.000MB/s transfers
 da0: 953654MB (1953083392 512 byte sectors: 255H 63S/T 121573C)
 SMP: AP CPU #1 Launched!
 SMP: AP CPU #4 Launched!


snip


 Any thoughts on msi vs no msi ?  Time to run some stress tests.  Its
 certainly a fast little controller for the money!


Typically MSI is preferred to INTx for performance reasons.  I can't
speak for why the original author made INTx the default though.

Regards,

-Jim
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: tws bug ? (LSI SAS 9750)

2012-09-21 Thread Mike Tancsa
On 9/21/2012 8:03 PM, Jim Harris wrote:
 .
 then a lot of
 .
 (probe65:tws0:0:65:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe65:tws0:0:65:0): CAM status: Invalid Target ID
 (probe65:tws0:0:65:0): Error 22, Unretryable error
 (probe1:tws0:0:1:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe1:tws0:0:1:0): CAM status: Invalid Target ID
 (probe1:tws0:0:1:0): Error 22, Unretryable error
 (probe2:tws0:0:2:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe2:tws0:0:2:0): CAM status: Invalid Target ID
 .
 .
 .
 (probe63:tws0:0:63:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe63:tws0:0:63:0): CAM status: Invalid Target ID
 (probe63:tws0:0:63:0): Error 22, Unretryable error
 (probe64:tws0:0:64:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe64:tws0:0:64:0): CAM status: Invalid Target ID
 (probe64:tws0:0:64:0): Error 22, Unretryable error
 
 These can be ignored.  CAM is just telling you that there are no
 devices attached at these target IDs.

What about a change similar to what Alexander Motin did in

http://lists.freebsd.org/pipermail/svn-src-head/2012-June/038196.html


0(ich10)# diff -u tws_cam.c.orig tws_cam.c
--- tws_cam.c.orig  2012-09-21 20:10:43.0 -0400
+++ tws_cam.c   2012-09-21 20:11:11.0 -0400
@@ -532,7 +532,7 @@
 ccb-ccb_h.status |= CAM_LUN_INVALID;
 } else {
 TWS_TRACE_DEBUG(sc, invalid target error,0,0);
-ccb-ccb_h.status |= CAM_TID_INVALID;
+ccb-ccb_h.status |= CAM_SEL_TIMEOUT;
 }

 } else {
1(ich10)#

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: tws bug ? (LSI SAS 9750)

2012-09-21 Thread Jim Harris
On Fri, Sep 21, 2012 at 5:37 PM, Mike Tancsa m...@sentex.net wrote:
 On 9/21/2012 8:03 PM, Jim Harris wrote:
 .
 then a lot of
 .
 (probe65:tws0:0:65:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe65:tws0:0:65:0): CAM status: Invalid Target ID
 (probe65:tws0:0:65:0): Error 22, Unretryable error
 (probe1:tws0:0:1:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe1:tws0:0:1:0): CAM status: Invalid Target ID
 (probe1:tws0:0:1:0): Error 22, Unretryable error
 (probe2:tws0:0:2:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe2:tws0:0:2:0): CAM status: Invalid Target ID
 .
 .
 .
 (probe63:tws0:0:63:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe63:tws0:0:63:0): CAM status: Invalid Target ID
 (probe63:tws0:0:63:0): Error 22, Unretryable error
 (probe64:tws0:0:64:0): INQUIRY. CDB: 12 0 0 0 24 0
 (probe64:tws0:0:64:0): CAM status: Invalid Target ID
 (probe64:tws0:0:64:0): Error 22, Unretryable error

 These can be ignored.  CAM is just telling you that there are no
 devices attached at these target IDs.

 What about a change similar to what Alexander Motin did in

 http://lists.freebsd.org/pipermail/svn-src-head/2012-June/038196.html

Ah, yes.  I was thinking you had CAM_DEBUG enabled which is why you
were seeing this spew - but that's not the case.  This indeed should
be fixed and not just ignored.

Seeing the attributions on Alexander's commit, you certainly seem to
have a monopoly on controllers that exhibit this problem on FreeBSD.
:)

I believe the CAM_LUN_INVALID here should be fixed as well, similar to
the twa commit.  If you send me a revised patch I will commit it.

Thanks,

-Jim



 0(ich10)# diff -u tws_cam.c.orig tws_cam.c
 --- tws_cam.c.orig  2012-09-21 20:10:43.0 -0400
 +++ tws_cam.c   2012-09-21 20:11:11.0 -0400
 @@ -532,7 +532,7 @@
  ccb-ccb_h.status |= CAM_LUN_INVALID;
  } else {
  TWS_TRACE_DEBUG(sc, invalid target error,0,0);
 -ccb-ccb_h.status |= CAM_TID_INVALID;
 +ccb-ccb_h.status |= CAM_SEL_TIMEOUT;
  }

  } else {
 1(ich10)#

 ---Mike


 --
 ---
 Mike Tancsa, tel +1 519 651 3400
 Sentex Communications, m...@sentex.net
 Providing Internet services since 1994 www.sentex.net
 Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org