ISCI bus_alloc_resource failed

2015-09-07 Thread Bradley W. Dutton

Hi,

I'm having trouble with the isci driver in both stable and current. I  
see the following dmesg in stable:


isci0:  port  
0x5000-0x50ff mem 0xe7afc000-0xe7af,0xe740-0xe77f irq 19  
at device 0.0 on pci11

isci: 1:51 ISCI bus_alloc_resource failed


I'm running FreeBSD on VMWare ESXi 6 with vt-d passthrough of the isci  
devices, here is the relevant pciconf output:


none2@pci0:3:0:0:	class=0x0c0500 card=0x062815d9 chip=0x1d708086  
rev=0x06 hdr=0x00

vendor = 'Intel Corporation'
device = 'C600/X79 series chipset SMBus Controller 0'
class  = serial bus
subclass   = SMBus
cap 10[90] = PCI-Express 2 endpoint max data 128(128) link x32(x32)
 speed 5.0(5.0) ASPM disabled(L0s)
cap 01[cc] = powerspec 3  supports D0 D3  current D0
cap 05[d4] = MSI supports 1 message
ecap 000e[100] = ARI 1
isci0@pci0:11:0:0:	class=0x010700 card=0x062815d9 chip=0x1d6b8086  
rev=0x06 hdr=0x00

vendor = 'Intel Corporation'
device = 'C602 chipset 4-Port SATA Storage Control Unit'
class  = mass storage
subclass   = SAS
cap 01[98] = powerspec 3  supports D0 D3  current D0
cap 10[c4] = PCI-Express 2 endpoint max data 128(128) link x32(x32)
 speed 5.0(5.0) ASPM disabled(L0s)
cap 11[a0] = MSI-X supports 2 messages
 Table in map 0x10[0x2000], PBA in map 0x10[0x3000]
ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
ecap 000e[138] = ARI 1
ecap 0017[180] = TPH Requester 1
ecap 0010[140] = SRIOV 1


I haven't tried booting on bare metal but running a linux distro  
(centos 7) in the same VM works without issue. Is is possible the  
SRIOV option is causing trouble? I don't see a BIOS option to disable  
that setting on this server like I have on some others. Any other  
ideas to get this working?


Thanks,
Brad

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ISCI bus_alloc_resource failed

2015-09-07 Thread Bradley W. Dutton

Quoting Jim Harris <jim.har...@gmail.com>:


On Mon, Sep 7, 2015 at 10:34 AM, Bradley W. Dutton <
brad-fbsd-sta...@duttonbros.com> wrote:


Hi,

I'm having trouble with the isci driver in both stable and current. I see
the following dmesg in stable:

isci0: <Intel(R) C600 Series Chipset SAS Controller (SATA mode)> port
0x5000-0x50ff mem 0xe7afc000-0xe7af,0xe740-0xe77f irq 19 at
device 0.0 on pci11
isci: 1:51 ISCI bus_alloc_resource failed


I'm running FreeBSD on VMWare ESXi 6 with vt-d passthrough of the isci
devices, here is the relevant pciconf output:

none2@pci0:3:0:0:   class=0x0c0500 card=0x062815d9 chip=0x1d708086
rev=0x06 hdr=0x00
vendor = 'Intel Corporation'
device = 'C600/X79 series chipset SMBus Controller 0'
class  = serial bus
subclass   = SMBus
cap 10[90] = PCI-Express 2 endpoint max data 128(128) link x32(x32)
 speed 5.0(5.0) ASPM disabled(L0s)
cap 01[cc] = powerspec 3  supports D0 D3  current D0
cap 05[d4] = MSI supports 1 message
ecap 000e[100] = ARI 1
isci0@pci0:11:0:0:  class=0x010700 card=0x062815d9 chip=0x1d6b8086
rev=0x06 hdr=0x00
vendor = 'Intel Corporation'
device = 'C602 chipset 4-Port SATA Storage Control Unit'
class  = mass storage
subclass   = SAS
cap 01[98] = powerspec 3  supports D0 D3  current D0
cap 10[c4] = PCI-Express 2 endpoint max data 128(128) link x32(x32)
 speed 5.0(5.0) ASPM disabled(L0s)
cap 11[a0] = MSI-X supports 2 messages
 Table in map 0x10[0x2000], PBA in map 0x10[0x3000]
ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
ecap 000e[138] = ARI 1
ecap 0017[180] = TPH Requester 1
ecap 0010[140] = SRIOV 1


I haven't tried booting on bare metal but running a linux distro (centos
7) in the same VM works without issue. Is is possible the SRIOV option is
causing trouble? I don't see a BIOS option to disable that setting on this
server like I have on some others. Any other ideas to get this working?



I do not think the SRIOV is the problem here.  I do notice that isci(4)
does not explicitly enable PCI busmaster, which will cause problems with
PCI passthrough.  I've attached a patch that rectifies that issue.  I'm not
certain that is the root cause of the interrupt resource allocation failure
though.

Could you:

1) Apply the attached patch and retest.
2) If you still see the resource allocation failure, reboot in verbose mode
and provide the resulting dmesg output.

Thanks,

-Jim


Thanks for the patch although the same error still persists. Here is  
the verbose boot log:

http://duttonbrosllc.com/misc/dmesg.boot

Thanks,
Brad

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ISCI bus_alloc_resource failed

2015-09-07 Thread Bradley W. Dutton

Quoting Jim Harris <jim.har...@gmail.com>:


On Mon, Sep 7, 2015 at 7:29 PM, Bradley W. Dutton <
brad-fbsd-sta...@duttonbros.com> wrote:


There are 2 devices in the same group so I passed both of them:
http://duttonbrosllc.com/misc/vmware_esxi_passthrough_config.png

At the time I wasn't sure if this was necessary but I just tried the
Centos 7 VM and it worked without the SMBus device being passed through. I
then tried the FreeBSD VM without SMBus and saw the same allocation error
as before. Looks like the SMBus device is a red herring?



Looks like on ESXi we are using Xen HVM init ops, which do not enable MSI.
And the isci driver is not reverting to INTx resource allocation when MSIx
vector allocation fails.  I've added reverting to INTx in the attached
patch - can you try once more?

Thanks,

-Jim


That patch worked. No allocation errors and the drives work as expected.

Thanks again,
Brad

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


ZFS issue?

2009-05-28 Thread Bradley W. Dutton
I updated my stable box (i386) on Tuesday and started seeing errors like
the below intermittently. Is anyone else seeing anything like this? I saw
similar errors periodically doing portupgrade and portinstall.

Besides the intermittent errors the stability is much improved, my i386
box hasn't crashed at all since the update. Previously I could crash the
box fairly reproducibly.

Thanks,
Brad

mv:
/home/bdutton/email/spam/cur/1243427579.M855387P56232VCB777092I0003F135_0.uno,S=4010:2,S:
set owner/group (was: 1000/89): Operation not permitted
mv:
/home/bdutton/email/spam/cur/1243427579.M855387P56232VCB777092I0003F135_0.uno,S=4010:2,S:
set flags (was: ): Invalid argument
mv:
/home/bdutton/email/spam/cur/1243428013.M780180P56404VCB777092I0003F139_0.uno,S=3330:2,S:
set owner/group (was: 1000/89): Operation not permitted
mv:
/home/bdutton/email/spam/cur/1243428013.M780180P56404VCB777092I0003F139_0.uno,S=3330:2,S:
set flags (was: ): Invalid argument


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


em watchdog timeout

2006-12-14 Thread Bradley W. Dutton
Hi,

I'm still seeing watchdog timeouts on my em card using stable from Nov 28.
Before the flurry of messages/activity about this last month I never had
any timeout problems, they only started occurring after the em changes
were committed in November. Generally when the timeout occurs everything
hangs for a bit (30-60 seconds maybe?), then resumes normal activity.
Please let me know what I can do to help.

Thanks,
Brad


# the log file starts on the morning of the 11th
[EMAIL PROTECTED]/var/log][119]% grep -i watchdog messages
Dec 11 11:38:49 backup kernel: em0: watchdog timeout -- resetting
Dec 11 12:07:03 backup kernel: em0: watchdog timeout -- resetting
Dec 11 19:35:26 backup kernel: em0: watchdog timeout -- resetting
Dec 11 20:33:07 backup kernel: em0: watchdog timeout -- resetting
Dec 11 23:03:35 backup kernel: em0: watchdog timeout -- resetting
Dec 12 06:06:58 backup kernel: em0: watchdog timeout -- resetting
Dec 12 14:05:28 backup kernel: em0: watchdog timeout -- resetting
Dec 12 19:02:12 backup kernel: em0: watchdog timeout -- resetting
Dec 12 19:06:43 backup kernel: em0: watchdog timeout -- resetting
Dec 12 20:02:46 backup kernel: em0: watchdog timeout -- resetting
Dec 12 20:03:54 backup kernel: em0: watchdog timeout -- resetting
Dec 12 21:15:37 backup kernel: em0: watchdog timeout -- resetting
Dec 12 23:05:24 backup kernel: em0: watchdog timeout -- resetting
Dec 13 01:38:36 backup kernel: em0: watchdog timeout -- resetting
Dec 13 02:35:26 backup kernel: em0: watchdog timeout -- resetting
Dec 13 11:37:27 backup kernel: em0: watchdog timeout -- resetting
Dec 13 19:47:56 backup kernel: em0: watchdog timeout -- resetting
Dec 13 20:29:45 backup kernel: em0: watchdog timeout -- resetting
Dec 13 23:35:00 backup kernel: em0: watchdog timeout -- resetting
Dec 14 06:35:40 backup kernel: em0: watchdog timeout -- resetting

[EMAIL PROTECTED]/var/log][120]% uname -a
FreeBSD backup 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #0: Tue Nov 28
17:52:48 PST 2006 [EMAIL PROTECTED]:/home/obj/usr/src/sys/BACKUP  i386

- debug.mpsafenet is already 0 because I have ipsec enabled.
- I'm using PF/altq
- Polling is compiled into the kernel but not enabled, HZ=1000
- em0 is a PCI card, not on board, with jumbo frames enabled. This
interface is transferring between 20-80Mbit/sec 24hours/day.
- I have vr0 on board (cable internet connection, light traffic) and a 4
port sis pci card that has some light traffic.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


graid3 rebuild panic: mb_dtor_pack: ext_size != MCLBYTES

2006-07-06 Thread Bradley W. Dutton
Hi,

I get the below panic when rebuilding a graid3 array. Is this indicative
of a hardware or software problem? Or is some of the data on my array
corrupt and I should just rebuild the array? I searched on google and
didn't find much.

panic: mb_dtor_pack: ext_size != MCLBYTES

Thanks for your time,
Brad

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


graid3 configure on 6 stable

2006-07-01 Thread Bradley W. Dutton
Hi,

I just tried 'graid3 configure -a' on a degraded array and received the
following:
panic: lock geom topology not exclusively locked @
/usr/src/sys/geom/raid3/g_raid3_ctl.c:105

Thanks,
Brad

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.1 prerelease graid3 livelock?

2006-04-30 Thread Bradley W. Dutton
I can reproduce the panic I experienced before. When in single user mode
if I try to mount a raid3 array that isn't complete I get the following
error:
panic: Lock (sx) GEOM topology locked @
/usr/src/sys/geom/raid3/g_raid3.c:775.

The full error and alltrace is here:
http://duttonbros.com/freebsd/ddb.log

As a workaround I would boot a pre March 20 kernel, rebuild the array,
then boot the new kernel again.

Thanks,
Brad



 It looks like the second patch fixed the problem. The box has been up for
 just over a day and a half without any problems.

 Thanks,
 Brad


 Well I don't know what was going on earlier but I reverted to good
 kernel,
 synced my raid arrays (no longer degraded from the panics), then booted
 a
 kernel with the second patch applied, this time no problems so far. I'll
 let you know how things go after the box is running for a while.

 Thanks,
 Brad


 Ok, I had already installed and booted the first patch. I then rebuilt
 the
 kernel with the second patch. Trying to reboot from the first patch to
 the
 second resulted in a crash/panic on shutdown. I didn't capture the
 output
 from this. Once I booted the second patch the machine panics in the
 boot
 process, in short:
 /dev/raid3t/moviesf: clean,o 190615 free (19p11 frags, 23588 oblocks,
 0.8%
 fralogy locked @ /usr/src/sys/geom/raid3/g_raid3.c:773.
 KDB: enter: panic
 [thread pid 35 tid 100030 ]
 Stopped at  kdb_enter+0x30: leave

 The text copied from the serial console was a little garbled, it did
 say
 something like:
 sx lock, geom topology locked...

 I did an alltrace at that point which I'll send seperately.

 Thanks,
 Brad


 On Thu, Apr 27, 2006 at 08:55:35AM +0200, Pawel Jakub Dawidek wrote:
 + On Sun, Apr 23, 2006 at 12:04:33PM -0700, Bradley W. Dutton wrote:
 + + Hi,
 + +
 + + I'm experiencing a sort of livelock on a 6.1 prerelease box. It
 appears
 + + all  of the IO related activity hangs but the box continues to
 do
 + + routing/NAT/etc  for internet access from my other boxes. I can
 usually
 + + get the lockup to occur within about 12 hours of booting.
 +
 + Ok, I think I found it. Could you try this patch:
 +
 + http://people.freebsd.org/~pjd/patches/g_raid3.c.4.patch

 markus@ reported the livelock is still there, so please try this patch
 instead:

http://people.freebsd.org/~pjd/patches/g_raid3.c.5.patch



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.1 prerelease graid3 livelock?

2006-04-29 Thread Bradley W. Dutton
It looks like the second patch fixed the problem. The box has been up for
just over a day and a half without any problems.

Thanks,
Brad


 Well I don't know what was going on earlier but I reverted to good kernel,
 synced my raid arrays (no longer degraded from the panics), then booted a
 kernel with the second patch applied, this time no problems so far. I'll
 let you know how things go after the box is running for a while.

 Thanks,
 Brad


 Ok, I had already installed and booted the first patch. I then rebuilt
 the
 kernel with the second patch. Trying to reboot from the first patch to
 the
 second resulted in a crash/panic on shutdown. I didn't capture the
 output
 from this. Once I booted the second patch the machine panics in the boot
 process, in short:
 /dev/raid3t/moviesf: clean,o 190615 free (19p11 frags, 23588 oblocks,
 0.8%
 fralogy locked @ /usr/src/sys/geom/raid3/g_raid3.c:773.
 KDB: enter: panic
 [thread pid 35 tid 100030 ]
 Stopped at  kdb_enter+0x30: leave

 The text copied from the serial console was a little garbled, it did say
 something like:
 sx lock, geom topology locked...

 I did an alltrace at that point which I'll send seperately.

 Thanks,
 Brad


 On Thu, Apr 27, 2006 at 08:55:35AM +0200, Pawel Jakub Dawidek wrote:
 + On Sun, Apr 23, 2006 at 12:04:33PM -0700, Bradley W. Dutton wrote:
 + + Hi,
 + +
 + + I'm experiencing a sort of livelock on a 6.1 prerelease box. It
 appears
 + + all  of the IO related activity hangs but the box continues to do
 + + routing/NAT/etc  for internet access from my other boxes. I can
 usually
 + + get the lockup to occur within about 12 hours of booting.
 +
 + Ok, I think I found it. Could you try this patch:
 +
 +  http://people.freebsd.org/~pjd/patches/g_raid3.c.4.patch

 markus@ reported the livelock is still there, so please try this patch
 instead:

 http://people.freebsd.org/~pjd/patches/g_raid3.c.5.patch


 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to
 [EMAIL PROTECTED]



 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.1 prerelease graid3 livelock?

2006-04-27 Thread Bradley W. Dutton
Ok, I had already installed and booted the first patch. I then rebuilt the
kernel with the second patch. Trying to reboot from the first patch to the
second resulted in a crash/panic on shutdown. I didn't capture the output
from this. Once I booted the second patch the machine panics in the boot
process, in short:
/dev/raid3t/moviesf: clean,o 190615 free (19p11 frags, 23588 oblocks, 0.8%
fralogy locked @ /usr/src/sys/geom/raid3/g_raid3.c:773.
KDB: enter: panic
[thread pid 35 tid 100030 ]
Stopped at  kdb_enter+0x30: leave

The text copied from the serial console was a little garbled, it did say
something like:
sx lock, geom topology locked...

I did an alltrace at that point which I'll send seperately.

Thanks,
Brad


 On Thu, Apr 27, 2006 at 08:55:35AM +0200, Pawel Jakub Dawidek wrote:
 + On Sun, Apr 23, 2006 at 12:04:33PM -0700, Bradley W. Dutton wrote:
 + + Hi,
 + +
 + + I'm experiencing a sort of livelock on a 6.1 prerelease box. It
 appears
 + + all  of the IO related activity hangs but the box continues to do
 + + routing/NAT/etc  for internet access from my other boxes. I can
 usually
 + + get the lockup to occur within about 12 hours of booting.
 +
 + Ok, I think I found it. Could you try this patch:
 +
 +http://people.freebsd.org/~pjd/patches/g_raid3.c.4.patch

 markus@ reported the livelock is still there, so please try this patch
 instead:

   http://people.freebsd.org/~pjd/patches/g_raid3.c.5.patch


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.1 prerelease graid3 livelock?

2006-04-27 Thread Bradley W. Dutton
Well I don't know what was going on earlier but I reverted to good kernel,
synced my raid arrays (no longer degraded from the panics), then booted a
kernel with the second patch applied, this time no problems so far. I'll
let you know how things go after the box is running for a while.

Thanks,
Brad


 Ok, I had already installed and booted the first patch. I then rebuilt the
 kernel with the second patch. Trying to reboot from the first patch to the
 second resulted in a crash/panic on shutdown. I didn't capture the output
 from this. Once I booted the second patch the machine panics in the boot
 process, in short:
 /dev/raid3t/moviesf: clean,o 190615 free (19p11 frags, 23588 oblocks, 0.8%
 fralogy locked @ /usr/src/sys/geom/raid3/g_raid3.c:773.
 KDB: enter: panic
 [thread pid 35 tid 100030 ]
 Stopped at  kdb_enter+0x30: leave

 The text copied from the serial console was a little garbled, it did say
 something like:
 sx lock, geom topology locked...

 I did an alltrace at that point which I'll send seperately.

 Thanks,
 Brad


 On Thu, Apr 27, 2006 at 08:55:35AM +0200, Pawel Jakub Dawidek wrote:
 + On Sun, Apr 23, 2006 at 12:04:33PM -0700, Bradley W. Dutton wrote:
 + + Hi,
 + +
 + + I'm experiencing a sort of livelock on a 6.1 prerelease box. It
 appears
 + + all  of the IO related activity hangs but the box continues to do
 + + routing/NAT/etc  for internet access from my other boxes. I can
 usually
 + + get the lockup to occur within about 12 hours of booting.
 +
 + Ok, I think I found it. Could you try this patch:
 +
 +   http://people.freebsd.org/~pjd/patches/g_raid3.c.4.patch

 markus@ reported the livelock is still there, so please try this patch
 instead:

  http://people.freebsd.org/~pjd/patches/g_raid3.c.5.patch


 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


6.1 prerelease graid3 livelock?

2006-04-23 Thread Bradley W. Dutton
Hi,

I'm experiencing a sort of livelock on a 6.1 prerelease box. It appears
all  of the IO related activity hangs but the box continues to do
routing/NAT/etc  for internet access from my other boxes. I can usually
get the lockup to occur within about 12 hours of booting.

I've narrowed down the commits to those on March 20 (kernel before then
works, kernel after then causes problems) and I think the problem is
geom/raid related. Besides a small gmirrored root partition the rest of my
partitions are all graid3. I'm not sure what information to provide to
help troubleshoot but I'm happy to do what's needed.

On an unrelated note the rebuild speed was about 50% faster on my box when
using the new geom/raid code introduced on March 20th.

Thanks,
Brad

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]