subject:"current \+ mpt = panic\: Bad link elm 0xffffff80002d6480 next\-prev \!= elm"

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-09-24 Thread Thomas E. Spanjaard

On 09/24/2010 16:42, Marius Strobl wrote:
> On Tue, Jul 20, 2010 at 01:55:28PM +0200, Stle Kristoffersen wrote:
>> I got the timeouts with STABLE as well, that was the reason for me to
>> try out CURRENT. I'm sorry I didn't mention that earlier.
>>
>> My main concern is to get rid of the timeouts, but a panic on one can't be
>> right. How can I debug this further? I can get timeout fairly consistent by
>> putting a bit of load on the drives. If it would help I can also provide
>> remote access.
>>
> 
> FYI, that panic is fixed with r213105.

That doesn't build as module at least. It errors out, because of an
implicit declaration warning of mpt_req_on_pending_list, followed by a
warning of a nested extern declaration of the same (line 853 of
sys/dev/mpt/mpt.c).

Cheers,
-- 
Thomas E. Spanjaard
t...@netphreax.net
t...@deepbone.net



signature.asc
Description: OpenPGP digital signature

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-09-24 Thread Marius Strobl

On Tue, Jul 20, 2010 at 01:55:28PM +0200, Stle Kristoffersen wrote:
> On 2010-07-20 at 12:17, Marius Strobl wrote:
> > On Mon, Jul 19, 2010 at 07:06:54PM +0200, Stle Kristoffersen wrote:
> > > On 2010-07-18 at 14:20, Marius Strobl wrote:
> > > > > > Downgrading now...
> > > > > 
> > > > > And it crashed again, with current from r209598...
> > > > > 
> > > > 
> > > > Ok, this at least means that your problem isn't caused by the recent
> > > > changes to mpt(4) as the pre-r209599 version only differed from the
> > > > 8-STABLE one in a cosmetic change at that time.
> > > 
> > > I have another data-point, I cvsup'ed to the latest current again, and
> > > rebuilt without INVARIANT and WITNESS, and now it seems to survive the
> > > timeouts.
> > 
> > That's more or less expected as the sanity check issuing the panic
> > just isn't compiled in then. However, my understanding was that with
> > STABLE you don't get the timeouts in the first place, or do you see
> > them there also?
> 
> I got the timeouts with STABLE as well, that was the reason for me to
> try out CURRENT. I'm sorry I didn't mention that earlier.
> 
> My main concern is to get rid of the timeouts, but a panic on one can't be
> right. How can I debug this further? I can get timeout fairly consistent by
> putting a bit of load on the drives. If it would help I can also provide
> remote access.
> 

FYI, that panic is fixed with r213105.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-26 Thread Ståle Kristoffersen

On 2010-07-21 at 20:40, Svein Skogen (Listmail account) wrote:
> On 21.07.2010 18:33, Ståle Kristoffersen wrote:



> > I -might- have solved my problem. It has now ran for 24h without timeouts,
> > and with a bit of load on it. I think I might have ran into the seagate +
> > NCQ-problem, even tho seagate's webpage told me my drives was not affected
> > (according to the serial numbers). I did however update the following
> > num drives   firmware 
> > 6x  ST31000340AS SD15
> > 4x  ST31500341AS SD17
> 
> I have 8 of the last type (31500341AS) mine running on CC1H firmware,
> connected to my MFI. Not a single glitch so far.

I also have 8 of those :) Part of my problem is that they are all connected
to a sas expander, and when one drive gets in trouble everything is reset,
so I can't see which drives is causing the problems. Thats why I flashed
every drive I could find an update for.

> > to firmware SD1B (old SD17) and SD1A (old SD15), and that looks like it has
> > done the trick. I'll report back in a week or so if the problem has not
> > reappeared.
> 
> Hope it's fixed for you. I'm still keeping an eye on the MPT code to see
> if someone changes something that CAN be affecting my timeout
> issues/reset, and if I see something promising, I'm willing to dump out
> the entire server to tapes, and test run (I have sufficient spare tapes
> to actually test without losing data), but such a job will take me a
> week to prepare, and another to test. Quite a bit of time for something
> that "may" solve my problem... ;)

It still runs fine now after 6 days, so I'm optimistic :)
Not a single timeout.
Good luck with your tape drive.
-- 
Ståle Kristoffersen
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-21 Thread Svein Skogen (Listmail account)

On 21.07.2010 18:33, Ståle Kristoffersen wrote:
> On 2010-07-20 at 14:16, Svein Skogen (Listmail account) wrote:
>> Sorry for the late response here, but what you're describing matches
>> fairly well what I saw with RELENG_8 (just after 8.0 was released), but
>> luckily I didn't have any disks on my MPT, just my tape autoloader.
>>
>> Random timeouts, and then bus resets (that made tape IO unreliable).
>>
>> The bad news, is that I had the exact same trouble with OpenSolaris
>> (134), and something-similar with Linux (can't remember versions), at
>> the time.
>>
>> I never did find a solution, and ended up throwing windows on the box,
>> just to get reliable backups.
>>
>> My MPT is a 3801 LSI1068e based card running the latest bios.
> 
> Hmm, that does not sound good. Did windows work on the same hardware
> without problems?

Yup. But notice that I do _NOT_ have any disks on my MPT (I have an MFI
for that), it's just a mini-sas<-->mini-sas into a HP 1/8G2 LTO3 Autoloader.

> I -might- have solved my problem. It has now ran for 24h without timeouts,
> and with a bit of load on it. I think I might have ran into the seagate +
> NCQ-problem, even tho seagate's webpage told me my drives was not affected
> (according to the serial numbers). I did however update the following
> num drives   firmware 
> 6x  ST31000340AS SD15
> 4x  ST31500341AS SD17

I have 8 of the last type (31500341AS) mine running on CC1H firmware,
connected to my MFI. Not a single glitch so far.

> 
> to firmware SD1B (old SD17) and SD1A (old SD15), and that looks like it has
> done the trick. I'll report back in a week or so if the problem has not
> reappeared.

Hope it's fixed for you. I'm still keeping an eye on the MPT code to see
if someone changes something that CAN be affecting my timeout
issues/reset, and if I see something promising, I'm willing to dump out
the entire server to tapes, and test run (I have sufficient spare tapes
to actually test without losing data), but such a job will take me a
week to prepare, and another to test. Quite a bit of time for something
that "may" solve my problem... ;)

//Svein

-- 
+---+---
  /"\   |Svein Skogen   | sv...@d80.iso100.no
  \ /   |Solberg Østli 9| PGP Key:  0xE5E76831
   X|2020 Skedsmokorset | sv...@jernhuset.no
  / \   |Norway | PGP Key:  0xCE96CE13
|   | sv...@stillbilde.net
 ascii  |   | PGP Key:  0x58CD33B6
 ribbon |System Admin   | svein-listm...@stillbilde.net
Campaign|stillbilde.net | PGP Key:  0x22D494A4
+---+---
|msn messenger: | Mobile Phone: +47 907 03 575
|sv...@jernhuset.no | RIPE handle:SS16503-RIPE
+---+---
 If you really are in a hurry, mail me at
   svein-mob...@stillbilde.net
 This mailbox goes directly to my cellphone and is checked
even when I'm not in front of my computer.

 Picture Gallery:
  https://gallery.stillbilde.net/v/svein/




signature.asc
Description: OpenPGP digital signature

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-21 Thread Ståle Kristoffersen

On 2010-07-20 at 14:16, Svein Skogen (Listmail account) wrote:
> Sorry for the late response here, but what you're describing matches
> fairly well what I saw with RELENG_8 (just after 8.0 was released), but
> luckily I didn't have any disks on my MPT, just my tape autoloader.
> 
> Random timeouts, and then bus resets (that made tape IO unreliable).
> 
> The bad news, is that I had the exact same trouble with OpenSolaris
> (134), and something-similar with Linux (can't remember versions), at
> the time.
> 
> I never did find a solution, and ended up throwing windows on the box,
> just to get reliable backups.
> 
> My MPT is a 3801 LSI1068e based card running the latest bios.

Hmm, that does not sound good. Did windows work on the same hardware
without problems?

I -might- have solved my problem. It has now ran for 24h without timeouts,
and with a bit of load on it. I think I might have ran into the seagate +
NCQ-problem, even tho seagate's webpage told me my drives was not affected
(according to the serial numbers). I did however update the following
num drives   firmware 
6x  ST31000340AS SD15
4x  ST31500341AS SD17

to firmware SD1B (old SD17) and SD1A (old SD15), and that looks like it has
done the trick. I'll report back in a week or so if the problem has not
reappeared.

-- 
Ståle Kristoffersen
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-20 Thread Svein Skogen (Listmail account)

On 20.07.2010 13:55, Ståle Kristoffersen wrote:
> On 2010-07-20 at 12:17, Marius Strobl wrote:
>> On Mon, Jul 19, 2010 at 07:06:54PM +0200, Stle Kristoffersen wrote:
>>> On 2010-07-18 at 14:20, Marius Strobl wrote:
>> Downgrading now...
>
> And it crashed again, with current from r209598...
>

 Ok, this at least means that your problem isn't caused by the recent
 changes to mpt(4) as the pre-r209599 version only differed from the
 8-STABLE one in a cosmetic change at that time.
>>>
>>> I have another data-point, I cvsup'ed to the latest current again, and
>>> rebuilt without INVARIANT and WITNESS, and now it seems to survive the
>>> timeouts.
>>
>> That's more or less expected as the sanity check issuing the panic
>> just isn't compiled in then. However, my understanding was that with
>> STABLE you don't get the timeouts in the first place, or do you see
>> them there also?
> 
> I got the timeouts with STABLE as well, that was the reason for me to
> try out CURRENT. I'm sorry I didn't mention that earlier.
> 
> My main concern is to get rid of the timeouts, but a panic on one can't be
> right. How can I debug this further? I can get timeout fairly consistent by
> putting a bit of load on the drives. If it would help I can also provide
> remote access.
> 
> I'm trying to update the firmware on some of the drives now to see if that
> helps with the timeouts.

Sorry for the late response here, but what you're describing matches
fairly well what I saw with RELENG_8 (just after 8.0 was released), but
luckily I didn't have any disks on my MPT, just my tape autoloader.

Random timeouts, and then bus resets (that made tape IO unreliable).

The bad news, is that I had the exact same trouble with OpenSolaris
(134), and something-similar with Linux (can't remember versions), at
the time.

I never did find a solution, and ended up throwing windows on the box,
just to get reliable backups.

My MPT is a 3801 LSI1068e based card running the latest bios.

//Svein

-- 
+---+---
  /"\   |Svein Skogen   | sv...@d80.iso100.no
  \ /   |Solberg Østli 9| PGP Key:  0xE5E76831
   X|2020 Skedsmokorset | sv...@jernhuset.no
  / \   |Norway | PGP Key:  0xCE96CE13
|   | sv...@stillbilde.net
 ascii  |   | PGP Key:  0x58CD33B6
 ribbon |System Admin   | svein-listm...@stillbilde.net
Campaign|stillbilde.net | PGP Key:  0x22D494A4
+---+---
|msn messenger: | Mobile Phone: +47 907 03 575
|sv...@jernhuset.no | RIPE handle:SS16503-RIPE
+---+---
 If you really are in a hurry, mail me at
   svein-mob...@stillbilde.net
 This mailbox goes directly to my cellphone and is checked
even when I'm not in front of my computer.

 Picture Gallery:
  https://gallery.stillbilde.net/v/svein/




signature.asc
Description: OpenPGP digital signature

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-20 Thread Ståle Kristoffersen

On 2010-07-20 at 12:17, Marius Strobl wrote:
> On Mon, Jul 19, 2010 at 07:06:54PM +0200, Stle Kristoffersen wrote:
> > On 2010-07-18 at 14:20, Marius Strobl wrote:
> > > > > Downgrading now...
> > > > 
> > > > And it crashed again, with current from r209598...
> > > > 
> > > 
> > > Ok, this at least means that your problem isn't caused by the recent
> > > changes to mpt(4) as the pre-r209599 version only differed from the
> > > 8-STABLE one in a cosmetic change at that time.
> > 
> > I have another data-point, I cvsup'ed to the latest current again, and
> > rebuilt without INVARIANT and WITNESS, and now it seems to survive the
> > timeouts.
> 
> That's more or less expected as the sanity check issuing the panic
> just isn't compiled in then. However, my understanding was that with
> STABLE you don't get the timeouts in the first place, or do you see
> them there also?

I got the timeouts with STABLE as well, that was the reason for me to
try out CURRENT. I'm sorry I didn't mention that earlier.

My main concern is to get rid of the timeouts, but a panic on one can't be
right. How can I debug this further? I can get timeout fairly consistent by
putting a bit of load on the drives. If it would help I can also provide
remote access.

I'm trying to update the firmware on some of the drives now to see if that
helps with the timeouts.

-- 
Ståle Kristoffersen
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-20 Thread Marius Strobl

On Mon, Jul 19, 2010 at 07:06:54PM +0200, Stle Kristoffersen wrote:
> On 2010-07-18 at 14:20, Marius Strobl wrote:
> > > > Downgrading now...
> > > 
> > > And it crashed again, with current from r209598...
> > > 
> > 
> > Ok, this at least means that your problem isn't caused by the recent
> > changes to mpt(4) as the pre-r209599 version only differed from the
> > 8-STABLE one in a cosmetic change at that time.
> 
> I have another data-point, I cvsup'ed to the latest current again, and
> rebuilt without INVARIANT and WITNESS, and now it seems to survive the
> timeouts.

That's more or less expected as the sanity check issuing the panic
just isn't compiled in then. However, my understanding was that with
STABLE you don't get the timeouts in the first place, or do you see
them there also?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-19 Thread Ståle Kristoffersen

On 2010-07-18 at 14:20, Marius Strobl wrote:
> > > Downgrading now...
> > 
> > And it crashed again, with current from r209598...
> > 
> 
> Ok, this at least means that your problem isn't caused by the recent
> changes to mpt(4) as the pre-r209599 version only differed from the
> 8-STABLE one in a cosmetic change at that time.

I have another data-point, I cvsup'ed to the latest current again, and
rebuilt without INVARIANT and WITNESS, and now it seems to survive the
timeouts.
-- 
Ståle Kristoffersen
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-18 Thread Ståle Kristoffersen

On 2010-07-16 at 12:31, Ståle Kristoffersen wrote:
> On 2010-07-15 at 19:52, Ståle Kristoffersen wrote:
> > On 2010-07-15 at 18:00, Marius Strobl wrote:
> > > On Thu, Jul 15, 2010 at 02:34:23PM +0200, Stle Kristoffersen wrote:
> > > > Upgraded to from stable to current yesterday and very quickly received a
> > > > panic. It did however not dump it's core, so I was unable to debug it.
> > > > Today it did panic again, and I took a picture: (Sorry about the bad
> > > > quality)
> > > > 
> > > > http://folk.uio.no/stalk/mpt/IMG_1403.JPG
> > > > 
> > > > And from the backtrace:
> > > > http://folk.uio.no/stalk/mpt/IMG_1404.JPG
> > > > 
> > > > Both times I hade the mpt0: request timed out just before the panic.
> > > > 
> > > > I'm not sure why it's not dumping it's core (It was working under 
> > > > stable,
> > > > and I have dumpdev="AUTO" and dumpdir="/var/crash" in rc.conf)
> > > 
> > > What revision were you using?
> > 
> > Not sure exactly what revision I was using, is there an easy way to figure
> > that out? I ran cvsupdate around 13:00 CEST yesterday.
> > 
> > > Does using current as of r209598 make a difference?
> > 
> > Downgrading now...
> 
> And it crashed again, with current from r209598...

It still keeps on crashing :/
I grabbed the output of show alllocks:
http://folk.uio.no/stalk/mpt/IMAG0047.jpg

To me it looks like maybe there is a race condition or something that makes
TAILQ_REMOVE-call in mpt_scsi_tmf_reply_handler() work on an element that
has been removed, but this is an un-educated guess ;)
I do not understand enough of the driver to follow the flow of the requests
around the driver.

-- 
Ståle Kristoffersen
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-18 Thread Marius Strobl

On Fri, Jul 16, 2010 at 12:31:26PM +0200, Stle Kristoffersen wrote:
> On 2010-07-15 at 19:52, St?le Kristoffersen wrote:
> > On 2010-07-15 at 18:00, Marius Strobl wrote:
> > > On Thu, Jul 15, 2010 at 02:34:23PM +0200, Stle Kristoffersen wrote:
> > > > Upgraded to from stable to current yesterday and very quickly received a
> > > > panic. It did however not dump it's core, so I was unable to debug it.
> > > > Today it did panic again, and I took a picture: (Sorry about the bad
> > > > quality)
> > > > 
> > > > http://folk.uio.no/stalk/mpt/IMG_1403.JPG
> > > > 
> > > > And from the backtrace:
> > > > http://folk.uio.no/stalk/mpt/IMG_1404.JPG
> > > > 
> > > > Both times I hade the mpt0: request timed out just before the panic.
> > > > 
> > > > I'm not sure why it's not dumping it's core (It was working under 
> > > > stable,
> > > > and I have dumpdev="AUTO" and dumpdir="/var/crash" in rc.conf)
> > > 
> > > What revision were you using?
> > 
> > Not sure exactly what revision I was using, is there an easy way to figure
> > that out? I ran cvsupdate around 13:00 CEST yesterday.
> > 
> > > Does using current as of r209598 make a difference?
> > 
> > Downgrading now...
> 
> And it crashed again, with current from r209598...
> 

Ok, this at least means that your problem isn't caused by the recent
changes to mpt(4) as the pre-r209599 version only differed from the
8-STABLE one in a cosmetic change at that time.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-16 Thread Ståle Kristoffersen

On 2010-07-15 at 19:52, Ståle Kristoffersen wrote:
> On 2010-07-15 at 18:00, Marius Strobl wrote:
> > On Thu, Jul 15, 2010 at 02:34:23PM +0200, Stle Kristoffersen wrote:
> > > Upgraded to from stable to current yesterday and very quickly received a
> > > panic. It did however not dump it's core, so I was unable to debug it.
> > > Today it did panic again, and I took a picture: (Sorry about the bad
> > > quality)
> > > 
> > > http://folk.uio.no/stalk/mpt/IMG_1403.JPG
> > > 
> > > And from the backtrace:
> > > http://folk.uio.no/stalk/mpt/IMG_1404.JPG
> > > 
> > > Both times I hade the mpt0: request timed out just before the panic.
> > > 
> > > I'm not sure why it's not dumping it's core (It was working under stable,
> > > and I have dumpdev="AUTO" and dumpdir="/var/crash" in rc.conf)
> > 
> > What revision were you using?
> 
> Not sure exactly what revision I was using, is there an easy way to figure
> that out? I ran cvsupdate around 13:00 CEST yesterday.
> 
> > Does using current as of r209598 make a difference?
> 
> Downgrading now...

And it crashed again, with current from r209598...


-- 
Ståle Kristoffersen
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-15 Thread Ståle Kristoffersen

On 2010-07-15 at 18:00, Marius Strobl wrote:
> On Thu, Jul 15, 2010 at 02:34:23PM +0200, Stle Kristoffersen wrote:
> > Upgraded to from stable to current yesterday and very quickly received a
> > panic. It did however not dump it's core, so I was unable to debug it.
> > Today it did panic again, and I took a picture: (Sorry about the bad
> > quality)
> > 
> > http://folk.uio.no/stalk/mpt/IMG_1403.JPG
> > 
> > And from the backtrace:
> > http://folk.uio.no/stalk/mpt/IMG_1404.JPG
> > 
> > Both times I hade the mpt0: request timed out just before the panic.
> > 
> > I'm not sure why it's not dumping it's core (It was working under stable,
> > and I have dumpdev="AUTO" and dumpdir="/var/crash" in rc.conf)
> 
> What revision were you using?

Not sure exactly what revision I was using, is there an easy way to figure
that out? I ran cvsupdate around 13:00 CEST yesterday.

> Does using current as of r209598 make a difference?

Downgrading now...

-- 
Ståle Kristoffersen
staal...@ifi.uio.no
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-15 Thread Marius Strobl

On Thu, Jul 15, 2010 at 02:34:23PM +0200, Stle Kristoffersen wrote:
> Upgraded to from stable to current yesterday and very quickly received a
> panic. It did however not dump it's core, so I was unable to debug it.
> Today it did panic again, and I took a picture: (Sorry about the bad
> quality)
> 
> http://folk.uio.no/stalk/mpt/IMG_1403.JPG
> 
> And from the backtrace:
> http://folk.uio.no/stalk/mpt/IMG_1404.JPG
> 
> Both times I hade the mpt0: request timed out just before the panic.
> 
> I'm not sure why it's not dumping it's core (It was working under stable,
> and I have dumpdev="AUTO" and dumpdir="/var/crash" in rc.conf)

What revision were you using?
Does using current as of r209598 make a difference?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-15 Thread Ståle Kristoffersen

On 2010-07-15 at 14:34, Ståle Kristoffersen wrote:
> Upgraded to from stable to current yesterday and very quickly received a
> panic. It did however not dump it's core, so I was unable to debug it.
> Today it did panic again, and I took a picture: (Sorry about the bad
> quality)
> 
> http://folk.uio.no/stalk/mpt/IMG_1403.JPG
> 
> And from the backtrace:
> http://folk.uio.no/stalk/mpt/IMG_1404.JPG
> 
> Both times I hade the mpt0: request timed out just before the panic.
> 
> I'm not sure why it's not dumping it's core (It was working under stable,
> and I have dumpdev="AUTO" and dumpdir="/var/crash" in rc.conf)

Just to be complete: I also get this LOR at boot:
lock order reversal:
 1st 0xff80a5108b38 bufwait (bufwait) @
/usr/src/sys/kern/vfs_bio.c:2607
 2nd 0xff0002dc6000 dirhash (dirhash) @
/usr/src/sys/ufs/ufs/ufs_dirhash.c:283
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x2e
witness_checkorder() at witness_checkorder+0x81e
_sx_xlock() at _sx_xlock+0x55
ufsdirhash_acquire() at ufsdirhash_acquire+0x33
ufsdirhash_remove() at ufsdirhash_remove+0x16
ufs_dirremove() at ufs_dirremove+0x1a4
ufs_remove() at ufs_remove+0x92
VOP_REMOVE_APV() at VOP_REMOVE_APV+0x93
kern_unlinkat() at kern_unlinkat+0x2cb
syscallenter() at syscallenter+0x1b5
syscall() at syscall+0x4c
Xfast_syscall() at Xfast_syscall+0xe2
--- syscall (10, FreeBSD ELF64, unlink), rip = 0x80072f3cc, rsp =
0x7fffdb08, rbp = 0x7fffef58 ---
lock order reversal:
 1st 0xff00407a4458 ufs (ufs) @ /usr/src/sys/kern/vfs_mount.c:1058
 2nd 0xff00407aedb8 devfs (devfs) @ /usr/src/sys/kern/vfs_subr.c:2090
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x2e
witness_checkorder() at witness_checkorder+0x81e
__lockmgr_args() at __lockmgr_args+0xd11
vop_stdlock() at vop_stdlock+0x39
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b
_vn_lock() at _vn_lock+0x47
vget() at vget+0x7b
devfs_allocv() at devfs_allocv+0x100
devfs_root() at devfs_root+0x48
vfs_donmount() at vfs_donmount+0xfb2
nmount() at nmount+0x63
syscallenter() at syscallenter+0x1b5
syscall() at syscall+0x4c
Xfast_syscall() at Xfast_syscall+0xe2
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007b2b4c, rsp =
0x7fffdd28, rbp = 0x800c09048 ---

-- 
Ståle Kristoffersen
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-15 Thread Ståle Kristoffersen

Upgraded to from stable to current yesterday and very quickly received a
panic. It did however not dump it's core, so I was unable to debug it.
Today it did panic again, and I took a picture: (Sorry about the bad
quality)

http://folk.uio.no/stalk/mpt/IMG_1403.JPG

And from the backtrace:
http://folk.uio.no/stalk/mpt/IMG_1404.JPG

Both times I hade the mpt0: request timed out just before the panic.

I'm not sure why it's not dumping it's core (It was working under stable,
and I have dumpdev="AUTO" and dumpdir="/var/crash" in rc.conf)
-- 
Ståle Kristoffersen
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

16 matches

Site Navigation

Mail list logo

Footer information