Re: New FreeBSD snapshots available: stable/10 (20150625 r284813)

Kurt Lidl Mon, 06 Jul 2015 13:40:47 -0700

On 7/2/15 11:00 AM, Glen Barber wrote:

On Thu, Jul 02, 2015 at 10:52:00AM -0400, Kurt Lidl wrote:

Kurt, can you re-enable the ipv6 line in rc.conf(5), and add '-tso6' to
your rc.conf(5) lines?


  ifconfig_bge0="DHCP"
  ifconfig_bge0_ipv6="inet6 accept_rtadv -tso6"


I tried this, and it panic'd in the same manner.  (Note - I've upgraded
this machine to the second 10.2-PRELEASE build.)


Okay, thank you for testing.  The last commits that I see specifically
referencing this bge(4) model were a long time ago, but TSO was
mentioned.  It was worth a shot.


Sure, no problem.

[...]

I've also seen (now that it's been running a bit longer), a couple of
other occurrences of the "spin lock held too long" panic. So while
having the IPv6 configuration in /etc/rc.conf causes this crash to
occur most of the time on boot, the same crash occurs at other times
too, which don't appear to IPv6 related.


Can you update the PR with this information, please?


Already done by the time I sent the email.

1) when making the requested change, I editted my /etc/rc.conf file,
and then issued "reboot".  The machine panic'd during the reboot
processing:

root@spork:~ # reboot
Jul  2 09:48:53 spork reboot: rebooted by root
Jul  2 09:48:53 spork syslogd: exiting on signal 15
Waiting (max 60 seconds) for system process `vnlru' to stop...done
Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining...0 0 0 0 done
All buffers synced.
Uptime: 14h34m16s
GEOM_MIRROR: Device gswap: provider mirror/gswap destroyed.
GEOM_MIRROR: Device gswap destroyed.
pid 1 (init), uid 0: exited on signal 4
spin lock 0xc0cba338 (smp rendezvous) held by 0xfffff8000bbbe920 (tid
100367) too long
timeout stopping cpus
panic: spin lock held too long
cpuid = 1
KDB: stack backtrace:
#0 0xc05757c0 at panic+0x20
#1 0xc0559250 at _mtx_lock_spin_failed+0x50
#2 0xc0559318 at _mtx_lock_spin_cookie+0xb8
#3 0xc08d801c at tick_get_timecount_mp+0xdc
#4 0xc05840c8 at binuptime+0x48
#5 0xc08a400c at timercb+0x6c
#6 0xc08d8380 at tick_intr+0x220
Uptime: 14h34m16s
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...
timeout stopping cpus
timeout shutting down CPUs.

SC Alert: Host System has Reset

Note: the "SC Alert:" message comes the Sparc's ALOM management system,
so that's from the hardware directly, not from FreeBSD's kernel.


Hmm.  Any chance this could be hardware (failure) related?


Highly unlikely.

First, both Chris and I both see this same error on our V240 machines.

Also, I took the time this weekend to re-install from the
10.0-RELEASE media onto the other disks in this machine.[*]
My V240 has 4x72GB drives, so I now have 10.0-RELEASE running
on a ZFS mirror on disk0/disk1 and have the second 10.2-PRERELEASE
bits installed onto a ZFS mirror on disk2/disk3.  So I can boot
into either of those environments pretty easily.

When running 10.0-RELEASE, the hardware does not exhibit the
"spin lock held too long" message.

-Kurt

[*] This turned out to be unexpected hard.  I was able to boot
from the 10.0-RELEASE cdrom, and create a ZFS mirror, and install
to it, but when I rebooted, I got this error:

Trying to mount root from zfs:sys/ROOT/default []...
Mounting from zfs:sys/ROOT/default failed with error 45.

It took me a while to figure out what was going on.  In 10.0,
the sparc ZFS support probed all the disk devices, looking
for the disks in the boot zpool.  In 10.2, it only probes the
the devices configured in the eeprom's "boot-device" setting.
I had installed the 10.2-ish bits into the zpool called "sys",
and when I reinstalled the 10.0 bits, I also put them into
a zpool named "sys".  So I had two entirely different "sys"
zpools, the first on disk0/disk1 and the second on disk2/disk3.

The 10.2 code can handle this (since it only looked at disk2/disk3),
and happily booted from disk2/disk3.
The 10.0 code, on the other hand, examined all the disks, found
devices that didn't match up, and gave up.  I ultimately ended up
reinstalling the 10.0-RELEASE software into a zpool named "sys0".

_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[email protected]"

Re: New FreeBSD snapshots available: stable/10 (20150625 r284813)

Reply via email to