panic on boot

2010-12-22 Thread Daniel Braniss
the hardware is Sun Fire X2200 M2, and it's discless, PXE booted.

this seems to have started sometime before 8.2, and it
'sometimes happens':

FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, rbp = 
0x80ef5c60 ---
da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class CPU)
  Origin = AuthenticAMD  Id = 0x40f13  Family = f  Model = 41  Stepping = 3
  Features=0x178bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,
CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT
  Features2=0x2001SSE3,CX16
  AMD Features=0xea500800SYSCALL,NX,MMX+,FFXSR,RDTSCP,LM,3DNow!+,3DNow!
  AMD Features2=0x1fLAHF,CMP,SVM,ExtAPIC,CR8
...
SMP: AP CPU #3 Launched!
(cd0:ata0:0:0:0): SCSI status: Check Condition
cpu3 AP:
(cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present)
 ID: 0x0300   VER: 0x80050010 LDR: 0x DFR: 0x
(cd0:  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
ata0:0:  timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: 
0x000104000): 
Error 6, Unretryable error
SMP: AP CPU #2 Launched!
cd0 at ata0 bus 0 scbus0 target 0 lun 0
cpu2 AP:
cd0:  ID: 0x0200   VER: 0x80050010 LDR: 0x DFR: 0x
TEAC DV-28E-N P.6A Removable CD-ROM SCSI-0 device 
  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
cd0: 33.300MB/s transfers  timer: 0x000200ef therm: 0x0001 err: 0x00f0 
( pmc: 0x00010400UDMA2, 
ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) to lapic 1 
vector 48
f
loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn
 4 (cd0: Attempt to query device size failed: NOT READY, Medium not present
ISA IRQ 4) to lapic 2 vector 48
ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48
ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49
ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49
ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49
ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50
ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x10
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x808b1581
stack pointer   = 0x28:0x80ef5b20
frame pointer   = 0x28:0x80ef5b50
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= resume, IOPL = 0
current process = 0 (swapper)
trap number = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
panic() at panic+0x187
trap_fatal() at trap_fatal+0x290
trap_pfault() at trap_pfault+0x28f
trap() at trap+0x3df
calltrap() at calltrap+0x8
--- trap 0xc, rip = 0x808b1581, rsp = 0x80ef5b20, rbp = 
0x80ef5b50 ---
intr_execute_handlers() at intr_execute_handlers+0x21
lapic_handle_intr() at lapic_handle_intr+0x37
Xapic_isr1() at Xapic_isr1+0xa5
--- interrupt, rip = 0x808b6cf3, rsp = 0x80ef5c40, rbp = 
0x80ef5c60 ---
spinlock_exit() at spinlock_exit+0x33
ioapic_assign_cpu() at ioapic_assign_cpu+0x123
intr_shuffle_irqs() at intr_shuffle_irqs+0x9d
mi_startup() at mi_startup+0x77
btext() at btext+0x2c
Uptime: 2s


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


recent 8.2-STABLE commits break nullfs for tinderbox?

2010-12-22 Thread Matthias Andree
Greetings,

I'm tracking 8.2-PRERELEASE, and it appears that recent commits to nullfs, zfs,
vfs, or thereabouts have broken Tinderbox for me.

I'm mounting my ports tree via nullfs, which has been working fine for a year.

Any ideas, or further info needed?

Best regards
Matthias
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


MCA messages after upgrade to 8.2-BEAT1

2010-12-22 Thread Miroslav Lachman

Hi,
the machine in question was upgraded from 7.3 to FreeBSD 8.2-BETA1 i386 
GENERIC
After this upgrade, i got following mesages in /var/log/messages every 
hour. The machine is almost idle (for testing only)


Dec 21 12:42:26 kavkaz kernel: MCA: Bank 0, Status 0xd40e4833
Dec 21 12:42:26 kavkaz kernel: MCA: Global Cap 0x0105, 
Status 0x
Dec 21 12:42:26 kavkaz kernel: MCA: Vendor AuthenticAMD, ID 0x40f33, 
APIC ID 0

Dec 21 12:42:26 kavkaz kernel: MCA: CPU 0 COR OVER BUSLG Source DRD Memory
Dec 21 12:42:26 kavkaz kernel: MCA: Address 0x236493c0
Dec 21 12:42:26 kavkaz kernel: MCA: Bank 1, Status 0xd4004853
Dec 21 12:42:26 kavkaz kernel: MCA: Global Cap 0x0105, 
Status 0x
Dec 21 12:42:26 kavkaz kernel: MCA: Vendor AuthenticAMD, ID 0x40f33, 
APIC ID 0

Dec 21 12:42:26 kavkaz kernel: MCA: CPU 0 COR OVER BUSLG Source IRD Memory
Dec 21 12:42:26 kavkaz kernel: MCA: Address 0x2a1c9440
Dec 21 12:42:26 kavkaz kernel: MCA: Bank 2, Status 0xd0004863
Dec 21 12:42:26 kavkaz kernel: MCA: Global Cap 0x0105, 
Status 0x
Dec 21 12:42:26 kavkaz kernel: MCA: Vendor AuthenticAMD, ID 0x40f33, 
APIC ID 0
Dec 21 12:42:26 kavkaz kernel: MCA: CPU 0 COR OVER BUSLG Source PREFETCH 
Memory

Dec 21 12:42:26 kavkaz kernel: MCA: Bank 4, Status 0xdc0e40020813
Dec 21 12:42:26 kavkaz kernel: MCA: Global Cap 0x0105, 
Status 0x
Dec 21 12:42:26 kavkaz kernel: MCA: Vendor AuthenticAMD, ID 0x40f33, 
APIC ID 0

Dec 21 12:42:26 kavkaz kernel: MCA: CPU 0 COR OVER BUSLG Source RD Memory
Dec 21 12:42:26 kavkaz kernel: MCA: Address 0x2cac9678
Dec 21 12:42:26 kavkaz kernel: MCA: Misc 0xe00d0fff
Dec 21 12:42:26 kavkaz kernel: MCA: Bank 0, Status 0xd40e4833
Dec 21 12:42:26 kavkaz kernel: MCA: Global Cap 0x0105, 
Status 0x
Dec 21 12:42:26 kavkaz kernel: MCA: Vendor AuthenticAMD, ID 0x40f33, 
APIC ID 1

Dec 21 12:42:26 kavkaz kernel: MCA: CPU 1 COR OVER BUSLG Source DRD Memory
Dec 21 12:42:26 kavkaz kernel: MCA: Address 0x23649640
Dec 21 12:42:26 kavkaz kernel: MCA: Bank 1, Status 0xd4004853
Dec 21 12:42:26 kavkaz kernel: MCA: Global Cap 0x0105, 
Status 0x
Dec 21 12:42:26 kavkaz kernel: MCA: Vendor AuthenticAMD, ID 0x40f33, 
APIC ID 1

Dec 21 12:42:26 kavkaz kernel: MCA: CPU 1 COR OVER BUSLG Source IRD Memory
Dec 21 12:42:26 kavkaz kernel: MCA: Address 0x2a1c9440
Dec 21 12:42:26 kavkaz kernel: MCA: Bank 2, Status 0xd0004863
Dec 21 12:42:26 kavkaz kernel: MCA: Global Cap 0x0105, 
Status 0x
Dec 21 12:42:26 kavkaz kernel: MCA: Vendor AuthenticAMD, ID 0x40f33, 
APIC ID 1
Dec 21 12:42:26 kavkaz kernel: MCA: CPU 1 COR OVER BUSLG Source PREFETCH 
Memory


Can somebody tell me, what these messages are?

Miroslav Lachman

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: recent 8.2-STABLE commits break nullfs for tinderbox?

2010-12-22 Thread Mike Tancsa
On 12/22/2010 7:03 AM, Matthias Andree wrote:
 Greetings,
 
 I'm tracking 8.2-PRERELEASE, and it appears that recent commits to nullfs, 
 zfs,
 vfs, or thereabouts have broken Tinderbox for me.
 
 I'm mounting my ports tree via nullfs, which has been working fine for a year.
 
 Any ideas, or further info needed?

Hi,
Whats specifically broken ?  Two of the freebsd tinderbox machines are
RELENG_8 from Dec 3 and they are fine.  However, they dont use nullfs,
just zfs and ufs.  Is it just nullfs thats broken ? What are the errors
you are getting ?

---Mike

 
 Best regards
 Matthias
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
 
 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: recent 8.2-STABLE commits break nullfs for tinderbox?

2010-12-22 Thread Matthias Andree
Am 22.12.2010 13:44, schrieb Mike Tancsa:
 On 12/22/2010 7:03 AM, Matthias Andree wrote:
 Greetings,
 
 I'm tracking 8.2-PRERELEASE, and it appears that recent commits to nullfs, 
 zfs,
 vfs, or thereabouts have broken Tinderbox for me.
 
 I'm mounting my ports tree via nullfs, which has been working fine for a 
 year.
 
 Any ideas, or further info needed?
 
 Hi,
   Whats specifically broken ?  Two of the freebsd tinderbox machines are
 RELENG_8 from Dec 3 and they are fine.  However, they dont use nullfs,
 just zfs and ufs.  Is it just nullfs thats broken ? What are the errors
 you are getting ?

I updated after that.

mount_nullfs /usr/ports.cvs /usr/local/tinderbox/portstrees/FreeBSD/ports fails
with resource conflict avoided.  I'll now rebuild GENERIC from scratch
(including make clean) to see if that helps.

Tried switching to NFS, this appears to work now.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: recent 8.2-STABLE commits break nullfs for tinderbox?

2010-12-22 Thread Jeremy Chadwick
On Wed, Dec 22, 2010 at 02:21:20PM +0100, Matthias Andree wrote:
 Am 22.12.2010 13:44, schrieb Mike Tancsa:
  On 12/22/2010 7:03 AM, Matthias Andree wrote:
  Greetings,
  
  I'm tracking 8.2-PRERELEASE, and it appears that recent commits to nullfs, 
  zfs,
  vfs, or thereabouts have broken Tinderbox for me.
  
  I'm mounting my ports tree via nullfs, which has been working fine for a 
  year.
  
  Any ideas, or further info needed?
  
  Hi,
  Whats specifically broken ?  Two of the freebsd tinderbox machines are
  RELENG_8 from Dec 3 and they are fine.  However, they dont use nullfs,
  just zfs and ufs.  Is it just nullfs thats broken ? What are the errors
  you are getting ?
 
 I updated after that.
 
 mount_nullfs /usr/ports.cvs /usr/local/tinderbox/portstrees/FreeBSD/ports 
 fails
 with resource conflict avoided.  I'll now rebuild GENERIC from scratch
 (including make clean) to see if that helps.
 
 Tried switching to NFS, this appears to work now.

FWIW, i can't find this error message (resource conflict avoided)
anywhere in /usr/src, /usr/include, nor /usr/ports on RELENG_8 source
dated from 2 hours ago.

grep -ri resource conflict /usr/src does return some results, but
nothing that looks identical to the string you posted.

Only reason I'm pointing this out: it would be good to find the commit
that breaks things for you, if there is such a commit, but we need
something to key off of.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: recent 8.2-STABLE commits break nullfs for tinderbox?

2010-12-22 Thread Mike Tancsa
On 12/22/2010 8:21 AM, Matthias Andree wrote:
 Am 22.12.2010 13:44, schrieb Mike Tancsa:
 On 12/22/2010 7:03 AM, Matthias Andree wrote:
 Greetings,

 I'm tracking 8.2-PRERELEASE, and it appears that recent commits to nullfs, 
 zfs,
 vfs, or thereabouts have broken Tinderbox for me.

 I'm mounting my ports tree via nullfs, which has been working fine for a 
 year.

 Any ideas, or further info needed?

 Hi,
  Whats specifically broken ?  Two of the freebsd tinderbox machines are
 RELENG_8 from Dec 3 and they are fine.  However, they dont use nullfs,
 just zfs and ufs.  Is it just nullfs thats broken ? What are the errors
 you are getting ?
 
 I updated after that.
 
 mount_nullfs /usr/ports.cvs /usr/local/tinderbox/portstrees/FreeBSD/ports 
 fails
 with resource conflict avoided.  I'll now rebuild GENERIC from scratch
 (including make clean) to see if that helps.

Is the error resource deadlock avoided ? or conflict ? EDEADLK


Strange, I am able to do this on RELENG_8 i386 from Dec 15th and AMD64
from the 12th

0(ich10)# mount_nullfs /usr/ports /mnt
0(ich10)# mount
/dev/ada0s1a on / (ufs, local)
devfs on /dev (devfs, local, multilabel)
/dev/ada0s1g on /home (ufs, NFS exported, local, soft-updates)
/dev/ada0s1f on /tmp (ufs, local, soft-updates)
/dev/ada0s1d on /usr (ufs, local, soft-updates)
/dev/ada0s1e on /var (ufs, local, soft-updates)
/usr/ports on /mnt (nullfs, local)
0(ich10)# ls -l /mnt/ | head
total 22888
drwxr-xr-x69 root  wheel  - 1536 Dec 14 09:03 .
drwxr-xr-x25 root  wheel  -  512 Dec 15 16:33 ..
-rw-r--r-- 1 root  wheel  -   19 Jul 14  1997 .cvsignore
-rw-r--r-- 1 root  wheel  -57038 Sep 29 14:06 CHANGES
-rw-r--r-- 1 root  wheel  - 1498 Dec 31  2009 COPYRIGHT
-rw-r--r-- 1 root  wheel  - 2680 Dec 14 09:03 GIDs
-rw-r--r-- 1 root  wheel  - 21942459 Jul 18 22:23 INDEX-8
-rw-r--r-- 1 root  wheel  - 9184 Oct 12 11:10 KNOBS
-rw-r--r-- 1 root  wheel  -32882 Dec  3 13:55 LEGAL
0(ich10)# md5 /mnt/INDEX-8 /usr/ports/INDEX-8
MD5 (/mnt/INDEX-8) = 2dd40914941dadac0afe3e0d86038322
MD5 (/usr/ports/INDEX-8) = 2dd40914941dadac0afe3e0d86038322
0(ich10)#

---Mike
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


ZFS v28 on 8.2-PRERELEASE

2010-12-22 Thread ciaby
I just downloaded and installed the latest zfs v28 patch (this one: 
http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101218.patch.xz).
System boots fine, a 4-disks RAIDZ set get mounted properly, no 
problems so far.
Thanks a lot to all the FreeBSD community for this great piece of 
software! :-)


Ciaby
P.S. Can i remove the SSD ZIL without upgrading the pool?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: recent 8.2-STABLE commits break nullfs for tinderbox?

2010-12-22 Thread Matthias Andree
Am 22.12.2010 14:29, schrieb Jeremy Chadwick:
 On Wed, Dec 22, 2010 at 02:21:20PM +0100, Matthias Andree wrote:
 Am 22.12.2010 13:44, schrieb Mike Tancsa:
  On 12/22/2010 7:03 AM, Matthias Andree wrote:
  Greetings,
  
  I'm tracking 8.2-PRERELEASE, and it appears that recent commits to 
  nullfs, zfs,
  vfs, or thereabouts have broken Tinderbox for me.
  
  I'm mounting my ports tree via nullfs, which has been working fine for a 
  year.
  
  Any ideas, or further info needed?
  
  Hi,
 Whats specifically broken ?  Two of the freebsd tinderbox machines are
  RELENG_8 from Dec 3 and they are fine.  However, they dont use nullfs,
  just zfs and ufs.  Is it just nullfs thats broken ? What are the errors
  you are getting ?
 
 I updated after that.
 
 mount_nullfs /usr/ports.cvs /usr/local/tinderbox/portstrees/FreeBSD/ports 
 fails
 with resource conflict avoided.  I'll now rebuild GENERIC from scratch
 (including make clean) to see if that helps.
 
 Tried switching to NFS, this appears to work now.
 
 FWIW, i can't find this error message (resource conflict avoided)
 anywhere in /usr/src, /usr/include, nor /usr/ports on RELENG_8 source
 dated from 2 hours ago.
 
 grep -ri resource conflict /usr/src does return some results, but
 nothing that looks identical to the string you posted.

Sorry, my fault, was quoting from memory.

This is now pasted:
mount_nullfs: Resource deadlock avoided

 Only reason I'm pointing this out: it would be good to find the commit
 that breaks things for you, if there is such a commit, but we need
 something to key off of.

Provided above.  Tests of kernel rebuilt from scratch are pending (requires
reboot and reconfiguration for nullfs).
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: recent 8.2-STABLE commits break nullfs for tinderbox?

2010-12-22 Thread Paul B Mahol
On 12/22/10, Jeremy Chadwick free...@jdc.parodius.com wrote:
 On Wed, Dec 22, 2010 at 02:21:20PM +0100, Matthias Andree wrote:
 Am 22.12.2010 13:44, schrieb Mike Tancsa:
  On 12/22/2010 7:03 AM, Matthias Andree wrote:
  Greetings,
 
  I'm tracking 8.2-PRERELEASE, and it appears that recent commits to
  nullfs, zfs,
  vfs, or thereabouts have broken Tinderbox for me.
 
  I'm mounting my ports tree via nullfs, which has been working fine for
  a year.
 
  Any ideas, or further info needed?
 
  Hi,
 Whats specifically broken ?  Two of the freebsd tinderbox machines are
  RELENG_8 from Dec 3 and they are fine.  However, they dont use nullfs,
  just zfs and ufs.  Is it just nullfs thats broken ? What are the errors
  you are getting ?

 I updated after that.

 mount_nullfs /usr/ports.cvs /usr/local/tinderbox/portstrees/FreeBSD/ports
 fails
 with resource conflict avoided.  I'll now rebuild GENERIC from scratch
 (including make clean) to see if that helps.

 Tried switching to NFS, this appears to work now.

 FWIW, i can't find this error message (resource conflict avoided)
 anywhere in /usr/src, /usr/include, nor /usr/ports on RELENG_8 source
 dated from 2 hours ago.

 grep -ri resource conflict /usr/src does return some results, but
 nothing that looks identical to the string you posted.

 Only reason I'm pointing this out: it would be good to find the commit
 that breaks things for you, if there is such a commit, but we need
 something to key off of.

Perhaps OP means resource deadlock avoided?

Such message appears if you try to mount same mount point with nullfs
twice - which doesnt have sense.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: recent 8.2-STABLE commits break nullfs for tinderbox?

2010-12-22 Thread Matthias Andree
Am 22.12.2010 14:53, schrieb Paul B Mahol:
 On 12/22/10, Jeremy Chadwick free...@jdc.parodius.com wrote:
 On Wed, Dec 22, 2010 at 02:21:20PM +0100, Matthias Andree wrote:
 Am 22.12.2010 13:44, schrieb Mike Tancsa:
  On 12/22/2010 7:03 AM, Matthias Andree wrote:
  Greetings,
 
  I'm tracking 8.2-PRERELEASE, and it appears that recent commits to
  nullfs, zfs,
  vfs, or thereabouts have broken Tinderbox for me.
 
  I'm mounting my ports tree via nullfs, which has been working fine for
  a year.
 
  Any ideas, or further info needed?
 
  Hi,
Whats specifically broken ?  Two of the freebsd tinderbox machines are
  RELENG_8 from Dec 3 and they are fine.  However, they dont use nullfs,
  just zfs and ufs.  Is it just nullfs thats broken ? What are the errors
  you are getting ?

 I updated after that.

 mount_nullfs /usr/ports.cvs /usr/local/tinderbox/portstrees/FreeBSD/ports
 fails
 with resource conflict avoided.  I'll now rebuild GENERIC from scratch
 (including make clean) to see if that helps.

 Tried switching to NFS, this appears to work now.

 FWIW, i can't find this error message (resource conflict avoided)
 anywhere in /usr/src, /usr/include, nor /usr/ports on RELENG_8 source
 dated from 2 hours ago.

 grep -ri resource conflict /usr/src does return some results, but
 nothing that looks identical to the string you posted.

 Only reason I'm pointing this out: it would be good to find the commit
 that breaks things for you, if there is such a commit, but we need
 something to key off of.
 
 Perhaps OP means resource deadlock avoided?
 
 Such message appears if you try to mount same mount point with nullfs
 twice - which doesnt have sense.

Then either tinderbox's check if the directory exists is broken, else it
wouldn't retry mounting it, or something else is broken.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: panic on boot

2010-12-22 Thread John Baldwin
On Wednesday, December 22, 2010 5:12:03 am Daniel Braniss wrote:
 the hardware is Sun Fire X2200 M2, and it's discless, PXE booted.
 
 this seems to have started sometime before 8.2, and it
 'sometimes happens':
 
 FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, rbp = 
 0x80ef5c60 ---
 da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64
 Timecounter i8254 frequency 1193182 Hz quality 0
 CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class CPU)
   Origin = AuthenticAMD  Id = 0x40f13  Family = f  Model = 41  Stepping = 3
   
 Features=0x178bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,
 CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT
   Features2=0x2001SSE3,CX16
   AMD Features=0xea500800SYSCALL,NX,MMX+,FFXSR,RDTSCP,LM,3DNow!+,3DNow!
   AMD Features2=0x1fLAHF,CMP,SVM,ExtAPIC,CR8
 ...
 SMP: AP CPU #3 Launched!
 (cd0:ata0:0:0:0): SCSI status: Check Condition
 cpu3 AP:
 (cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present)
  ID: 0x0300   VER: 0x80050010 LDR: 0x DFR: 0x
 (cd0:  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
 ata0:0:  timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: 
 0x000104000): 
 Error 6, Unretryable error
 SMP: AP CPU #2 Launched!
 cd0 at ata0 bus 0 scbus0 target 0 lun 0
 cpu2 AP:
 cd0:  ID: 0x0200   VER: 0x80050010 LDR: 0x DFR: 0x
 TEAC DV-28E-N P.6A Removable CD-ROM SCSI-0 device 
   lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
 cd0: 33.300MB/s transfers  timer: 0x000200ef therm: 0x0001 err: 
 0x00f0 ( pmc: 0x00010400UDMA2, 
 ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) to lapic 
 1 vector 48
 f
 loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn
  4 (cd0: Attempt to query device size failed: NOT READY, Medium not present
 ISA IRQ 4) to lapic 2 vector 48
 ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48
 ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49
 ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49
 ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49
 ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50
 ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50
 kernel trap 12 with interrupts disabled
 
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address   = 0x10
 fault code  = supervisor read data, page not present
 instruction pointer = 0x20:0x808b1581
 stack pointer   = 0x28:0x80ef5b20
 frame pointer   = 0x28:0x80ef5b50
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags= resume, IOPL = 0
 current process = 0 (swapper)
 trap number = 12
 panic: page fault
 cpuid = 0
 KDB: stack backtrace:
 db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
 kdb_backtrace() at kdb_backtrace+0x37
 panic() at panic+0x187
 trap_fatal() at trap_fatal+0x290
 trap_pfault() at trap_pfault+0x28f
 trap() at trap+0x3df
 calltrap() at calltrap+0x8
 --- trap 0xc, rip = 0x808b1581, rsp = 0x80ef5b20, rbp = 
 0x80ef5b50 ---
 intr_execute_handlers() at intr_execute_handlers+0x21
 lapic_handle_intr() at lapic_handle_intr+0x37
 Xapic_isr1() at Xapic_isr1+0xa5
 --- interrupt, rip = 0x808b6cf3, rsp = 0x80ef5c40, rbp = 
 0x80ef5c60 ---
 spinlock_exit() at spinlock_exit+0x33
 ioapic_assign_cpu() at ioapic_assign_cpu+0x123
 intr_shuffle_irqs() at intr_shuffle_irqs+0x9d
 mi_startup() at mi_startup+0x77
 btext() at btext+0x2c
 Uptime: 2s

Can you do 'l *intr_execute_handlers+0x21' and 'l *ioapic_assign_cpu+0x123'
in 'gdb kernel.debug' of your kernel?

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: MCA messages after upgrade to 8.2-BEAT1

2010-12-22 Thread John Baldwin
On Wednesday, December 22, 2010 7:41:25 am Miroslav Lachman wrote:
 Dec 21 12:42:26 kavkaz kernel: MCA: Bank 0, Status 0xd40e4833
 Dec 21 12:42:26 kavkaz kernel: MCA: Global Cap 0x0105, 
 Status 0x
 Dec 21 12:42:26 kavkaz kernel: MCA: Vendor AuthenticAMD, ID 0x40f33, 
 APIC ID 0
 Dec 21 12:42:26 kavkaz kernel: MCA: CPU 0 COR OVER BUSLG Source DRD Memory
 Dec 21 12:42:26 kavkaz kernel: MCA: Address 0x236493c0

You are getting corrected ECC errors in your RAM.  You see them once an hour
because we poll the machine check registers once an hour.  If this happens
constantly you might have a DIMM that is dying?

% ~/mcelog --ascii  foo.txt 
mcelog: Cannot open /dev/mem for DMI decoding: Permission denied
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 0 data cache 
ADDR 236493c0 
  Data cache ECC error (syndrome 1c)
   bit46 = corrected ecc error
   bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
 data read mem transaction
 memory access, level generic'
STATUS d40e4833 MCGSTATUS 0
MCGCAP 105 APICID 0 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 67
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 1 instruction cache 
ADDR 2a1c9440 
  Instruction cache ECC error
   bit46 = corrected ecc error
   bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
 instruction fetch mem transaction
 memory access, level generic'
STATUS d4004853 MCGSTATUS 0
MCGCAP 105 APICID 0 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 67
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 2 bus unit 
  L2 cache ECC error
  Bus or cache array error
   bit46 = corrected ecc error
   bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
 prefetch mem transaction
 memory access, level generic'
STATUS d0004863 MCGSTATUS 0
MCGCAP 105 APICID 0 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 67
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 0 4 northbridge 
MISC e00d0fff ADDR 2cac9678 
  Northbridge RAM ECC error
  ECC syndrome = 1c
   bit33 = err cpu1
   bit46 = corrected ecc error
   bit59 = misc error valid
   bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
 generic read mem transaction
 memory access, level generic'
STATUS dc0e40020813 MCGSTATUS 0
MCGCAP 105 APICID 0 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 67
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 1 0 data cache 
ADDR 23649640 
  Data cache ECC error (syndrome 1c)
   bit46 = corrected ecc error
   bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
 data read mem transaction
 memory access, level generic'
STATUS d40e4833 MCGSTATUS 0
MCGCAP 105 APICID 1 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 67
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 1 1 instruction cache 
ADDR 2a1c9440 
  Instruction cache ECC error
   bit46 = corrected ecc error
   bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
 instruction fetch mem transaction
 memory access, level generic'
STATUS d4004853 MCGSTATUS 0
MCGCAP 105 APICID 1 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 67
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 1 2 bus unit 
  L2 cache ECC error
  Bus or cache array error
   bit46 = corrected ecc error
   bit62 = error overflow (multiple errors)
  bus error 'local node origin, request didn't time out
 prefetch mem transaction
 memory access, level generic'
STATUS d0004863 MCGSTATUS 0
MCGCAP 105 APICID 1 SOCKETID 0 
CPUID Vendor AMD Family 15 Model 67


-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: panic on boot

2010-12-22 Thread Daniel Braniss
 On Wednesday, December 22, 2010 5:12:03 am Daniel Braniss wrote:
  the hardware is Sun Fire X2200 M2, and it's discless, PXE booted.
  
  this seems to have started sometime before 8.2, and it
  'sometimes happens':
  
  FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, rbp = 
  0x80ef5c60 ---
  da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64
  Timecounter i8254 frequency 1193182 Hz quality 0
  CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class CPU)
Origin = AuthenticAMD  Id = 0x40f13  Family = f  Model = 41  Stepping = 
  3

  Features=0x178bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,
  CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT
Features2=0x2001SSE3,CX16
AMD Features=0xea500800SYSCALL,NX,MMX+,FFXSR,RDTSCP,LM,3DNow!+,3DNow!
AMD Features2=0x1fLAHF,CMP,SVM,ExtAPIC,CR8
  ...
  SMP: AP CPU #3 Launched!
  (cd0:ata0:0:0:0): SCSI status: Check Condition
  cpu3 AP:
  (cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present)
   ID: 0x0300   VER: 0x80050010 LDR: 0x DFR: 0x
  (cd0:  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
  ata0:0:  timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: 
  0x000104000): 
  Error 6, Unretryable error
  SMP: AP CPU #2 Launched!
  cd0 at ata0 bus 0 scbus0 target 0 lun 0
  cpu2 AP:
  cd0:  ID: 0x0200   VER: 0x80050010 LDR: 0x DFR: 0x
  TEAC DV-28E-N P.6A Removable CD-ROM SCSI-0 device 
lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
  cd0: 33.300MB/s transfers  timer: 0x000200ef therm: 0x0001 err: 
  0x00f0 ( pmc: 0x00010400UDMA2, 
  ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) to 
  lapic 1 vector 48
  f
  loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn
   4 (cd0: Attempt to query device size failed: NOT READY, Medium not present
  ISA IRQ 4) to lapic 2 vector 48
  ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48
  ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49
  ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49
  ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49
  ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50
  ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50
  kernel trap 12 with interrupts disabled
  
  
  Fatal trap 12: page fault while in kernel mode
  cpuid = 0; apic id = 00
  fault virtual address   = 0x10
  fault code  = supervisor read data, page not present
  instruction pointer = 0x20:0x808b1581
  stack pointer   = 0x28:0x80ef5b20
  frame pointer   = 0x28:0x80ef5b50
  code segment= base 0x0, limit 0xf, type 0x1b
  = DPL 0, pres 1, long 1, def32 0, gran 1
  processor eflags= resume, IOPL = 0
  current process = 0 (swapper)
  trap number = 12
  panic: page fault
  cpuid = 0
  KDB: stack backtrace:
  db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
  kdb_backtrace() at kdb_backtrace+0x37
  panic() at panic+0x187
  trap_fatal() at trap_fatal+0x290
  trap_pfault() at trap_pfault+0x28f
  trap() at trap+0x3df
  calltrap() at calltrap+0x8
  --- trap 0xc, rip = 0x808b1581, rsp = 0x80ef5b20, rbp = 
  0x80ef5b50 ---
  intr_execute_handlers() at intr_execute_handlers+0x21
  lapic_handle_intr() at lapic_handle_intr+0x37
  Xapic_isr1() at Xapic_isr1+0xa5
  --- interrupt, rip = 0x808b6cf3, rsp = 0x80ef5c40, rbp = 
  0x80ef5c60 ---
  spinlock_exit() at spinlock_exit+0x33
  ioapic_assign_cpu() at ioapic_assign_cpu+0x123
  intr_shuffle_irqs() at intr_shuffle_irqs+0x9d
  mi_startup() at mi_startup+0x77
  btext() at btext+0x2c
  Uptime: 2s
 
 Can you do 'l *intr_execute_handlers+0x21' and 'l *ioapic_assign_cpu+0x123'
 in 'gdb kernel.debug' of your kernel?

sure, as soon as it happens, and it aint happening now :-(
but when it will happen, I think it won't let me into the debugger
- probably will have to recompile
thanks
danny



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: panic on boot

2010-12-22 Thread Daniel Braniss
ok, it happened
...
Cannot dump. Device not defined or unavailable.
Automatic reboot in 15 seconds - press a key on the console to abort
-- Press a key on the console to reboot,
-- or switch off the system now.


but 
a- the 15 seconds never happen :-)
b- there is some magic to get into the debugger
   but can't find it.
danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: panic on boot

2010-12-22 Thread John Baldwin
On Wednesday, December 22, 2010 10:58:56 am Daniel Braniss wrote:
  On Wednesday, December 22, 2010 5:12:03 am Daniel Braniss wrote:
   the hardware is Sun Fire X2200 M2, and it's discless, PXE booted.
   
   this seems to have started sometime before 8.2, and it
   'sometimes happens':
   
   FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, rbp = 
   0x80ef5c60 ---
   da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64
   Timecounter i8254 frequency 1193182 Hz quality 0
   CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class CPU)
 Origin = AuthenticAMD  Id = 0x40f13  Family = f  Model = 41  Stepping 
   = 3
 
   Features=0x178bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,
   CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT
 Features2=0x2001SSE3,CX16
 AMD Features=0xea500800SYSCALL,NX,MMX+,FFXSR,RDTSCP,LM,3DNow!+,3DNow!
 AMD Features2=0x1fLAHF,CMP,SVM,ExtAPIC,CR8
   ...
   SMP: AP CPU #3 Launched!
   (cd0:ata0:0:0:0): SCSI status: Check Condition
   cpu3 AP:
   (cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present)
ID: 0x0300   VER: 0x80050010 LDR: 0x DFR: 0x
   (cd0:  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
   ata0:0:  timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: 
   0x000104000): 
   Error 6, Unretryable error
   SMP: AP CPU #2 Launched!
   cd0 at ata0 bus 0 scbus0 target 0 lun 0
   cpu2 AP:
   cd0:  ID: 0x0200   VER: 0x80050010 LDR: 0x DFR: 0x
   TEAC DV-28E-N P.6A Removable CD-ROM SCSI-0 device 
 lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
   cd0: 33.300MB/s transfers  timer: 0x000200ef therm: 0x0001 err: 
   0x00f0 ( pmc: 0x00010400UDMA2, 
   ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) to 
   lapic 1 vector 48
   f
   loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn
4 (cd0: Attempt to query device size failed: NOT READY, Medium not 
   present
   ISA IRQ 4) to lapic 2 vector 48
   ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48
   ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49
   ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49
   ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49
   ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50
   ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50
   kernel trap 12 with interrupts disabled
   
   
   Fatal trap 12: page fault while in kernel mode
   cpuid = 0; apic id = 00
   fault virtual address   = 0x10
   fault code  = supervisor read data, page not present
   instruction pointer = 0x20:0x808b1581
   stack pointer   = 0x28:0x80ef5b20
   frame pointer   = 0x28:0x80ef5b50
   code segment= base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, long 1, def32 0, gran 1
   processor eflags= resume, IOPL = 0
   current process = 0 (swapper)
   trap number = 12
   panic: page fault
   cpuid = 0
   KDB: stack backtrace:
   db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
   kdb_backtrace() at kdb_backtrace+0x37
   panic() at panic+0x187
   trap_fatal() at trap_fatal+0x290
   trap_pfault() at trap_pfault+0x28f
   trap() at trap+0x3df
   calltrap() at calltrap+0x8
   --- trap 0xc, rip = 0x808b1581, rsp = 0x80ef5b20, rbp = 
   0x80ef5b50 ---
   intr_execute_handlers() at intr_execute_handlers+0x21
   lapic_handle_intr() at lapic_handle_intr+0x37
   Xapic_isr1() at Xapic_isr1+0xa5
   --- interrupt, rip = 0x808b6cf3, rsp = 0x80ef5c40, rbp = 
   0x80ef5c60 ---
   spinlock_exit() at spinlock_exit+0x33
   ioapic_assign_cpu() at ioapic_assign_cpu+0x123
   intr_shuffle_irqs() at intr_shuffle_irqs+0x9d
   mi_startup() at mi_startup+0x77
   btext() at btext+0x2c
   Uptime: 2s
  
  Can you do 'l *intr_execute_handlers+0x21' and 'l *ioapic_assign_cpu+0x123'
  in 'gdb kernel.debug' of your kernel?
 
 sure, as soon as it happens, and it aint happening now :-(
 but when it will happen, I think it won't let me into the debugger
 - probably will have to recompile

You don't need to trigger the panic, you can just run
'gdb /path/to/kernel.debug' (e.g.
'gdb /usr/obj/usr/src/sys/GENERIC/kernel.debug')

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS v28 on 8.2-PRERELEASE

2010-12-22 Thread jhell
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/22/2010 08:48, ciaby wrote:
 P.S. Can i remove the SSD ZIL without upgrading the pool?

Simply put `NO'

Longer answer, ZFS will complain at the point where you try to replace
the log device or remove it and tell you it was formatted using an older
version. OpenSolaris and OpenIndiana both do this so the expectancy of
FreeBSD would be to do the same.


Regards,

- -- 

 jhell,v
-BEGIN PGP SIGNATURE-

iQEcBAEBAgAGBQJNEsneAAoJEJBXh4mJ2FR+oY4IAJXHj2b29RxuP9M8Ru0ixFEj
T4CVYQ9KFkPxozbb2OZW60lpEGOtJfPuHzzqX5ICAUgFnbeSwM0kMIBDvI2srE2l
WvlSNwIB7wTdOac6s74o0IWBh4TBhKBMgFeQ+CLZlMkKoEs2HGwbYYqPg+R/+0gD
x+sOQdfiMa1sUwMupl2QOFR5Iq1z+4IGNljVvg43EZ5IvJCc7dGF9vaE1V4gNkdq
MNT/OphXOHirngdfphiRb7mdRss3k49NwrSaiPxlg4X+KNHI1BQmpZOLgLE+7Chg
M6RfHSgoLkmtl2XK4H7eIivfnQrloU/4RMnou4LG2uPrNHHg/YbfqXjaehajXCc=
=ZCc1
-END PGP SIGNATURE-
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: panic on boot

2010-12-22 Thread Daniel Braniss
 On Wednesday, December 22, 2010 10:58:56 am Daniel Braniss wrote:
   On Wednesday, December 22, 2010 5:12:03 am Daniel Braniss wrote:
the hardware is Sun Fire X2200 M2, and it's discless, PXE booted.

this seems to have started sometime before 8.2, and it
'sometimes happens':

FreeBSD 8.2-PRERELEASE #15 r4274: Wed Dec 22 09:11:27 IST 2010c40, rbp 
= 
0x80ef5c60 ---
da...@rnd:/home/obj/rnd/r+d/stable/8/sys/HUJI amd64
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: Dual-Core AMD Opteron(tm) Processor 2218 (2613.40-MHz K8-class CPU)
  Origin = AuthenticAMD  Id = 0x40f13  Family = f  Model = 41  
Stepping = 3
  
Features=0x178bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,
CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT
  Features2=0x2001SSE3,CX16
  AMD 
Features=0xea500800SYSCALL,NX,MMX+,FFXSR,RDTSCP,LM,3DNow!+,3DNow!
  AMD Features2=0x1fLAHF,CMP,SVM,ExtAPIC,CR8
...
SMP: AP CPU #3 Launched!
(cd0:ata0:0:0:0): SCSI status: Check Condition
cpu3 AP:
(cd0:ata0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present)
 ID: 0x0300   VER: 0x80050010 LDR: 0x DFR: 0x
(cd0:  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 
0x01ff
ata0:0:  timer: 0x000200ef therm: 0x0001 err: 0x00f00: pmc: 
0x000104000): 
Error 6, Unretryable error
SMP: AP CPU #2 Launched!
cd0 at ata0 bus 0 scbus0 target 0 lun 0
cpu2 AP:
cd0:  ID: 0x0200   VER: 0x80050010 LDR: 0x DFR: 
0x
TEAC DV-28E-N P.6A Removable CD-ROM SCSI-0 device 
  lint0: 0x00010700 lint1: 0x0400 TPR: 0x SVR: 0x01ff
cd0: 33.300MB/s transfers  timer: 0x000200ef therm: 0x0001 err: 
0x00f0 ( pmc: 0x00010400UDMA2, 
ATAPI 12bytes, ioapic0: routing intpin 3 (PIO 65534bytesISA IRQ 3)) to 
lapic 1 vector 48
f
loiwotaapbilce0 :c lreoaunteirn gs tianrttpeidn
 4 (cd0: Attempt to query device size failed: NOT READY, Medium not 
present
ISA IRQ 4) to lapic 2 vector 48
ioapic0: routing intpin 9 (ISA IRQ 9) to lapic 3 vector 48
ioapic0: routing intpin 15 (ISA IRQ 15) to lapic 1 vector 49
ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 2 vector 49
ioapic0: routing intpin 18 (PCI IRQ 18) to lapic 3 vector 49
ioapic0: routing intpin 22 (PCI IRQ 22) to lapic 1 vector 50
ioapic0: routing intpin 23 (PCI IRQ 23) to lapic 2 vector 50
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x10
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x808b1581
stack pointer   = 0x28:0x80ef5b20
frame pointer   = 0x28:0x80ef5b50
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= resume, IOPL = 0
current process = 0 (swapper)
trap number = 12
panic: page fault
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
panic() at panic+0x187
trap_fatal() at trap_fatal+0x290
trap_pfault() at trap_pfault+0x28f
trap() at trap+0x3df
calltrap() at calltrap+0x8
--- trap 0xc, rip = 0x808b1581, rsp = 0x80ef5b20, rbp = 
0x80ef5b50 ---
intr_execute_handlers() at intr_execute_handlers+0x21
lapic_handle_intr() at lapic_handle_intr+0x37
Xapic_isr1() at Xapic_isr1+0xa5
--- interrupt, rip = 0x808b6cf3, rsp = 0x80ef5c40, rbp 
= 0x80ef5c60 ---
spinlock_exit() at spinlock_exit+0x33
ioapic_assign_cpu() at ioapic_assign_cpu+0x123
intr_shuffle_irqs() at intr_shuffle_irqs+0x9d
mi_startup() at mi_startup+0x77
btext() at btext+0x2c
Uptime: 2s
   
   Can you do 'l *intr_execute_handlers+0x21' and 'l 
   *ioapic_assign_cpu+0x123'
   in 'gdb kernel.debug' of your kernel?
  
  sure, as soon as it happens, and it aint happening now :-(
  but when it will happen, I think it won't let me into the debugger
  - probably will have to recompile
 
 You don't need to trigger the panic, you can just run
 'gdb /path/to/kernel.debug' (e.g.
 'gdb /usr/obj/usr/src/sys/GENERIC/kernel.debug')
sorry, missed the gdb part.

gdb /d/7/boot/kernel/kernel
...
(gdb) l *intr_execute_handlers+0x21
0x808b1581 is in intr_execute_handlers (/r+d/stable/8/sys/amd64/amd64/i
ntr_machdep.c:243).
238  * We count software interrupts when we process them.  The
239  * code here follows previous practice, but there's an
240  * argument for counting hardware interrupts when they're
241  * processed too.
242  */
243