Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Eric Saxe wrote: Dennis Clarke wrote: Dennis, was this an Intel or AMD based system? Actually neither .. it is a low power appliance motherboard based on VIA technology. I see. If you can provide me access to a crash dump somehow, that would be helpful. Otherwise, if you can reproduce this let's take the conversation offline (or to a chat session), and we can debug it live... FYI, this bug: 6580117 panic: assertion failed: ISP2(CPUSETSIZE()) on VIA Esther based system ...has been fixed in build 74 (Kit Chow fixed it)... Thanks, -Eric ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Jürgen Keil wrote: cpu0: x86 (CentaurHauls 6A9 family 6 model 10 step 9 clock 1200 MHz) cpu0: VIA Esther processor 1200MHz Could be missing / broken or incomplete VIA/CentaurHauls x86 cpu support: I'm suspecting that you are correct Jürgen. According to: http://en.wikipedia.org/wiki/VIA_C7#Esther ...the CPU's L2 is 32-way set associative, so we're likely mis-characterizing it. I'm trying to get access to the CPUID specification for this CPU. If someone can send me a pointer, i'd be grateful. :) -Eric ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Jürgen Keil wrote: cpu0: x86 (CentaurHauls 6A9 family 6 model 10 step 9 clock 1200 MHz) cpu0: VIA Esther processor 1200MHz Could be missing / broken or incomplete VIA/CentaurHauls x86 cpu support: I'm suspecting that you are correct Jürgen. According to: http://en.wikipedia.org/wiki/VIA_C7#Esther ...the CPU's L2 is 32-way set associative, so we're likely mis-characterizing it. I'm trying to get access to the CPUID specification for this CPU. If someone can send me a pointer, i'd be grateful. :) here is what I have in front of me : module /platform/i86pc/kernel//unix: text at [0xfe80, 0xfe8ae71b] data at 0xfec0 module /kernel/genunix: text at [0xfe8ae720, 0xfea4497f] data at 0xfec47d80 SunOS Release 5.11 Version snv_68 32-bit Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. features: 1007effcpuid,sse2,sse,sep,pat,cx8,pae,mmx,cmov,de,pge,mtrr,msr,tsc,lgpg cpu0: initialized cpu module 'cpu.generic' mem = 981500K (0x3be7f000) root nexus = i86pc pseudo0 at root pseudo0 is /pseudo scsi_vhci0 at root scsi_vhci0 is /scsi_vhci isa0 at root pci0 at root: space 0 offset 0 pci0 is /[EMAIL PROTECTED],0 IDE device at targ 0, lun 0 lastlun 0x0 model ST340014A ATA/ATAPI-6 supported, majver 0x7e minver 0x1b ata_set_feature: (0x66,0x0) failed ATAPI device at targ 1, lun 0 lastlun 0x0 model SAMSUNG CDRW/DVD SM-352N PCI-device: [EMAIL PROTECTED], ata2 ata2 is /[EMAIL PROTECTED],0/[EMAIL PROTECTED],1/[EMAIL PROTECTED] UltraDMA mode 2 selected UltraDMA mode 2 selected UltraDMA mode 2 selected UltraDMA mode 2 selected cmdk0 at ata2 target 0 lun 0 cmdk0 is /[EMAIL PROTECTED],0/[EMAIL PROTECTED],1/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 SMBIOS v2.3 loaded (902 bytes)pseudo-device: dld0 dld0 is /pseudo/[EMAIL PROTECTED] PCI-device: pci1106,[EMAIL PROTECTED], pci_pci0 pci_pci0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED] ISA-device: asy0 asy0 is /isa/[EMAIL PROTECTED],3f8 /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4 (ehci0): Unable to take control from BIOS. Failure is ignored. PCI-device: pci1106,[EMAIL PROTECTED],4, ehci0 ehci0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4 USB 2.0 device (usb951,1600) operating at hi speed (USB 2.x) on USB 2.0 root hub: [EMAIL PROTECTED], scsa2usb0 at bus address 2 Kingston DataTraveler II 5B571B0016A3 scsa2usb0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED] /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED] (scsa2usb0) online USB 2.0 device (usb781,5151) operating at hi speed (USB 2.x) on USB 2.0 root hub: [EMAIL PROTECTED], scsa2usb1 at bus address 3 SanDisk Corporation Cruzer Micro 200517402304FB60FF18 scsa2usb1 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED] /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED] (scsa2usb1) online USB 2.0 device (usb154b,5) operating at hi speed (USB 2.x) on USB 2.0 root hub: [EMAIL PROTECTED], scsa2usb2 at bus address 4 PNY USB 2.0 FD 6E6A09002646 scsa2usb2 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED] /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED] (scsa2usb2) online sd0 at scsa2usb0: target 0 lun 0 sd0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 sd1 at scsa2usb1: target 0 lun 0 sd1 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 sd2 at scsa2usb2: target 0 lun 0 sd2 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd0) online /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd1) online /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd2) online PCI-device: pci1106,[EMAIL PROTECTED], uhci0 uhci0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED] PCI-device: pci1106,[EMAIL PROTECTED],1, uhci1 uhci1 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],1 PCI-device: pci1106,[EMAIL PROTECTED],2, uhci2 uhci2 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],2 PCI-device: pci1106,[EMAIL PROTECTED],3, uhci3 uhci3 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],3 cpu0: x86 (CentaurHauls 6A9 family 6 model 10 step 9 clock 1200 MHz) cpu0: VIA Esther processor 1200MHz pseudo-device: tzmon0 tzmon0 is /pseudo/[EMAIL PROTECTED] UltraDMA mode 2 selected UltraDMA mode 2 selected UltraDMA mode 2 selected UltraDMA mode 2 selected dump on /dev/dsk/c0d0s1 size 2047 MB Hostname: aequitas pseudo-device: zfs0 zfs0 is /pseudo/[EMAIL PROTECTED] pseudo-device: devinfo0 devinfo0 is /pseudo/[EMAIL PROTECTED] pseudo-device: pm0 pm0 is /pseudo/[EMAIL PROTECTED] pseudo-device: power0 power0 is /pseudo/[EMAIL PROTECTED] /dev/rdsk/c0d0s7 is clean
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Dennis Clarke wrote: Jürgen Keil wrote: cpu0: x86 (CentaurHauls 6A9 family 6 model 10 step 9 clock 1200 MHz) cpu0: VIA Esther processor 1200MHz Could be missing / broken or incomplete VIA/CentaurHauls x86 cpu support: I'm suspecting that you are correct Jürgen. According to: http://en.wikipedia.org/wiki/VIA_C7#Esther ...the CPU's L2 is 32-way set associative, so we're likely mis-characterizing it. I'm trying to get access to the CPUID specification for this CPU. If someone can send me a pointer, i'd be grateful. :) here is what I have in front of me : snip How can I be of help to you? I am sure that this machine here can provide all that you want to know. I think what Eric is looking for is a reference or spec to the VIA C7 chip, specifically to find out what values it returns for CPUID queries. The fact that OpenSolaris is detecting it as a 10-way set associative cache when it's in fact a 32-way is a bug in the way we're querying the CPUID function on the VIA chip (or a bug in the way we're interpreting the response); so we need to find out what the correct query is, and how to correctly interpret it. cheers, steve -- stephen lau // [EMAIL PROTECTED] | 650.786.0845 | http://whacked.net opensolaris // solaris kernel development ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Dennis Clarke wrote: Jürgen Keil wrote: cpu0: x86 (CentaurHauls 6A9 family 6 model 10 step 9 clock 1200 MHz) cpu0: VIA Esther processor 1200MHz Could be missing / broken or incomplete VIA/CentaurHauls x86 cpu support: I'm suspecting that you are correct Jürgen. According to: http://en.wikipedia.org/wiki/VIA_C7#Esther ...the CPU's L2 is 32-way set associative, so we're likely mis-characterizing it. I'm trying to get access to the CPUID specification for this CPU. If someone can send me a pointer, i'd be grateful. :) here is what I have in front of me : snip How can I be of help to you? I am sure that this machine here can provide all that you want to know. I think what Eric is looking for is a reference or spec to the VIA C7 chip, specifically to find out what values it returns for CPUID queries. Oh, I don't think I have *that* in front of me. The fact that OpenSolaris is detecting it as a 10-way set associative cache when it's in fact a 32-way is a bug in the way we're querying the CPUID function on the VIA chip (or a bug in the way we're interpreting the response); so we need to find out what the correct query is, and how to correctly interpret it. This seems to be an opportunity to employ some code early in the boot phase and to perhaps print out some debugging info at that time. The question is .. where and what to debug. Dennis ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Dennis Clarke wrote: This seems to be an opportunity to employ some code early in the boot phase and to perhaps print out some debugging info at that time. The question is .. where and what to debug. Can you send me the output of: # echo cpuid_info0::print | mdb -k Those are the tea leaves that i'll be using the VIA CPUID guide to interpret. :) Thanks, -Eric ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Dennis Clarke wrote: Dennis Clarke wrote: This seems to be an opportunity to employ some code early in the boot phase and to perhaps print out some debugging info at that time. The question is .. where and what to debug. Can you send me the output of: # echo cpuid_info0::print | mdb -k # echo cpuid_info0::print | mdb -k { cpi_pass = 0x4 cpi_maxeax = 0x1 cpi_vendorstr = [ CentaurHauls ] cpi_vendor = 0x6 cpi_family = 0x6 cpi_model = 0xa cpi_step = 0x9 cpi_chipid = 0x cpi_brandid = 0 cpi_clogid = 0 cpi_ncpu_per_chip = 0x1 cpi_cacheinfo = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ] cpi_ncache = 0 cpi_std = [ { cp_eax = 0x1 cp_ebx = 0x746e6543 cp_ecx = 0x736c7561 cp_edx = 0x48727561 } { cp_eax = 0x6a9 cp_ebx = 0x10800 cp_ecx = 0 cp_edx = 0xa7c9bbff } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } ] cpi_xmaxeax = 0x8006 cpi_brandstr = [ VIA Esther processor 1200MHz ] cpi_pabits = 0x24 cpi_vabits = 0x20 cpi_extd = [ { cp_eax = 0x8006 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } { cp_eax = 0x20202020 cp_ebx = 0x20202020 cp_ecx = 0x20202020 cp_edx = 0x20202020 } { cp_eax = 0x56202020 cp_ebx = 0x45204149 cp_ecx = 0x65687473 cp_edx = 0x72702072 } { cp_eax = 0x7365636f cp_ebx = 0x20726f73 cp_ecx = 0x30303231 cp_edx = 0x7a484d } { cp_eax = 0 cp_ebx = 0x8800880 cp_ecx = 0x40040140 cp_edx = 0x40040140 } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0x80a140 cp_edx = 0 } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } ] cpi_coreid = 0 cpi_ncore_per_chip = 0x1 cpi_support = [ 0xa7c9bbff, 0, 0, 0, 0 ] cpi_chiprev = 0 cpi_chiprevstr = 0xfe8a6a80 Unknown cpi_socket = 0 cpi_mwait = { mon_min = 0 mon_max = 0 support = 0 } } Those are the tea leaves that i'll be using the VIA CPUID guide to interpret. :) let me know anything else that you need. Thanks, I can see from cpuid_info0.cpi_extd[6].cp_ecx[15:12] where the associativity was derived from (0xa), so we were assuming an AMD style cache description (amd_l2cacheinfo()). cpuid_info.cpi_std[2] is filled with 0x20 descriptors...for which we don't have an entry in our tables. If the appropriate CPUID reference says we should be using function 2, then that's probably the problem. I've filed: 6580117 panic: assertion failed: ISP2(CPUSETSIZE()) on VIA Esther based system for this. Thanks, -Eric ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Dennis Clarke wrote: Dennis Clarke wrote: This seems to be an opportunity to employ some code early in the boot phase and to perhaps print out some debugging info at that time. The question is .. where and what to debug. Can you send me the output of: # echo cpuid_info0::print | mdb -k # echo cpuid_info0::print | mdb -k { cpi_pass = 0x4 cpi_maxeax = 0x1 cpi_vendorstr = [ CentaurHauls ] cpi_vendor = 0x6 cpi_family = 0x6 cpi_model = 0xa cpi_step = 0x9 cpi_chipid = 0x cpi_brandid = 0 cpi_clogid = 0 cpi_ncpu_per_chip = 0x1 cpi_cacheinfo = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ] cpi_ncache = 0 cpi_std = [ { cp_eax = 0x1 cp_ebx = 0x746e6543 cp_ecx = 0x736c7561 cp_edx = 0x48727561 } { cp_eax = 0x6a9 cp_ebx = 0x10800 cp_ecx = 0 cp_edx = 0xa7c9bbff } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } ] cpi_xmaxeax = 0x8006 cpi_brandstr = [ VIA Esther processor 1200MHz ] cpi_pabits = 0x24 cpi_vabits = 0x20 cpi_extd = [ { cp_eax = 0x8006 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } { cp_eax = 0x20202020 cp_ebx = 0x20202020 cp_ecx = 0x20202020 cp_edx = 0x20202020 } { cp_eax = 0x56202020 cp_ebx = 0x45204149 cp_ecx = 0x65687473 cp_edx = 0x72702072 } { cp_eax = 0x7365636f cp_ebx = 0x20726f73 cp_ecx = 0x30303231 cp_edx = 0x7a484d } { cp_eax = 0 cp_ebx = 0x8800880 cp_ecx = 0x40040140 cp_edx = 0x40040140 } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0x80a140 cp_edx = 0 } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } { cp_eax = 0 cp_ebx = 0 cp_ecx = 0 cp_edx = 0 } ] cpi_coreid = 0 cpi_ncore_per_chip = 0x1 cpi_support = [ 0xa7c9bbff, 0, 0, 0, 0 ] cpi_chiprev = 0 cpi_chiprevstr = 0xfe8a6a80 Unknown cpi_socket = 0 cpi_mwait = { mon_min = 0 mon_max = 0 support = 0 } } Those are the tea leaves that i'll be using the VIA CPUID guide to interpret. :) let me know anything else that you need. Thanks, I can see from cpuid_info0.cpi_extd[6].cp_ecx[15:12] where the associativity was derived from (0xa), so we were assuming an AMD style cache description (amd_l2cacheinfo()). cpuid_info.cpi_std[2] is filled with 0x20 descriptors...for which we don't have an entry in our tables. If the appropriate CPUID reference says we should be using function 2, then that's probably the problem. gee ... I'll just agree ;-) I've filed: 6580117 panic: assertion failed: ISP2(CPUSETSIZE()) on VIA Esther based system for this. Thanks, -Eric Thank you .. I'll keep an eye on this link : http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6580117 and in the meantime I'll update my code via hg and fire off another build just to test the results. When you're ready. anything else you need .. just let me know. Dennis Clarke ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Gavin wrote: On 07/07/07 18:55, Dennis Clarke wrote: after BFU of snv_68 : module /platform/i86pc/kernel//unix: text at [0xfe80, 0xfe8d4a8b] data at 0xfec0 module /kernel/genunix: text at [0xfe8d4a90, 0xfead88ff] data at 0xfec4cdc0 panic[cpu0]/thread=fec1f2e0: assertion failed: l2cache_assoc ? (l2cache_sz / l2cache_assoc) : 0x1000)) (((l2cache_assoc ? (l2cache_sz / l2cache_assoc) : 0x1000)) - 1)) == 0), file fec382d0 genunix:assfail+5a (fe8c936c, fe8c95ec,) fec38300 unix:page_coloring_init+35a (2, 40, a) fec38358 unix:startup_memlist+3f5 (fec38384, fe954503,) fec38360 unix:startup+1c (fe800010, fec34128,) fec38384 genunix:main+5b () That is this assert: ASSERT(ISP2(CPUSETSIZE())); checking that the number of distinct l2 sets is a power of 2: #define CPUSETSIZE()\ (l2cache_assoc ? (l2cache_sz / l2cache_assoc) : MMU_PAGESIZE) #define ISP2(x) (((x) ((x) - 1)) == 0) So could you boot under kmdb and at the time of panic (when you drop to the debugger) utter: l2cache_assoc/D l2cache_sz/X l2cache_linesz/D Hmm, startup_memlist+3f5 passes these as parameters to page_coloring_init(), so if we trust the parameters shown in the stack backtrace, we have pagecolor_memsz = page_coloring_init(l2cache_sz, l2cache_linesz, l2cache_assoc); fec38300 unix:page_coloring_init+35a (2, 40, a) l2cache_sz == 0x2, l2cache_linesz == 0x40, l2cache_assoc == 0xa That should give us a CPUSETSIZE() of 0x2 / 0xa == 0x, which is not a power of 2. This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Gavin wrote: On 07/07/07 18:55, Dennis Clarke wrote: after BFU of snv_68 : module /platform/i86pc/kernel//unix: text at [0xfe80, 0xfe8d4a8b] data at 0xfec0 module /kernel/genunix: text at [0xfe8d4a90, 0xfead88ff] data at 0xfec4cdc0 panic[cpu0]/thread=fec1f2e0: assertion failed: l2cache_assoc ? (l2cache_sz / l2cache_assoc) : 0x1000)) (((l2cache_assoc ? (l2cache_sz / l2cache_assoc) : 0x1000)) - 1)) == 0), file fec382d0 genunix:assfail+5a (fe8c936c, fe8c95ec,) fec38300 unix:page_coloring_init+35a (2, 40, a) fec38358 unix:startup_memlist+3f5 (fec38384, fe954503,) fec38360 unix:startup+1c (fe800010, fec34128,) fec38384 genunix:main+5b () That is this assert: ASSERT(ISP2(CPUSETSIZE())); checking that the number of distinct l2 sets is a power of 2: #define CPUSETSIZE()\ (l2cache_assoc ? (l2cache_sz / l2cache_assoc) : MMU_PAGESIZE) #define ISP2(x) (((x) ((x) - 1)) == 0) So could you boot under kmdb and at the time of panic (when you drop to the debugger) utter: l2cache_assoc/D l2cache_sz/X l2cache_linesz/D Hmm, startup_memlist+3f5 passes these as parameters to page_coloring_init(), so if we trust the parameters shown in the stack backtrace, we have pagecolor_memsz = page_coloring_init(l2cache_sz, l2cache_linesz, l2cache_assoc); fec38300 unix:page_coloring_init+35a (2, 40, a) l2cache_sz == 0x2, l2cache_linesz == 0x40, l2cache_assoc == 0xa That should give us a CPUSETSIZE() of 0x2 / 0xa == 0x, which is not a power of 2. If I had realized that this experiment would produce data of some value I would not have tossed out the whole thing so fast. But .. I needed a successful build ( which I did get ) and a successful BFU/ACR which I did not. So I installed snv_64a Developers Edition fresh and started over. That was another mistake because now I was building with Studio 12 as opposed to Studio 11. The build ran for 16 hours and then failed. This is turning out to be a long week already with snv_68 Dennis ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Dennis Clarke wrote: Hmm, startup_memlist+3f5 passes these as parameters to page_coloring_init(), so if we trust the parameters shown in the stack backtrace, we have pagecolor_memsz = page_coloring_init(l2cache_sz, l2cache_linesz, l2cache_assoc); fec38300 unix:page_coloring_init+35a (2, 40, a) l2cache_sz == 0x2, l2cache_linesz == 0x40, l2cache_assoc == 0xa That should give us a CPUSETSIZE() of 0x2 / 0xa == 0x, which is not a power of 2. 10 way set associative cache eh. ;) Dennis, was this an Intel or AMD based system? Thanks, -Eric ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Eric Saxe wrote: Dennis Clarke wrote: Hmm, startup_memlist+3f5 passes these as parameters to page_coloring_init(), so if we trust the parameters shown in the stack backtrace, we have pagecolor_memsz = page_coloring_init(l2cache_sz, l2cache_linesz, l2cache_assoc); fec38300 unix:page_coloring_init+35a (2, 40, a) l2cache_sz == 0x2, l2cache_linesz == 0x40, l2cache_assoc == 0xa That should give us a CPUSETSIZE() of 0x2 / 0xa == 0x, which is not a power of 2. 10 way set associative cache eh. ;) Dennis, was this an Intel or AMD based system? Never mind...I neglected to look closely at the verbose boot output you provided... Thanks... -Eric ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Dennis Clarke wrote: Hmm, startup_memlist+3f5 passes these as parameters to page_coloring_init(), so if we trust the parameters shown in the stack backtrace, we have pagecolor_memsz = page_coloring_init(l2cache_sz, l2cache_linesz, l2cache_assoc); fec38300 unix:page_coloring_init+35a (2, 40, a) l2cache_sz == 0x2, l2cache_linesz == 0x40, l2cache_assoc == 0xa That should give us a CPUSETSIZE() of 0x2 / 0xa == 0x, which is not a power of 2. 10 way set associative cache eh. ;) Dennis, was this an Intel or AMD based system? Actually neither .. it is a low power appliance motherboard based on VIA technology. Dennis ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Eric Saxe wrote: Dennis Clarke wrote: Hmm, startup_memlist+3f5 passes these as parameters to page_coloring_init(), so if we trust the parameters shown in the stack backtrace, we have pagecolor_memsz = page_coloring_init(l2cache_sz, l2cache_linesz, l2cache_assoc); fec38300 unix:page_coloring_init+35a (2, 40, a) l2cache_sz == 0x2, l2cache_linesz == 0x40, l2cache_assoc == 0xa That should give us a CPUSETSIZE() of 0x2 / 0xa == 0x, which is not a power of 2. 10 way set associative cache eh. ;) Dennis, was this an Intel or AMD based system? Never mind...I neglected to look closely at the verbose boot output you provided... Thanks... I'm starting over from scratch again and in 16 hours or so .. hopefully .. I'll have a clean build. I have done this four times now and am not having great success .. for various little reasons. I'll stay in touch. Dennis ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
Dennis Clarke wrote: Dennis, was this an Intel or AMD based system? Actually neither .. it is a low power appliance motherboard based on VIA technology. I see. If you can provide me access to a crash dump somehow, that would be helpful. Otherwise, if you can reproduce this let's take the conversation offline (or to a chat session), and we can debug it live... Thanks, -Eric ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
after BFU of snv_68 : module /platform/i86pc/kernel//unix: text at [0xfe80, 0xfe8d4a8b] data at 0xfec0 module /kernel/genunix: text at [0xfe8d4a90, 0xfead88ff] data at 0xfec4cdc0 panic[cpu0]/thread=fec1f2e0: assertion failed: l2cache_assoc ? (l2cache_sz / l2cache_assoc) : 0x1000)) (((l2cache_assoc ? (l2cache_sz / l2cache_assoc) : 0x1000)) - 1)) == 0), file fec382d0 genunix:assfail+5a (fe8c936c, fe8c95ec,) fec38300 unix:page_coloring_init+35a (2, 40, a) fec38358 unix:startup_memlist+3f5 (fec38384,fe954503,) fec38360 unix:startup+1c (fe800010, fec34128,) fec38384 genunix:main+5b () Can you try to boot with kmdb enabled (kernel boot option: -k) and print the values for l2cache_sz and l2cache_assoc, when it stops at the panic / failed assertion ? l2cache_sz::print l2cache_assoc::print cpu0: x86 (CentaurHauls 6A9 family 6 model 10 step 9 clock 1200 MHz) cpu0: VIA Esther processor 1200MHz Could be missing / broken or incomplete VIA/CentaurHauls x86 cpu support: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/i86pc/os/startup.c#1000 1000 /* 1001 * determine l2 cache info and memory size for page coloring 1002 */ 1003 (void) getl2cacheinfo(CPU, 1004 l2cache_sz, l2cache_linesz, l2cache_assoc); 1005 pagecolor_memsz = 1006 page_coloring_init(l2cache_sz, l2cache_linesz, l2cache_assoc); In getl2cacheinfo(), I see support for Intel, AMD and Cyrix, but no support for VIA / CentaurHauls cpus? http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/i86pc/os/cpuid.c#3300 3300 int 3301 getl2cacheinfo(cpu_t *cpu, int *csz, int *lsz, int *assoc) 3302 { 3303 struct cpuid_info *cpi = cpu-cpu_m.mcpu_cpi; 3304 struct l2info __l2info, *l2i = __l2info; 3305 3306 l2i-l2i_csz = csz; 3307 l2i-l2i_lsz = lsz; 3308 l2i-l2i_assoc = assoc; 3309 l2i-l2i_ret = -1; 3310 3311 switch (x86_which_cacheinfo(cpi)) { 3312 case X86_VENDOR_Intel: 3313 intel_walk_cacheinfo(cpi, l2i, intel_l2cinfo); 3314 break; 3315 case X86_VENDOR_Cyrix: 3316 cyrix_walk_cacheinfo(cpi, l2i, intel_l2cinfo); 3317 break; 3318 case X86_VENDOR_AMD: 3319 amd_l2cacheinfo(cpi, l2i); 3320 break; 3321 default: 3322 break; 3323 } 3324 return (l2i-l2i_ret); 3325 } This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
On 07/07/07 18:55, Dennis Clarke wrote: after BFU of snv_68 : module /platform/i86pc/kernel//unix: text at [0xfe80, 0xfe8d4a8b] data at 0xfec0 module /kernel/genunix: text at [0xfe8d4a90, 0xfead88ff] data at 0xfec4cdc0 panic[cpu0]/thread=fec1f2e0: assertion failed: l2cache_assoc ? (l2cache_sz / l2cache_assoc) : 0x1000)) (((l2cache_assoc ? (l2cache_sz / l2cache_assoc) : 0x1000)) - 1)) == 0), file fec382d0 genunix:assfail+5a (fe8c936c, fe8c95ec,) fec38300 unix:page_coloring_init+35a (2, 40, a) fec38358 unix:startup_memlist+3f5 (fec38384, fe954503,) fec38360 unix:startup+1c (fe800010, fec34128,) fec38384 genunix:main+5b () That is this assert: ASSERT(ISP2(CPUSETSIZE())); It's checking that the number of distinct l2 sets is a power of 2: #define CPUSETSIZE()\ (l2cache_assoc ? (l2cache_sz / l2cache_assoc) : MMU_PAGESIZE) #define ISP2(x) (((x) ((x) - 1)) == 0) So could you boot under kmdb and at the time of panic (when you drop to the debugger) utter: l2cache_assoc/D l2cache_sz/X l2cache_linesz/D Note the comment where these are defined: /* * XX64 need a comment here.. are these just default values, surely * we read the cpuid type information to figure this out. */ int l2cache_sz = 0x8; int l2cache_linesz = 0x40; int l2cache_assoc = 1; Gavin ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
On 07/07/07 18:55, Dennis Clarke wrote: after BFU of snv_68 : module /platform/i86pc/kernel//unix: text at [0xfe80, 0xfe8d4a8b] data at 0xfec0 module /kernel/genunix: text at [0xfe8d4a90, 0xfead88ff] data at 0xfec4cdc0 panic[cpu0]/thread=fec1f2e0: assertion failed: l2cache_assoc ? (l2cache_sz / l2cache_assoc) : 0x1000)) (((l2cache_assoc ? (l2cache_sz / l2cache_assoc) : 0x1000)) - 1)) == 0), file fec382d0 genunix:assfail+5a (fe8c936c, fe8c95ec,) fec38300 unix:page_coloring_init+35a (2, 40, a) fec38358 unix:startup_memlist+3f5 (fec38384, fe954503,) fec38360 unix:startup+1c (fe800010, fec34128,) fec38384 genunix:main+5b () That is this assert: ASSERT(ISP2(CPUSETSIZE())); It's checking that the number of distinct l2 sets is a power of 2: #define CPUSETSIZE()\ (l2cache_assoc ? (l2cache_sz / l2cache_assoc) : MMU_PAGESIZE) #define ISP2(x) (((x) ((x) - 1)) == 0) So could you boot under kmdb and at the time of panic (when you drop to the debugger) utter: l2cache_assoc/D l2cache_sz/X l2cache_linesz/D Note the comment where these are defined: /* * XX64 need a comment here.. are these just default values, surely * we read the cpuid type information to figure this out. */ int l2cache_sz = 0x8; int l2cache_linesz = 0x40; int l2cache_assoc = 1; oh gee ... well I wish I could. I made the wild assumption that something was terribly wrong at my end I decided to reinstall snv_64a from scratch : $ uname -a SunOS aequitas 5.11 snv_64a i86pc i386 i86pc $ cat /etc/release Solaris Nevada snv_64a X86 Copyright 2007 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 18 May 2007 $ I have everything ready to go for another build and I'll fire that off shortly. It takes over 14 hours on this appliance and then I can BFU/ACR boot+pray sequence :-) so .. I'll check in about 15 hours from now. Dennis ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
[osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed
after BFU of snv_68 : module /platform/i86pc/kernel//unix: text at [0xfe80, 0xfe8d4a8b] data at 0xfec0 module /kernel/genunix: text at [0xfe8d4a90, 0xfead88ff] data at 0xfec4cdc0 panic[cpu0]/thread=fec1f2e0: assertion failed: l2cache_assoc ? (l2cache_sz / l2cache_assoc) : 0x1000)) (((l2cache_assoc ? (l2cache_sz / l2cache_assoc) : 0x1000)) - 1)) == 0), file fec382d0 genunix:assfail+5a (fe8c936c, fe8c95ec,) fec38300 unix:page_coloring_init+35a (2, 40, a) fec38358 unix:startup_memlist+3f5 (fec38384, fe954503,) fec38360 unix:startup+1c (fe800010, fec34128,) fec38384 genunix:main+5b () system reboots right away ... I select Solaris failsafe that seems to work : module /platform/i86pc/kernel/unix: text at [0xfe80, 0xfe8abc47] data at 0xfec0 module /kernel/genunix: text at [0xfe8abc48, 0xfe9f1887] data at 0xfec4a840 SunOS Release 5.11 Version snv_64a 32-bit Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. features: 1007effcpuid,sse2,sse,sep,pat,cx8,pae,mmx,cmov,de,pge,mtrr,msr,tsc,lgpg Using default device instance data cpu0: initialized cpu module 'cpu.generic' mem = 981500K (0x3be7f000) root nexus = i86pc pseudo0 at root pseudo0 is /pseudo scsi_vhci0 at root scsi_vhci0 is /scsi_vhci isa0 at root ramdisk0 at root ramdisk0 is /ramdisk SMBIOS v2.3 loaded (902 bytes)pseudo-device: dld0 dld0 is /pseudo/[EMAIL PROTECTED] pci0 at root: space 0 offset 0 pci0 is /[EMAIL PROTECTED],0 PCI-device: pci1106,[EMAIL PROTECTED], pci_pci0 pci_pci0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED] ISA-device: asy0 asy0 is /isa/[EMAIL PROTECTED],3f8 /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4 (ehci0): Unable to take control from BIOS. Failure is ignored. PCI-device: pci1106,[EMAIL PROTECTED],4, ehci0 ehci0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4 PCI-device: pci1106,[EMAIL PROTECTED], uhci0 uhci0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED] PCI-device: pci1106,[EMAIL PROTECTED],1, uhci1 uhci1 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],1 PCI-device: pci1106,[EMAIL PROTECTED],2, uhci2 uhci2 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],2 PCI-device: pci1106,[EMAIL PROTECTED],3, uhci3 uhci3 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],3 cpu0: x86 (CentaurHauls 6A9 family 6 model 10 step 9 clock 1200 MHz) cpu0: VIA Esther processor 1200MHz pseudo-device: tzmon0 tzmon0 is /pseudo/[EMAIL PROTECTED] USB 2.0 device (usb951,1600) operating at hi speed (USB 2.x) on USB 2.0 root hub: [EMAIL PROTECTED], scsa2usb0 at bus address 2 Kingston DataTraveler II 5B571B0016A3 scsa2usb0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED] /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED] (scsa2usb0) online sd0 at scsa2usb0: target 0 lun 0 sd0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd0) online Booting to milestone milestone/single-user:default. Configuring /dev pseudo-device: devinfo0 devinfo0 is /pseudo/[EMAIL PROTECTED] xsvc0 at root: space 0 offset 0 xsvc0 is /[EMAIL PROTECTED],0 pseudo-device: pseudo1 pseudo1 is /pseudo/[EMAIL PROTECTED] PCI-device: pci16f3,[EMAIL PROTECTED],5, audiovia823x0 audiovia823x0 is /[EMAIL PROTECTED],0/pci16f3,[EMAIL PROTECTED],5 IDE device at targ 0, lun 0 lastlun 0x0 model ST340014A ATA/ATAPI-6 supported, majver 0x7e minver 0x1b ata_set_feature: (0x66,0x0) failed ATAPI device at targ 1, lun 0 lastlun 0x0 model SAMSUNG CDRW/DVD SM-352N PCI-device: [EMAIL PROTECTED], ata2 ata2 is /[EMAIL PROTECTED],0/[EMAIL PROTECTED],1/[EMAIL PROTECTED] UltraDMA mode 2 selected UltraDMA mode 2 selected UltraDMA mode 2 selected sd1 at ata2: target 1 lun 0 sd1 is /[EMAIL PROTECTED],0/[EMAIL PROTECTED],1/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 pseudo-device: fssnap0 fssnap0 is /pseudo/[EMAIL PROTECTED] pseudo-device: ramdisk1024 ramdisk1024 is /pseudo/[EMAIL PROTECTED] pseudo-device: winlock0 winlock0 is /pseudo/[EMAIL PROTECTED] pseudo-device: fcp0 fcp0 is /pseudo/[EMAIL PROTECTED] pseudo-device: fcsm0 fcsm0 is /pseudo/[EMAIL PROTECTED] pseudo-device: llc10 llc10 is /pseudo/[EMAIL PROTECTED] pseudo-device: lofi0 lofi0 is /pseudo/[EMAIL PROTECTED] pseudo-device: pool0 pool0 is /pseudo/[EMAIL PROTECTED] pseudo-device: power0 power0 is /pseudo/[EMAIL PROTECTED] cmdk0 at ata2 target 0 lun 0 cmdk0 is /[EMAIL PROTECTED],0/[EMAIL PROTECTED],1/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 pseudo-device: zfs0 zfs0 is /pseudo/[EMAIL PROTECTED] Searching for installed OS instances... Solaris Nevada snv_64a X86 was found on /dev/dsk/c0d0s0. Do you wish to have it mounted read-write on /a? [y,n,?] y mounting /dev/dsk/c0d0s0 on /a Starting shell. # .. too much to hope for a core dump ... # fsck -F ufs -Y /dev/rdsk/c0d0s1 ** /dev/rdsk/c0d0s1 ** Last Mounted on /var ** Phase 1 - Check Blocks and