Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-09-21 Thread Eric Saxe
Eric Saxe wrote:
 Dennis Clarke wrote:
   
 Dennis, was this an Intel or AMD based system?
 
   
 Actually neither .. it is a low power appliance motherboard based on VIA
 technology.
   
 
 I see. If you can provide me access to a crash dump somehow, that would 
 be helpful. Otherwise, if you can reproduce this
 let's take the conversation offline (or to a chat session), and we can 
 debug it live...
   
FYI, this bug:
6580117 panic: assertion failed: ISP2(CPUSETSIZE()) on VIA Esther 
based system

...has been fixed in build 74 (Kit Chow fixed it)...

Thanks,
-Eric


___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-12 Thread Eric Saxe
Jürgen Keil wrote:
   
 cpu0: x86 (CentaurHauls 6A9 family 6 model 10 step 9 clock 1200 MHz)
 cpu0: VIA Esther processor 1200MHz
 

 Could be missing / broken or incomplete VIA/CentaurHauls x86 cpu support:
   
I'm suspecting that you are correct Jürgen. According to:
http://en.wikipedia.org/wiki/VIA_C7#Esther

...the CPU's L2 is 32-way set associative, so we're likely 
mis-characterizing it. I'm trying to get access
to the CPUID specification for this CPU. If someone can send me a 
pointer, i'd be grateful. :)

-Eric
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-12 Thread Dennis Clarke

 Jürgen Keil wrote:

 cpu0: x86 (CentaurHauls 6A9 family 6 model 10 step 9 clock 1200 MHz)
 cpu0: VIA Esther processor 1200MHz


 Could be missing / broken or incomplete VIA/CentaurHauls x86 cpu support:

 I'm suspecting that you are correct Jürgen. According to:
 http://en.wikipedia.org/wiki/VIA_C7#Esther

 ...the CPU's L2 is 32-way set associative, so we're likely
 mis-characterizing it. I'm trying to get access
 to the CPUID specification for this CPU. If someone can send me a
 pointer, i'd be grateful. :)

here is what I have in front of me :

module /platform/i86pc/kernel//unix: text at [0xfe80, 0xfe8ae71b] data
at 0xfec0
module /kernel/genunix: text at [0xfe8ae720, 0xfea4497f] data at 0xfec47d80
SunOS Release 5.11 Version snv_68 32-bit
Copyright 1983-2007 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
features:
1007effcpuid,sse2,sse,sep,pat,cx8,pae,mmx,cmov,de,pge,mtrr,msr,tsc,lgpg
cpu0: initialized cpu module 'cpu.generic'
mem = 981500K (0x3be7f000)
root nexus = i86pc
pseudo0 at root
pseudo0 is /pseudo
scsi_vhci0 at root
scsi_vhci0 is /scsi_vhci
isa0 at root
pci0 at root: space 0 offset 0
pci0 is /[EMAIL PROTECTED],0
IDE device at targ 0, lun 0 lastlun 0x0
model ST340014A
ATA/ATAPI-6 supported, majver 0x7e minver 0x1b
ata_set_feature: (0x66,0x0) failed
ATAPI device at targ 1, lun 0 lastlun 0x0
model SAMSUNG CDRW/DVD SM-352N
PCI-device: [EMAIL PROTECTED], ata2
ata2 is /[EMAIL PROTECTED],0/[EMAIL PROTECTED],1/[EMAIL PROTECTED]
UltraDMA mode 2 selected
UltraDMA mode 2 selected
UltraDMA mode 2 selected
UltraDMA mode 2 selected
cmdk0 at ata2 target 0 lun 0
cmdk0 is /[EMAIL PROTECTED],0/[EMAIL PROTECTED],1/[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
SMBIOS v2.3 loaded (902 bytes)pseudo-device: dld0
dld0 is /pseudo/[EMAIL PROTECTED]
PCI-device: pci1106,[EMAIL PROTECTED], pci_pci0
pci_pci0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED]
ISA-device: asy0
asy0 is /isa/[EMAIL PROTECTED],3f8
/[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4 (ehci0): Unable to take 
control from BIOS.
Failure is ignored.
PCI-device: pci1106,[EMAIL PROTECTED],4, ehci0
ehci0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4
USB 2.0 device (usb951,1600) operating at hi speed (USB 2.x) on USB 2.0 root
hub: [EMAIL PROTECTED], scsa2usb0 at bus address 2
Kingston DataTraveler II  5B571B0016A3
scsa2usb0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]
/[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED] (scsa2usb0) 
online
USB 2.0 device (usb781,5151) operating at hi speed (USB 2.x) on USB 2.0 root
hub: [EMAIL PROTECTED], scsa2usb1 at bus address 3
SanDisk Corporation Cruzer Micro 200517402304FB60FF18
scsa2usb1 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]
/[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED] (scsa2usb1) 
online
USB 2.0 device (usb154b,5) operating at hi speed (USB 2.x) on USB 2.0 root
hub: [EMAIL PROTECTED], scsa2usb2 at bus address 4
PNY  USB 2.0 FD   6E6A09002646
scsa2usb2 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]
/[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED] (scsa2usb2) 
online
sd0 at scsa2usb0: target 0 lun 0
sd0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0
sd1 at scsa2usb1: target 0 lun 0
sd1 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0
sd2 at scsa2usb2: target 0 lun 0
sd2 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0
/[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd0) online
/[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd1) online
/[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd2) online
PCI-device: pci1106,[EMAIL PROTECTED], uhci0
uhci0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED]
PCI-device: pci1106,[EMAIL PROTECTED],1, uhci1
uhci1 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],1
PCI-device: pci1106,[EMAIL PROTECTED],2, uhci2
uhci2 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],2
PCI-device: pci1106,[EMAIL PROTECTED],3, uhci3
uhci3 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],3
cpu0: x86 (CentaurHauls 6A9 family 6 model 10 step 9 clock 1200 MHz)
cpu0: VIA Esther processor 1200MHz
pseudo-device: tzmon0
tzmon0 is /pseudo/[EMAIL PROTECTED]
UltraDMA mode 2 selected
UltraDMA mode 2 selected
UltraDMA mode 2 selected
UltraDMA mode 2 selected
dump on /dev/dsk/c0d0s1 size 2047 MB
Hostname: aequitas
pseudo-device: zfs0
zfs0 is /pseudo/[EMAIL PROTECTED]
pseudo-device: devinfo0
devinfo0 is /pseudo/[EMAIL PROTECTED]
pseudo-device: pm0
pm0 is /pseudo/[EMAIL PROTECTED]
pseudo-device: power0
power0 is /pseudo/[EMAIL PROTECTED]
/dev/rdsk/c0d0s7 is clean

Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-12 Thread Stephen Lau
Dennis Clarke wrote:
 Jürgen Keil wrote:
 cpu0: x86 (CentaurHauls 6A9 family 6 model 10 step 9 clock 1200 MHz)
 cpu0: VIA Esther processor 1200MHz

 Could be missing / broken or incomplete VIA/CentaurHauls x86 cpu support:

 I'm suspecting that you are correct Jürgen. According to:
 http://en.wikipedia.org/wiki/VIA_C7#Esther

 ...the CPU's L2 is 32-way set associative, so we're likely
 mis-characterizing it. I'm trying to get access
 to the CPUID specification for this CPU. If someone can send me a
 pointer, i'd be grateful. :)
 
 here is what I have in front of me :
 

snip

 How can I be of help to you?  I am sure that this machine here can
 provide all that you want to know.

I think what Eric is looking for is a reference or spec to the VIA C7 
chip, specifically to find out what values it returns for CPUID queries.

The fact that OpenSolaris is detecting it as a 10-way set associative 
cache when it's in fact a 32-way is a bug in the way we're querying the 
CPUID function on the VIA chip (or a bug in the way we're interpreting 
the response); so we need to find out what the correct query is, and how 
to correctly interpret it.

cheers,
steve
-- 
stephen lau // [EMAIL PROTECTED] | 650.786.0845 | http://whacked.net
opensolaris // solaris kernel development
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-12 Thread Dennis Clarke

 Dennis Clarke wrote:
 Jürgen Keil wrote:
 cpu0: x86 (CentaurHauls 6A9 family 6 model 10 step 9 clock 1200 MHz)
 cpu0: VIA Esther processor 1200MHz

 Could be missing / broken or incomplete VIA/CentaurHauls x86 cpu
 support:

 I'm suspecting that you are correct Jürgen. According to:
 http://en.wikipedia.org/wiki/VIA_C7#Esther

 ...the CPU's L2 is 32-way set associative, so we're likely
 mis-characterizing it. I'm trying to get access
 to the CPUID specification for this CPU. If someone can send me a
 pointer, i'd be grateful. :)

 here is what I have in front of me :


 snip

 How can I be of help to you?  I am sure that this machine here can
 provide all that you want to know.

 I think what Eric is looking for is a reference or spec to the VIA C7
 chip, specifically to find out what values it returns for CPUID queries.

Oh, I don't think I have *that* in front of me.

 The fact that OpenSolaris is detecting it as a 10-way set associative
 cache when it's in fact a 32-way is a bug in the way we're querying the
 CPUID function on the VIA chip (or a bug in the way we're interpreting
 the response); so we need to find out what the correct query is, and how
 to correctly interpret it.

This seems to be an opportunity to employ some code early in the boot phase
and to perhaps print out some debugging info at that time.

The question is .. where and what to debug.

Dennis

___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-12 Thread Eric Saxe
Dennis Clarke wrote:
 This seems to be an opportunity to employ some code early in the boot phase
 and to perhaps print out some debugging info at that time.

 The question is .. where and what to debug.
   
Can you send me the output of:

# echo cpuid_info0::print | mdb -k

Those are the tea leaves that i'll be using the VIA CPUID guide to 
interpret. :)

Thanks,
-Eric
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-12 Thread Eric Saxe
Dennis Clarke wrote:
 Dennis Clarke wrote:
 
 This seems to be an opportunity to employ some code early in the boot
 phase
 and to perhaps print out some debugging info at that time.

 The question is .. where and what to debug.

   
 Can you send me the output of:

 # echo cpuid_info0::print | mdb -k
 

 # echo cpuid_info0::print | mdb -k
 {
 cpi_pass = 0x4
 cpi_maxeax = 0x1
 cpi_vendorstr = [ CentaurHauls ]
 cpi_vendor = 0x6
 cpi_family = 0x6
 cpi_model = 0xa
 cpi_step = 0x9
 cpi_chipid = 0x
 cpi_brandid = 0
 cpi_clogid = 0
 cpi_ncpu_per_chip = 0x1
 cpi_cacheinfo = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]
 cpi_ncache = 0
 cpi_std = [
 {
 cp_eax = 0x1
 cp_ebx = 0x746e6543
 cp_ecx = 0x736c7561
 cp_edx = 0x48727561
 }
 {
 cp_eax = 0x6a9
 cp_ebx = 0x10800
 cp_ecx = 0
 cp_edx = 0xa7c9bbff
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 ]
 cpi_xmaxeax = 0x8006
 cpi_brandstr = [ VIA Esther processor 1200MHz ]
 cpi_pabits = 0x24
 cpi_vabits = 0x20
 cpi_extd = [
 {
 cp_eax = 0x8006
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 {
 cp_eax = 0x20202020
 cp_ebx = 0x20202020
 cp_ecx = 0x20202020
 cp_edx = 0x20202020
 }
 {
 cp_eax = 0x56202020
 cp_ebx = 0x45204149
 cp_ecx = 0x65687473
 cp_edx = 0x72702072
 }
 {
 cp_eax = 0x7365636f
 cp_ebx = 0x20726f73
 cp_ecx = 0x30303231
 cp_edx = 0x7a484d
 }
 {
 cp_eax = 0
 cp_ebx = 0x8800880
 cp_ecx = 0x40040140
 cp_edx = 0x40040140
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0x80a140
 cp_edx = 0
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 ]
 cpi_coreid = 0
 cpi_ncore_per_chip = 0x1
 cpi_support = [ 0xa7c9bbff, 0, 0, 0, 0 ]
 cpi_chiprev = 0
 cpi_chiprevstr = 0xfe8a6a80 Unknown
 cpi_socket = 0
 cpi_mwait = {
 mon_min = 0
 mon_max = 0
 support = 0
 }
 }

   
 Those are the tea leaves that i'll be using the VIA CPUID guide to
 interpret. :)
 

 let me know anything else that you need.
Thanks, I can see from cpuid_info0.cpi_extd[6].cp_ecx[15:12] where the 
associativity was
derived from (0xa), so we were assuming an AMD style cache description 
(amd_l2cacheinfo()).
cpuid_info.cpi_std[2] is filled with 0x20 descriptors...for which we 
don't have an entry in our
tables.  If the appropriate CPUID reference says we should be using 
function 2, then that's probably
the problem.

I've filed:
6580117 panic: assertion failed: ISP2(CPUSETSIZE()) on VIA Esther 
based system

for this.

Thanks,
-Eric
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-12 Thread Dennis Clarke

 Dennis Clarke wrote:
 Dennis Clarke wrote:

 This seems to be an opportunity to employ some code early in the boot
 phase
 and to perhaps print out some debugging info at that time.

 The question is .. where and what to debug.


 Can you send me the output of:

 # echo cpuid_info0::print | mdb -k


 # echo cpuid_info0::print | mdb -k
 {
 cpi_pass = 0x4
 cpi_maxeax = 0x1
 cpi_vendorstr = [ CentaurHauls ]
 cpi_vendor = 0x6
 cpi_family = 0x6
 cpi_model = 0xa
 cpi_step = 0x9
 cpi_chipid = 0x
 cpi_brandid = 0
 cpi_clogid = 0
 cpi_ncpu_per_chip = 0x1
 cpi_cacheinfo = [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]
 cpi_ncache = 0
 cpi_std = [
 {
 cp_eax = 0x1
 cp_ebx = 0x746e6543
 cp_ecx = 0x736c7561
 cp_edx = 0x48727561
 }
 {
 cp_eax = 0x6a9
 cp_ebx = 0x10800
 cp_ecx = 0
 cp_edx = 0xa7c9bbff
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 ]
 cpi_xmaxeax = 0x8006
 cpi_brandstr = [ VIA Esther processor 1200MHz ]
 cpi_pabits = 0x24
 cpi_vabits = 0x20
 cpi_extd = [
 {
 cp_eax = 0x8006
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 {
 cp_eax = 0x20202020
 cp_ebx = 0x20202020
 cp_ecx = 0x20202020
 cp_edx = 0x20202020
 }
 {
 cp_eax = 0x56202020
 cp_ebx = 0x45204149
 cp_ecx = 0x65687473
 cp_edx = 0x72702072
 }
 {
 cp_eax = 0x7365636f
 cp_ebx = 0x20726f73
 cp_ecx = 0x30303231
 cp_edx = 0x7a484d
 }
 {
 cp_eax = 0
 cp_ebx = 0x8800880
 cp_ecx = 0x40040140
 cp_edx = 0x40040140
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0x80a140
 cp_edx = 0
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 {
 cp_eax = 0
 cp_ebx = 0
 cp_ecx = 0
 cp_edx = 0
 }
 ]
 cpi_coreid = 0
 cpi_ncore_per_chip = 0x1
 cpi_support = [ 0xa7c9bbff, 0, 0, 0, 0 ]
 cpi_chiprev = 0
 cpi_chiprevstr = 0xfe8a6a80 Unknown
 cpi_socket = 0
 cpi_mwait = {
 mon_min = 0
 mon_max = 0
 support = 0
 }
 }


 Those are the tea leaves that i'll be using the VIA CPUID guide to
 interpret. :)


 let me know anything else that you need.
 Thanks, I can see from cpuid_info0.cpi_extd[6].cp_ecx[15:12] where the
 associativity was
 derived from (0xa), so we were assuming an AMD style cache description
 (amd_l2cacheinfo()). cpuid_info.cpi_std[2] is filled with 0x20
 descriptors...for which we don't have an entry in our
 tables.  If the appropriate CPUID reference says we should be using
 function 2, then that's probably the problem.

gee ... I'll just agree ;-)

 I've filed:
 6580117 panic: assertion failed: ISP2(CPUSETSIZE()) on VIA Esther
 based system
 for this.

 Thanks,
 -Eric


Thank you .. I'll keep an eye on this link :

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6580117

and in the meantime I'll update my code via hg and fire off another build
just to test the results.  When you're ready.

anything else you need .. just let me know.

Dennis Clarke

___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-10 Thread Jürgen Keil
Gavin wrote:
 On 07/07/07 18:55, Dennis Clarke wrote:
  after BFU of snv_68 :
  
  module /platform/i86pc/kernel//unix: text at [0xfe80, 0xfe8d4a8b] data 
  at 0xfec0
  module /kernel/genunix: text at [0xfe8d4a90, 0xfead88ff] data at 0xfec4cdc0
  
  panic[cpu0]/thread=fec1f2e0: assertion failed: l2cache_assoc ? 
  (l2cache_sz / l2cache_assoc) : 0x1000))  (((l2cache_assoc ? (l2cache_sz / 
  l2cache_assoc) : 0x1000)) - 1)) == 0), file
  
  fec382d0 genunix:assfail+5a (fe8c936c, fe8c95ec,)
  fec38300 unix:page_coloring_init+35a (2, 40, a)
  fec38358 unix:startup_memlist+3f5 (fec38384, fe954503,)
  fec38360 unix:startup+1c (fe800010, fec34128,)
  fec38384 genunix:main+5b ()
 
 That is this assert:
 
  ASSERT(ISP2(CPUSETSIZE()));
 checking that the number of distinct l2 sets is a power of 2:
 
 #define CPUSETSIZE()\
 (l2cache_assoc ? (l2cache_sz / l2cache_assoc) : MMU_PAGESIZE)
 #define ISP2(x) (((x)  ((x) - 1)) == 0)
 
 So could you boot under kmdb and at the time of panic
 (when you drop to the debugger) utter:
 
 l2cache_assoc/D
 l2cache_sz/X
 l2cache_linesz/D

Hmm, startup_memlist+3f5 passes these as parameters to page_coloring_init(),
so if we trust the parameters shown in the stack backtrace, we have

pagecolor_memsz =
page_coloring_init(l2cache_sz, l2cache_linesz, l2cache_assoc);

   fec38300 unix:page_coloring_init+35a (2, 40, a)

l2cache_sz == 0x2,
l2cache_linesz == 0x40,
l2cache_assoc ==  0xa

That should give us a CPUSETSIZE() of 0x2 / 0xa == 0x,
which is not a power of 2.
 
 
This message posted from opensolaris.org
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-10 Thread Dennis Clarke

 Gavin wrote:
 On 07/07/07 18:55, Dennis Clarke wrote:
  after BFU of snv_68 :
 
  module /platform/i86pc/kernel//unix: text at [0xfe80, 0xfe8d4a8b]
 data at 0xfec0
  module /kernel/genunix: text at [0xfe8d4a90, 0xfead88ff] data at
 0xfec4cdc0
 
  panic[cpu0]/thread=fec1f2e0: assertion failed: l2cache_assoc ?
 (l2cache_sz / l2cache_assoc) : 0x1000))  (((l2cache_assoc ? (l2cache_sz
 / l2cache_assoc) : 0x1000)) - 1)) == 0), file
 
  fec382d0 genunix:assfail+5a (fe8c936c, fe8c95ec,)
  fec38300 unix:page_coloring_init+35a (2, 40, a)
  fec38358 unix:startup_memlist+3f5 (fec38384, fe954503,)
  fec38360 unix:startup+1c (fe800010, fec34128,)
  fec38384 genunix:main+5b ()

 That is this assert:

  ASSERT(ISP2(CPUSETSIZE()));
 checking that the number of distinct l2 sets is a power of 2:

 #define CPUSETSIZE()\
 (l2cache_assoc ? (l2cache_sz / l2cache_assoc) : MMU_PAGESIZE)
 #define ISP2(x) (((x)  ((x) - 1)) == 0)

 So could you boot under kmdb and at the time of panic
 (when you drop to the debugger) utter:

 l2cache_assoc/D
 l2cache_sz/X
 l2cache_linesz/D

 Hmm, startup_memlist+3f5 passes these as parameters to page_coloring_init(),
 so if we trust the parameters shown in the stack backtrace, we have

 pagecolor_memsz =
 page_coloring_init(l2cache_sz, l2cache_linesz, l2cache_assoc);

fec38300 unix:page_coloring_init+35a (2, 40, a)

 l2cache_sz == 0x2,
 l2cache_linesz == 0x40,
 l2cache_assoc ==  0xa

 That should give us a CPUSETSIZE() of 0x2 / 0xa == 0x,
 which is not a power of 2.

If I had realized that this experiment would produce data of some value I
would not have tossed out the whole thing so fast.  But .. I needed a
successful build ( which I did get ) and a successful BFU/ACR which I did
not.  So I installed snv_64a Developers Edition fresh and started over. 
That was another mistake because now I was building with Studio 12 as
opposed to Studio 11.  The build ran for 16 hours and then failed.

This is turning out to be a long week already with snv_68

Dennis
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-10 Thread Eric Saxe
Dennis Clarke wrote:
 Hmm, startup_memlist+3f5 passes these as parameters to page_coloring_init(),
 so if we trust the parameters shown in the stack backtrace, we have

 pagecolor_memsz =
 page_coloring_init(l2cache_sz, l2cache_linesz, l2cache_assoc);

fec38300 unix:page_coloring_init+35a (2, 40, a)

 l2cache_sz == 0x2,
 l2cache_linesz == 0x40,
 l2cache_assoc ==  0xa

 That should give us a CPUSETSIZE() of 0x2 / 0xa == 0x,
 which is not a power of 2.
 
10 way set associative cache eh. ;)

Dennis, was this an Intel or AMD based system?

Thanks,
-Eric
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-10 Thread Eric Saxe
Eric Saxe wrote:
 Dennis Clarke wrote:
   
 Hmm, startup_memlist+3f5 passes these as parameters to page_coloring_init(),
 so if we trust the parameters shown in the stack backtrace, we have

 pagecolor_memsz =
 page_coloring_init(l2cache_sz, l2cache_linesz, l2cache_assoc);

fec38300 unix:page_coloring_init+35a (2, 40, a)

 l2cache_sz == 0x2,
 l2cache_linesz == 0x40,
 l2cache_assoc ==  0xa

 That should give us a CPUSETSIZE() of 0x2 / 0xa == 0x,
 which is not a power of 2.
 
   
 10 way set associative cache eh. ;)

 Dennis, was this an Intel or AMD based system?
   
Never mind...I neglected to look closely at the verbose boot output you 
provided...
Thanks...

-Eric
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-10 Thread Dennis Clarke

 Dennis Clarke wrote:
 Hmm, startup_memlist+3f5 passes these as parameters to
 page_coloring_init(),
 so if we trust the parameters shown in the stack backtrace, we have

 pagecolor_memsz =
 page_coloring_init(l2cache_sz, l2cache_linesz,
 l2cache_assoc);

fec38300 unix:page_coloring_init+35a (2, 40, a)

 l2cache_sz == 0x2,
 l2cache_linesz == 0x40,
 l2cache_assoc ==  0xa

 That should give us a CPUSETSIZE() of 0x2 / 0xa == 0x,
 which is not a power of 2.

 10 way set associative cache eh. ;)

 Dennis, was this an Intel or AMD based system?

Actually neither .. it is a low power appliance motherboard based on VIA
technology.

Dennis

___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-10 Thread Dennis Clarke

 Eric Saxe wrote:
 Dennis Clarke wrote:

 Hmm, startup_memlist+3f5 passes these as parameters to
 page_coloring_init(),
 so if we trust the parameters shown in the stack backtrace, we have

 pagecolor_memsz =
 page_coloring_init(l2cache_sz, l2cache_linesz,
 l2cache_assoc);

fec38300 unix:page_coloring_init+35a (2, 40, a)

 l2cache_sz == 0x2,
 l2cache_linesz == 0x40,
 l2cache_assoc ==  0xa

 That should give us a CPUSETSIZE() of 0x2 / 0xa == 0x,
 which is not a power of 2.


 10 way set associative cache eh. ;)

 Dennis, was this an Intel or AMD based system?

 Never mind...I neglected to look closely at the verbose boot output you
 provided...
 Thanks...


I'm starting over from scratch again and in 16 hours or so .. hopefully ..
I'll have a clean build.  I have done this four times now and am not having
great success .. for various little reasons.

I'll stay in touch.

Dennis

___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-10 Thread Eric Saxe
Dennis Clarke wrote:
 Dennis, was this an Intel or AMD based system?
 

 Actually neither .. it is a low power appliance motherboard based on VIA
 technology.
   
I see. If you can provide me access to a crash dump somehow, that would 
be helpful. Otherwise, if you can reproduce this
let's take the conversation offline (or to a chat session), and we can 
debug it live...

Thanks,
-Eric


___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-09 Thread Jürgen Keil
 after BFU of snv_68 :
 
 module /platform/i86pc/kernel//unix: text at [0xfe80, 0xfe8d4a8b] data at 
 0xfec0
 module /kernel/genunix: text at [0xfe8d4a90, 0xfead88ff] data at 0xfec4cdc0
 
 panic[cpu0]/thread=fec1f2e0: assertion failed: l2cache_assoc ? 
 (l2cache_sz / l2cache_assoc) : 0x1000))  (((l2cache_assoc ? (l2cache_sz / 
 l2cache_assoc) : 0x1000)) - 1)) == 0), file
 
 fec382d0 genunix:assfail+5a (fe8c936c, fe8c95ec,)
 fec38300 unix:page_coloring_init+35a (2, 40, a)
 fec38358 unix:startup_memlist+3f5 (fec38384,fe954503,)
 fec38360 unix:startup+1c (fe800010, fec34128,)
 fec38384 genunix:main+5b ()

Can you try to boot with kmdb enabled (kernel boot option: -k)
and print the values for l2cache_sz and l2cache_assoc, when it
stops at the panic / failed assertion ?


l2cache_sz::print
l2cache_assoc::print


 cpu0: x86 (CentaurHauls 6A9 family 6 model 10 step 9 clock 1200 MHz)
 cpu0: VIA Esther processor 1200MHz

Could be missing / broken or incomplete VIA/CentaurHauls x86 cpu support:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/i86pc/os/startup.c#1000

   1000 /*
   1001  * determine l2 cache info and memory size for page coloring
   1002  */
   1003 (void) getl2cacheinfo(CPU,
   1004 l2cache_sz, l2cache_linesz, l2cache_assoc);
   1005 pagecolor_memsz =
   1006 page_coloring_init(l2cache_sz, l2cache_linesz, 
l2cache_assoc);


In getl2cacheinfo(), I see support for Intel, AMD and Cyrix, but no support
for VIA / CentaurHauls cpus?

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/i86pc/os/cpuid.c#3300

   3300 int
   3301 getl2cacheinfo(cpu_t *cpu, int *csz, int *lsz, int *assoc)
   3302 {
   3303 struct cpuid_info *cpi = cpu-cpu_m.mcpu_cpi;
   3304 struct l2info __l2info, *l2i = __l2info;
   3305 
   3306 l2i-l2i_csz = csz;
   3307 l2i-l2i_lsz = lsz;
   3308 l2i-l2i_assoc = assoc;
   3309 l2i-l2i_ret = -1;
   3310 
   3311 switch (x86_which_cacheinfo(cpi)) {
   3312 case X86_VENDOR_Intel:
   3313 intel_walk_cacheinfo(cpi, l2i, intel_l2cinfo);
   3314 break;
   3315 case X86_VENDOR_Cyrix:
   3316 cyrix_walk_cacheinfo(cpi, l2i, intel_l2cinfo);
   3317 break;
   3318 case X86_VENDOR_AMD:
   3319 amd_l2cacheinfo(cpi, l2i);
   3320 break;
   3321 default:
   3322 break;
   3323 }
   3324 return (l2i-l2i_ret);
   3325 }
 
 
This message posted from opensolaris.org
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-09 Thread Gavin Maltby


On 07/07/07 18:55, Dennis Clarke wrote:
 after BFU of snv_68 :
 
 module /platform/i86pc/kernel//unix: text at [0xfe80, 0xfe8d4a8b] data
 at 0xfec0
 module /kernel/genunix: text at [0xfe8d4a90, 0xfead88ff] data at 0xfec4cdc0
 
 panic[cpu0]/thread=fec1f2e0: assertion failed: l2cache_assoc ?
 (l2cache_sz / l2cache_assoc) : 0x1000))  (((l2cache_assoc ? (l2cache_sz /
 l2cache_assoc) : 0x1000)) - 1)) == 0), file
 
 fec382d0 genunix:assfail+5a (fe8c936c, fe8c95ec,)
 fec38300 unix:page_coloring_init+35a (2, 40, a)
 fec38358 unix:startup_memlist+3f5 (fec38384, fe954503,)
 fec38360 unix:startup+1c (fe800010, fec34128,)
 fec38384 genunix:main+5b ()

That is this assert:

 ASSERT(ISP2(CPUSETSIZE()));

It's checking that the number of distinct l2 sets is a power of 2:

#define CPUSETSIZE()\
 (l2cache_assoc ? (l2cache_sz / l2cache_assoc) : MMU_PAGESIZE)

#define ISP2(x) (((x)  ((x) - 1)) == 0)

So could you boot under kmdb and at the time of panic (when you drop
to the debugger) utter:

l2cache_assoc/D
l2cache_sz/X
l2cache_linesz/D

Note the comment where these are defined:

/*
  * XX64 need a comment here.. are these just default values, surely
  * we read the cpuid type information to figure this out.
  */
int l2cache_sz = 0x8;
int l2cache_linesz = 0x40;
int l2cache_assoc = 1;

Gavin
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-09 Thread Dennis Clarke



 On 07/07/07 18:55, Dennis Clarke wrote:
 after BFU of snv_68 :

 module /platform/i86pc/kernel//unix: text at [0xfe80, 0xfe8d4a8b] data
 at 0xfec0
 module /kernel/genunix: text at [0xfe8d4a90, 0xfead88ff] data at
 0xfec4cdc0

 panic[cpu0]/thread=fec1f2e0: assertion failed: l2cache_assoc ?
 (l2cache_sz / l2cache_assoc) : 0x1000))  (((l2cache_assoc ? (l2cache_sz /
 l2cache_assoc) : 0x1000)) - 1)) == 0), file

 fec382d0 genunix:assfail+5a (fe8c936c, fe8c95ec,)
 fec38300 unix:page_coloring_init+35a (2, 40, a)
 fec38358 unix:startup_memlist+3f5 (fec38384, fe954503,)
 fec38360 unix:startup+1c (fe800010, fec34128,)
 fec38384 genunix:main+5b ()

 That is this assert:

  ASSERT(ISP2(CPUSETSIZE()));

 It's checking that the number of distinct l2 sets is a power of 2:

 #define CPUSETSIZE()\
  (l2cache_assoc ? (l2cache_sz / l2cache_assoc) : MMU_PAGESIZE)

 #define ISP2(x) (((x)  ((x) - 1)) == 0)

 So could you boot under kmdb and at the time of panic (when you drop
 to the debugger) utter:

 l2cache_assoc/D
 l2cache_sz/X
 l2cache_linesz/D

 Note the comment where these are defined:

 /*
   * XX64 need a comment here.. are these just default values, surely
   * we read the cpuid type information to figure this out.
   */
 int l2cache_sz = 0x8;
 int l2cache_linesz = 0x40;
 int l2cache_assoc = 1;


  oh gee ... well I wish I could.

  I made the wild assumption that something was terribly wrong at my end I
decided to reinstall snv_64a from scratch :

$ uname -a
SunOS aequitas 5.11 snv_64a i86pc i386 i86pc
$ cat /etc/release
Solaris Nevada snv_64a X86
   Copyright 2007 Sun Microsystems, Inc.  All Rights Reserved.
Use is subject to license terms.
  Assembled 18 May 2007
$

I have everything ready to go for another build and I'll fire that off
shortly. It takes over 14 hours on this appliance and then I can BFU/ACR
boot+pray sequence :-)

so .. I'll check in about 15 hours from now.

Dennis

___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


[osol-discuss] panic[cpu0]/thread=fec1f2e0: assertion failed

2007-07-07 Thread Dennis Clarke

after BFU of snv_68 :

module /platform/i86pc/kernel//unix: text at [0xfe80, 0xfe8d4a8b] data
at 0xfec0
module /kernel/genunix: text at [0xfe8d4a90, 0xfead88ff] data at 0xfec4cdc0

panic[cpu0]/thread=fec1f2e0: assertion failed: l2cache_assoc ?
(l2cache_sz / l2cache_assoc) : 0x1000))  (((l2cache_assoc ? (l2cache_sz /
l2cache_assoc) : 0x1000)) - 1)) == 0), file

fec382d0 genunix:assfail+5a (fe8c936c, fe8c95ec,)
fec38300 unix:page_coloring_init+35a (2, 40, a)
fec38358 unix:startup_memlist+3f5 (fec38384, fe954503,)
fec38360 unix:startup+1c (fe800010, fec34128,)
fec38384 genunix:main+5b ()


system reboots right away ... I select Solaris failsafe
that seems to work :

module /platform/i86pc/kernel/unix: text at [0xfe80, 0xfe8abc47] data at
0xfec0
module /kernel/genunix: text at [0xfe8abc48, 0xfe9f1887] data at 0xfec4a840
SunOS Release 5.11 Version snv_64a 32-bit
Copyright 1983-2007 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
features:
1007effcpuid,sse2,sse,sep,pat,cx8,pae,mmx,cmov,de,pge,mtrr,msr,tsc,lgpg
Using default device instance data
cpu0: initialized cpu module 'cpu.generic'
mem = 981500K (0x3be7f000)
root nexus = i86pc
pseudo0 at root
pseudo0 is /pseudo
scsi_vhci0 at root
scsi_vhci0 is /scsi_vhci
isa0 at root
ramdisk0 at root
ramdisk0 is /ramdisk
SMBIOS v2.3 loaded (902 bytes)pseudo-device: dld0
dld0 is /pseudo/[EMAIL PROTECTED]
pci0 at root: space 0 offset 0
pci0 is /[EMAIL PROTECTED],0
PCI-device: pci1106,[EMAIL PROTECTED], pci_pci0
pci_pci0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED]
ISA-device: asy0
asy0 is /isa/[EMAIL PROTECTED],3f8
/[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4 (ehci0): Unable to take 
control from BIOS.
Failure is ignored.
PCI-device: pci1106,[EMAIL PROTECTED],4, ehci0
ehci0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4
PCI-device: pci1106,[EMAIL PROTECTED], uhci0
uhci0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED]
PCI-device: pci1106,[EMAIL PROTECTED],1, uhci1
uhci1 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],1
PCI-device: pci1106,[EMAIL PROTECTED],2, uhci2
uhci2 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],2
PCI-device: pci1106,[EMAIL PROTECTED],3, uhci3
uhci3 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],3
cpu0: x86 (CentaurHauls 6A9 family 6 model 10 step 9 clock 1200 MHz)
cpu0: VIA Esther processor 1200MHz
pseudo-device: tzmon0
tzmon0 is /pseudo/[EMAIL PROTECTED]
USB 2.0 device (usb951,1600) operating at hi speed (USB 2.x) on USB 2.0 root
hub: [EMAIL PROTECTED], scsa2usb0 at bus address 2
Kingston DataTraveler II  5B571B0016A3
scsa2usb0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]
/[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED] (scsa2usb0) 
online
sd0 at scsa2usb0: target 0 lun 0
sd0 is /[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL 
PROTECTED]/[EMAIL PROTECTED],0
/[EMAIL PROTECTED],0/pci1106,[EMAIL PROTECTED],4/[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd0) online
Booting to milestone milestone/single-user:default.
Configuring /dev
pseudo-device: devinfo0
devinfo0 is /pseudo/[EMAIL PROTECTED]
xsvc0 at root: space 0 offset 0
xsvc0 is /[EMAIL PROTECTED],0
pseudo-device: pseudo1
pseudo1 is /pseudo/[EMAIL PROTECTED]
PCI-device: pci16f3,[EMAIL PROTECTED],5, audiovia823x0
audiovia823x0 is /[EMAIL PROTECTED],0/pci16f3,[EMAIL PROTECTED],5
IDE device at targ 0, lun 0 lastlun 0x0
model ST340014A
ATA/ATAPI-6 supported, majver 0x7e minver 0x1b
ata_set_feature: (0x66,0x0) failed
ATAPI device at targ 1, lun 0 lastlun 0x0
model SAMSUNG CDRW/DVD SM-352N
PCI-device: [EMAIL PROTECTED], ata2
ata2 is /[EMAIL PROTECTED],0/[EMAIL PROTECTED],1/[EMAIL PROTECTED]
UltraDMA mode 2 selected
UltraDMA mode 2 selected
UltraDMA mode 2 selected
sd1 at ata2: target 1 lun 0
sd1 is /[EMAIL PROTECTED],0/[EMAIL PROTECTED],1/[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
pseudo-device: fssnap0
fssnap0 is /pseudo/[EMAIL PROTECTED]
pseudo-device: ramdisk1024
ramdisk1024 is /pseudo/[EMAIL PROTECTED]
pseudo-device: winlock0
winlock0 is /pseudo/[EMAIL PROTECTED]
pseudo-device: fcp0
fcp0 is /pseudo/[EMAIL PROTECTED]
pseudo-device: fcsm0
fcsm0 is /pseudo/[EMAIL PROTECTED]
pseudo-device: llc10
llc10 is /pseudo/[EMAIL PROTECTED]
pseudo-device: lofi0
lofi0 is /pseudo/[EMAIL PROTECTED]
pseudo-device: pool0
pool0 is /pseudo/[EMAIL PROTECTED]
pseudo-device: power0
power0 is /pseudo/[EMAIL PROTECTED]
cmdk0 at ata2 target 0 lun 0
cmdk0 is /[EMAIL PROTECTED],0/[EMAIL PROTECTED],1/[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
pseudo-device: zfs0
zfs0 is /pseudo/[EMAIL PROTECTED]
Searching for installed OS instances...

Solaris Nevada snv_64a X86 was found on /dev/dsk/c0d0s0.
Do you wish to have it mounted read-write on /a? [y,n,?] y
mounting /dev/dsk/c0d0s0 on /a

Starting shell.
#


 .. too much to hope for a core dump ...

# fsck -F ufs -Y /dev/rdsk/c0d0s1
** /dev/rdsk/c0d0s1
** Last Mounted on /var
** Phase 1 - Check Blocks and