Re: installer amd64 'Get/Verify bsd' -> 'Illegal instruction' - shuttle ds47d
> Date: Tue, 2 Feb 2016 09:43:54 -0800 > From: Philip Guenther> > On Tue, 2 Feb 2016, Philip Guenther wrote: > ... > > Currently we seem to assume that the presence of certain CPU features like > > AVX implies that CPUID supports the related leaf; that BIOS option breaks > > that assumption, resulting in the bogus fpu_save_len sizing you hit. > > From the dmesg you posted I see it also explains the bogus mwait sizing > > that has been reported by some others. Your machine will perform better > > with that option off; I guess we should add check to the code to catch > > this sort of setup by checking the cpuid_level variable before using the > > higher CPUID leafs. > > Revised version that switches a few places to check cpuid_level instead of > calling CPUID(0) again and similar for using curcpu()->ci_pnfeatset > instead of calling CPUID(0x8000) once identifycpu() sets that, and add > a check of ci->ci_pnfeatset before using CPUID(CPUID_AMD_SVM_CAP) in the > vmm bits. > > ok? ok kettenis@ > Index: i386/i386/cpu.c > === > RCS file: /data/src/openbsd/src/sys/arch/i386/i386/cpu.c,v > retrieving revision 1.70 > diff -u -p -r1.70 cpu.c > --- i386/i386/cpu.c 27 Dec 2015 04:31:34 - 1.70 > +++ i386/i386/cpu.c 2 Feb 2016 16:54:09 - > @@ -784,7 +784,7 @@ cpu_init_mwait(struct device *dv) > { > unsigned int smallest, largest, extensions, c_substates; > > - if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0) > + if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0 || cpuid_level < 0x5) > return; > > /* get the monitor granularity */ > Index: amd64/amd64/amd64_mem.c > === > RCS file: /data/src/openbsd/src/sys/arch/amd64/amd64/amd64_mem.c,v > retrieving revision 1.11 > diff -u -p -r1.11 amd64_mem.c > --- amd64/amd64/amd64_mem.c 14 Mar 2015 03:38:46 - 1.11 > +++ amd64/amd64/amd64_mem.c 2 Feb 2016 17:37:55 - > @@ -583,8 +583,7 @@ mrinit(struct mem_range_softc *sc) >* If CPUID does not support leaf function 0x8008, use the >* default a 36-bit address size. >*/ > - CPUID(0x8000, regs[0], regs[1], regs[2], regs[3]); > - if (regs[0] >= 0x8008) { > + if (curcpu()->ci_pnfeatset >= 0x8008) { > CPUID(0x8008, regs[0], regs[1], regs[2], regs[3]); > if (regs[0] & 0xff) { > mtrrmask = (1ULL << (regs[0] & 0xff)) - 1; > Index: amd64/amd64/cacheinfo.c > === > RCS file: /data/src/openbsd/src/sys/arch/amd64/amd64/cacheinfo.c,v > retrieving revision 1.7 > diff -u -p -r1.7 cacheinfo.c > --- amd64/amd64/cacheinfo.c 13 Nov 2015 07:52:20 - 1.7 > +++ amd64/amd64/cacheinfo.c 2 Feb 2016 17:36:11 - > @@ -159,7 +159,6 @@ amd_cpu_cacheinfo(struct cpu_info *ci) > struct x86_cache_info *cai; > int family, model; > u_int descs[4]; > - u_int lfunc; > > family = ci->ci_family; > model = ci->ci_model; > @@ -171,15 +170,9 @@ amd_cpu_cacheinfo(struct cpu_info *ci) > return; > > /* > - * Determine the largest extended function value. > - */ > - CPUID(0x8000, descs[0], descs[1], descs[2], descs[3]); > - lfunc = descs[0]; > - > - /* >* Determine L1 cache/TLB info. >*/ > - if (lfunc < 0x8005) { > + if (ci->ci_pnfeatset < 0x8005) { > /* No L1 cache info available. */ > return; > } > @@ -228,7 +221,7 @@ amd_cpu_cacheinfo(struct cpu_info *ci) > /* >* Determine L2 cache/TLB info. >*/ > - if (lfunc < 0x8006) { > + if (ci->ci_pnfeatset < 0x8006) { > /* No L2 cache info available. */ > return; > } > Index: amd64/amd64/cpu.c > === > RCS file: /data/src/openbsd/src/sys/arch/amd64/amd64/cpu.c,v > retrieving revision 1.94 > diff -u -p -r1.94 cpu.c > --- amd64/amd64/cpu.c 27 Dec 2015 04:31:34 - 1.94 > +++ amd64/amd64/cpu.c 2 Feb 2016 17:03:04 - > @@ -282,7 +282,7 @@ cpu_init_mwait(struct cpu_softc *sc) > { > unsigned int smallest, largest, extensions, c_substates; > > - if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0) > + if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0 || cpuid_level < 0x5) > return; > > /* get the monitor granularity */ > @@ -505,7 +505,7 @@ cpu_init(struct cpu_info *ci) > cr4 |= CR4_OSXSAVE; > lcr4(cr4); > > - if (cpu_ecxfeature & CPUIDECX_XSAVE) { > + if ((cpu_ecxfeature & CPUIDECX_XSAVE) && cpuid_level >= 0xd) { > u_int32_t eax, ebx, ecx, edx; > > xsave_mask = XCR0_X87 | XCR0_SSE; > Index: amd64/amd64/identcpu.c >
drm panic
Hello, Yesterday, at the point that I did a mouse drag in gvim, my X session died, and I was dumped into the console with the following message appearing a split second later: error: [drm:pid29704:i915_reset] *ERROR* Failed to reset chip: -60 At that point, the machine was completely locked (keyboard didn't respond etc., so no ddb output) and I had to hard reboot. [Possibly relevant information: I had ZZZ'd this machine once before this happened, so the machine had come back after resuming from hibernation.] I'm not sure if the gvim connection is relevant or not. There is also an intermittent bug where gvim causes X to reset, but I've never seen it lock up the machine in this way before. Laurie dmesg: OpenBSD 5.9-beta (GENERIC.MP) #1864: Mon Jan 25 19:11:29 MST 2016 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 8476475392 (8083MB) avail mem = 8215384064 (7834MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xeb170 (52 entries) bios0: vendor Intel Corp. version "BLH6710H.86A.0160.2012.1204.1156" date 12/04/2012 bios0: TranquilPC IXL acpi0 at bios0: rev 2 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP APIC SSDT MCFG HPET acpi0: wakeup devices PS2K(S3) PS2M(S3) UAR1(S3) P0P1(S4) P0P2(S4) P0P3(S4) P0P4(S4) GBE_(S4) BR20(S3) EUSB(S3) USBE(S3) PEX0(S4) BR21(S4) PEX1(S4) PEX2(S4) PEX3(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM) i7-2600S CPU @ 2.80GHz, 2794.10 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,LONG,LAHF,PERF,ITSC,SENSOR,ARAT cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.1, IBE cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Core(TM) i7-2600S CPU @ 2.80GHz, 2793.66 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,LONG,LAHF,PERF,ITSC,SENSOR,ARAT cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 0, core 1, package 0 cpu2 at mainbus0: apid 1 (application processor) cpu2: Intel(R) Core(TM) i7-2600S CPU @ 2.80GHz, 2793.66 MHz cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,LONG,LAHF,PERF,ITSC,SENSOR,ARAT cpu2: 256KB 64b/line 8-way L2 cache cpu2: smt 1, core 0, package 0 cpu3 at mainbus0: apid 3 (application processor) cpu3: Intel(R) Core(TM) i7-2600S CPU @ 2.80GHz, 2793.66 MHz cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,LONG,LAHF,PERF,ITSC,SENSOR,ARAT cpu3: 256KB 64b/line 8-way L2 cache cpu3: smt 1, core 1, package 0 ioapic0 at mainbus0: apid 0 pa 0xfec0, version 20, 24 pins acpimcfg0 at acpi0 addr 0xf800, bus 0-63 acpihpet0 at acpi0: 14318179 Hz acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus -1 (P0P1) acpiprt2 at acpi0: bus -1 (P0P2) acpiprt3 at acpi0: bus -1 (P0P3) acpiprt4 at acpi0: bus -1 (P0P4) acpiprt5 at acpi0: bus 1 (PEX0) acpiprt6 at acpi0: bus -1 (BR21) acpiprt7 at acpi0: bus 2 (PEX1) acpiprt8 at acpi0: bus -1 (PEX2) acpiprt9 at acpi0: bus -1 (PEX3) acpiprt10 at acpi0: bus -1 (PEX4) acpiprt11 at acpi0: bus -1 (PEX5) acpiprt12 at acpi0: bus -1 (PEX6) acpiprt13 at acpi0: bus -1 (PEX7) acpicpu0 at acpi0 0x800a4008 cnt:01 stk:00 package: 06 0x800a3a88 cnt:01 stk:00 integer: 6 0x8009fc08 cnt:01 stk:00 integer: 0 0x800a4d88 cnt:01 stk:00 integer: 0 0x800a4d08 cnt:01 stk:00 integer: fe 0x800a1508 cnt:01 stk:00 integer: 2 0x800a1308 cnt:01 stk:00 integer: 2 CSD r=0 d=0 c=fe n=2 i=2 : C3(350@104 mwait.3@0x20), C2(500@80 mwait.3@0x10), C1(1000@1 mwait.1), PSS acpicpu1 at acpi0 0x8009f188 cnt:01 stk:00 package: 06 0x8009f308 cnt:01 stk:00 integer: 6 0x800a1a08 cnt:01 stk:00 integer: 0 0x800a1e08 cnt:01 stk:00 integer: 0 0x800a1488 cnt:01 stk:00 integer: fe 0x8009f388 cnt:01 stk:00 integer: 2 0x800a4e08 cnt:01 stk:00 integer: 2 CSD r=0 d=0 c=fe n=2 i=2 : C3(350@104 mwait.3@0x20), C2(500@80 mwait.3@0x10), C1(1000@1 mwait.1), PSS acpicpu2
Re: UEFI Boot Report: Screen corruption and kernel panic
On Tue, Feb 02, 2016 at 10:27:24PM +1100, Jonathan Gray wrote: > On Tue, Feb 02, 2016 at 05:49:33AM -0500, James Hastings wrote: > > On 2/2/16, Jonathan Graywrote: > > > On Tue, Feb 02, 2016 at 03:56:13AM -0500, James Hastings wrote: > > >> On 2/2/16, Jonathan Gray wrote: > > >> > > > >> > The bios may have to be fetched from the acpi VFCT table for the uefi > > >> > case. > > >> > > > >> > Here's a quick attempt at trying to avoid the crash at least: > > >> > > > >> > > >> Different panic this time. > > > > > > Thanks. Here is a version modified to have a better check for efi. > > > Still may need some things shuffled around to deal with the root hook. > > > > > > > Identical panic again. > > > > Thinking out loud; could these two issues be caused by bad information > > passed by > > the bootloader or firmware? memory maps? framebuffer address? > > It's because one of the ways of getting the video bios is to check where > it has long been mapped and reading out of that memory. > > It sounds like your machine has enough of it there to convince the code > that checks for a signature but not all of the expected 256k of memory > after 0xc is actually mapped. > > Putting a "return false;" at the top of radeon_read_platform_bios() > should prevent this method from being tried entirely. > Can you try this to limit the size? Index: radeon_bios.c === RCS file: /cvs/src/sys/dev/pci/drm/radeon/radeon_bios.c,v retrieving revision 1.6 diff -u -p -r1.6 radeon_bios.c --- radeon_bios.c 12 Apr 2015 12:14:30 - 1.6 +++ radeon_bios.c 2 Feb 2016 21:30:25 - @@ -48,11 +48,10 @@ radeon_read_platform_bios(struct radeon_ { #if defined(__amd64__) || defined(__i386__) uint8_t __iomem *bios; - bus_size_t size = 256 * 1024; /* ??? */ + bus_size_t size = 0x2; uint8_t *found = NULL; int i; - - + if (!(rdev->flags & RADEON_IS_IGP)) if (!radeon_card_posted(rdev)) return false;
Re: UEFI Boot Report: Screen corruption and kernel panic
On 2/2/16, Jonathan Graywrote: > On Tue, Feb 02, 2016 at 10:27:24PM +1100, Jonathan Gray wrote: >> On Tue, Feb 02, 2016 at 05:49:33AM -0500, James Hastings wrote: >> > On 2/2/16, Jonathan Gray wrote: >> > > On Tue, Feb 02, 2016 at 03:56:13AM -0500, James Hastings wrote: >> > >> On 2/2/16, Jonathan Gray wrote: >> > >> > >> > >> > The bios may have to be fetched from the acpi VFCT table for the >> > >> > uefi >> > >> > case. >> > >> > >> > >> > Here's a quick attempt at trying to avoid the crash at least: >> > >> > >> > >> >> > >> Different panic this time. >> > > >> > > Thanks. Here is a version modified to have a better check for efi. >> > > Still may need some things shuffled around to deal with the root >> > > hook. >> > > >> > >> > Identical panic again. >> > >> > Thinking out loud; could these two issues be caused by bad information >> > passed by >> > the bootloader or firmware? memory maps? framebuffer address? >> >> It's because one of the ways of getting the video bios is to check where >> it has long been mapped and reading out of that memory. >> >> It sounds like your machine has enough of it there to convince the code >> that checks for a signature but not all of the expected 256k of memory >> after 0xc is actually mapped. >> >> Putting a "return false;" at the top of radeon_read_platform_bios() >> should prevent this method from being tried entirely. >> > > Can you try this to limit the size? > > Index: radeon_bios.c > === > RCS file: /cvs/src/sys/dev/pci/drm/radeon/radeon_bios.c,v > retrieving revision 1.6 > diff -u -p -r1.6 radeon_bios.c > --- radeon_bios.c 12 Apr 2015 12:14:30 - 1.6 > +++ radeon_bios.c 2 Feb 2016 21:30:25 - > @@ -48,11 +48,10 @@ radeon_read_platform_bios(struct radeon_ > { > #if defined(__amd64__) || defined(__i386__) > uint8_t __iomem *bios; > - bus_size_t size = 256 * 1024; /* ??? */ > + bus_size_t size = 0x2; > uint8_t *found = NULL; > int i; > - > - > + > if (!(rdev->flags & RADEON_IS_IGP)) > if (!radeon_card_posted(rdev)) > return false; > slightly different panic OpenBSD 5.9 (GENERIC.MP) #1: Tue Feb 2 16:50:43 EST 2016 r...@cq58.example.test:/usr/src/sys/arch/amd64/compile/GENERIC.MP RTC BIOS diagnostic error 80 real mem = 1690714112 (1612MB) avail mem = 1635336192 (1559MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.7 @ 0x66abc000 (45 entries) bios0: vendor Insyde version "F.65" date 06/04/2014 bios0: Hewlett-Packard Compaq CQ58 Notebook PC acpi0 at bios0: rev 2 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP UEFI HPET APIC MCFG ASF! BOOT SPCR WDRT WDAT FPDT MSDM SSDT SSDT VFCT BGRT acpi0: wakeup devices PB6_(S4) SPB0(S4) XPDV(S4) SPB1(S4) SPB3(S4) GEC_(S4) OHC1(S3) OHC2(S3) OHC3(S3) OHC4(S3) EHC1(S3) EHC2(S3) EHC3(S3) P2P_(S5) acpitimer0 at acpi0: 3579545 Hz, 32 bits acpihpet0 at acpi0: 14318180 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: AMD C-60 APU with Radeon(tm) HD Graphics, 998.71 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,MWAIT,SSSE3,CX16,POPCNT,NXE,MMXX,FFXSR,PAGE1GB,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,IBS,SKINIT,ITSC cpu0: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 16-way L2 cache cpu0: 8 4MB entries fully associative cpu0: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 199MHz cpu0: mwait min=64, max=64, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: AMD C-60 APU with Radeon(tm) HD Graphics, 997.87 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,MWAIT,SSSE3,CX16,POPCNT,NXE,MMXX,FFXSR,PAGE1GB,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,IBS,SKINIT,ITSC cpu1: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 16-way L2 cache cpu1: 8 4MB entries fully associative cpu1: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative cpu1: smt 0, core 1, package 0 ioapic0 at mainbus0: apid 4 pa 0xfec0, version 21, 24 pins ioapic0: misconfigured as apic 0, remapped to apid 4 acpimcfg0 at acpi0 addr 0xf800, bus 0-63 acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus -1 (PB4_) acpiprt2 at acpi0: bus -1 (PB5_) acpiprt3 at acpi0: bus -1 (PB6_) acpiprt4 at acpi0: bus -1 (PB7_) acpiprt5 at acpi0: bus 2 (SPB0) acpiprt6 at acpi0: bus 6 (SPB1) acpiprt7 at acpi0: bus 7 (SPB2) acpiprt8 at acpi0: bus -1 (SPB3) acpiprt9 at acpi0: bus 1 (P2P_) acpiec0 at acpi0 acpicpu0 at acpi0: C2(0@100 io@0xf800), C1(@1 halt!), PSS acpicpu1 at acpi0:
Re: UEFI Boot Report: Screen corruption and kernel panic
On 2/2/16, Mark Ketteniswrote: >> Date: Tue, 2 Feb 2016 21:32:13 +0100 (CET) >> From: Mark Kettenis >> >> > Date: Tue, 2 Feb 2016 14:14:04 -0500 >> > From: James Hastings >> > >> > Native screen size is 1366x768 >> > >> > Results: >> > ei.config_acpi: 0x66bfe014 >> > ei.config_smbios: 0x66abef98 >> > ei.fb_addr: 0x8000 >> > ei.fb_size: 0x42 >> > ei.fb_width: 1024 >> > ei.fb_height: 768 >> > ei.fb_pixpsl: 1024 >> >> Right. Looks like the firmware is giving us bogus values. But >> perhaps there is something subtly wrong with our bootloader code. > > Does the diff below help? > > Index: efiboot.c > === > RCS file: /cvs/src/sys/arch/amd64/stand/efiboot/efiboot.c,v > retrieving revision 1.10 > diff -u -p -r1.10 efiboot.c > --- efiboot.c 26 Nov 2015 20:26:20 - 1.10 > +++ efiboot.c 2 Feb 2016 21:26:49 - > @@ -526,7 +526,7 @@ efi_makebootargs(void) > bestsiz = gopsiz; > } > } > - if (bestmode >= 0 && conout->Mode->Mode != bestmode) { > + if (bestmode >= 0) { > status = EFI_CALL(gop->SetMode, gop, bestmode); > if (EFI_ERROR(status)) > printf("GOP setmode failed(%d)\n", status); > Yes! Now I can read the console. Looks great. Thank you.
Re: ral(4) leaks mbufs instead of setting oactive
On Sat, Jan 30, 2016 at 10:49:38PM +1300, Richard Procter wrote: > - ring->queued--; > + atomic_dec_int(>queued); > - ring->queued += ntxds; > + atomic_add_int(>queued, ntxds); I don't think these make a difference in the current way of things. Wireless drivers run interrupts under the kernel big lock, interrupts aren't preemptible, and AFAIK (most?) 32bit integer operations are atomic. > * fix dropped frames, watchdog timeouts. > > - on full tx ring, ring->cur wraps to an active tx descriptor. Passing > that wrapped value to the card was observed to cause general flakiness. Nice find. > * replace custom defrag with m_defrag() Also good. We've already done this in some other drivers. > + /* XXX card may interrupt here and invalidate this guard; the You can easily prevent the card from interrupting by making rt2860_tx() call splnet() before modifying data shared with the interrupt handler. I think that's the real bug you're looking for. Could you try that and send an updated diff if it works? Thank you very much for your work on this. I wrote some fixes for the rt2560 line of ral chips some years ago fixing similar driver bugs.
Re: ral(4) leaks mbufs instead of setting oactive
On Tue, Feb 02, 2016 at 09:14:06AM +0100, Stefan Sperling wrote: > On Sat, Jan 30, 2016 at 10:49:38PM +1300, Richard Procter wrote: > > + atomic_add_int(>queued, ntxds); > > + /* XXX card may interrupt here and invalidate this guard; the > > You can easily prevent the card from interrupting by making rt2860_tx() > call splnet() before modifying data shared with the interrupt handler. > I think that's the real bug you're looking for. > Could you try that and send an updated diff if it works? Hmm. Taking a closer look, if_start() is already called under splnet. So adding splnet to rt2860_tx() shouldn't make a difference. This also means the card cannot interrupt in the way your comment describes, i.e. the problem you're "fixing" here cannot exist... ?
Re: UEFI Boot Report: Screen corruption and kernel panic
On 2/2/16, Jonathan Graywrote: > > The bios may have to be fetched from the acpi VFCT table for the uefi case. > > Here's a quick attempt at trying to avoid the crash at least: > Different panic this time. OpenBSD 5.9 (GENERIC.MP) #0: Tue Feb 2 03:30:32 EST 2016 r...@cq58.example.test:/usr/src/sys/arch/amd64/compile/GENERIC.MP RTC BIOS diagnostic error 80 real mem = 1690714112 (1612MB) avail mem = 1635336192 (1559MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.7 @ 0x66abc000 (45 entries) bios0: vendor Insyde version "F.65" date 06/04/2014 bios0: Hewlett-Packard Compaq CQ58 Notebook PC acpi0 at bios0: rev 2 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP UEFI HPET APIC MCFG ASF! BOOT SPCR WDRT WDAT FPDT MSDM SSDT SSDT VFCT BGRT acpi0: wakeup devices PB6_(S4) SPB0(S4) XPDV(S4) SPB1(S4) SPB3(S4) GEC_(S4) OHC1(S3) OHC2(S3) OHC3(S3) OHC4(S3) EHC1(S3) EHC2(S3) EHC3(S3) P2P_(S5) acpitimer0 at acpi0: 3579545 Hz, 32 bits acpihpet0 at acpi0: 14318180 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: AMD C-60 APU with Radeon(tm) HD Graphics, 998.72 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,MWAIT,SSSE3,CX16,POPCNT,NXE,MMXX,FFXSR,PAGE1GB,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,IBS,SKINIT,ITSC cpu0: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 16-way L2 cache cpu0: 8 4MB entries fully associative cpu0: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 199MHz cpu0: mwait min=64, max=64, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: AMD C-60 APU with Radeon(tm) HD Graphics, 997.88 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,MWAIT,SSSE3,CX16,POPCNT,NXE,MMXX,FFXSR,PAGE1GB,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,IBS,SKINIT,ITSC cpu1: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 16-way L2 cache cpu1: 8 4MB entries fully associative cpu1: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative cpu1: smt 0, core 1, package 0 ioapic0 at mainbus0: apid 4 pa 0xfec0, version 21, 24 pins ioapic0: misconfigured as apic 0, remapped to apid 4 acpimcfg0 at acpi0 addr 0xf800, bus 0-63 acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus -1 (PB4_) acpiprt2 at acpi0: bus -1 (PB5_) acpiprt3 at acpi0: bus -1 (PB6_) acpiprt4 at acpi0: bus -1 (PB7_) acpiprt5 at acpi0: bus 2 (SPB0) acpiprt6 at acpi0: bus 6 (SPB1) acpiprt7 at acpi0: bus 7 (SPB2) acpiprt8 at acpi0: bus -1 (SPB3) acpiprt9 at acpi0: bus 1 (P2P_) acpiec0 at acpi0 acpicpu0 at acpi0: C2(0@100 io@0xf800), C1(@1 halt!), PSS acpicpu1 at acpi0: C2(0@100 io@0xf800), C1(@1 halt!), PSS acpipwrres0 at acpi0: FN00, resource for FAN0 acpitz0 at acpi0: critical temperature is 125 degC acpibtn0 at acpi0: PWRB acpiac0 at acpi0: AC unit online acpibat0 at acpi0: BAT0 not present acpibtn1 at acpi0: LID_ acpivideo0 at acpi0: VGA_ acpivideo1 at acpi0: VGA_ cpu0: 998 MHz: speeds: 1000 800 MHz pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 "AMD AMD64 14h Host" rev 0x00 radeondrm0 at pci0 dev 1 function 0 "ATI Radeon HD 6290" rev 0x00 drm0 at radeondrm0 radeondrm0: msi ahci0 at pci0 dev 17 function 0 "ATI SBx00 SATA" rev 0x00: apic 4 int 19, AHCI 1.2 ahci0: port 0: 3.0Gb/s ahci0: port 2: 1.5Gb/s scsibus1 at ahci0: 32 targets sd0 at scsibus1 targ 0 lun 0: SCSI3 0/direct fixed naa.5000c500536c9072 sd0: 305245MB, 512 bytes/sector, 625142448 sectors cd0 at scsibus1 targ 2 lun 0: ATAPI 5/cdrom removable ohci0 at pci0 dev 18 function 0 "ATI SB700 USB" rev 0x00: apic 4 int 18, version 1.0, legacy support ehci0 at pci0 dev 18 function 2 "ATI SB700 USB2" rev 0x00: apic 4 int 17 usb0 at ehci0: USB revision 2.0 uhub0 at usb0 "ATI EHCI root hub" rev 2.00/1.00 addr 1 ohci1 at pci0 dev 19 function 0 "ATI SB700 USB" rev 0x00: apic 4 int 18, version 1.0, legacy support ehci1 at pci0 dev 19 function 2 "ATI SB700 USB2" rev 0x00: apic 4 int 17 usb1 at ehci1: USB revision 2.0 uhub1 at usb1 "ATI EHCI root hub" rev 2.00/1.00 addr 1 piixpm0 at pci0 dev 20 function 0 "ATI SBx00 SMBus" rev 0x42: polling iic0 at piixpm0 spdmem0 at iic0 addr 0x51: 2GB DDR3 SDRAM PC3-12800 SO-DIMM azalia0 at pci0 dev 20 function 2 "ATI SBx00 HD Audio" rev 0x40: apic 4 int 16 azalia0: codecs: IDT 92HD81B1X audio0 at azalia0 pcib0 at pci0 dev 20 function 3 "ATI SB700 ISA" rev 0x40 ppb0 at pci0 dev 20 function 4 "ATI SB600 PCI" rev 0x40 pci1 at ppb0 bus 1 ppb1 at pci0 dev 21 function 0 "ATI SB800 PCIE" rev 0x00 pci2 at ppb1 bus 2 re0 at pci2 dev 0 function 0 "Realtek 8101E" rev 0x05: RTL8105E (0x4080), msi, address c8:cb:b8:01:0d:22 rlphy0 at
Re: ral(4) leaks mbufs instead of setting oactive
On 02/02/16(Tue) 09:14, Stefan Sperling wrote: > On Sat, Jan 30, 2016 at 10:49:38PM +1300, Richard Procter wrote: > > - ring->queued--; > > + atomic_dec_int(>queued); > > > - ring->queued += ntxds; > > + atomic_add_int(>queued, ntxds); > > I don't think these make a difference in the current way of things. It does not, the wireless stack needs some love before wifi drivers can have their interrupt handlers marked as mp-safe. > Wireless drivers run interrupts under the kernel big lock, interrupts > aren't preemptible, and AFAIK (most?) 32bit integer operations are atomic. Adding/decrementing are not because you need to reading the existing value then write the modified one :)
Re: installer amd64 'Get/Verify bsd' -> 'Illegal instruction' - shuttle ds47d
sisnk...@gmail.com (Stefan Kempf), 2016.02.01 (Mon) 19:13 (CET): > Marcus MERIGHI wrote: > > sisnk...@gmail.com (Stefan Kempf), 2016.01.30 (Sat) 10:49 (CET): > > > We need to see how it looks like from within the kernel (and whether > > > the illegal instruction is really raised from within sendsig()). Can you > > > try the diff below? > > > > > You should get a kernel panic now instead of an illegal instruction > > > signal if you try running ping or top. We need the output of the panic > > > message and the output of the following commands: > > > > ping(1), top(1) messed up the screen. > > > > # ping 192.168.188.189 > > PING 192.168.188.189 (192.168.188.189): 56 data bytes > > 64 bytes from 192.168.188.189: icmp_seq=0 ttl=255 time=166.533 ms > > panic: sendsig 1: fxsave 0x800032c8a000, sp 0x7f7fff0d20b1, > > fxave_size 512, savefpu_size 832, fpu_save_len 15773951, tf_rsp > > 0x7f7dd238, userstack 1 > > fpu_save_len is way too large (0xf0b0ff in hex). It should be 832 at > most. And that causes the kernel to attempt writes outside of the > process stack (and/or to read beyond the saved FPU state). > > Either the value we get from CPUID is strange (or we handle CPUID > wrongly), or something trashes fpu_save_len. > > Can you try this diff and paste the "cpuid1:" "cpuid2:" lines? Please > revert the previous diff. That will show us what CPUID returns. Twice in dmesg: cpuid1: ebx: 0xf0b0ff cpuid2: fpu_save_len: 0xf0b0ff, ebx: 0xf0b0ff cpuid1: ebx: 0xf0b0ff cpuid2: fpu_save_len: 0xf0b0ff, ebx: 0xf0b0ff Full dmesg attached. Thanks once more, Marcus OpenBSD 5.9 (GENERIC.MP) #3: Tue Feb 2 10:31:48 CET 2016 fifi@dax:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 4161052672 (3968MB) avail mem = 4030750720 (3844MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xeb530 (73 entries) bios0: vendor American Megatrends Inc. version "1.03" date 08/09/2013 bios0: Shuttle Inc. DS47D acpi0 at bios0: rev 2 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP APIC FPDT MCFG SLIC HPET SSDT SSDT SSDT acpi0: wakeup devices P0P1(S4) USB1(S3) USB2(S3) USB3(S3) USB4(S3) USB5(S3) USB6(S3) USB7(S3) PXSX(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Celeron(R) CPU 847 @ 1.10GHz, 1097.68 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,XSAVE,NXE,LONG,LAHF,PERF,ITSC cpu0: 256KB 64b/line 8-way L2 cache mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpuid1: ebx: 0xf0b0ff cpuid2: fpu_save_len: 0xf0b0ff, ebx: 0xf0b0ff cpu0: apic clock running at 99MHz cpu0: mwait min=23041, max=45311 (bogus) cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Celeron(R) CPU 847 @ 1.10GHz, 1097.51 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,XSAVE,NXE,LONG,LAHF,PERF,ITSC cpu1: 256KB 64b/line 8-way L2 cache ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 24 pins acpimcfg0 at acpi0 addr 0xf800, bus 0-63 acpihpet0 at acpi0: 14318179 Hz acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus -1 (P0P1) acpiprt2 at acpi0: bus 1 (RP01) acpiprt3 at acpi0: bus 2 (RP02) acpiprt4 at acpi0: bus 3 (RP03) acpiprt5 at acpi0: bus 4 (RP04) acpiprt6 at acpi0: bus -1 (RP05) acpiprt7 at acpi0: bus -1 (RP06) acpiprt8 at acpi0: bus -1 (RP07) acpiprt9 at acpi0: bus -1 (RP08) acpiprt10 at acpi0: bus -1 (PEG0) acpiprt11 at acpi0: bus -1 (PEG1) acpiprt12 at acpi0: bus -1 (PEG2) acpiprt13 at acpi0: bus -1 (PEG3) acpiec0 at acpi0: not present acpicpu0 at acpi0: C1(@1 halt!), PSS acpicpu1 at acpi0: C1(@1 halt!), PSS acpipwrres0 at acpi0: FN00, resource for FAN0 acpipwrres1 at acpi0: FN01, resource for FAN1 acpipwrres2 at acpi0: FN02, resource for FAN2 acpipwrres3 at acpi0: FN03, resource for FAN3 acpipwrres4 at acpi0: FN04, resource for FAN4 acpitz0 at acpi0: critical temperature is 101 degC acpitz1 at acpi0: critical temperature is 101 degC acpibat0 at acpi0: BAT0 not present acpibat1 at acpi0: BAT1 not present acpibat2 at acpi0: BAT2 not present acpibtn0 at acpi0: PWRB acpibtn1 at acpi0: LID0 acpivideo0 at acpi0: GFX0 cpu0: Enhanced SpeedStep 1097 MHz: speeds: 1100, 1000, 900, 800 MHz pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 "Intel Core 2G Host" rev 0x09 inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics 2000" rev 0x09 drm0 at inteldrm0 inteldrm0: msi inteldrm0: 1280x1024 wsdisplay0 at inteldrm0 mux 1: console (std, vt100 emulation) wsdisplay0:
Re: installer amd64 'Get/Verify bsd' -> 'Illegal instruction' - shuttle ds47d
sisnk...@gmail.com (Stefan Kempf), 2016.02.01 (Mon) 19:13 (CET): > Marcus MERIGHI wrote: > > sisnk...@gmail.com (Stefan Kempf), 2016.01.30 (Sat) 10:49 (CET): > > > We need to see how it looks like from within the kernel (and whether > > > the illegal instruction is really raised from within sendsig()). Can you > > > try the diff below? > > > > > You should get a kernel panic now instead of an illegal instruction > > > signal if you try running ping or top. We need the output of the panic > > > message and the output of the following commands: > > > > ping(1), top(1) messed up the screen. > > > > # ping 192.168.188.189 > > PING 192.168.188.189 (192.168.188.189): 56 data bytes > > 64 bytes from 192.168.188.189: icmp_seq=0 ttl=255 time=166.533 ms > > panic: sendsig 1: fxsave 0x800032c8a000, sp 0x7f7fff0d20b1, > > fxave_size 512, savefpu_size 832, fpu_save_len 15773951, tf_rsp > > 0x7f7dd238, userstack 1 > > fpu_save_len is way too large (0xf0b0ff in hex). It should be 832 at > most. And that causes the kernel to attempt writes outside of the > process stack (and/or to read beyond the saved FPU state). > > Either the value we get from CPUID is strange (or we handle CPUID > wrongly), or something trashes fpu_save_len. Now that you mention CPUID... If I switch 'Max CPUID Value Limit' to 'disabled' in the BIOS, the symptom is gone. It re-appears when setting to 'enabled'. Diff between dmesgs (I did some line wrapping; file attached for better readability): --- dmesg.out.enabled Tue Feb 2 09:55:41 2016 +++ dmesg.out.disabled Tue Feb 2 09:55:41 2016 @@ -15,7 +15,7 @@ acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) -cpu0: Intel(R) Celeron(R) CPU 847 @ 1.10GHz, 1097.70 MHz +cpu0: Intel(R) Celeron(R) CPU 847 @ 1.10GHz, 1097.68 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV, PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3, PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM, PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,XSAVE,NXE,LONG,LAHF, PERF,ITSC cpu0: 256KB 64b/line 8-way L2 cache mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges @@ -160,16 +160,18 @@ acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) -cpu0: Intel(R) Celeron(R) CPU 847 @ 1.10GHz, 1097.68 MHz -cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV, PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3, PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM, PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,XSAVE,NXE,LONG,LAHF, PERF,ITSC +cpu0: Intel(R) Celeron(R) CPU 847 @ 1.10GHz, 1097.67 MHz +cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV, PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3, PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM, PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,XSAVE,NXE,LONG,LAHF, PERF,ITSC,SENSOR,ARAT cpu0: 256KB 64b/line 8-way L2 cache +cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz -cpu0: mwait min=23041, max=45311 (bogus) +cpu0: mwait min=64, max=64, C-substates=0.2.1.1.2, IBE cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Celeron(R) CPU 847 @ 1.10GHz, 1097.51 MHz -cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV, PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3, PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM, PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,XSAVE,NXE,LONG,LAHF, PERF,ITSC +cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV, PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3, PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM, PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,XSAVE,NXE,LONG,LAHF, PERF,ITSC,SENSOR,ARAT cpu1: 256KB 64b/line 8-way L2 cache +cpu1: smt 0, core 1, package 0 ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 24 pins acpimcfg0 at acpi0 addr 0xf800, bus 0-63 acpihpet0 at acpi0: 14318179 Hz @@ -188,8 +190,8 @@ acpiprt12 at acpi0: bus -1 (PEG2) acpiprt13 at acpi0: bus -1 (PEG3) acpiec0 at acpi0: not present -acpicpu0 at acpi0: C1(@1 halt!), PSS -acpicpu1 at acpi0: C1(@1 halt!), PSS +acpicpu0 at acpi0: C2(350@104 mwait.1@0x20), C1(1000@1 mwait.1), PSS +acpicpu1 at acpi0: C2(350@104 mwait.1@0x20), C1(1000@1 mwait.1), PSS acpipwrres0 at acpi0: FN00, resource for FAN0 acpipwrres1 at acpi0: FN01, resource for FAN1 acpipwrres2 at acpi0: FN02, resource for FAN2 I'm now off to working off your instructions below... Bye+Thanks, Marcus > Can you try this diff and paste the "cpuid1:" "cpuid2:" lines? Please >
UDMA problem / slow disc performance compared to FreeBSD
Hi all, after using FreeBSD for many years now, I thought to give OpenBSD a try. Installed V5.8/i386 on an ALIX board (www.pcengines.ch) in order to build a X11 client. Installation went fine so far. But then I noticed that disc performance is bad and OpenBSD does not enable UDMA4. The system uses a 32GB CompactFlash card which supports UDMA5. The ALIX board can do UDMA4 but OpenBSD only enables UDMA2. I checked against FreeBSD and it looks that 1) FreeBSD enables UDMA4 2) and is about two times faster than OpenBSD. dmesg, original kernel: OpenBSD 5.8 (GENERIC) #1066: Sun Aug 16 02:33:00 MDT 2015 dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC cpu0: Geode(TM) Integrated Processor by AMD PCS ("AuthenticAMD" 586-class) 499 MHz cpu0: FPU,DE,PSE,TSC,MSR,CX8,SEP,PGE,CMOV,CFLUSH,MMX,MMXX,3DNOW2,3DNOW real mem = 232992768 (222MB) avail mem = 216035328 (206MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: date 07/19/10, BIOS32 rev. 0 @ 0xfa950 apm0 at bios0: Power Management spec V1.2 (slowidle) pcibios0 at bios0: rev 2.1 @ 0xf/0xdfb4 pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfdf30/128 (6 entries) pcibios0: PCI Exclusive IRQs: 5 10 11 pcibios0: no compatible PCI ICU found: ICU vendor 0x1022 product 0x2090 pcibios0: Warning, unable to fix up PCI interrupt routing pcibios0: PCI bus #0 is the last bus bios0: ROM list: 0xc/0x8000 0xc8000/0xa800 0xef000/0x1000! cpu0 at mainbus0: (uniprocessor) mtrr: K6-family MTRR support (2 registers) amdmsr0 at mainbus0 pci0 at mainbus0 bus 0: configuration mode 1 (bios) pchb0 at pci0 dev 1 function 0 "AMD Geode LX" rev 0x33 vga1 at pci0 dev 1 function 1 "AMD Geode LX Video" rev 0x00 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) glxsb0 at pci0 dev 1 function 2 "AMD Geode LX Crypto" rev 0x00: RNG AES vr0 at pci0 dev 13 function 0 "VIA VT6105M RhineIII" rev 0x96: irq 11, address 00:0d:b9:0f:1a:98 ukphy0 at vr0 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI 0x004063, model 0x0034 athn0 at pci0 dev 14 function 0 "Atheros AR9280" rev 0x01: irq 10 athn0: AR9280 rev 2 (2T2R), ROM rev 22, address 04:f0:21:17:32:91 glxpcib0 at pci0 dev 15 function 0 "AMD CS5536 ISA" rev 0x03: rev 3, 32-bit 3579545Hz timer, watchdog, gpio, i2c gpio0 at glxpcib0: 32 pins iic0 at glxpcib0 pciide0 at pci0 dev 15 function 2 "AMD CS5536 IDE" rev 0x01: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility wd0 at pciide0 channel 0 drive 0: wd0: 1-sector PIO, LBA48, 30535MB, 62537328 sectors >>> UDMA2 detected: wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2 pciide0: channel 1 ignored (disabled) auglx0 at pci0 dev 15 function 3 "AMD CS5536 Audio" rev 0x01: irq 11, CS5536 AC97 ac97: codec id 0x414c4770 (Avance Logic ALC203 rev 0) ac97: codec features headphone, 20 bit DAC, 18 bit ADC, No 3D Stereo audio0 at auglx0 ohci0 at pci0 dev 15 function 4 "AMD CS5536 USB" rev 0x02: irq 5, version 1.0, legacy support ehci0 at pci0 dev 15 function 5 "AMD CS5536 USB" rev 0x02: irq 5 usb0 at ehci0: USB revision 2.0 uhub0 at usb0 "AMD EHCI root hub" rev 2.00/1.00 addr 1 isa0 at glxpcib0 isadma0 at isa0 com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo com2: irq 5 already in use pckbc0 at isa0 port 0x60/5 irq 1 irq 12 pckbd0 at pckbc0 (kbd slot) wskbd0 at pckbd0: console keyboard, using wsdisplay0 pcppi0 at isa0 port 0x61 spkr0 at pcppi0 lpt0 at isa0 port 0x378/4 irq 7 wbsio0 at isa0 port 0x2e/2: W83627HF rev 0x41 lm1 at wbsio0 port 0x290/8: W83627HF npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16 usb1 at ohci0: USB revision 1.0 uhub1 at usb1 "AMD OHCI root hub" rev 1.00/1.00 addr 1 uhidev0 at uhub1 port 4 configuration 1 interface 0 "Logitech 2.4GHz Cordless Desktop" rev 2.00/55.01 addr 2 uhidev0: iclass 3/1 ukbd0 at uhidev0: 8 variable keys, 6 key codes wskbd1 at ukbd0 mux 1 wskbd1: connecting to wsdisplay0 uhidev1 at uhub1 port 4 configuration 1 interface 1 "Logitech 2.4GHz Cordless Desktop" rev 2.00/55.01 addr 2 uhidev1: iclass 3/1, 6 report ids ums0 at uhidev1 reportid 1: 8 buttons, Z dir wsmouse0 at ums0 mux 0 uhid0 at uhidev1 reportid 2: input=2, output=0, feature=0 uhid1 at uhidev1 reportid 3: input=1, output=0, feature=0 uhid2 at uhidev1 reportid 6: input=0, output=0, feature=6 vscsi0 at root scsibus1 at vscsi0: 256 targets softraid0 at root scsibus2 at softraid0: 256 targets root on wd0a (e00b9c2f0019db9c.a) swap on wd0b dump on wd0b audio0: different play and record parameters returned by hardware FreeBSD, 10.1, i386: dmesg: atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xff00-0xff0f at device 15.2 on pci0 ... ada0 at ata0 bus 0 scbus0 target 0 lun 0 ada0: CFA-0 device ada0: Serial Number OAZ072715093110 ada0: 100.000MB/s transfers (UDMA5, PIO 512bytes) ada0: 30535MB (62537328 512 byte sectors: 16H 63S/T 62041C) ada0: Previously was known as ad0 Write: dd if=/dev/zero of=bigfile
Re: UEFI Boot Report: Screen corruption and kernel panic
On Tue, Feb 02, 2016 at 03:56:13AM -0500, James Hastings wrote: > On 2/2/16, Jonathan Graywrote: > > > > The bios may have to be fetched from the acpi VFCT table for the uefi case. > > > > Here's a quick attempt at trying to avoid the crash at least: > > > > Different panic this time. Thanks. Here is a version modified to have a better check for efi. Still may need some things shuffled around to deal with the root hook. Index: radeon.h === RCS file: /cvs/src/sys/dev/pci/drm/radeon/radeon.h,v retrieving revision 1.17 diff -u -p -r1.17 radeon.h --- radeon.h27 Sep 2015 11:09:26 - 1.17 +++ radeon.h2 Feb 2016 07:33:39 - @@ -1554,6 +1554,7 @@ struct radeon_device { struct drm_device *ddev; struct pci_dev *pdev; + struct pci_attach_args *pa; pci_chipset_tag_t pc; pcitag_tpa_tag; pci_intr_handle_t intrh; @@ -1595,6 +1596,7 @@ struct radeon_device { /* BIOS */ uint8_t *bios; boolis_atom_bios; + int uefi; uint16_tbios_header_start; struct radeon_bo*stollen_vga_memory; /* Register mmio */ Index: radeon_bios.c === RCS file: /cvs/src/sys/dev/pci/drm/radeon/radeon_bios.c,v retrieving revision 1.6 diff -u -p -r1.6 radeon_bios.c --- radeon_bios.c 12 Apr 2015 12:14:30 - 1.6 +++ radeon_bios.c 2 Feb 2016 07:34:17 - @@ -51,8 +51,10 @@ radeon_read_platform_bios(struct radeon_ bus_size_t size = 256 * 1024; /* ??? */ uint8_t *found = NULL; int i; - - + + if (rdev->uefi) + return false; + if (!(rdev->flags & RADEON_IS_IGP)) if (!radeon_card_posted(rdev)) return false; Index: radeon_kms.c === RCS file: /cvs/src/sys/dev/pci/drm/radeon/radeon_kms.c,v retrieving revision 1.46 diff -u -p -r1.46 radeon_kms.c --- radeon_kms.c6 Jan 2016 19:56:08 - 1.46 +++ radeon_kms.c2 Feb 2016 09:16:40 - @@ -41,6 +41,15 @@ extern int vga_console_attached; #endif +#ifdef __amd64__ +#include "efifb.h" +#endif + +#if NEFIFB > 0 +#include +#include +#endif + #define DRIVER_NAME"radeon" #define DRIVER_DESC"ATI Radeon" #define DRIVER_DATE"20080613" @@ -481,6 +490,7 @@ radeondrm_attach_kms(struct device *pare id_entry = drm_find_description(PCI_VENDOR(pa->pa_id), PCI_PRODUCT(pa->pa_id), radeondrm_pciidlist); rdev->flags = id_entry->driver_data; + rdev->pa = pa; rdev->pc = pa->pa_pc; rdev->pa_tag = pa->pa_tag; rdev->iot = pa->pa_iot; @@ -501,6 +511,15 @@ radeondrm_attach_kms(struct device *pare vga_console_attached = 1; #endif } +#if NEFIFB > 0 + if (efifb_is_console(pa)) + rdev->console = 1; + if (bios_efiinfo != NULL) + rdev->uefi = 1; +#else + rdev->uefi = 0; +#endif + #endif #define RADEON_PCI_MEM 0x10 @@ -713,6 +732,12 @@ radeondrm_attachhook(struct device *self #ifdef __sparc64__ fbwscons_setcolormap(>sf, radeondrm_setcolor); #endif + +#if NEFIFB > 0 + if (efifb_is_console(rdev->pa)) + efifb_cndetach(); +#endif + drm_modeset_lock_all(rdev->ddev); drm_fb_helper_restore_fbdev_mode((void *)rdev->mode_info.rfbdev); drm_modeset_unlock_all(rdev->ddev);
Re: UEFI Boot Report: Screen corruption and kernel panic
On 2/2/16, Jonathan Graywrote: > On Tue, Feb 02, 2016 at 03:56:13AM -0500, James Hastings wrote: >> On 2/2/16, Jonathan Gray wrote: >> > >> > The bios may have to be fetched from the acpi VFCT table for the uefi >> > case. >> > >> > Here's a quick attempt at trying to avoid the crash at least: >> > >> >> Different panic this time. > > Thanks. Here is a version modified to have a better check for efi. > Still may need some things shuffled around to deal with the root hook. > Identical panic again. Thinking out loud; could these two issues be caused by bad information passed by the bootloader or firmware? memory maps? framebuffer address? OpenBSD 5.9 (GENERIC.MP) #0: Tue Feb 2 05:04:30 EST 2016 r...@cq58.example.test:/usr/src/sys/arch/amd64/compile/GENERIC.MP RTC BIOS diagnostic error 80 real mem = 1690714112 (1612MB) avail mem = 1635336192 (1559MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.7 @ 0x66abc000 (45 entries) bios0: vendor Insyde version "F.65" date 06/04/2014 bios0: Hewlett-Packard Compaq CQ58 Notebook PC acpi0 at bios0: rev 2 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP UEFI HPET APIC MCFG ASF! BOOT SPCR WDRT WDAT FPDT MSDM SSDT SSDT VFCT BGRT acpi0: wakeup devices PB6_(S4) SPB0(S4) XPDV(S4) SPB1(S4) SPB3(S4) GEC_(S4) OHC1(S3) OHC2(S3) OHC3(S3) OHC4(S3) EHC1(S3) EHC2(S3) EHC3(S3) P2P_(S5) acpitimer0 at acpi0: 3579545 Hz, 32 bits acpihpet0 at acpi0: 14318180 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: AMD C-60 APU with Radeon(tm) HD Graphics, 998.42 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,MWAIT,SSSE3,CX16,POPCNT,NXE,MMXX,FFXSR,PAGE1GB,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,IBS,SKINIT,ITSC cpu0: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 16-way L2 cache cpu0: 8 4MB entries fully associative cpu0: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 199MHz cpu0: mwait min=64, max=64, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: AMD C-60 APU with Radeon(tm) HD Graphics, 997.88 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,MWAIT,SSSE3,CX16,POPCNT,NXE,MMXX,FFXSR,PAGE1GB,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,IBS,SKINIT,ITSC cpu1: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 16-way L2 cache cpu1: 8 4MB entries fully associative cpu1: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative cpu1: smt 0, core 1, package 0 ioapic0 at mainbus0: apid 4 pa 0xfec0, version 21, 24 pins ioapic0: misconfigured as apic 0, remapped to apid 4 acpimcfg0 at acpi0 addr 0xf800, bus 0-63 acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus -1 (PB4_) acpiprt2 at acpi0: bus -1 (PB5_) acpiprt3 at acpi0: bus -1 (PB6_) acpiprt4 at acpi0: bus -1 (PB7_) acpiprt5 at acpi0: bus 2 (SPB0) acpiprt6 at acpi0: bus 6 (SPB1) acpiprt7 at acpi0: bus 7 (SPB2) acpiprt8 at acpi0: bus -1 (SPB3) acpiprt9 at acpi0: bus 1 (P2P_) acpiec0 at acpi0 acpicpu0 at acpi0: C2(0@100 io@0xf800), C1(@1 halt!), PSS acpicpu1 at acpi0: C2(0@100 io@0xf800), C1(@1 halt!), PSS acpipwrres0 at acpi0: FN00, resource for FAN0 acpitz0 at acpi0: critical temperature is 125 degC acpibtn0 at acpi0: PWRB acpiac0 at acpi0: AC unit online acpibat0 at acpi0: BAT0 not present acpibtn1 at acpi0: LID_ acpivideo0 at acpi0: VGA_ acpivideo1 at acpi0: VGA_ cpu0: 998 MHz: speeds: 1000 800 MHz pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 "AMD AMD64 14h Host" rev 0x00 radeondrm0 at pci0 dev 1 function 0 "ATI Radeon HD 6290" rev 0x00 drm0 at radeondrm0 radeondrm0: msi ahci0 at pci0 dev 17 function 0 "ATI SBx00 SATA" rev 0x00: apic 4 int 19, AHCI 1.2 ahci0: port 0: 3.0Gb/s ahci0: port 2: 1.5Gb/s scsibus1 at ahci0: 32 targets sd0 at scsibus1 targ 0 lun 0: SCSI3 0/direct fixed naa.5000c500536c9072 sd0: 305245MB, 512 bytes/sector, 625142448 sectors cd0 at scsibus1 targ 2 lun 0: ATAPI 5/cdrom removable ohci0 at pci0 dev 18 function 0 "ATI SB700 USB" rev 0x00: apic 4 int 18, version 1.0, legacy support ehci0 at pci0 dev 18 function 2 "ATI SB700 USB2" rev 0x00: apic 4 int 17 usb0 at ehci0: USB revision 2.0 uhub0 at usb0 "ATI EHCI root hub" rev 2.00/1.00 addr 1 ohci1 at pci0 dev 19 function 0 "ATI SB700 USB" rev 0x00: apic 4 int 18, version 1.0, legacy support ehci1 at pci0 dev 19 function 2 "ATI SB700 USB2" rev 0x00: apic 4 int 17 usb1 at ehci1: USB revision 2.0 uhub1 at usb1 "ATI EHCI root hub" rev 2.00/1.00 addr 1 piixpm0 at pci0 dev 20 function 0 "ATI SBx00 SMBus" rev 0x42: polling iic0 at piixpm0 spdmem0 at iic0 addr 0x51: 2GB DDR3 SDRAM PC3-12800
Re: installer amd64 'Get/Verify bsd' -> 'Illegal instruction' - shuttle ds47d
On Tue, 2 Feb 2016, Marcus MERIGHI wrote: > sisnk...@gmail.com (Stefan Kempf), 2016.02.01 (Mon) 19:13 (CET): > > Marcus MERIGHI wrote: > > > sisnk...@gmail.com (Stefan Kempf), 2016.01.30 (Sat) 10:49 (CET): > > > > We need to see how it looks like from within the kernel (and whether > > > > the illegal instruction is really raised from within sendsig()). Can you > > > > try the diff below? > > > > > > > You should get a kernel panic now instead of an illegal instruction > > > > signal if you try running ping or top. We need the output of the panic > > > > message and the output of the following commands: > > > > > > ping(1), top(1) messed up the screen. > > > > > > # ping 192.168.188.189 > > > PING 192.168.188.189 (192.168.188.189): 56 data bytes > > > 64 bytes from 192.168.188.189: icmp_seq=0 ttl=255 time=166.533 ms > > > panic: sendsig 1: fxsave 0x800032c8a000, sp 0x7f7fff0d20b1, > > > fxave_size 512, savefpu_size 832, fpu_save_len 15773951, tf_rsp > > > 0x7f7dd238, userstack 1 > > > > fpu_save_len is way too large (0xf0b0ff in hex). It should be 832 at > > most. And that causes the kernel to attempt writes outside of the > > process stack (and/or to read beyond the saved FPU state). > > > > Either the value we get from CPUID is strange (or we handle CPUID > > wrongly), or something trashes fpu_save_len. > > Now that you mention CPUID... > If I switch 'Max CPUID Value Limit' to 'disabled' in the BIOS, the > symptom is gone. It re-appears when setting to 'enabled'. "Doctor, it hurts when I do this..." That BIOS option exists to support ancient OSes (Windows NT, etc) and shouldn't be enabled when using OpenBSD. Currently we seem to assume that the presence of certain CPU features like AVX implies that CPUID supports the related leaf; that BIOS option breaks that assumption, resulting in the bogus fpu_save_len sizing you hit. >From the dmesg you posted I see it also explains the bogus mwait sizing that has been reported by some others. Your machine will perform better with that option off; I guess we should add check to the code to catch this sort of setup by checking the cpuid_level variable before using the higher CPUID leafs. Can you try applying the diff below, temporarily re-enable that BIOS option, then report the resulting dmesg and verify that ping works properly? Philip Guenther Index: i386/i386/cpu.c === RCS file: /data/src/openbsd/src/sys/arch/i386/i386/cpu.c,v retrieving revision 1.70 diff -u -p -r1.70 cpu.c --- i386/i386/cpu.c 27 Dec 2015 04:31:34 - 1.70 +++ i386/i386/cpu.c 2 Feb 2016 16:54:09 - @@ -784,7 +784,7 @@ cpu_init_mwait(struct device *dv) { unsigned int smallest, largest, extensions, c_substates; - if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0) + if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0 || cpuid_level < 0x5) return; /* get the monitor granularity */ Index: amd64/amd64/cpu.c === RCS file: /data/src/openbsd/src/sys/arch/amd64/amd64/cpu.c,v retrieving revision 1.94 diff -u -p -r1.94 cpu.c --- amd64/amd64/cpu.c 27 Dec 2015 04:31:34 - 1.94 +++ amd64/amd64/cpu.c 2 Feb 2016 16:54:30 - @@ -282,7 +282,7 @@ cpu_init_mwait(struct cpu_softc *sc) { unsigned int smallest, largest, extensions, c_substates; - if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0) + if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0 || cpuid_level < 0x5) return; /* get the monitor granularity */ @@ -505,7 +505,7 @@ cpu_init(struct cpu_info *ci) cr4 |= CR4_OSXSAVE; lcr4(cr4); - if (cpu_ecxfeature & CPUIDECX_XSAVE) { + if (cpu_ecxfeature & CPUIDECX_XSAVE && cpuid_level >= 0xd) { u_int32_t eax, ebx, ecx, edx; xsave_mask = XCR0_X87 | XCR0_SSE;
Re: UEFI Boot Report: Screen corruption and kernel panic
On Tue, Feb 02, 2016 at 05:49:33AM -0500, James Hastings wrote: > On 2/2/16, Jonathan Graywrote: > > On Tue, Feb 02, 2016 at 03:56:13AM -0500, James Hastings wrote: > >> On 2/2/16, Jonathan Gray wrote: > >> > > >> > The bios may have to be fetched from the acpi VFCT table for the uefi > >> > case. > >> > > >> > Here's a quick attempt at trying to avoid the crash at least: > >> > > >> > >> Different panic this time. > > > > Thanks. Here is a version modified to have a better check for efi. > > Still may need some things shuffled around to deal with the root hook. > > > > Identical panic again. > > Thinking out loud; could these two issues be caused by bad information > passed by > the bootloader or firmware? memory maps? framebuffer address? It's because one of the ways of getting the video bios is to check where it has long been mapped and reading out of that memory. It sounds like your machine has enough of it there to convince the code that checks for a signature but not all of the expected 256k of memory after 0xc is actually mapped. Putting a "return false;" at the top of radeon_read_platform_bios() should prevent this method from being tried entirely.
Re: UEFI Boot Report: Screen corruption and kernel panic
> Date: Tue, 2 Feb 2016 02:09:08 -0500 > From: James Hastings> > Testing UEFI booting. Willing to test patches and debug. > > 1) Encountered screen corruption immediately after EFIBOOT. > http://imgur.com/mXURlgV Looks like the framebuffer configuration is not correctly passed to the kernel isn't quite right. Or the kernel is misinterpreting that information. The code in sys/arch/amd64/stand/efiboot/efiboot.c:efi_makebootargs() queries UEFI for the framebuffer layout. It would help if you could print that information. This requires building and installing a new EFIBOOT bootloader. It would also be useful to know the resolution of your screen. Thanks, Mark
Re: installer amd64 'Get/Verify bsd' -> 'Illegal instruction' - shuttle ds47d
On Tue, 2 Feb 2016, Philip Guenther wrote: ... > Currently we seem to assume that the presence of certain CPU features like > AVX implies that CPUID supports the related leaf; that BIOS option breaks > that assumption, resulting in the bogus fpu_save_len sizing you hit. > From the dmesg you posted I see it also explains the bogus mwait sizing > that has been reported by some others. Your machine will perform better > with that option off; I guess we should add check to the code to catch > this sort of setup by checking the cpuid_level variable before using the > higher CPUID leafs. Revised version that switches a few places to check cpuid_level instead of calling CPUID(0) again and similar for using curcpu()->ci_pnfeatset instead of calling CPUID(0x8000) once identifycpu() sets that, and add a check of ci->ci_pnfeatset before using CPUID(CPUID_AMD_SVM_CAP) in the vmm bits. ok? Philip Guenther Index: i386/i386/cpu.c === RCS file: /data/src/openbsd/src/sys/arch/i386/i386/cpu.c,v retrieving revision 1.70 diff -u -p -r1.70 cpu.c --- i386/i386/cpu.c 27 Dec 2015 04:31:34 - 1.70 +++ i386/i386/cpu.c 2 Feb 2016 16:54:09 - @@ -784,7 +784,7 @@ cpu_init_mwait(struct device *dv) { unsigned int smallest, largest, extensions, c_substates; - if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0) + if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0 || cpuid_level < 0x5) return; /* get the monitor granularity */ Index: amd64/amd64/amd64_mem.c === RCS file: /data/src/openbsd/src/sys/arch/amd64/amd64/amd64_mem.c,v retrieving revision 1.11 diff -u -p -r1.11 amd64_mem.c --- amd64/amd64/amd64_mem.c 14 Mar 2015 03:38:46 - 1.11 +++ amd64/amd64/amd64_mem.c 2 Feb 2016 17:37:55 - @@ -583,8 +583,7 @@ mrinit(struct mem_range_softc *sc) * If CPUID does not support leaf function 0x8008, use the * default a 36-bit address size. */ - CPUID(0x8000, regs[0], regs[1], regs[2], regs[3]); - if (regs[0] >= 0x8008) { + if (curcpu()->ci_pnfeatset >= 0x8008) { CPUID(0x8008, regs[0], regs[1], regs[2], regs[3]); if (regs[0] & 0xff) { mtrrmask = (1ULL << (regs[0] & 0xff)) - 1; Index: amd64/amd64/cacheinfo.c === RCS file: /data/src/openbsd/src/sys/arch/amd64/amd64/cacheinfo.c,v retrieving revision 1.7 diff -u -p -r1.7 cacheinfo.c --- amd64/amd64/cacheinfo.c 13 Nov 2015 07:52:20 - 1.7 +++ amd64/amd64/cacheinfo.c 2 Feb 2016 17:36:11 - @@ -159,7 +159,6 @@ amd_cpu_cacheinfo(struct cpu_info *ci) struct x86_cache_info *cai; int family, model; u_int descs[4]; - u_int lfunc; family = ci->ci_family; model = ci->ci_model; @@ -171,15 +170,9 @@ amd_cpu_cacheinfo(struct cpu_info *ci) return; /* -* Determine the largest extended function value. -*/ - CPUID(0x8000, descs[0], descs[1], descs[2], descs[3]); - lfunc = descs[0]; - - /* * Determine L1 cache/TLB info. */ - if (lfunc < 0x8005) { + if (ci->ci_pnfeatset < 0x8005) { /* No L1 cache info available. */ return; } @@ -228,7 +221,7 @@ amd_cpu_cacheinfo(struct cpu_info *ci) /* * Determine L2 cache/TLB info. */ - if (lfunc < 0x8006) { + if (ci->ci_pnfeatset < 0x8006) { /* No L2 cache info available. */ return; } Index: amd64/amd64/cpu.c === RCS file: /data/src/openbsd/src/sys/arch/amd64/amd64/cpu.c,v retrieving revision 1.94 diff -u -p -r1.94 cpu.c --- amd64/amd64/cpu.c 27 Dec 2015 04:31:34 - 1.94 +++ amd64/amd64/cpu.c 2 Feb 2016 17:03:04 - @@ -282,7 +282,7 @@ cpu_init_mwait(struct cpu_softc *sc) { unsigned int smallest, largest, extensions, c_substates; - if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0) + if ((cpu_ecxfeature & CPUIDECX_MWAIT) == 0 || cpuid_level < 0x5) return; /* get the monitor granularity */ @@ -505,7 +505,7 @@ cpu_init(struct cpu_info *ci) cr4 |= CR4_OSXSAVE; lcr4(cr4); - if (cpu_ecxfeature & CPUIDECX_XSAVE) { + if ((cpu_ecxfeature & CPUIDECX_XSAVE) && cpuid_level >= 0xd) { u_int32_t eax, ebx, ecx, edx; xsave_mask = XCR0_X87 | XCR0_SSE; Index: amd64/amd64/identcpu.c === RCS file: /data/src/openbsd/src/sys/arch/amd64/amd64/identcpu.c,v retrieving revision 1.71 diff -u -p -r1.71 identcpu.c --- amd64/amd64/identcpu.c 27 Dec 2015 04:31:34 -
Re: ral(4) leaks mbufs instead of setting oactive
On Tue, 2 Feb 2016, Stefan Sperling wrote: > > On Sat, Jan 30, 2016 at 10:49:38PM +1300, Richard Procter wrote: > > - ring->queued--; > > + atomic_dec_int(>queued); > > > - ring->queued += ntxds; > > + atomic_add_int(>queued, ntxds); > > I don't think these make a difference in the current way of things. > Wireless drivers run interrupts under the kernel big lock, interrupts > aren't preemptible, and AFAIK (most?) 32bit integer operations are atomic. [...] > Hmm. Taking a closer look, if_start() is already called under splnet. > So adding splnet to rt2860_tx() shouldn't make a difference. You're right, the atomic is unnecessary. I'd assumed rt2860_tx() was running under splsoftnet. I could have sworn I'd seen errors without the atomic but that now tests fine, too. (I still see errors without the ring->cur fix.) > This also means the card cannot interrupt in the way your comment > describes, i.e. the problem you're "fixing" here cannot exist... ? Also right. This simplifies things --- see below for the patch minus the above. Without it my card under stress sees 1 oerror per ~217 packets; it's now sent 5E6 without seeing any. cheers, Richard. --- * fix watchdog timeouts and dropped frames under load. - on full tx ring, ring->cur wraps to an active tx descriptor. Passing that wrapped value to the card was observed to cause general flakiness. Fix prevents the wrap at the cost of reducing usable tx descriptors by one. Index: sys/dev/ic/rt2860.c === --- sys.orig/dev/ic/rt2860.c +++ sys/dev/ic/rt2860.c @@ -1171,7 +1171,7 @@ rt2860_tx_intr(struct rt2860_softc *sc, } sc->sc_tx_timer = 0; - if (ring->queued < RT2860_TX_RING_COUNT) + if (ring->queued < RT2860_TX_RING_MAX) sc->qfullmsk &= ~(1 << qid); ifq_clr_oactive(>if_snd); rt2860_start(ifp); @@ -1618,7 +1618,7 @@ rt2860_tx(struct rt2860_softc *sc, struc /* determine how many TXDs are required */ ntxds = 1 + (data->map->dm_nsegs / 2); - if (ring->queued + ntxds >= RT2860_TX_RING_COUNT) { + if (ring->queued + ntxds >= RT2860_TX_RING_MAX) { /* not enough free TXDs, force mbuf defrag */ bus_dmamap_unload(sc->sc_dmat, data->map); error = EFBIG; @@ -1656,7 +1656,7 @@ rt2860_tx(struct rt2860_softc *sc, struc /* determine how many TXDs are now required */ ntxds = 1 + (data->map->dm_nsegs / 2); - if (ring->queued + ntxds >= RT2860_TX_RING_COUNT) { + if (ring->queued + ntxds >= RT2860_TX_RING_MAX) { /* this is a hopeless case, drop the mbuf! */ bus_dmamap_unload(sc->sc_dmat, data->map); m_freem(m); @@ -1714,7 +1714,7 @@ rt2860_tx(struct rt2860_softc *sc, struc ring->cur = (ring->cur + 1) % RT2860_TX_RING_COUNT; ring->queued += ntxds; - if (ring->queued >= RT2860_TX_RING_COUNT) + if (ring->queued >= RT2860_TX_RING_MAX) sc->qfullmsk |= 1 << qid; /* kick Tx */ Index: sys/dev/ic/rt2860var.h === --- sys.orig/dev/ic/rt2860var.h +++ sys/dev/ic/rt2860var.h @@ -17,8 +17,9 @@ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ -#define RT2860_TX_RING_COUNT 64 #define RT2860_RX_RING_COUNT 128 +#define RT2860_TX_RING_COUNT 64 +#define RT2860_TX_RING_MAX (RT2860_TX_RING_COUNT - 1) #define RT2860_TX_POOL_COUNT (RT2860_TX_RING_COUNT * 2) #define RT2860_MAX_SCATTER ((RT2860_TX_RING_COUNT * 2) - 1) --- * replace custom defrag with m_defrag() - This fixes an error in the existing code: the "hopeless case" guard equivales 'ring now full', so oactive is never set: the code drops any mbuf that would fill the ring. This occurs often in practice. - The preceding patch allows the ring to fill safely. - The new code avoids some hoop-jumping. Currently, a tx dma-map can map an entire tx ring. Therefore an mbuf that fits a dma-map may yet not fit into the tx ring's remaining space. To be sure it can, we must in general count the mbuf's fragments and, if necessary, defrag it and reload the dmamap. The new code limits the dmamap to cover at most 8 tx descriptors (= 15 fragments): now, if an mbuf fits a dma-map it will fit any ring with at least 8 free descriptors. So we need only check for 8 free descriptors and are longer obliged to count fragments and jump hoops. The cost is unused tx ring descriptors, at most 7 of 63, when a one-fragment mbuf occupies one descriptor in the last block of 8. - For simplicity on error return, shift responsibilty for calling m_freem() to rt2860_tx()'s caller (which already calls ieee80211_release_node()).
Re: access softraid(4) raid-5, retrying read on block, kernel panic
The rebuild has finished: Volume Status Size Device softraid0 0 Online 12002360033280 sd5 RAID5 0 Online 4000786726912 0:0.0 noencl 1 Online 4000786726912 0:1.0 noencl 2 Online 4000786726912 0:2.0 noencl 3 Online 4000786726912 0:3.0 noencl I'm hoping for advise on how to proceed; I need to try to get as much data off of this before things get worse. Or aren't they even bad at all? What bothers me is that there is a need to 'retrying read on block 52992'. IIRC I've seen this on dying HDDs but not on a softraid disk. Does this mean the underlying softraid is broken? Thanks in advance, Marcus mcmer-open...@tor.at (Marcus MERIGHI), 2016.01.28 (Thu) 10:58 (CET): > Softraid RAID-5 on four 4TB HDDs. > The four disks are in an external enclosure (JBOD), connected via USB. > The array worked for about a month. > Tested before loading (bioctl -O, bioctl -R). > Copied roughly 6TB onto it while rebuilding was ongoing. > Took about 3 weeks (via network). > When the data was there I started another copy from the array to yet another > external disk. > While this was running the first kernel panic ocurred: > > sd5: retrying read on block 52992 > panic: softraid0: sd5: invalid volume state transition 1 -> 1 > Starting stack trace... > panic() at panic+0x10b > sr_raid5_set_vol_state() at sr_raid5_set_vol_state+0xe7 > sr_raid5_set_chunk_state() at sr_raid5_set_chunk_state+0xc7 > sr_ccb_done() at sr_ccb_done+0x76 > sr_raid5_intr() at sr_raid5_intr+0x3c > sd_buf_done() at sd_buf_done+0x7b > scsi_done() at scsi_done+0x1e > usb_transfer_complete() at usb_transfer_complete+0x26c > ehci_softintr() at ehci_softintr+0x3f > softintr_dispatch() at softintr_dispatch+0x8b > Xsoftnet() at Xsoftnet+0x1f > --- interrupt --- > end of kernel > end trace frame: 0x1388, count: 246 > 0x8: > End of stack trace. > syncing disks... 80 48 10 1 1 1 1 1 1 [...] giving up > > After that I retried (reboot, fsck, mount) and when accessing the mount > point there was another kernel panic. Silly me did not take another > picture. > Only after unpowering/powering the external enclosure and rebooting the > machine the array was automagically assembled again, with one disk > degraded. It is now rebuilding, 13% took about 14 hours. > > Thanks for reading, Marcus > > # bioctl softraid0 > Volume Status Size Device > softraid0 0 Rebuild12002360033280 sd5 RAID5 13% done > 0 Rebuild 4000786726912 0:0.0 noencl > 1 Online 4000786726912 0:1.0 noencl > 2 Online 4000786726912 0:2.0 noencl > 3 Online 4000786726912 0:3.0 noencl > softraid0 1 Online53691555840 sd6 CRYPTO > 0 Online53691555840 1:0.0 noencl > softraid0 2 Online53949354496 sd7 CRYPTO > 0 Online53949354496 2:0.0 noencl > > OpenBSD 5.8 (GENERIC.MP) #1236: Sun Aug 16 02:31:04 MDT 2015 > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > real mem = 4276822016 (4078MB) > avail mem = 4143312896 (3951MB) > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 2.5 @ 0xcff9c000 (46 entries) > bios0: vendor Dell Inc. version "1.4.3" date 06/05/2009 > bios0: Dell Inc. PowerEdge R200 > acpi0 at bios0: rev 2 > acpi0: sleep states S0 S4 S5 > acpi0: tables DSDT FACP APIC SPCR HPET MCFG WDAT SLIC ERST HEST BERT EINJ > SSDT SSDT SSDT > acpi0: wakeup devices PCI0(S5) > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: Intel(R) Core(TM)2 Duo CPU E7300 @ 2.66GHz, 1600.27 MHz > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,NXE,LONG,LAHF,PERF,SENSOR > cpu0: 3MB 64b/line 8-way L2 cache > cpu0: smt 0, core 0, package 0 > mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges > cpu0: apic clock running at 266MHz > cpu0: mwait min=64, max=64, C-substates=0.2.2.2.2, IBE > cpu1 at mainbus0: apid 1 (application processor) > cpu1: Intel(R) Core(TM)2 Duo CPU E7300 @ 2.66GHz, 1600.06 MHz > cpu1: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,NXE,LONG,LAHF,PERF,SENSOR > cpu1: 3MB 64b/line 8-way L2 cache > cpu1: smt 0, core 1, package 0 > ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 24 pins > ioapic0: misconfigured as apic 0, remapped to apid 2 > ioapic1 at mainbus0: apid 3 pa 0xfec1, version 20, 24 pins > ioapic1: misconfigured as apic 0, remapped to apid 3 > acpihpet0 at acpi0: 14318179 Hz > acpimcfg0 at acpi0 addr 0xe000, bus 0-255 > acpiprt0 at acpi0: bus 0 (PCI0) > acpiprt1 at acpi0: bus 1