Re: mcelog?
On Mon, Apr 08, 2019 at 12:01:00AM -0700, John Nemeth wrote: > On Apr 7, 9:48pm, "Aaron J. Grier" wrote: > > On Wed, Mar 20, 2019 at 11:22:13AM -0700, John Nemeth wrote: > > > (XEN) Bank 4: 945a4000fd080813 atef3581180 > > > (XEN) MCE: polling routine found correctable error. Use mcelog to parse > > > above e > > > rror output. > > [...] > > > In any event, if I'm reading the above correctly, I believe > > > that it is telling that there is bad memory? > > > > which CPU manufacturer and model is this? memory is just one of > > many possibilities which can generate machine check events. > > cpu0: "AMD Opteron(tm) Processor 6386 SE " > cpu0: AMD Family 15h (686-class) > cpu0: family 0x15 model 0x2 stepping 0 (id 0x600f20) https://www.amd.com/system/files/TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf according to cursory register decode based on the above document, that it does look like it could be an ECC-correctable memory error. there's another MSR that keeps a count of how many DRAM errors have been detected -- too bad NetBSD doesn't have an MSR driver. ;) -- Aaron J. Grier | "Not your ordinary poofy goof." | agr...@poofygoof.com "The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay." -- Tony Hoare
daily CVS update output
Updating src tree: P src/distrib/sets/lists/etc/mi P src/etc/defaults/Makefile P src/etc/rc.d/Makefile P src/external/bsd/jemalloc/include/jemalloc/internal/jemalloc_internal_defs.h P src/external/mit/xorg/lib/dri/Makefile P src/external/mit/xorg/lib/libGL/Makefile P src/sbin/fdisk/fdisk.8 P src/share/man/man4/carp.4 P src/share/man/man9/sched_4bsd.9 P src/share/man/man9/sched_m2.9 P src/sys/arch/arm/samsung/exynos_dwcmmc.c P src/sys/arch/arm/samsung/exynos_platform.c P src/sys/arch/macppc/conf/GENERIC P src/sys/conf/files P src/sys/conf/param.c P src/sys/dev/ic/gem.c P src/sys/dev/ic/i82596.c P src/sys/dev/isa/if_ai.c P src/sys/dev/isa/if_ef.c P src/sys/dev/isa/if_ix.c P src/sys/dev/mca/if_le_mca.c P src/sys/dev/mii/mii.c P src/sys/dev/mii/mii_physubr.c P src/sys/dev/mii/miivar.h P src/sys/kern/sysv_ipc.c P src/usr.bin/make/parse.c P src/usr.sbin/altq/altqd/altq.conf.5 P src/usr.sbin/altq/altqstat/altqstat.1 Updating xsrc tree: P xsrc/external/mit/MesaLib/dist/src/glx/glxclient.h P xsrc/external/mit/MesaLib/dist/src/glx/glxcurrent.c P xsrc/external/mit/MesaLib/dist/src/mapi/entry_x86-64_tls.h P xsrc/external/mit/MesaLib/dist/src/mapi/entry_x86_tls.h P xsrc/external/mit/MesaLib/dist/src/mapi/u_current.c P xsrc/external/mit/MesaLib/dist/src/mapi/u_current.h Killing core files: Updating release-7 src tree (netbsd-7): Updating release-7 xsrc tree (netbsd-7): Updating release-8 src tree (netbsd-8): Updating release-8 xsrc tree (netbsd-8): Updating file list: -rw-rw-r-- 1 srcmastr netbsd 42422769 Apr 10 03:09 ls-lRA.gz
Re: mcelog?
On Mon, Apr 08, 2019 at 11:39:31AM +0530, Mathew, Cherry G. wrote: > On 8 April 2019 10:18:16 AM GMT+05:30, "Aaron J. Grier" > wrote: > >are we going to get an MSR interface for NetBSD any time soon? > > What would such an interface look like ? - start with interface would be a set of ioctls on an i386/amd64- specific /dev/msr that generated kernel-mode rdmsr and wrmsr operations from the thread that made the ioctl. - start hardening by only allowing a set of whitelisted MSRs to complete, so userland couldn't relocate the APIC for example. - continue to harden with separate access for read vs write MSRs - add data-massaging and bounds checking on per-MSR basis in case we have mix of permissions within fields. - add arguments for running the MSR operation on a specific package / socket or hardware thread (rather than under the context of the the caller). if ioctls are out of style, possibly a sysctl interface of some sort. it's probably a google SoC project waiting to happen (if it hasn't already.) end-goal would be ability to run a native x86info with feature parity to linux, as well as capture machine check errors, gather die temperature data, power states, poll hardware performance counters... -- Aaron J. Grier | "Not your ordinary poofy goof." | agr...@poofygoof.com "The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay." -- Tony Hoare
Re: Mesa update
Just FYI - I am getting the same message whe starting kde4's konqueror: There was an error loading the module Dolphin View. The diagnostics is: Cannot load library /usr/pkg/lib/kde4/dolphinpart.so: (/usr/X11R7/lib/libGL.so.3: Use of initialized Thread Local Storage with model initial-exec and dlopen is not supported) ... On Tue, 9 Apr 2019 at 17:19, wrote: > > glmark2 aborts for me, but I don't understand why. > > > gdb -q (which glmark2) > Reading symbols from /usr/pkg/bin/glmark2...done. > (gdb) r > Starting program: /usr/pkg/bin/glmark2 > [New LWP 1 of process 27174] > > Thread 2 received signal SIGABRT, Aborted. > 0x74c647e427ca in _sys___sigprocmask14 () from /usr/lib/libc.so.12 > (gdb) info registers > rax0x0 0 > rbx0x74c64a457900 128394998413568 > rcx0x74c647e427ca 128394958481354 > rdx0x0 0 > rsi0x7f7fff33f090 140187719168144 > rdi0x3 3 > rbp0x74c64a4ac0e0 0x74c64a4ac0e0 > rsp0x7f7fff33f038 0x7f7fff33f038 > r8 0x74c64a4adb38 128394998766392 > r9 0x74c644400900 128394897393920 > r100x0 0 > r110x202514 > r120x0 0 > r130x74c64a4ab120 128394998755616 > r140x0 0 > r150x -1 > rip0x74c647e427ca 0x74c647e427ca <_sys___sigprocmask14+10> > eflags 0x202[ IF ] > cs 0x47 71 > ss 0x3f 63 > ds 0x23 35 > es 0x23 35 > fs 0x0 0 > gs 0x0 0 > (gdb) bt > #0 0x74c647e427ca in _sys___sigprocmask14 () from /usr/lib/libc.so.12 > #1 0x74c642e096ee in pthread_sigmask () from /usr/lib/libpthread.so.1 > #2 0x74c643628eee in util_queue_init () from > /usr/X11R7/lib/modules/dri/i965_dri.so > #3 0x74c643627c8b in disk_cache_create () from > /usr/X11R7/lib/modules/dri/i965_dri.so > #4 0x74c643626b1c in brw_disk_cache_init () from > /usr/X11R7/lib/modules/dri/i965_dri.so > #5 0x74c6434e9091 in ?? () from /usr/X11R7/lib/modules/dri/i965_dri.so > #6 0x74c643939da7 in ?? () from /usr/X11R7/lib/modules/dri/i965_dri.so > #7 0x74c649247956 in ?? () from /usr/X11R7/lib/libGL.so.3 > #8 0x74c64922e576 in __glXInitialize () from /usr/X11R7/lib/libGL.so.3 > #9 0x74c64923013a in glXQueryVersion () from /usr/X11R7/lib/libGL.so.3 > #10 0x00408cdd in GLStateGLX::check_glx_version > (this=this@entry=0x7f7fff340100) at ../src/gl-state-glx.cpp:160 > #11 0x0040904f in GLStateGLX::ensure_glx_fbconfig > (this=this@entry=0x7f7fff340100) at ../src/gl-state-glx.cpp:241 > #12 0x00409329 in GLStateGLX::gotNativeConfig (this=0x7f7fff340100, > vid=@0x7f7fff33ff5c: 0) at ../src/gl-state-glx.cpp:128 > #13 0x00407268 in CanvasGeneric::resize_no_viewport > (this=this@entry=0x7f7fff340150, width=800, height=600) at > ../src/canvas-generic.cpp:225 > #14 0x00408739 in CanvasGeneric::reset (this=0x7f7fff340150) at > ../src/canvas-generic.cpp:56 > #15 0x004cdfa3 in main (argc=, argv=) > at ../src/main.cpp:198 > (gdb) disas > Dump of assembler code for function _sys___sigprocmask14: >0x74c647e427c0 <+0>: mov$0x125,%eax >0x74c647e427c5 <+5>: mov%rcx,%r10 >0x74c647e427c8 <+8>: syscall > => 0x74c647e427ca <+10>:jb 0x74c647e427cd > <_sys___sigprocmask14+13> >0x74c647e427cc <+12>:retq >0x74c647e427cd <+13>:jmpq 0x74c647f999e0 <__cerror> > End of assembler dump. > --
Re: Jemalloc fallout on sandpoint
Hi Christos, > The problem is that on powerpc we use MAX_PAGE_SHIFT not the right page > shift for the machine. I will fix it to compute and use the MIN_PAGE_SHIFT > soon. This: Modified Files: src/external/bsd/jemalloc/include/jemalloc/internal: jemalloc_internal_defs.h Log Message: Use MIN_PAGE_SHIFT if PAGE_SHIFT is not available instead of MAX_PAGE_SHIFT. To generate a diff of this commit: cvs rdiff -u -r1.5 -r1.6 \ src/external/bsd/jemalloc/include/jemalloc/internal/jemalloc_internal_defs.h fixes it for me. Thank you! Regards, Julian
Re: Jemalloc fallout on sandpoint
In article , Robert Swindells wrote: >On 2019-04-09 01:26, Jason Thorpe wrote: >>> On Apr 8, 2019, at 2:40 PM, Julian Coleman wrote: >>> >>> Hi all, >>> >>> Upgraded my QNAP TS-201 (sandpoint) to current, and all binaries crash >>> with: >> >> What kind of CPU is in this device? It's possible that jemalloc is >> making a page size assumption that isn't true for this particular >> powerpc CPU (I think there were other issues like this on another >> port...). > >Do powerpc kernels have a sysctl for page size ? I'm fairly sure that >arm does. It is better I think if it is a constant. christos
Re: Mesa update
glmark2 aborts for me, but I don't understand why. > gdb -q (which glmark2) Reading symbols from /usr/pkg/bin/glmark2...done. (gdb) r Starting program: /usr/pkg/bin/glmark2 [New LWP 1 of process 27174] Thread 2 received signal SIGABRT, Aborted. 0x74c647e427ca in _sys___sigprocmask14 () from /usr/lib/libc.so.12 (gdb) info registers rax0x0 0 rbx0x74c64a457900 128394998413568 rcx0x74c647e427ca 128394958481354 rdx0x0 0 rsi0x7f7fff33f090 140187719168144 rdi0x3 3 rbp0x74c64a4ac0e0 0x74c64a4ac0e0 rsp0x7f7fff33f038 0x7f7fff33f038 r8 0x74c64a4adb38 128394998766392 r9 0x74c644400900 128394897393920 r100x0 0 r110x202514 r120x0 0 r130x74c64a4ab120 128394998755616 r140x0 0 r150x -1 rip0x74c647e427ca 0x74c647e427ca <_sys___sigprocmask14+10> eflags 0x202[ IF ] cs 0x47 71 ss 0x3f 63 ds 0x23 35 es 0x23 35 fs 0x0 0 gs 0x0 0 (gdb) bt #0 0x74c647e427ca in _sys___sigprocmask14 () from /usr/lib/libc.so.12 #1 0x74c642e096ee in pthread_sigmask () from /usr/lib/libpthread.so.1 #2 0x74c643628eee in util_queue_init () from /usr/X11R7/lib/modules/dri/i965_dri.so #3 0x74c643627c8b in disk_cache_create () from /usr/X11R7/lib/modules/dri/i965_dri.so #4 0x74c643626b1c in brw_disk_cache_init () from /usr/X11R7/lib/modules/dri/i965_dri.so #5 0x74c6434e9091 in ?? () from /usr/X11R7/lib/modules/dri/i965_dri.so #6 0x74c643939da7 in ?? () from /usr/X11R7/lib/modules/dri/i965_dri.so #7 0x74c649247956 in ?? () from /usr/X11R7/lib/libGL.so.3 #8 0x74c64922e576 in __glXInitialize () from /usr/X11R7/lib/libGL.so.3 #9 0x74c64923013a in glXQueryVersion () from /usr/X11R7/lib/libGL.so.3 #10 0x00408cdd in GLStateGLX::check_glx_version (this=this@entry=0x7f7fff340100) at ../src/gl-state-glx.cpp:160 #11 0x0040904f in GLStateGLX::ensure_glx_fbconfig (this=this@entry=0x7f7fff340100) at ../src/gl-state-glx.cpp:241 #12 0x00409329 in GLStateGLX::gotNativeConfig (this=0x7f7fff340100, vid=@0x7f7fff33ff5c: 0) at ../src/gl-state-glx.cpp:128 #13 0x00407268 in CanvasGeneric::resize_no_viewport (this=this@entry=0x7f7fff340150, width=800, height=600) at ../src/canvas-generic.cpp:225 #14 0x00408739 in CanvasGeneric::reset (this=0x7f7fff340150) at ../src/canvas-generic.cpp:56 #15 0x004cdfa3 in main (argc=, argv=) at ../src/main.cpp:198 (gdb) disas Dump of assembler code for function _sys___sigprocmask14: 0x74c647e427c0 <+0>: mov$0x125,%eax 0x74c647e427c5 <+5>: mov%rcx,%r10 0x74c647e427c8 <+8>: syscall => 0x74c647e427ca <+10>:jb 0x74c647e427cd <_sys___sigprocmask14+13> 0x74c647e427cc <+12>:retq 0x74c647e427cd <+13>:jmpq 0x74c647f999e0 <__cerror> End of assembler dump.
Re: Jemalloc fallout on sandpoint
On 2019-04-09 01:26, Jason Thorpe wrote: On Apr 8, 2019, at 2:40 PM, Julian Coleman wrote: Hi all, Upgraded my QNAP TS-201 (sandpoint) to current, and all binaries crash with: What kind of CPU is in this device? It's possible that jemalloc is making a page size assumption that isn't true for this particular powerpc CPU (I think there were other issues like this on another port...). Do powerpc kernels have a sysctl for page size ? I'm fairly sure that arm does.
Re: Jemalloc fallout on sandpoint
In article <20190408213840.ga11...@orion.coris.org.uk>, Julian Coleman wrote: >Hi all, > >Upgraded my QNAP TS-201 (sandpoint) to current, and all binaries crash with: > > : >/usr/src/external/bsd/jemalloc/lib/../dist/src/pages.c:273: Failed >assertion: "PAGE_ADDR2BASE(addr) == addr" > [1] Abort trap (core dumped) sh > >Not sure how we can pass in an address that isn't the page base address here. >It looks like the allocations from pages.c have the same assertion, so they >shouldn't cause this. Could base_block_alloc() be allocating a block that >starts at an address that isn't a multiple of the page size? Should we >assert "PAGE_ADDR2BASE(block) == block" every time we allocate a block? The problem is that on powerpc we use MAX_PAGE_SHIFT not the right page shift for the machine. I will fix it to compute and use the MIN_PAGE_SHIFT soon. christos
Re: Mesa update
On Tue, Apr 09, 2019 at 01:41:58PM +0200, Joerg Sonnenberger wrote: > Sorry, my magic 8-ball is broken. dlerror? coypu@ already did the honours: /usr/X11R7/lib/libGL.so: Use of initialized Thread Local Storage with model initial-exec and dlopen is not supported Cheers, Patrick
Re: Mesa update
... but maybe we want to define -DPTHREADS?
Re: Mesa update
On Tue, Apr 09, 2019 at 12:17:46PM +0100, Patrick Welche wrote: > On Tue, Apr 09, 2019 at 10:38:14AM +, co...@sdf.org wrote: > > thanks to jmcneill for suggesting dlerror(); > > > > perhaps we need to remove -DGLX_USE_TLS. it otherwise uses TLS via > > pthread. > > > > /usr/X11R7/lib/libGL.so: Use of initialized Thread Local Storage with model > > initial-exec and dlopen is not supported > > > > I'm going to test > > Thanks! > > (This might end up being "interesting": > > https://gitlab.freedesktop.org/mesa/mesa/blob/master/meson.build#L346 > > so AFAICT the "default" meson build defines GLX_USE_TLS no matter what...) > > > Cheers, > > Patrick There's a bunch of patches in the pkgsrc mesalib18, I'll give them a try. patch-src_glx_glxclient.h patch-src_glx_glxcurrent.c patch-src_mapi_entry__x86-64__tls.h patch-src_mapi_entry__x86__tls.h patch-src_mapi_u__current.c patch-src_mapi_u__current.h (they apply with some fuzz)
Re: Mesa update
On Tue, Apr 09, 2019 at 10:38:14AM +, co...@sdf.org wrote: > thanks to jmcneill for suggesting dlerror(); > > perhaps we need to remove -DGLX_USE_TLS. it otherwise uses TLS via > pthread. > > /usr/X11R7/lib/libGL.so: Use of initialized Thread Local Storage with model > initial-exec and dlopen is not supported > > I'm going to test Thanks! (This might end up being "interesting": https://gitlab.freedesktop.org/mesa/mesa/blob/master/meson.build#L346 so AFAICT the "default" meson build defines GLX_USE_TLS no matter what...) Cheers, Patrick
Re: Mesa update
thanks to jmcneill for suggesting dlerror(); perhaps we need to remove -DGLX_USE_TLS. it otherwise uses TLS via pthread. /usr/X11R7/lib/libGL.so: Use of initialized Thread Local Storage with model initial-exec and dlopen is not supported I'm going to test Index: libmesa.mk === RCS file: /cvsroot/src/external/mit/xorg/lib/libmesa.mk,v retrieving revision 1.6 diff -u -r1.6 libmesa.mk --- libmesa.mk 3 Apr 2019 15:26:34 - 1.6 +++ libmesa.mk 9 Apr 2019 10:37:54 - @@ -521,7 +521,6 @@ -DHAVE_LIBDRM -DGLX_USE_DRM \ -DGLX_INDIRECT_RENDERING \ -DGLX_DIRECT_RENDERING \ - -DGLX_USE_TLS \ -DHAVE_X11_PLATFORM \ -DHAVE_DRM_PLATFORM \ -DENABLE_SHADER_CACHE \ Index: libGL/Makefile === RCS file: /cvsroot/src/external/mit/xorg/lib/libGL/Makefile,v retrieving revision 1.23 diff -u -r1.23 Makefile --- libGL/Makefile 10 Mar 2019 10:51:58 - 1.23 +++ libGL/Makefile 9 Apr 2019 10:37:55 - @@ -164,7 +164,7 @@ -DHAVE_FUNC_ATTRIBUTE_NORETURN=1 -DHAVE_ENDIAN_H=1 -DHAVE_DLADDR=1 \ -DHAVE_CLOCK_GETTIME=1 -DHAVE_PTHREAD_PRIO_INHERIT=1 \ -DHAVE_PTHREAD=1 -DENABLE_ST_OMX_BELLAGIO=0 -DENABLE_ST_OMX_TIZONIA=0 \ - -DHAVE_TIMESPEC_GET -DGLX_USE_TLS + -DHAVE_TIMESPEC_GET .include "../asm.mk" Index: libglapi/Makefile === RCS file: /cvsroot/src/external/mit/xorg/lib/libglapi/Makefile,v retrieving revision 1.4 diff -u -r1.4 Makefile --- libglapi/Makefile 10 Mar 2019 10:51:58 - 1.4 +++ libglapi/Makefile 9 Apr 2019 10:37:55 - @@ -68,7 +68,6 @@ -DGLX_USE_DRM \ -DGLX_INDIRECT_RENDERING \ -DGLX_DIRECT_RENDERING \ - -DGLX_USE_TLS \ -DHAVE_X11_PLATFORM \ -DHAVE_DRM_PLATFORM \ -DENABLE_SHADER_CACHE \
Re: Mesa update
On Tue, Apr 09, 2019 at 10:05:08AM +0100, Patrick Welche wrote: > On Fri, Apr 05, 2019 at 04:10:24PM +, co...@sdf.org wrote: > > hi current-users, > > > > -current is now going to use mesa 18.3.4, and on x86, LLVM for radeon > > and software acceleration. It's faster and supports more modern OpenGL > > functionality. Software raster on x86 is now done using the faster > > llvmpipe. > > (Thanks to mrg@ and joerg@). > > > > This will increase your build times dramatically if you build Xorg on > > x86, from building LLVM libraries. > > > > If you would like to do an update build, you will likely have to remove > > many directories in OBJDIR/external/mit/xorg/lib/*. > > I didn't test this, sorry. > > > > Let me know if there are any situations for which this fails to work. We > > got really good testing of things before committing it so I don't expect > > much trouble. > > Wondering why I couldn't get glmark2 to work, I see the following oddity: > > > #include > #include > > int main() > { > void *handle; > > handle = dlopen("/usr/X11R7/lib/libGL.so.2", RTLD_NOW | RTLD_NODELETE); > printf("GL version 2 handle = %p\n", handle); > > handle = dlopen("/usr/X11R7/lib/libGL.so.3", RTLD_NOW | RTLD_NODELETE); > printf("GL version 3 handle = %p\n", handle); > > return 0; > } > > $ ./glmark > GL version 2 handle = 0x7f7ff7ef9800 > GL version 3 handle = 0x0 > $ file /usr/X11R7/lib/libGL.so.* > /usr/X11R7/lib/libGL.so.2: symbolic link to libGL.so.2.0 > /usr/X11R7/lib/libGL.so.2.0: ELF 64-bit LSB shared object, x86-64, version 1 > (SYSV), dynamically linked, for NetBSD 8.99.3, not stripped > /usr/X11R7/lib/libGL.so.3: symbolic link to libGL.so.3.0 > /usr/X11R7/lib/libGL.so.3.0: ELF 64-bit LSB shared object, x86-64, version 1 > (SYSV), dynamically linked, for NetBSD 8.99.37, not stripped > > so why will the old library dlopen, but not the new one?! (if you try so.3 first, then so.2, you 0x0 and a seg fault - I know you wouldn't want to load 2 libraries with the same symbols in - but this illustrates that there is a problem)
Re: Mesa update
On Fri, Apr 05, 2019 at 04:10:24PM +, co...@sdf.org wrote: > hi current-users, > > -current is now going to use mesa 18.3.4, and on x86, LLVM for radeon > and software acceleration. It's faster and supports more modern OpenGL > functionality. Software raster on x86 is now done using the faster > llvmpipe. > (Thanks to mrg@ and joerg@). > > This will increase your build times dramatically if you build Xorg on > x86, from building LLVM libraries. > > If you would like to do an update build, you will likely have to remove > many directories in OBJDIR/external/mit/xorg/lib/*. > I didn't test this, sorry. > > Let me know if there are any situations for which this fails to work. We > got really good testing of things before committing it so I don't expect > much trouble. Wondering why I couldn't get glmark2 to work, I see the following oddity: #include #include int main() { void *handle; handle = dlopen("/usr/X11R7/lib/libGL.so.2", RTLD_NOW | RTLD_NODELETE); printf("GL version 2 handle = %p\n", handle); handle = dlopen("/usr/X11R7/lib/libGL.so.3", RTLD_NOW | RTLD_NODELETE); printf("GL version 3 handle = %p\n", handle); return 0; } $ ./glmark GL version 2 handle = 0x7f7ff7ef9800 GL version 3 handle = 0x0 $ file /usr/X11R7/lib/libGL.so.* /usr/X11R7/lib/libGL.so.2: symbolic link to libGL.so.2.0 /usr/X11R7/lib/libGL.so.2.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, for NetBSD 8.99.3, not stripped /usr/X11R7/lib/libGL.so.3: symbolic link to libGL.so.3.0 /usr/X11R7/lib/libGL.so.3.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, for NetBSD 8.99.37, not stripped so why will the old library dlopen, but not the new one?! Cheers, Patrick