11.0-CURRENT: SBUF_INCLUDENUL, /head/sys/kern/subr_sbuf.c -r280193 vs. updating to head snaphot -r280598
Basic context: # freebsd-version -ku; uname -apKU 11.0-CURRENT 11.0-CURRENT FreeBSD FBSDG5C0 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r279514M: Sat Mar 21 05:15:23 PDT 2015 root@FBSDG5C0:/usr/obj/usr/srcC/sys/GENERIC64vtsc-NODEBUG powerpc powerpc64 1100062 1100062 The problem: Summary of the details that are listed later. Both of the following exist: /usr/src/sys/sys/sbuf.h /usr/include/sys/sbuf.h The first can be newer than the 2nd during buildworld. The buildworld compile of /head/sys/kern/subr_sbuf.c from an updated /usr/src can/does end up using the second instead of the first, at least for the powerpc64-xtoolchain-gcc style of buildworld activity that I am trying. The recent addition of SBUF_INCLUDENUL use ends up with its definition missing because of this: during the build /usr/include/sys/sbuf.h ends up being the file included and the compile fails from the missing additional definition. Either the #include paths in /head/sys/kern/subr_sbuf.c or the command line arguments should force the /usr/src/sys/sys/sbuf.h vintage file to be found. The /head/sys/kern/subr_sbuf.c relevant includes are shown below... #include sys/cdefs.h __FBSDID($FreeBSD: head/sys/kern/subr_sbuf.c 280193 2015-03-17 21:00:31Z ian $); #include sys/param.h #ifdef _KERNEL #include sys/ctype.h #include sys/errno.h #include sys/kernel.h #include sys/malloc.h #include sys/systm.h #include sys/uio.h #include machine/stdarg.h #else /* _KERNEL */ #include ctype.h #include errno.h #include stdarg.h #include stdio.h #include stdlib.h #include string.h #endif /* _KERNEL */ #include sys/sbuf.h I have not checked for other .c files with similar issues for sys/sbuf.h usage during buildworld. The problem details: /head/sys/kern/subr_sbuf.c -r280193 added: #define SBUF_NULINCLUDED(s) ((s)-s_flags SBUF_INCLUDENUL) and head (20150325 r280598) contains it. But the SBUF_INCLUDENUL reference blocks buildworld (for at least /usr/local/bin/powerpc64-portbld-freebsd11.0-gcc (powerpc64-xtoolchain=gcc) use): /usr/local/bin/powerpc64-portbld-freebsd11.0-gcc -fpic -DPIC -O2 -pipe -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes -Wmissing-pro totypes -Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow -Wunused-parameter -Wcast-align -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls -Wold-style-definition -Wno-pointer-sign -c /usr/src/lib/libsbuf/../../sys/kern/subr_sbuf.c -o subr_sbuf.So ... /usr/src/lib/libsbuf/../../sys/kern/subr_sbuf.c:73:45: error: 'SBUF_INCLUDENUL' undeclared (first use in this function) #define SBUF_NULINCLUDED(s) ((s)-s_flags SBUF_INCLUDENUL) ^ Looking to see where SBUF_INCLUDENUL usage and definitions might be in /usr/src for -r280598 ... # pwd /usr/src # find . \( -type d -name .svn -prune \) -or \( -type f -exec grep SBUF_INCLUDENUL {} \; -print \) | more .It Dv SBUF_INCLUDENUL ./share/man/man9/sbuf.9 #define SBUF_INCLUDENUL 0x0002 /* nulterm byte is counted in len */ ./sys/sys/sbuf.h sbuf_clear_flags(sbuf, SBUF_INCLUDENUL); ./sys/vm/uma_core.c SBUF_INCLUDENUL); ./sys/netinet/tcp_hostcache.c sbuf_clear_flags(sbuf, SBUF_INCLUDENUL); ./sys/kern/kern_malloc.c SBUF_INCLUDENUL); ./sys/kern/kern_cons.c sbuf_clear_flags(sb, SBUF_INCLUDENUL); sbuf_clear_flags(sb, SBUF_INCLUDENUL); ./sys/kern/kern_descrip.c sbuf_clear_flags(sb, SBUF_INCLUDENUL); sbuf_clear_flags(sb, SBUF_INCLUDENUL); sbuf_clear_flags(sb, SBUF_INCLUDENUL); sbuf_clear_flags(sb, SBUF_INCLUDENUL); sbuf_clear_flags(sb, SBUF_INCLUDENUL); ./sys/kern/kern_proc.c sbuf_new(sb, NULL, 256, SBUF_AUTOEXTEND | SBUF_INCLUDENUL); ./sys/kern/kern_et.c sbuf_new(sb, NULL, 128, SBUF_AUTOEXTEND | SBUF_INCLUDENUL); ./sys/kern/kern_fail.c s = sbuf_new(s, buf, length, SBUF_FIXEDLEN | SBUF_INCLUDENUL); ./sys/kern/kern_sysctl.c #define SBUF_NULINCLUDED(s) ((s)-s_flags SBUF_INCLUDENUL) ./sys/kern/subr_sbuf.c Looking at the list of includes in /head/lib/libc/net/sctp_sys_calls.c for -r280193 shows: #include sys/cdefs.h __FBSDID($FreeBSD: head/sys/kern/subr_sbuf.c 280193 2015-03-17 21:00:31Z ian $); #include sys/param.h #ifdef _KERNEL #include sys/ctype.h #include sys/errno.h #include sys/kernel.h #include sys/malloc.h #include sys/systm.h #include sys/uio.h #include machine/stdarg.h #else /* _KERNEL */ #include ctype.h #include errno.h #include stdarg.h #include stdio.h #include stdlib.h #include string.h #endif /* _KERNEL */ #include sys/sbuf.h That there was no complaint about sbuf.h being missing suggests that a sys/sctp.h was found but did not contain a SBUF_INCLUDENUL definition: so a different one than the above find/grep reported. Using a find to
11.0-CURRENT: BIO_F_DGRAM_SCTP_WRITE, /head/crypto/openssl/crypto/bio/bio_err.c -r280297 vs. updating to head snaphot -r280598
Basic context: # freebsd-version -ku; uname -apKU 11.0-CURRENT 11.0-CURRENT FreeBSD FBSDG5C0 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r279514M: Sat Mar 21 05:15:23 PDT 2015 root@FBSDG5C0:/usr/obj/usr/srcC/sys/GENERIC64vtsc-NODEBUG powerpc powerpc64 1100062 1100062 The problem: Summary of the details that are listed later. All of the following exist: /usr/src/crypto/openssl/crypto/bio/bio.h /usr/obj/usr/src/tmp/usr/include/openssl/bio.h /usr/include/openssl/bio.h The first two can be newer than the last during buildworld. The buildworld compile of /head/crypto/openssl/crypto/bio/bio_err.c from an updated /usr/src can/does end up using the last instead of one of the first two, at least for the powerpc64-xtoolchain-gcc style of buildworld activity that I am trying. The recent addition of BIO_F_DGRAM_SCTP_WRITE ends up with its definition missing because of this: during the build /usr/include/openssl/bio.h ends up being the file included and the compile fails from the missing additional definition. Either the #include paths in /head/crypto/openssl/crypto/bio/bio_err.c or the command line arguments should force the /usr/obj/usr/src/tmp/usr/include/openssl/bio.h (or /usr/src/crypto/openssl/crypto/bio/bio.h ) vintage file to be found. The bio.h relevant includes are shown below... #include stdio.h #include openssl/err.h #include openssl/bio.h More than bio.h might have such issues since there is an openssl/err.h include path in /head/crypto/openssl/crypto/bio/bio_err.c . I have not checked for other .c files with similar issues for openssl/... usage during buildworld. The problem details: /head/crypto/openssl/crypto/bio/bio_err.c -r280297 added: {ERR_FUNC(BIO_F_DGRAM_SCTP_WRITE), DGRAM_SCTP_WRITE}, and head (20150325 r280598) contains it. But the BIO_F_DGRAM_SCTP_WRITE reference blocks buildworld (for at least /usr/local/bin/powerpc64-portbld-freebsd11.0-gcc (powerpc64-xtoolchain=gcc) use): /usr/local/bin/powerpc64-portbld-freebsd11.0-gcc -fpic -DPIC -O2 -pipe -DTERMIOS -DANSI_SOURCE -I/usr/src/secure/lib/libcrypto/../../../crypto/openssl -I/usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto -I/usr/obj/usr/src/secure/lib/libcrypto -DOPENSSL_THREADS -DDSO_DLFCN -DHAVE_DLFCN_H -I/usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/asn1 -I/usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/evp -I/usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/modes -std=gnu89 -fstack-protector -Wno-pointer-sign -c /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/bio/bio_err.c -o bio_err.So In file included from /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/bio/bio_err.c:63:0: /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/bio/bio_err.c:99:15: error: 'BIO_F_DGRAM_SCTP_WRITE' undeclared here (not in a function) {ERR_FUNC(BIO_F_DGRAM_SCTP_WRITE), DGRAM_SCTP_WRITE}, ^ Looking to see where usage and definitions might be in /usr/src for -r280598 ... # pwd /usr/src $ find . \( -type d -name .svn -prune \) -or \( -type f -exec grep BIO_F_DGRAM_SCTP_WRITE {} \; -print \) | more # define BIO_F_DGRAM_SCTP_WRITE 133 ./crypto/openssl/crypto/bio/bio.h BIOerr(BIO_F_DGRAM_SCTP_WRITE, ERR_R_MALLOC_ERROR); ./crypto/openssl/crypto/bio/bss_dgram.c {ERR_FUNC(BIO_F_DGRAM_SCTP_WRITE), DGRAM_SCTP_WRITE}, ./crypto/openssl/crypto/bio/bio_err.c And looking at the list of includes in /head/crypto/openssl/crypto/bio/bio_err.c -r280297 shows: #include stdio.h #include openssl/err.h #include openssl/bio.h That there was no complaint about bio.h being missing suggests that a openssl/bio.h was found but did not contain a BIO_F_DGRAM_SCTP_WRITE definition: so a different one than a copy of what the above find/grep reported. Using a find to report other bio.h files shows: # find / \( -type d -name .svn -prune \) -or \( -type f -name bio.h -print \) | more /usr/src/crypto/openssl/crypto/bio/bio.h /usr/src/sys/sys/bio.h /usr/obj/usr/src/tmp/usr/include/openssl/bio.h /usr/include/openssl/bio.h /usr/include/sys/bio.h (Ignoring .../sys/bio.h as distinct by content and by path prefix...) The diff of /usr/obj/usr/src/tmp/usr/include/openssl/bio.h and /usr/include/openssl/bio.h shows the problem if the wrong file is found and used: diff -w /usr/src/crypto/openssl/crypto/bio/bio.h /usr/include/openssl/bio.h | less ... 797,798c775 /* * The following lines are auto generated by the script mkerr.pl. Any changes --- /* The following lines are auto generated by the script mkerr.pl. Any changes 832d808 # define BIO_F_DGRAM_SCTP_WRITE 133 Context details: make -j 8 CROSS_TOOLCHAIN=powerpc64-gcc WITHOUT_CLANG_BOOTSTRAP= WITHOUT_CLANG= WITHOUT_CLANG_IS_CC= \ WITHOUT_LLDB= \ WITH_GCC_BOOTSTRAP= WITH_GCC= WITHOUT_GNUCXX= \ WITHOUT_BOOT= WITHOUT_LIB32= \ buildworld
Re: [Call for testers] DRM device-independent code update to Linux 3.8 (take #2)
On 26 mar 2015, at 22:00, Jakob Alvermark wrote: On Tue, March 24, 2015 00:29, Hans Petter Selasky wrote: Hi, Without the attached kernel patch(es), Xorg starts consuming alot of CPU and becomes very unresponsive and unusable. Using ktrace reveals that X-org is issuing DRM_IOCTL_MODE_GETCONNECTOR over and over again with no apparent reason. It doesn't happen when using a simple window manager like blackbox. I was not able to use XFCE4 (9-stable userland) with 11-current kernel at all, after the latest DRM2 kernel updates. It worked fine before the update. I'm not sure what is causing it. Going through the new DRM2 code revealed that a mode sorting function did not take all parameters like interlaced or not into account, causing the mode list to be reshuffelled every time a new mode scan was done. Not sure if Xorg cares about this though. I got the same problem with XFCE4, Xorg at 100% CPU. Applied the patch and it works again. Interestingly, xrandr now lists a lot more available modes than before the DRM code update. I thought it was my cheap TV that only supported 1080i, but it turns out that now I can use 1080p@60Hz! Jakob ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Mbuf leak in if_lagg.c
Le jeudi 26 mars 2015, 23:10:49 Andrey V. Elsukov a écrit : On 26.03.2015 22:42, Andrey V. Elsukov wrote: If lp_detaching is non 0, the mbuf pointer is set to NULL without m_freem it. Can you look at this ? Hi, what you thing about this patch? lp_detaching can be non zero in case of parent interface departure. So I don't see the reason to call ETHER_BPF_MTAP() in this case. Now I see the reason - to capture all received packets before interface departure. New is attached. Sounds good for me :) -- Alexandre Martins STORMSHIELD smime.p7s Description: S/MIME cryptographic signature
Re: Early use of log() does not end up in kernel msg buffer
On 03/26/2015 23:20, Eric Badger wrote: Using log(9) when no process is reading the log results in the message going only to the console (contrast with printf(9), which goes to the console and to the kernel message buffer in this case). I believe it is truer to the semantics of logging for messages to *always* go to the message buffer (where they can eventually be collected and in fact put into a logfile). I therefore propose the attached patch, which sends log(9) to the message buffer always, and to the console only if no one has yet opened the log. This makes sense to me. Since I'm new here, I'll wait for others to comment. Eric ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
11.0-CURRENT: DTLS1_VERSION_MAJOR, /head/crypto/openssl/ssl/ssl_asn1.c -r280297 vs. updating to head snaphot -r280598
/head/crypto/openssl/ssl/ssl_asn1.c has a similar issue to /head/crypto/openssl/crypto/bio/bio_err.c but relative to: /usr/local/bin/powerpc64-portbld-freebsd11.0-gcc -fpic -DPIC -O2 -pipe -DTERMIOS -DANSI_SOURCE -I/usr/src/secure/lib/libssl/../../../crypto/openssl -I/usr/src/secure/lib/libssl/../../../crypto/openssl/crypto -I/usr/obj/usr/src/secure/lib/libssl -DOPENSSL_THREADS -DDSO_DLFCN -DHAVE_DLFCN_H -std=gnu99 -fstack-protector -Wno-pointer-sign -c /usr/src/secure/lib/libssl/../../../crypto/openssl/ssl/ssl_asn1.c -o ssl_asn1.So /usr/src/secure/lib/libssl/../../../crypto/openssl/ssl/ssl_asn1.c: In function 'd2i_SSL_SESSION': /usr/src/secure/lib/libssl/../../../crypto/openssl/ssl/ssl_asn1.c:425:34: error: 'DTLS1_VERSION_MAJOR' undeclared (first use in this function) || (ssl_version 8) == DTLS1_VERSION_MAJOR ^ # pwd /usr/src # find . \( -type d -name .svn -prune \) -or \( -type f -exec grep DTLS1_VERSION_MAJOR {} \; -print \) | more # define DTLS1_VERSION_MAJOR 0xFE ./crypto/openssl/ssl/dtls1.h || (ssl_version 8) == DTLS1_VERSION_MAJOR ./crypto/openssl/ssl/ssl_asn1.c # find / \( -type d -name .svn -prune \) -or \( -type f -name dtls1.h -print \) | more /usr/src/crypto/openssl/ssl/dtls1.h /usr/obj/usr/src/tmp/usr/include/openssl/dtls1.h /usr/include/openssl/dtls1.h #include stdio.h #include stdlib.h #include ssl_locl.h #include openssl/asn1_mac.h #include openssl/objects.h #include openssl/x509.h So the crypto/openssl/ssl/dtls1.h or /usr/obj/usr/src/tmp/usr/include/openssl/dtls1.h file is not directly included at all. Finding where it is included... Omitting most Makefile lines... # find . \( -type d -name .svn -prune \) -or \( -type f -exec grep dtls1\.h {} \; -print \) | more INCS= dtls1.h kssl.h srtp.h ssl.h ssl2.h ssl23.h ssl3.h tls1.h ./secure/lib/libssl/Makefile ... ./crypto/openssl/apps/Makefile /* ssl/dtls1.h */ ./crypto/openssl/ssl/dtls1.h ... ./crypto/openssl/ssl/Makefile # include openssl/dtls1.h /* Datagram TLS */ ./crypto/openssl/ssl/ssl.h crypto/openssl/ssl/ssl.h is not directly included. (I will stop here as far as the include sequence goes since a dtls1.h was found.) The openssl/dtls1.h reference is finding /usr/include/openssl/dtls1.h (old) instead of /usr/obj/usr/src/tmp/usr/include/openssl/dtls1.h or /usr/src/crypto/openssl/ssl/dtls1.h (new): # diff -w /usr/include/openssl/dtls1.h /usr/obj/usr/src/tmp/usr/include/openssl/dtls1.h | more 87a88 # define DTLS1_VERSION_MAJOR 0xFE 123,129c124,128 ... === Mark Millard markmi at dsl-only.net On 2015-Mar-27, at 02:44 AM, Mark Millard markmi at dsl-only.net wrote: Basic context: # freebsd-version -ku; uname -apKU 11.0-CURRENT 11.0-CURRENT FreeBSD FBSDG5C0 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r279514M: Sat Mar 21 05:15:23 PDT 2015 root@FBSDG5C0:/usr/obj/usr/srcC/sys/GENERIC64vtsc-NODEBUG powerpc powerpc64 1100062 1100062 The problem: Summary of the details that are listed later. All of the following exist: /usr/src/crypto/openssl/crypto/bio/bio.h /usr/obj/usr/src/tmp/usr/include/openssl/bio.h /usr/include/openssl/bio.h The first two can be newer than the last during buildworld. The buildworld compile of /head/crypto/openssl/crypto/bio/bio_err.c from an updated /usr/src can/does end up using the last instead of one of the first two, at least for the powerpc64-xtoolchain-gcc style of buildworld activity that I am trying. The recent addition of BIO_F_DGRAM_SCTP_WRITE ends up with its definition missing because of this: during the build /usr/include/openssl/bio.h ends up being the file included and the compile fails from the missing additional definition. Either the #include paths in /head/crypto/openssl/crypto/bio/bio_err.c or the command line arguments should force the /usr/obj/usr/src/tmp/usr/include/openssl/bio.h (or /usr/src/crypto/openssl/crypto/bio/bio.h ) vintage file to be found. The bio.h relevant includes are shown below... #include stdio.h #include openssl/err.h #include openssl/bio.h More than bio.h might have such issues since there is an openssl/err.h include path in /head/crypto/openssl/crypto/bio/bio_err.c . I have not checked for other .c files with similar issues for openssl/... usage during buildworld. The problem details: /head/crypto/openssl/crypto/bio/bio_err.c -r280297 added: {ERR_FUNC(BIO_F_DGRAM_SCTP_WRITE), DGRAM_SCTP_WRITE}, and head (20150325 r280598) contains it. But the BIO_F_DGRAM_SCTP_WRITE reference blocks buildworld (for at least /usr/local/bin/powerpc64-portbld-freebsd11.0-gcc (powerpc64-xtoolchain=gcc) use): /usr/local/bin/powerpc64-portbld-freebsd11.0-gcc -fpic -DPIC -O2 -pipe -DTERMIOS -DANSI_SOURCE -I/usr/src/secure/lib/libcrypto/../../../crypto/openssl -I/usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto -I/usr/obj/usr/src/secure/lib/libcrypto -DOPENSSL_THREADS
[PATCH] Adding backlight support for the i915 driver.
This patch exposes the backlight support via a sysctl: set the backlight to 10%: # sysctl hw.dri.0.i915_backlight=10 hw.dri.0.i915_backlight: 25 - 10 set the backlight to 50%: # sysctl hw.dri.0.i915_backlight=50 hw.dri.0.i915_backlight: 10 - 50 decrease the current backlight value: # sysctl hw.dri.0.i915_backlight=-1000 hw.dri.0.i915_backlight: 50 - 43 increment the current backlight value: # sysctl hw.dri.0.i915_backlight=1000 hw.dri.0.i915_backlight: 43 - 51 # sysctl hw.dri.0.i915_backlight=1000 hw.dri.0.i915_backlight: 51 - 60 I am running this path on for about a week without issue. This path can be found at: https://github.com/maurizio-emmex/i915_backlight_freebsd I thank Elizabeth Myers, elizabeth at interlinked dot me, for the idea of adding the backlight support for the i915 driver and for the original patch. Regards, Maurizio ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
locking issue between igmp and route code in current?
Hi, I noticed the following checkins to route.c in current, and was wondering if they have barring on the deadlock documented below: Revision 274589 http://svnweb.freebsd.org/base?view=revisionrevision=274589 - (view http://svnweb.freebsd.org/base/head/sys/net/route.c?revision=274589view=markup) (download http://svnweb.freebsd.org/base/head/sys/net/route.c?revision=274589view=co) (annotate http://svnweb.freebsd.org/base/head/sys/net/route.c?annotate=274589) - [select for diffs] http://svnweb.freebsd.org/base/head/sys/net/route.c?view=logr1=274589log_pagestart=0 Modified Sun Nov 16 18:15:23 2014 UTC (4 months, 1 week ago) by melifaro File length: 46990 byte(s) Diff to previous 274585 http://svnweb.freebsd.org/base/head/sys/net/route.c?r1=274585r2=274589 Revert r274585 http://svnweb.freebsd.org/base?view=revisionrevision=274585: rte lock is properly destroyed in uma dtor callback. Pointed by: glebius Revision 274585 http://svnweb.freebsd.org/base?view=revisionrevision=274585 - (view http://svnweb.freebsd.org/base/head/sys/net/route.c?revision=274585view=markup) (download http://svnweb.freebsd.org/base/head/sys/net/route.c?revision=274585view=co) (annotate http://svnweb.freebsd.org/base/head/sys/net/route.c?annotate=274585) - [select for diffs] http://svnweb.freebsd.org/base/head/sys/net/route.c?view=logr1=274585log_pagestart=0 Modified Sun Nov 16 14:56:31 2014 UTC (4 months, 1 week ago) by melifaro File length: 47013 byte(s) Diff to previous 274187 http://svnweb.freebsd.org/base/head/sys/net/route.c?r1=274187r2=274585 Make witness happy: destroy rte lock before free. MFC after: 2 weeks lock order reversal: 1st 0xf80003d62190 if_addr_lock (if_addr_lock) @ /u/lars/sandbox/builds/curre nt_10032015/sys/netinet/igmp.c:1714 2nd 0xf800090d7be0 radix node head (radix node head) @ /u/lars/sandbox/builds/current_10032015/sys/net/route.c:415 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe0043faf3f0 witness_checkorder() at witness_checkorder+0xbe7/frame 0xfe0043faf480 __rw_rlock() at __rw_rlock+0x5a/frame 0xfe0043faf520 rtalloc1_fib() at rtalloc1_fib+0x60/frame 0xfe0043faf5d0 rtalloc_ign_fib() at rtalloc_ign_fib+0x98/frame 0xfe0043faf610 flowtable_lookup_common() at flowtable_lookup_common+0x1e6/frame 0xfe0043faf6f0 flowtable_lookup() at flowtable_lookup+0x10f/frame 0xfe0043faf750 ip_output() at ip_output+0x87/frame 0xfe0043faf840 igmp_intr() at igmp_intr+0x2ed/frame 0xfe0043faf8c0 netisr_dispatch_src() at netisr_dispatch_src+0x61/frame 0xfe0043faf930 igmp_v1v2_queue_report() at igmp_v1v2_queue_report+0x14b/frame 0xfe0043faf980 igmp_fasttimo() at igmp_fasttimo+0x381/frame 0xfe0043fafa30 pffasttimo() at pffasttimo+0x54/frame 0xfe0043fafa60 softclock_call_cc() at softclock_call_cc+0x165/frame 0xfe0043fafb20 softclock() at softclock+0x3d/frame 0xfe0043fafb40 intr_event_execute_handlers() at intr_event_execute_handlers+0xb1/frame 0xfe0043fafb70 ithread_loop() at ithread_loop+0x9c/frame 0xfe0043fafbb0 fork_exit() at fork_exit+0x71/frame 0xfe0043fafbf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe0043fafbf0 This was on a current build as of March 10 - svn 279869 Lars ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: 11.0-CURRENT: SCTP_MAX_CWND, lib/libc/net/sctp_sys_calls.c -r279859 vs. updating to head snaphot -r280598
On Fri, 2015-03-27 at 10:17 -0500, Michael Tuexen wrote: On 26 Mar 2015, at 21:36, Mark Millard mar...@dsl-only.net wrote: Basic context: # freebsd-version -ku; uname -apKU 11.0-CURRENT 11.0-CURRENT FreeBSD FBSDG5C0 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r279514M: Sat Mar 21 05:15:23 PDT 2015 root@FBSDG5C0:/usr/obj/usr/srcC/sys/GENERIC64vtsc-NODEBUG powerpc powerpc64 1100062 1100062 The problem: Summary of the details that are listed later. Both of the following exist: /usr/src/sys/netinet/sctp.h /usr/include/netinet/sctp.h The first can be newer than the 2nd during buildworld. The buildworld compile of /head/lib/libc/net/sctp_sys_calls.c from an updated /usr/src can/does end up using the second instead of the first, at least for the powerpc64-xtoolchain-gcc style of buildworld activity that I am trying. The recent addition of SCTP_MAX_CWND ends up with its definition missing because of this: during the build /usr/include/netinet/sctp.h ends up being the file included and the compile fails from the missing additional definition. Either the #include paths in /head/lib/libc/net/sctp_sys_calls.c or the command line arguments should force the /usr/src/sys/netinet/sctp.h vintage file to be found. The 3 netinet/ relevant includes are shown below... ... #include netinet/in.h #include arpa/inet.h #include netinet/sctp_uio.h #include netinet/sctp.h More than sctp.h might have such issues since there are 3 netinet/ include paths in /head/lib/libc/net/sctp_sys_calls.c . I have not checked for other .c files with similar issues for netinet/... usage during buildworld. I guess there is something wrong with the build system / Makefiles such that the entries in the search path for include files are in the wrong order. I don't think this is related to the concrete patch you are referring to. It only exposes the problem. As I see, you experience similar problems in other situations to. Maybe someone knowing the build system has to look into it. And it seems to be somewhat platform specific, since I have not observed this problem when testing the build on amd64 and arm. Best regards Michael This and the other similar reports on current@ appear to be problems with the xtoolchain ports, not the base build system, and probably should have been reported to the port's maintainer, or on ports@. Or perhaps it's some sort of usage error, I don't know anything about the xtoolchain stuff. In any case, there doesn't seem to be anything wrong with the base build using the supported build mechanisms. -- Ian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [PATCH] Adding backlight support for the i915 driver.
On 03/27/15 16:01, Ranjan1018 . wrote: This patch exposes the backlight support via a sysctl: set the backlight to 10%: # sysctl hw.dri.0.i915_backlight=10 hw.dri.0.i915_backlight: 25 - 10 set the backlight to 50%: # sysctl hw.dri.0.i915_backlight=50 hw.dri.0.i915_backlight: 10 - 50 decrease the current backlight value: # sysctl hw.dri.0.i915_backlight=-1000 hw.dri.0.i915_backlight: 50 - 43 increment the current backlight value: # sysctl hw.dri.0.i915_backlight=1000 hw.dri.0.i915_backlight: 43 - 51 # sysctl hw.dri.0.i915_backlight=1000 hw.dri.0.i915_backlight: 51 - 60 I am running this path on for about a week without issue. This path can be found at: https://github.com/maurizio-emmex/i915_backlight_freebsd I thank Elizabeth Myers, elizabeth at interlinked dot me, for the idea of adding the backlight support for the i915 driver and for the original patch. Regards, Maurizio Maybe you want to use CTLFLAG_RWTUN so that it also can be set from /boot/loader.conf ? --HPS ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: 11.0-CURRENT: SCTP_MAX_CWND, lib/libc/net/sctp_sys_calls.c -r279859 vs. updating to head snaphot -r280598
On 26 Mar 2015, at 21:36, Mark Millard mar...@dsl-only.net wrote: Basic context: # freebsd-version -ku; uname -apKU 11.0-CURRENT 11.0-CURRENT FreeBSD FBSDG5C0 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r279514M: Sat Mar 21 05:15:23 PDT 2015 root@FBSDG5C0:/usr/obj/usr/srcC/sys/GENERIC64vtsc-NODEBUG powerpc powerpc64 1100062 1100062 The problem: Summary of the details that are listed later. Both of the following exist: /usr/src/sys/netinet/sctp.h /usr/include/netinet/sctp.h The first can be newer than the 2nd during buildworld. The buildworld compile of /head/lib/libc/net/sctp_sys_calls.c from an updated /usr/src can/does end up using the second instead of the first, at least for the powerpc64-xtoolchain-gcc style of buildworld activity that I am trying. The recent addition of SCTP_MAX_CWND ends up with its definition missing because of this: during the build /usr/include/netinet/sctp.h ends up being the file included and the compile fails from the missing additional definition. Either the #include paths in /head/lib/libc/net/sctp_sys_calls.c or the command line arguments should force the /usr/src/sys/netinet/sctp.h vintage file to be found. The 3 netinet/ relevant includes are shown below... ... #include netinet/in.h #include arpa/inet.h #include netinet/sctp_uio.h #include netinet/sctp.h More than sctp.h might have such issues since there are 3 netinet/ include paths in /head/lib/libc/net/sctp_sys_calls.c . I have not checked for other .c files with similar issues for netinet/... usage during buildworld. I guess there is something wrong with the build system / Makefiles such that the entries in the search path for include files are in the wrong order. I don't think this is related to the concrete patch you are referring to. It only exposes the problem. As I see, you experience similar problems in other situations to. Maybe someone knowing the build system has to look into it. And it seems to be somewhat platform specific, since I have not observed this problem when testing the build on amd64 and arm. Best regards Michael The problem details: /head/lib/libc/net/sctp_sys_calls.c -r279859 added: case SCTP_MAX_CWND: ((struct sctp_assoc_value *)arg)-assoc_id = id; break; and head (20150325 r280598) contains it. But the SCTP_MAX_CWND reference blocks buildworld (for at least /usr/local/bin/powerpc64-portbld-freebsd11.0-gcc (powerpc64-xtoolchain=gcc) use): /usr/local/bin/powerpc64-portbld-freebsd11.0-gcc -fpic -DPIC -O2 -pipe -I/usr/src/lib/libc/include -I/usr/src/lib/libc/../../include -I/usr/src/lib/libc/powerpc64 -DNLS -D__DBINTERFACE_PRIVATE -I/usr/src/lib/libc/../../contrib/gdtoa -I/usr/src/lib/libc/../../contrib/libc-vis -DINET6 -I/usr/obj/usr/src/lib/libc -I/usr/src/lib/libc/resolv -D_ACL_PRIVATE -DPOSIX_MISTAKE -I/usr/src/lib/libc/../libmd -I/usr/src/lib/libc/../../contrib/jemalloc/include -DMALLOC_PRODUCTION -I/usr/src/lib/libc/../../contrib/tzcode/stdtime -I/usr/src/lib/libc/stdtime -I/usr/src/lib/libc/locale -DBROKEN_DES -DPORTMAP -DDES_BUILTIN -I/usr/src/lib/libc/rpc -DYP -DNS_CACHING -DSYMBOL_VERSIONING -DSYSCALL_COMPAT -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /usr/src/lib/libc/net/sctp_sys_calls.c -o sctp_sys_calls.So /usr/src/lib/libc/net/sctp_sys_calls.c: In function 'sctp_opt_info': /usr/src/lib/libc/net/sctp_sys_calls.c:386:7: error: 'SCTP_MAX_CWND' undeclared (first use in this function) case SCTP_MAX_CWND: ^ Looking to see where usage and definitions might be in /usr/src for -r280598 ... # pwd /usr/src $ find . \( -type d -name .svn -prune \) -or \( -type f -exec grep SCTP_MAX_CWND {} \; -print \) | more case SCTP_MAX_CWND: ./lib/libc/net/sctp_sys_calls.c case SCTP_MAX_CWND: case SCTP_MAX_CWND: ./sys/netinet/sctp_usrreq.c #define SCTP_MAX_CWND 0x0032 ./sys/netinet/sctp.h And looking at the list of includes in /head/lib/libc/net/sctp_sys_calls.c for -r279859 shows: #include sys/cdefs.h __FBSDID($FreeBSD$); #include stdio.h #include string.h #include errno.h #include stdlib.h #include unistd.h #include sys/types.h #include sys/socket.h #include sys/errno.h #include sys/syscall.h #include sys/uio.h #include netinet/in.h #include arpa/inet.h #include netinet/sctp_uio.h #include netinet/sctp.h That there was no complaint about sctp.h being missing suggests that a netinet/sctp.h was found but did not contain a SCTP_MAX_CWND definition: so a different one than the above find/grep reported. Using a find to report other sctp.h files shows: # find / \( -type d -name .svn -prune \) -or \( -type f -name sctp.h -print \) | more /usr/src/sys/netinet/sctp.h /usr/include/netinet/sctp.h The diff of those shows the problem if the wrong file is found and used: # diff
SSE in libthr
In a nutshell: Clang emits SSE instructions on amd64 in the common path of pthread_mutex_unlock. This reduces performance by a non-trivial amount. I'd like to disable SSE in libthr. In more detail: In libthr/thread/thr_mutex.c, we find the following: #define MUTEX_INIT_LINK(m) do {\ (m)-m_qe.tqe_prev = NULL; \ (m)-m_qe.tqe_next = NULL; \ } while (0) In 9.1, clang 3.1 emits two ordinary mov instructions: movq $0x0,0x8(%rax) movq $0x0,(%rax) Since 10.0 and clang 3.3, clang emits these SSE instructions: xorps %xmm0,%xmm0 movups %xmm0,(%rax) Although these look harmless enough, using the FPU can reduce performance by incurring extra overhead due to context-switching the FPU state. As I mentioned, this code is used in the common path of pthread_mutex_unlock. I have a simple test program that creates four threads, all contending for a single mutex, and measures the total number of lock acquisitions over several seconds. When libthr is built with SSE, as is current, I get around 53 million locks in 5 seconds. Without SSE, I get around 60 million (13% more). DTrace shows around 790,000 calls to fpudna versus 10 calls. There could be other factors involved, but I presume that the FPU context switches account for most of the change in performance. Even when I add some SSE usage in the application--incidentally, these same instructions--building libthr without SSE improves performance from 53.5 million to 55.8 million (4.3%). In the real-world application where I first noticed this, performance improves by 3-5%. I would appreciate your thoughts and feedback. The proposed patch is below. Eric Index: base/head/lib/libthr/arch/amd64/Makefile.inc === --- base/head/lib/libthr/arch/amd64/Makefile.inc(revision 280703) +++ base/head/lib/libthr/arch/amd64/Makefile.inc(working copy) @@ -1,3 +1,8 @@ #$FreeBSD$ SRCS+= _umtx_op_err.S + +# Using SSE incurs extra overhead per context switch, +# which measurably impacts performance when the application +# does not otherwise use FP/SSE. +CFLAGS+=-mno-sse ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SSE in libthr
Wow. I remember seeing this in the work application - all packet pushing in userland, but there are locks being acquired. I was wondering what exactly was triggering the FPU save/restore code. Now I know. Yes, if there are no other objections, I'd love to see this in -HEAD and stable/10. -adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SSE in libthr
On Fri, 27 Mar 2015, Eric van Gyzen wrote: In a nutshell: Clang emits SSE instructions on amd64 in the common path of pthread_mutex_unlock. This reduces performance by a non-trivial amount. I'd like to disable SSE in libthr. This makes sense to me. -- DE ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
locking issue between igmp and route code in current?
Hi, I noticed the following checkins to route.c in current, and was wondering if they have barring on the deadlock documented below: Revision 274589 http://svnweb.freebsd.org/base?view=revisionrevision=274589 - (view http://svnweb.freebsd.org/base/head/sys/net/route.c?revision=274589view=markup) (download http://svnweb.freebsd.org/base/head/sys/net/route.c?revision=274589view=co) (annotate http://svnweb.freebsd.org/base/head/sys/net/route.c?annotate=274589) - [select for diffs] http://svnweb.freebsd.org/base/head/sys/net/route.c?view=logr1=274589log_pagestart=0 Modified Sun Nov 16 18:15:23 2014 UTC (4 months, 1 week ago) by melifaro File length: 46990 byte(s) Diff to previous 274585 http://svnweb.freebsd.org/base/head/sys/net/route.c?r1=274585r2=274589 Revert r274585 http://svnweb.freebsd.org/base?view=revisionrevision=274585: rte lock is properly destroyed in uma dtor callback. Pointed by: glebius Revision 274585 http://svnweb.freebsd.org/base?view=revisionrevision=274585 - (view http://svnweb.freebsd.org/base/head/sys/net/route.c?revision=274585view=markup) (download http://svnweb.freebsd.org/base/head/sys/net/route.c?revision=274585view=co) (annotate http://svnweb.freebsd.org/base/head/sys/net/route.c?annotate=274585) - [select for diffs] http://svnweb.freebsd.org/base/head/sys/net/route.c?view=logr1=274585log_pagestart=0 Modified Sun Nov 16 14:56:31 2014 UTC (4 months, 1 week ago) by melifaro File length: 47013 byte(s) Diff to previous 274187 http://svnweb.freebsd.org/base/head/sys/net/route.c?r1=274187r2=274585 Make witness happy: destroy rte lock before free. MFC after: 2 weeks lock order reversal: 1st 0xf80003d62190 if_addr_lock (if_addr_lock) @ /u/lars/sandbox/builds/curre nt_10032015/sys/netinet/igmp.c:1714 2nd 0xf800090d7be0 radix node head (radix node head) @ /u/lars/sandbox/builds/current_10032015/sys/net/route.c:415 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe0043faf3f0 witness_checkorder() at witness_checkorder+0xbe7/frame 0xfe0043faf480 __rw_rlock() at __rw_rlock+0x5a/frame 0xfe0043faf520 rtalloc1_fib() at rtalloc1_fib+0x60/frame 0xfe0043faf5d0 rtalloc_ign_fib() at rtalloc_ign_fib+0x98/frame 0xfe0043faf610 flowtable_lookup_common() at flowtable_lookup_common+0x1e6/frame 0xfe0043faf6f0 flowtable_lookup() at flowtable_lookup+0x10f/frame 0xfe0043faf750 ip_output() at ip_output+0x87/frame 0xfe0043faf840 igmp_intr() at igmp_intr+0x2ed/frame 0xfe0043faf8c0 netisr_dispatch_src() at netisr_dispatch_src+0x61/frame 0xfe0043faf930 igmp_v1v2_queue_report() at igmp_v1v2_queue_report+0x14b/frame 0xfe0043faf980 igmp_fasttimo() at igmp_fasttimo+0x381/frame 0xfe0043fafa30 pffasttimo() at pffasttimo+0x54/frame 0xfe0043fafa60 softclock_call_cc() at softclock_call_cc+0x165/frame 0xfe0043fafb20 softclock() at softclock+0x3d/frame 0xfe0043fafb40 intr_event_execute_handlers() at intr_event_execute_handlers+0xb1/frame 0xfe0043fafb70 ithread_loop() at ithread_loop+0x9c/frame 0xfe0043fafbb0 fork_exit() at fork_exit+0x71/frame 0xfe0043fafbf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe0043fafbf0 This was on a current build as of March 10 - svn 279869 Lars ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SSE in libthr
On Mar 27, 2015, at 12:26, Eric van Gyzen vangy...@freebsd.org wrote: In a nutshell: Clang emits SSE instructions on amd64 in the common path of pthread_mutex_unlock. This reduces performance by a non-trivial amount. I'd like to disable SSE in libthr. In more detail: In libthr/thread/thr_mutex.c, we find the following: #define MUTEX_INIT_LINK(m) do {\ (m)-m_qe.tqe_prev = NULL; \ (m)-m_qe.tqe_next = NULL; \ } while (0) In 9.1, clang 3.1 emits two ordinary mov instructions: movq $0x0,0x8(%rax) movq $0x0,(%rax) Since 10.0 and clang 3.3, clang emits these SSE instructions: xorps %xmm0,%xmm0 movups %xmm0,(%rax) Although these look harmless enough, using the FPU can reduce performance by incurring extra overhead due to context-switching the FPU state. As I mentioned, this code is used in the common path of pthread_mutex_unlock. I have a simple test program that creates four threads, all contending for a single mutex, and measures the total number of lock acquisitions over several seconds. When libthr is built with SSE, as is current, I get around 53 million locks in 5 seconds. Without SSE, I get around 60 million (13% more). DTrace shows around 790,000 calls to fpudna versus 10 calls. There could be other factors involved, but I presume that the FPU context switches account for most of the change in performance. Even when I add some SSE usage in the application--incidentally, these same instructions--building libthr without SSE improves performance from 53.5 million to 55.8 million (4.3%). In the real-world application where I first noticed this, performance improves by 3-5%. I would appreciate your thoughts and feedback. The proposed patch is below. Eric Index: base/head/lib/libthr/arch/amd64/Makefile.inc === --- base/head/lib/libthr/arch/amd64/Makefile.inc (revision 280703) +++ base/head/lib/libthr/arch/amd64/Makefile.inc (working copy) @@ -1,3 +1,8 @@ #$FreeBSD$ SRCS+=_umtx_op_err.S + +# Using SSE incurs extra overhead per context switch, +# which measurably impacts performance when the application +# does not otherwise use FP/SSE. +CFLAGS+=-mno-sse Good catch! Regarding your patch, I think we should disable even more, if possible. How about: CFLAGS+=-mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -- Rui Paulo ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SSE in libthr
On Fri, Mar 27, 2015 at 03:26:17PM -0400, Eric van Gyzen wrote: In a nutshell: Clang emits SSE instructions on amd64 in the common path of pthread_mutex_unlock. This reduces performance by a non-trivial amount. I'd like to disable SSE in libthr. How about saving and restoring the FPU/SSE state eagerly instead of the current CR0.TS-based lazy method? There is overhead associated with #NM exception handling (fpudna) which is not worth it if FPU/SSE are used often. This would apply to userland threads only; kernel threads normally do not use FPU/SSE and handle the FPU/SSE state manually if they do. There is performance improvement potential in using SSE for optimizing string functions, for example. Even a simple SSE2 strlen easily outperforms the already optimized lib/libc/string/strlen.c in a microbenchmark, and many other string functions are slow byte-at-a-time implementations. -- Jilles Tjoelker ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SSE in libthr
On 03/27/2015 16:49, Rui Paulo wrote: Regarding your patch, I think we should disable even more, if possible. How about: CFLAGS+=-mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 Yes, I was considering copying all of the similar flags that we use in the kernel. That seems wise. According to comments in sys/conf/kern.mk, only no-mmx and no-sse would be necessary, as they imply the others. dim@ raised the possibility of CPUTYPE=foo on i386, so I would also apply this change to i386. An updated patch is below. Eric Index: base/head/lib/libthr/arch/amd64/Makefile.inc === --- base/head/lib/libthr/arch/amd64/Makefile.inc(revision 280703) +++ base/head/lib/libthr/arch/amd64/Makefile.inc(working copy) @@ -1,3 +1,8 @@ #$FreeBSD$ SRCS+=_umtx_op_err.S + +# Using SSE incurs extra overhead per context switch, +# which measurably impacts performance when the application +# does not otherwise use FP/SSE. +CFLAGS+=-mno-sse -mno-mmx Index: base/head/lib/libthr/arch/i386/Makefile.inc === --- base/head/lib/libthr/arch/i386/Makefile.inc(revision 280703) +++ base/head/lib/libthr/arch/i386/Makefile.inc(working copy) @@ -1,3 +1,8 @@ # $FreeBSD$ SRCS+=_umtx_op_err.S + +# Using SSE incurs extra overhead per context switch, +# which measurably impacts performance when the application +# does not otherwise use FP/SSE. +CFLAGS+=-mno-sse -mno-mmx ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SSE in libthr
On Fri, Mar 27, 2015 at 01:49:03PM -0700, Rui Paulo wrote: On Mar 27, 2015, at 12:26, Eric van Gyzen vangy...@freebsd.org wrote: In a nutshell: Clang emits SSE instructions on amd64 in the common path of pthread_mutex_unlock. This reduces performance by a non-trivial amount. I'd like to disable SSE in libthr. In more detail: In libthr/thread/thr_mutex.c, we find the following: #define MUTEX_INIT_LINK(m) do {\ (m)-m_qe.tqe_prev = NULL; \ (m)-m_qe.tqe_next = NULL; \ } while (0) In 9.1, clang 3.1 emits two ordinary mov instructions: movq $0x0,0x8(%rax) movq $0x0,(%rax) Since 10.0 and clang 3.3, clang emits these SSE instructions: xorps %xmm0,%xmm0 movups %xmm0,(%rax) Although these look harmless enough, using the FPU can reduce performance by incurring extra overhead due to context-switching the FPU state. As I mentioned, this code is used in the common path of pthread_mutex_unlock. I have a simple test program that creates four threads, all contending for a single mutex, and measures the total number of lock acquisitions over several seconds. When libthr is built with SSE, as is current, I get around 53 million locks in 5 seconds. Without SSE, I get around 60 million (13% more). DTrace shows around 790,000 calls to fpudna versus 10 calls. There could be other factors involved, but I presume that the FPU context switches account for most of the change in performance. Even when I add some SSE usage in the application--incidentally, these same instructions--building libthr without SSE improves performance from 53.5 million to 55.8 million (4.3%). In the real-world application where I first noticed this, performance improves by 3-5%. I would appreciate your thoughts and feedback. The proposed patch is below. Eric Index: base/head/lib/libthr/arch/amd64/Makefile.inc === --- base/head/lib/libthr/arch/amd64/Makefile.inc(revision 280703) +++ base/head/lib/libthr/arch/amd64/Makefile.inc(working copy) @@ -1,3 +1,8 @@ #$FreeBSD$ SRCS+= _umtx_op_err.S + +# Using SSE incurs extra overhead per context switch, +# which measurably impacts performance when the application +# does not otherwise use FP/SSE. +CFLAGS+=-mno-sse Good catch! Regarding your patch, I think we should disable even more, if possible. How about: CFLAGS+=-mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 I think so. Also, this should be done for libc as well, both on i386 and amd64. I am not sure, should compiler-rt be included into the set ? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SSE in libthr
hi, please don't try to microoptimise crap like strlen(). The TL;DR for performant high-throughput code is: if strlen() or memcpy() is the thing that's costing you the most, you're doing it wrong. -adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SSE in libthr
On Fri, Mar 27, 2015 at 4:36 PM, Adrian Chadd adr...@freebsd.org wrote: hi, please don't try to microoptimise crap like strlen(). The TL;DR for performant high-throughput code is: if strlen() or memcpy() is the thing that's costing you the most, you're doing it wrong. -adrian I respectfully disagree. A well-optimized libc will benefit _every_single_program_ that uses strlen. That includes Apache, Samba, Memcached, Quake, and basically every single program that every single FreeBSD user uses. There's no reason that 3rd party software maintainers should have to rewrite basic libc functions in order to get decent performance on FreeBSD. And the downsides are so small! In 2015, we should assume by default that most userland software is using SIMD instructions. As Eric noticed, Clang emits them freely. What's the point to lazily saving the SSE registers on context switches if essentially all programs compiled from Ports will be using those registers anyway? I agree with Jilles; I think we should always save the SSE registers for userland programs. -Alan ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
locking issue between igmp and route code in current?
Hi, I realized that I hadn’t copied the other half of the locking issue mentioned earlier.. Lars Mon Mar 23 12:42:15 CDT 2015 lock order reversal: 1st 0xf80003d62190 if_addr_lock (if_addr_lock) @ /u/lars/sandbox/builds/current_10032015/sys/netinet/igmp.c:1714 2nd 0x80e387b0 ifnet_rw (ifnet_rw) @ /u/lars/sandbox/builds/current_10032015/sys/net/if.c:243 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe0043faf6f0 witness_checkorder() at witness_checkorder+0xbe7/frame 0xfe0043faf780 __rw_rlock() at __rw_rlock+0x5a/frame 0xfe0043faf820 ifnet_byindex() at ifnet_byindex+0x22/frame 0xfe0043faf840 igmp_intr() at igmp_intr+0x1d/frame 0xfe0043faf8c0 netisr_dispatch_src() at netisr_dispatch_src+0x61/frame 0xfe0043faf930 igmp_v1v2_queue_report() at igmp_v1v2_queue_report+0x14b/frame 0xfe0043faf980 igmp_fasttimo() at igmp_fasttimo+0x381/frame 0xfe0043fafa30 pffasttimo() at pffasttimo+0x54/frame 0xfe0043fafa60 softclock_call_cc() at softclock_call_cc+0x165/frame 0xfe0043fafb20 softclock() at softclock+0x3d/frame 0xfe0043fafb40 intr_event_execute_handlers() at intr_event_execute_handlers+0xb1/frame 0xfe0043fafb70 ithread_loop() at ithread_loop+0x9c/frame 0xfe0043fafbb0 fork_exit() at fork_exit+0x71/frame 0xfe0043fafbf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe0043fafbf0 --- trap 0, rip = 0, rsp = 0xfe0043fafcb0, rbp = 0 --- lock order reversal: 1st 0xf80003d62190 if_addr_lock (if_addr_lock) @ /u/lars/sandbox/builds/current_10032015/sys/netinet/igmp.c:1714 2nd 0xf800090d7be0 radix node head (radix node head) @ /u/lars/sandbox/builds/current_10032015/sys/net/route.c:415 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe0043faf3f0 witness_checkorder() at witness_checkorder+0xbe7/frame 0xfe0043faf480 __rw_rlock() at __rw_rlock+0x5a/frame 0xfe0043faf520 rtalloc1_fib() at rtalloc1_fib+0x60/frame 0xfe0043faf5d0 rtalloc_ign_fib() at rtalloc_ign_fib+0x98/frame 0xfe0043faf610 flowtable_lookup_common() at flowtable_lookup_common+0x1e6/frame 0xfe0043faf6f0 flowtable_lookup() at flowtable_lookup+0x10f/frame 0xfe0043faf750 ip_output() at ip_output+0x87/frame 0xfe0043faf840 igmp_intr() at igmp_intr+0x2ed/frame 0xfe0043faf8c0 netisr_dispatch_src() at netisr_dispatch_src+0x61/frame 0xfe0043faf930 igmp_v1v2_queue_report() at igmp_v1v2_queue_report+0x14b/frame 0xfe0043faf980 igmp_fasttimo() at igmp_fasttimo+0x381/frame 0xfe0043fafa30 pffasttimo() at pffasttimo+0x54/frame 0xfe0043fafa60 softclock_call_cc() at softclock_call_cc+0x165/frame 0xfe0043fafb20 softclock() at softclock+0x3d/frame 0xfe0043fafb40 intr_event_execute_handlers() at intr_event_execute_handlers+0xb1/frame 0xfe0043fafb70 ithread_loop() at ithread_loop+0x9c/frame 0xfe0043fafbb0 fork_exit() at fork_exit+0x71/frame 0xfe0043fafbf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe0043fafbf0 --- trap 0, rip = 0, rsp = 0xfe0043fafcb0, rbp = 0 --- panic: deadlkres: possible deadlock detected for 0xf8018245d000, blocked for 1802208 ticks cpuid = 16 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe00cac40a80 vpanic() at vpanic+0x187/frame 0xfe00cac40b00 panic() at panic+0x43/frame 0xfe00cac40b60 deadlkres() at deadlkres+0x2fc/frame 0xfe00cac40bb0 fork_exit() at fork_exit+0x71/frame 0xfe00cac40bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfe00cac40bf0 --- trap 0, rip = 0, rsp = 0xfe00cac40cb0, rbp = 0 --- KDB: enter: panic ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SSE in libthr
On 27 March 2015 at 16:03, Alan Somers asom...@freebsd.org wrote: On Fri, Mar 27, 2015 at 4:36 PM, Adrian Chadd adr...@freebsd.org wrote: hi, please don't try to microoptimise crap like strlen(). The TL;DR for performant high-throughput code is: if strlen() or memcpy() is the thing that's costing you the most, you're doing it wrong. -adrian I respectfully disagree. A well-optimized libc will benefit _every_single_program_ that uses strlen. That includes Apache, Samba, Memcached, Quake, and basically every single program that every single FreeBSD user uses. There's no reason that 3rd party software maintainers should have to rewrite basic libc functions in order to get decent performance on FreeBSD. And the downsides are so small! In 2015, we should assume by default that most userland software is using SIMD instructions. As Eric noticed, Clang emits them freely. What's the point to lazily saving the SSE registers on context switches if essentially all programs compiled from Ports will be using those registers anyway? I agree with Jilles; I think we should always save the SSE registers for userland programs. That's fine, but those benchmarks and improvements also have to take into account the environment that these programs are running in, and all of the other things that are going on with it. Fixing strlen() to use SSE2 is great, but if the gains are offset by fpu save/restore when doing fine grain locking that's blocking under real world workloads, what's the benefit? What about if the system is context switching over a million times a second? These are real life things I see servers running all of the above software /do/. One only knows with benchmarking, not microbenchmarking. Microbenchmarks are great. They serve a purpose, which is how the heck is the current silicon I'm running on run some code that I've cleverly crafted to hopefully run well. I'm totally for saving/restoring SSE registers for userland programs. But that's not where that kind of make stuff fast work should stop. If it does, and that's where your benchmarking for the real world stops, then you're doing it wrong. Everything is a toss-up. For this userland based netmap packet pushing app, SEE may be nice for some instructions, but know what else screws things? The fact that the default scheduler policy is terrible and crap gets scheduled /everywhere/ under any appreciable amount of load. That the context switch rate is high, the interrupt rate is also high, and with a little locking going on, I see fpu save/restore occur for a non-insignificant fraction of CPU. Optimising strlen() or memcpy() is great, but when my system context switches a million times a second, we're never going to reach the steady state that these CPUs can really crank out real work at under those conditions. So, cool. Please keep poking at that stuff. But if you stop short of making the system actually /be able to take advantage of them under load/, I respectfully ask for a nice knob I can use to turn them off. :) -adrian (Know where the slowdowns for memcached are? Hint - not strlen or memcpy. Yes, I've been down that rabbit hole recently. Know what /i/ have? 1 million UDP transactions a second working on 16 core sandybridge systems. Know what I didn't optimise? memcpy or strlen. The network stack locking and pthreads overhead is what sucks.) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: SSE in libthr
Possibly related information. Recently, I tried to build world/kernel (head, r280410, amd64) with CPUTYPE setting in make.conf. Real CPU is sandybridge (corei7-avx). Running in VirtualBox VM, installworld fails with CPUTYPE?=corei7-avx, while with CPUTYPE?=corei7 everything goes OK. *Rebooting after installkernel and etcupdate -p goes OK, but rebooting after failed installworld causes even /bin/sh fail to start (kernel starts OK). Yes, it would be the problem (or limitation) of VirtualBox and NOT of FreeBSD, as memstick image built from /usr/obj with CPUTYPE?=corei7-avx runs OK in real hardware. This should mean some AVX instructions are generated by clang 3.6.0 for userland, and VirtualBox doesn't like them. On Fri, 27 Mar 2015 15:26:17 -0400 Eric van Gyzen vangy...@freebsd.org wrote: In a nutshell: Clang emits SSE instructions on amd64 in the common path of pthread_mutex_unlock. This reduces performance by a non-trivial amount. I'd like to disable SSE in libthr. In more detail: In libthr/thread/thr_mutex.c, we find the following: #define MUTEX_INIT_LINK(m) do {\ (m)-m_qe.tqe_prev = NULL; \ (m)-m_qe.tqe_next = NULL; \ } while (0) In 9.1, clang 3.1 emits two ordinary mov instructions: movq $0x0,0x8(%rax) movq $0x0,(%rax) Since 10.0 and clang 3.3, clang emits these SSE instructions: xorps %xmm0,%xmm0 movups %xmm0,(%rax) Although these look harmless enough, using the FPU can reduce performance by incurring extra overhead due to context-switching the FPU state. As I mentioned, this code is used in the common path of pthread_mutex_unlock. I have a simple test program that creates four threads, all contending for a single mutex, and measures the total number of lock acquisitions over several seconds. When libthr is built with SSE, as is current, I get around 53 million locks in 5 seconds. Without SSE, I get around 60 million (13% more). DTrace shows around 790,000 calls to fpudna versus 10 calls. There could be other factors involved, but I presume that the FPU context switches account for most of the change in performance. Even when I add some SSE usage in the application--incidentally, these same instructions--building libthr without SSE improves performance from 53.5 million to 55.8 million (4.3%). In the real-world application where I first noticed this, performance improves by 3-5%. I would appreciate your thoughts and feedback. The proposed patch is below. Eric Index: base/head/lib/libthr/arch/amd64/Makefile.inc === --- base/head/lib/libthr/arch/amd64/Makefile.inc (revision 280703) +++ base/head/lib/libthr/arch/amd64/Makefile.inc (working copy) @@ -1,3 +1,8 @@ #$FreeBSD$ SRCS+= _umtx_op_err.S + +# Using SSE incurs extra overhead per context switch, +# which measurably impacts performance when the application +# does not otherwise use FP/SSE. +CFLAGS+=-mno-sse ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org -- Tomoaki AOKIjunch...@dec.sakura.ne.jp ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org