Re: stable/11 -r307797 on BPi-M3 (cortex-a7): truss gets segmentation fault for handling unknown system call
[I re-established the crotchet-build based failure context finally. Unfortunately truss just dies in a new place.] On 2016-Oct-28, at 7:29 AM, John Baldwin wrote: > On Tuesday, October 25, 2016 11:40:38 AM Mark Millard wrote: >> [The following has been reported in: >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213778 .] >> >> In trying to build lang/gcc6 xgcc's cc1 got some SIGSYS examples. In trying >> to track things down I ran into truss getting a SIGSEGV when it tries to >> handle the situation. . . >> >> In truss's enter_syscall there is (from a live gdb on truss, after the >> segmentation fault): >> >> 380 t->cs.name = sysdecode_syscallname(t->proc->abi->abi, >> t->cs.number); >> 381 if (t->cs.name == NULL) >> (gdb) >> 382 fprintf(info->outfile, "-- UNKNOWN %s SYSCALL %d --\n", >> 383 t->proc->abi->type, t->cs.number); >> 384 >> 385 sc = get_syscall(t->cs.name, narg); >> 386 t->cs.nargs = sc->nargs; >> 387 assert(sc->nargs <= nitems(t->cs.s_args)); >> 388 >> 389 t->cs.sc = sc; >> >> (gdb) print *t >> $2 = {entries = {le_next = 0x0, le_prev = 0x20617070}, proc = 0x20617060, >> tid = 100150, in_syscall = 1, cs = {sc = 0x0, name = 0x0, number = >> 580828064, args = 0x2061b0c0, nargs = 0, >>s_args = 0x2061b0ec}, before = {tv_sec = 1477418265, tv_nsec = >> 492342263}, after = {tv_sec = 1477418265, tv_nsec = 492496630}} >> >> (gdb) print sc >> $3 = (struct syscall *) 0x0 >> >> So line 386 listed above gets a segmentation fault for sc->nargs when >> t->cs.name is a NULL pointer: sc ends up NULL. >> >> Looking at the two things that the fprintf on lines 382 and 383 would report: >> >> (gdb) print t->proc->abi->type >> $4 = 0x10166 "FreeBSD ELF32" >> >> (gdb) print t->cs.number >> $5 = 580828064 >> >> (gdb) print narg >> $6 = 0 >> >> (that last is for context for the get_syscall arguments). >> >> FYI: 580828064 = 0x229EBBA0 > > I have a patchset I have tested some in a git branch that I believe fixes > handling of > unknown system calls. Please try this: > > https://github.com/freebsd/freebsd/compare/master...bsdjhb:truss_unknown > > (Add .diff to get a diff you can apply with patch) > > > -- > John Baldwin [Watch out for inlining consequences in how gdb presents things. Also I extracted from my explorations and changed the presentation order to eliminate junk.] Summary: st->syscalls ends up NULL from reallocf refusing a huge allocation because t->cs.number==580828064, which would make for a huge offset in st->syscalls[number] . new_count * sizeof(st->syscalls[0]) would be rather large (new_count == number+1) . reallocf's result needs to be tested and/or reasonable-value-checks on t->cs.number (a.k.a. number) need to be made and unreasonable value handled some other way. The supporting details: root@bananapi-m3:/usr/obj/portswork/usr/ports/lang/gcc6/work/.build/armv6-portbld-freebsd11.0/libgcc # gdb truss GNU gdb 6.1.1 [FreeBSD] . . . (gdb) run -faeH -o truss.log /usr/obj/portswork/usr/ports/lang/gcc6/work/.build/./gcc/xgcc -B/usr/obj/portswork/usr/ports/lang/gcc6/work/.build/./gcc/ -B/usr/local/armv6-portbld-freebsd11.0/bin/ -B/usr/local/armv6-portbld-freebsd11.0/lib/ -isystem /usr/local/armv6-portbld-freebsd11.0/include -isystem /usr/local/armv6-portbld-freebsd11.0/sys-include -O2 -pipe -mcpu=cortex-a7 -DLIBICONV_PLUG -g -fno-strict-aliasing -O2 -O2 -pipe -mcpu=cortex-a7 -DLIBICONV_PLUG -g -fno-strict-aliasing -DIN_GCC -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fPIC -pthread -fno-inline -fomit-frame-pointer -g -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector -fPIC -pthread -fno-inline -fomit-frame-pointer -I. -I. -I../.././gcc -I/usr/obj/portswork/usr/ports/lang/gcc6/work/gcc-6.2.0/libgcc -I/usr/obj/portswork/usr/ports/lang/gcc6/work/gcc-6.2.0/libgcc/. -I/usr/obj/portswork/usr/ports/lang/gcc6/work/gcc-6.2. 0/libgcc/../gcc -I/usr/obj/portswork/usr/ports/lang/gcc6/work/gcc-6.2.0/libgcc/../include -DHAVE_CC_TLS -o _muldi3.o -MT _muldi3.o -MD -MP -MF _muldi3.dep -DL_muldi3 -c /usr/obj/portswork/usr/ports/lang/gcc6/work/gcc-6.2.0/libgcc/libgcc2.c -fvisibility=hidden -DHIDE_EXPORTS Starting program: /usr/bin/truss -faeH -o truss.log . . . . Program received signal SIGSEGV, Segmentation fault. 0x20241ebc in memset () from /lib/libc.so.7 Current language: auto; currently minimal (gdb) bt #0 0x20241ebc in memset () from /lib/libc.so.7 #1 0xaec8 in get_syscall (t=, number=580828064, nargs=0) at /usr/src/usr.bin/truss/syscalls.c:956 #2 0xab8c in enter_syscall (info=0x20612000, t=0x2061b0a0, pl=) at /usr/src/usr.bin/truss/setup.c:380 #3 0xa798 in eventloop (info=) at /usr/src/usr.bin/truss/setup.c:664 #4 0x98d4 in $a.6 () at /usr/src/usr.bin/truss/main.c:207 #5 0x98d4 in $a.6 () at /usr/src/usr.bin/truss/main.c:
Re: stable/11 -r307797 on BPi-M3 (cortex-a7): truss gets segmentation fault for handling unknown system call
On 2016-Oct-28, at 4:02 PM, Mark Millard wrote: > On 2016-Oct-28, at 7:29 AM, John Baldwin wrote: > >> On Tuesday, October 25, 2016 11:40:38 AM Mark Millard wrote: >>> [The following has been reported in: >>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213778 .] >>> >>> In trying to build lang/gcc6 xgcc's cc1 got some SIGSYS examples. In trying >>> to track things down I ran into truss getting a SIGSEGV when it tries to >>> handle the situation. . . >>> >>> In truss's enter_syscall there is (from a live gdb on truss, after the >>> segmentation fault): >>> >>> 380 t->cs.name = sysdecode_syscallname(t->proc->abi->abi, >>> t->cs.number); >>> 381 if (t->cs.name == NULL) >>> (gdb) >>> 382 fprintf(info->outfile, "-- UNKNOWN %s SYSCALL %d --\n", >>> 383 t->proc->abi->type, t->cs.number); >>> 384 >>> 385 sc = get_syscall(t->cs.name, narg); >>> 386 t->cs.nargs = sc->nargs; >>> 387 assert(sc->nargs <= nitems(t->cs.s_args)); >>> 388 >>> 389 t->cs.sc = sc; >>> >>> (gdb) print *t >>> $2 = {entries = {le_next = 0x0, le_prev = 0x20617070}, proc = 0x20617060, >>> tid = 100150, in_syscall = 1, cs = {sc = 0x0, name = 0x0, number = >>> 580828064, args = 0x2061b0c0, nargs = 0, >>> s_args = 0x2061b0ec}, before = {tv_sec = 1477418265, tv_nsec = >>> 492342263}, after = {tv_sec = 1477418265, tv_nsec = 492496630}} >>> >>> (gdb) print sc >>> $3 = (struct syscall *) 0x0 >>> >>> So line 386 listed above gets a segmentation fault for sc->nargs when >>> t->cs.name is a NULL pointer: sc ends up NULL. >>> >>> Looking at the two things that the fprintf on lines 382 and 383 would >>> report: >>> >>> (gdb) print t->proc->abi->type >>> $4 = 0x10166 "FreeBSD ELF32" >>> >>> (gdb) print t->cs.number >>> $5 = 580828064 >>> >>> (gdb) print narg >>> $6 = 0 >>> >>> (that last is for context for the get_syscall arguments). >>> >>> FYI: 580828064 = 0x229EBBA0 >> >> I have a patchset I have tested some in a git branch that I believe fixes >> handling of >> unknown system calls. Please try this: >> >> https://github.com/freebsd/freebsd/compare/master...bsdjhb:truss_unknown >> >> (Add .diff to get a diff you can apply with patch) >> >> -- >> John Baldwin > > Sorry it took so long to try the build. . . > > I got a compile failure for use of bool in my stable/11 context for the > BPI-M3 build that the truss problem was discovered with (quoting the build > log below): > >> --- main.o --- >> cc -target armv6-gnueabihf-freebsd11.0 >> --sysroot=/usr/local/src/crochet/work/obj/arm.armv6/usr/src/tmp >> -B/usr/local/src/crochet/work/obj/arm.armv6/usr/src/tmp/usr/bin -O -pipe >> -I/usr/src/usr.bin/truss -I. -I/usr/src/usr.bin/truss/../../sys -g -MD >> -MF.depend.main.o -MTma >> in.o -std=gnu99 -Wsystem-headers -Wall -Wno-format-y2k -W >> -Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes >> -Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow >> -Wunused-parameter -Wcast-align -Wchar-subscripts -Winline -Wnested-externs >> -Wredundant-decls -Wold-style-definition -Wno-pointer-sign >> -Wmissing-variable-declarations -Wthread-safety -Wno-empty-body >> -Wno-string-plus-int -Wno-unused-const-variable -Qunused-arguments -c >> /usr/src/usr.bin/truss/main.c -o main.o >> In file included from /usr/src/usr.bin/truss/main.c:53: >> /usr/src/usr.bin/truss/syscall.h:75:2: error: unknown type name 'bool' >>bool unknown; /* Uknown system call */ >>^ >> 1 error generated. >> *** [main.o] Error code 1 >> >> make[4]: stopped in /usr/src/usr.bin/truss >> 1 error > > > In C99 bool is a macro from and _Bool is the C99 type itself. So > apparently (or an equivalent) was not directly or indirectly > included. (The macros true and false and __bool_true_false_are_defined are > also from .) > > Which way do you want the C99 typing to be handled for this: native C99 with > no required? Use ? > > > Side note: > > I'll see about getting my normal stable/11 build environment going for the > BPI-M3 instead of using the crochet from my first-time build for the target. > > === > Mark Millard > markmi at dsl-only.net [Once I got back to this test yet again I arbitrarily added a #include to allow truss to build during buildworld.] The way I normally build (instead of crochet) did not get the original cc1 problem in its original form. So as of yet I've not managed to reproduce the test case accurately: Back to crochet. I will note that in my "with debug symbols" and -mcpu=cortex-a7 style of buildworld buildkernel type of boot context I got an explicit message in the xgcc/cc1 test that may indicate the original problem for cc1: /usr/obj/portswork/usr/ports/lang/gcc6/work/.build/./gcc/cc1: Undefined symbol "__aeabi_uidiv" [bugzilla https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213785 ] At this stage in trying to bootstrap lang/gcc6 as the first gcc compiler on the
Re: stable/11 -r307797 on BPi-M3 (cortex-a7): truss gets segmentation fault for handling unknown system call
On 2016-Oct-28, at 7:29 AM, John Baldwin wrote: > On Tuesday, October 25, 2016 11:40:38 AM Mark Millard wrote: >> [The following has been reported in: >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213778 .] >> >> In trying to build lang/gcc6 xgcc's cc1 got some SIGSYS examples. In trying >> to track things down I ran into truss getting a SIGSEGV when it tries to >> handle the situation. . . >> >> In truss's enter_syscall there is (from a live gdb on truss, after the >> segmentation fault): >> >> 380 t->cs.name = sysdecode_syscallname(t->proc->abi->abi, >> t->cs.number); >> 381 if (t->cs.name == NULL) >> (gdb) >> 382 fprintf(info->outfile, "-- UNKNOWN %s SYSCALL %d --\n", >> 383 t->proc->abi->type, t->cs.number); >> 384 >> 385 sc = get_syscall(t->cs.name, narg); >> 386 t->cs.nargs = sc->nargs; >> 387 assert(sc->nargs <= nitems(t->cs.s_args)); >> 388 >> 389 t->cs.sc = sc; >> >> (gdb) print *t >> $2 = {entries = {le_next = 0x0, le_prev = 0x20617070}, proc = 0x20617060, >> tid = 100150, in_syscall = 1, cs = {sc = 0x0, name = 0x0, number = >> 580828064, args = 0x2061b0c0, nargs = 0, >>s_args = 0x2061b0ec}, before = {tv_sec = 1477418265, tv_nsec = >> 492342263}, after = {tv_sec = 1477418265, tv_nsec = 492496630}} >> >> (gdb) print sc >> $3 = (struct syscall *) 0x0 >> >> So line 386 listed above gets a segmentation fault for sc->nargs when >> t->cs.name is a NULL pointer: sc ends up NULL. >> >> Looking at the two things that the fprintf on lines 382 and 383 would report: >> >> (gdb) print t->proc->abi->type >> $4 = 0x10166 "FreeBSD ELF32" >> >> (gdb) print t->cs.number >> $5 = 580828064 >> >> (gdb) print narg >> $6 = 0 >> >> (that last is for context for the get_syscall arguments). >> >> FYI: 580828064 = 0x229EBBA0 > > I have a patchset I have tested some in a git branch that I believe fixes > handling of > unknown system calls. Please try this: > > https://github.com/freebsd/freebsd/compare/master...bsdjhb:truss_unknown > > (Add .diff to get a diff you can apply with patch) > > -- > John Baldwin Sorry it took so long to try the build. . . I got a compile failure for use of bool in my stable/11 context for the BPI-M3 build that the truss problem was discovered with (quoting the build log below): > --- main.o --- > cc -target armv6-gnueabihf-freebsd11.0 > --sysroot=/usr/local/src/crochet/work/obj/arm.armv6/usr/src/tmp > -B/usr/local/src/crochet/work/obj/arm.armv6/usr/src/tmp/usr/bin -O -pipe > -I/usr/src/usr.bin/truss -I. -I/usr/src/usr.bin/truss/../../sys -g -MD > -MF.depend.main.o -MTma > in.o -std=gnu99 -Wsystem-headers -Wall -Wno-format-y2k -W > -Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes > -Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow > -Wunused-parameter -Wcast-align -Wchar-subscripts -Winline -Wnested-externs > -Wredundant-decls -Wold-style-definition -Wno-pointer-sign > -Wmissing-variable-declarations -Wthread-safety -Wno-empty-body > -Wno-string-plus-int -Wno-unused-const-variable -Qunused-arguments -c > /usr/src/usr.bin/truss/main.c -o main.o > In file included from /usr/src/usr.bin/truss/main.c:53: > /usr/src/usr.bin/truss/syscall.h:75:2: error: unknown type name 'bool' > bool unknown; /* Uknown system call */ > ^ > 1 error generated. > *** [main.o] Error code 1 > > make[4]: stopped in /usr/src/usr.bin/truss > 1 error In C99 bool is a macro from and _Bool is the C99 type itself. So apparently (or an equivalent) was not directly or indirectly included. (The macros true and false and __bool_true_false_are_defined are also from .) Which way do you want the C99 typing to be handled for this: native C99 with no required? Use ? Side note: I'll see about getting my normal stable/11 build environment going for the BPI-M3 instead of using the crochet from my first-time build for the target. === Mark Millard markmi at dsl-only.net ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: stable/11 -r307797 on BPi-M3 (cortex-a7): truss gets segmentation fault for handling unknown system call
On Tuesday, October 25, 2016 11:40:38 AM Mark Millard wrote: > [The following has been reported in: > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213778 .] > > In trying to build lang/gcc6 xgcc's cc1 got some SIGSYS examples. In trying > to track things down I ran into truss getting a SIGSEGV when it tries to > handle the situation. . . > > In truss's enter_syscall there is (from a live gdb on truss, after the > segmentation fault): > > 380 t->cs.name = sysdecode_syscallname(t->proc->abi->abi, > t->cs.number); > 381 if (t->cs.name == NULL) > (gdb) > 382 fprintf(info->outfile, "-- UNKNOWN %s SYSCALL %d --\n", > 383 t->proc->abi->type, t->cs.number); > 384 > 385 sc = get_syscall(t->cs.name, narg); > 386 t->cs.nargs = sc->nargs; > 387 assert(sc->nargs <= nitems(t->cs.s_args)); > 388 > 389 t->cs.sc = sc; > > (gdb) print *t > $2 = {entries = {le_next = 0x0, le_prev = 0x20617070}, proc = 0x20617060, tid > = 100150, in_syscall = 1, cs = {sc = 0x0, name = 0x0, number = 580828064, > args = 0x2061b0c0, nargs = 0, > s_args = 0x2061b0ec}, before = {tv_sec = 1477418265, tv_nsec = > 492342263}, after = {tv_sec = 1477418265, tv_nsec = 492496630}} > > (gdb) print sc > $3 = (struct syscall *) 0x0 > > So line 386 listed above gets a segmentation fault for sc->nargs when > t->cs.name is a NULL pointer: sc ends up NULL. > > Looking at the two things that the fprintf on lines 382 and 383 would report: > > (gdb) print t->proc->abi->type > $4 = 0x10166 "FreeBSD ELF32" > > (gdb) print t->cs.number > $5 = 580828064 > > (gdb) print narg > $6 = 0 > > (that last is for context for the get_syscall arguments). > > FYI: 580828064 = 0x229EBBA0 I have a patchset I have tested some in a git branch that I believe fixes handling of unknown system calls. Please try this: https://github.com/freebsd/freebsd/compare/master...bsdjhb:truss_unknown (Add .diff to get a diff you can apply with patch) -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"