So... I've expanded the definition of plusXX and put audittstacks between each line. In other words, basically:
I plusXX(I n, I m, X* x, X* y, X* z, JJ jt){ X u; X v; if (n-1==0) { I i= (I)(m)-1; for (; i>=0; --i){ u= (*x++); v= (*y++); *z++= jtxplus(jt,(u),(v)); } } else if (n-1<0){ I i= (I)(m)-1; for (; i>=0; --i){ u= (*x++); { I i= -2-(I)(n); for (; i>=0; --i){ v= (*y++); *z++= jtxplus(jt,(u),(v)); // this update of *z is "the" problem } } } } else { I i= (I)(m)-1; for (; i>=0; --i){ v= (*y++); { I i= (I)(n)-1; for (; i>=0; --i){ u= (*x++); *z++= jtxplus(jt,(u),(v)); } } } }; I rc= jt->jerr; jt->jerr= 0; return rc?rc:256; } Unfortunately, I'm a bit lost here - I do not know what I am looking at. The line in jtva2 which called plusXX looks like this: #3 0x00007ffff29e7bec in jtva2 (jt=0x7ffff13e8200, a=0x555555768900, w=0x5555556d1600, self=0x7ffff3a28600 <primtab+4480>, allranks=131072) at ../../../../jsrc/va2.c:749 749 {I lrc=((AHDR2FN*)aadocv->f)(n,m,av,wv,zv,jt); // run one section. Result of 0 means error n is -50 m is 1 av, wv and zv are pointers into memory somewhere. The call to jtva2 looked like this: #4 0x00007ffff29e45fb in jtatomic2 (jt=0x7ffff13e8200, a=0x555555769080, w=0x5555556d1600, self=0x7ffff3a28600 <primtab+4480>) at ../../../../jsrc/va2.c:1276 1276 z=jtva2(jtinplace,a,w,self,(awr<<RANK2TX)+selfranks); // execute the verb (gdb) p *a $10 = {kchain = {k = 56, chain = 0x38, globalst = 0x38, locpath = 0x38}, flag = 0, mback = {m = 93824992547648, back = 0x5555555a1340, jobpyx = 0x5555555a1340, zaploc = 0x5555555a1340, aarg = 0x5555555a1340}, tproxy = {t = 4, proxychain = 0x4}, c = 1, n = 1, r = 0 '\000', filler = 0 '\000', h = 445, origin = 0, lock = 0, s = {-100}} (gdb) p *w $11 = {kchain = {k = 72, chain = 0x48, globalst = 0x48, locpath = 0x48}, flag = 64, mback = {m = 93824992547688, back = 0x5555555a1368, jobpyx = 0x5555555a1368, zaploc = 0x5555555a1368, aarg = 0x5555555a1368}, tproxy = {t = 64, proxychain = 0x40}, c = -9223372036854775807, n = 49, r = 2 '\002', filler = 0 '\000', h = 664, origin = 0, lock = 0, s = {7}} (gdb) p self $12 = (A) 0x7ffff3a28600 <primtab+4480> (gdb) p *self $13 = {kchain = {k = 56, chain = 0x38, globalst = 0x38, locpath = 0x38}, flag = 134217728, mback = {m = 0, back = 0x0, jobpyx = 0x0, zaploc = 0x0, aarg = 0x0}, tproxy = {t = 134217728, proxychain = 0x8000000}, c = 4611686018427387904, n = 0, r = 0 '\000', filler = 0 '\000', h = 0, origin = 0, lock = 0, s = {0}} (gdb) p awr $14 = 2 So it should be adding -100 to a 7 by 7 matrix of extended integers. And, for what it's worth, here's av, wv and zv in jtva2 (for comparison): (gdb) p av $15 = (C *) 0x555555768938 "p\022tUUU" (gdb) p wv $16 = (C *) 0x5555556d1648 "Ц}UUU" (gdb) p zv $17 = (C *) 0x5555556cfc48 "\300\017|UUU" and the point of failure in plusXX (where I've added audittstack calls between every line) looks like this at the point of failure: #2 0x00007ffff2e70879 in plusXX (n=-50, m=1, x=0x555555768940, y=0x5555556d17a0, z=0x5555556cfda0, jt=0x7ffff13e8200) at ../../../../jsrc/vx.c:368 368 {if(MEMAUDIT&2)audittstack(jt);} ... There's a lot going on in jtva2, and there's a lot going on in the memory management, but I'm wondering if maybe this is storing its results to a bogus location - if that's the case I am not at all sure that the details I am providing here are going to be all that useful for isolating the problem. But I guess I could use a few more hints about where I should be looking... Thanks, -- Raul On Sat, Nov 25, 2023 at 10:42 AM Raul Miller <rauldmil...@gmail.com> wrote: > > Well... I discovered and fixed one problem (an off by 2 error with AC(z)). > > Ironically, fixing this problem had no effect on the problem I've been > trying to isolate. > > Which seems strange... > > -- > Raul > > On Sat, Nov 25, 2023 at 8:43 AM Raul Miller <rauldmil...@gmail.com> wrote: > > > > I think I figured out the problem, given Henry's hints and what I'm seeing. > > > > No need for anyone else to spend time on this. > > > > Thanks, > > > > -- > > Raul > > > > On Fri, Nov 24, 2023 at 10:12 PM Bill Heagy <whe...@bell.net> wrote: > > > > > > Probably not helpful. This produced the right answer, but complained a > > > lot. I just recompiled with gcc, which produced the wrong answer, but > > > didn't complain. > > > > > > Bill H. > > > > > > On 11/24/23 22:01, Bill Heagy wrote: > > > > Does this help: > > > > > > > > I've compiled with sanitize: > > > > CC='clang -g -fsanitize=address -fno-omit-frame-pointer > > > > -fsanitize-recover=address' jplatform=linux j64x=j64avx2 > > > > make2/build_all.sh > > > > > > > > > > > > $ USE_OPENMPI=2 ASAN_OPTIONS=halt_on_error=0 ./jlibrary/bin/jconsole > > > > [various complaints starting up] > > > > ....... > > > > 0j30 ": (*%) 11 c. 665142606648569600281099799288x > > > > ================================================================= > > > > ==136587==ERROR: AddressSanitizer: global-buffer-overflow on address > > > > 0x7eff18c54180 at pc 0x7eff18611c52 bp 0x7ffc21b2e3f0 sp 0x7ffc21b2e3e8 > > > > READ of size 8 at 0x7eff18c54180 thread T0 > > > > #0 0x7eff18611c51 in mvc > > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/u.c:345:8 > > > > #1 0x7eff18571535 in jtfmte > > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/f2.c:224:94 > > > > #2 0x7eff18571535 in jtfmt1 > > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/f2.c:244:13 > > > > #3 0x7eff1856d943 in jtth2a > > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/f2.c:287:3 > > > > #4 0x7eff1856d943 in jtthorn2 > > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/f2.c:390:3 > > > > #5 0x7eff185c964f in jtparsea > > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/p.c:751:10 > > > > #6 0x7eff185c817b in jtparse > > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/p.c:290:4 > > > > #7 0x7eff185d6afb in jtimmex > > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/px.c:54:28 > > > > #8 0x7eff1858f47c in jtimmexexecct > > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/io.c:386:2 > > > > #9 0x7eff1858f47c in jdo > > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/io.c:422:111 > > > > #10 0x7eff1858ee62 in JDo > > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/io.c:527:9 > > > > #11 0x563c3aa02bff in main > > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/jconsole.c:393:28 > > > > #12 0x7eff1fe286c9 in __libc_start_call_main > > > > csu/../sysdeps/nptl/libc_start_call_main.h:58:16 > > > > #13 0x7eff1fe28784 in __libc_start_main > > > > csu/../csu/libc-start.c:360:3 > > > > #14 0x563c3a92b440 in _start > > > > (/home/wheagy/git/jsource/jlibrary/bin/jconsole+0x23440) (BuildId: > > > > 5cef6dfdd9af34a072711de8a792780461088bc1) > > > > > > > > 0x7eff18c54180 is located 32 bytes before global variable '.str.13' > > > > defined in > > > > '/home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/f2.c:104' > > > > (0x7eff18c541a0) of size 2 > > > > '.str.13' is ascii string '*' > > > > 0x7eff18c54182 is located 0 bytes after global variable '.str.12' > > > > defined in > > > > '/home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/f2.c:98' > > > > (0x7eff18c54180) of size 2 > > > > '.str.12' is ascii string '0' > > > > SUMMARY: AddressSanitizer: global-buffer-overflow > > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/u.c:345:8 > > > > in mvc > > > > Shadow bytes around the buggy address: > > > > 0x7eff18c53f00: 00 00 00 02 f9 f9 f9 f9 00 00 00 00 00 00 00 00 > > > > 0x7eff18c53f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > > 0x7eff18c54000: 00 05 f9 f9 00 07 f9 f9 00 06 f9 f9 04 f9 f9 f9 > > > > 0x7eff18c54080: 03 f9 f9 f9 02 f9 f9 f9 04 f9 f9 f9 03 f9 f9 f9 > > > > 0x7eff18c54100: 05 f9 f9 f9 04 f9 f9 f9 03 f9 f9 f9 02 f9 f9 f9 > > > > =>0x7eff18c54180:[02]f9 f9 f9 02 f9 f9 f9 00 00 00 f9 f9 f9 f9 f9 > > > > 0x7eff18c54200: 00 00 06 f9 f9 f9 f9 f9 03 f9 f9 f9 00 00 00 00 > > > > 0x7eff18c54280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > > 0x7eff18c54300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > > 0x7eff18c54380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > > 0x7eff18c54400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > > Shadow byte legend (one shadow byte represents 8 application bytes): > > > > Addressable: 00 > > > > Partially addressable: 01 02 03 04 05 06 07 > > > > Heap left redzone: fa > > > > Freed heap region: fd > > > > Stack left redzone: f1 > > > > Stack mid redzone: f2 > > > > Stack right redzone: f3 > > > > Stack after return: f5 > > > > Stack use after scope: f8 > > > > Global redzone: f9 > > > > Global init order: f6 > > > > Poisoned by user: f7 > > > > Container overflow: fc > > > > Array cookie: ac > > > > Intra object redzone: bb > > > > ASan internal: fe > > > > Left alloca redzone: ca > > > > Right alloca redzone: cb > > > > 1.000000000000000000000000000000 > > > > > > > > > > > > On 11/24/23 21:41, Raul Miller wrote: > > > >> So... near as I can tell, the problem occurs inside plusXX which > > > >> resides between jtva2 and jtxplus. > > > >> > > > >> Specifically, an audittstack in jtva2 like this: > > > >> {if(MEMAUDIT&2)audittstack(jt);} > > > >> {I lrc=((AHDR2FN*)aadocv->f)(n,m,av,wv,zv,jt); // run one > > > >> section. Result of 0 means error > > > >> {if(MEMAUDIT&2)audittstack(jt);} > > > >> > > > >> and in jtplusx like this: > > > >> > > > >> XF2(jtxplus){ // a+w > > > >> {if(MEMAUDIT&2)audittstack(jt);} > > > >> > > > >> gives me a segfault with a stack trace on that first line of jtxplus, > > > >> with plusXX between jtva2 and jtxplus, and of course with the above > > > >> lrc= line on the stack for jtva2 (but only after the 150 seconds of > > > >> scripting to trigger the problem). > > > >> > > > > ---------------------------------------------------------------------- > > > > For information about J forums see http://www.jsoftware.com/forums.htm > > > ---------------------------------------------------------------------- > > > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm