So... I've expanded the definition of plusXX and put audittstacks
between each line. In other words, basically:

I plusXX(I n, I m, X* x, X* y, X* z, JJ jt){
  X u; X v;
  if (n-1==0) {
    I i= (I)(m)-1;
    for (; i>=0; --i){
      u= (*x++);
      v= (*y++);
      *z++= jtxplus(jt,(u),(v));
    }
  } else if (n-1<0){
    I i= (I)(m)-1;
    for (; i>=0; --i){
      u= (*x++);
      {
        I i= -2-(I)(n);
        for (; i>=0; --i){
          v= (*y++);
          *z++= jtxplus(jt,(u),(v)); // this update of *z is "the" problem
        }
      }
    }
  } else {
    I i= (I)(m)-1;
    for (; i>=0; --i){
      v= (*y++);
      {
        I i= (I)(n)-1;
        for (; i>=0; --i){
          u= (*x++);
          *z++= jtxplus(jt,(u),(v));
        }
      }
    }
  };
  I rc= jt->jerr;
  jt->jerr= 0;
  return rc?rc:256;
}

Unfortunately, I'm a bit lost here - I do not know what I am looking at.

The line in jtva2 which called plusXX looks like this:

#3  0x00007ffff29e7bec in jtva2 (jt=0x7ffff13e8200, a=0x555555768900,
w=0x5555556d1600,
    self=0x7ffff3a28600 <primtab+4480>, allranks=131072) at
../../../../jsrc/va2.c:749
749             {I lrc=((AHDR2FN*)aadocv->f)(n,m,av,wv,zv,jt);    //
run one section.  Result of 0 means error

n is -50
m is 1
av, wv and zv are pointers into memory somewhere.

The call to jtva2 looked like this:

#4  0x00007ffff29e45fb in jtatomic2 (jt=0x7ffff13e8200,
a=0x555555769080, w=0x5555556d1600,
    self=0x7ffff3a28600 <primtab+4480>) at ../../../../jsrc/va2.c:1276
1276      z=jtva2(jtinplace,a,w,self,(awr<<RANK2TX)+selfranks);  //
execute the verb
(gdb) p *a
$10 = {kchain = {k = 56, chain = 0x38, globalst = 0x38, locpath =
0x38}, flag = 0, mback = {m = 93824992547648,
    back = 0x5555555a1340, jobpyx = 0x5555555a1340, zaploc =
0x5555555a1340, aarg = 0x5555555a1340}, tproxy = {t = 4,
    proxychain = 0x4}, c = 1, n = 1, r = 0 '\000', filler = 0 '\000',
h = 445, origin = 0, lock = 0, s = {-100}}
(gdb) p *w
$11 = {kchain = {k = 72, chain = 0x48, globalst = 0x48, locpath =
0x48}, flag = 64, mback = {m = 93824992547688,
    back = 0x5555555a1368, jobpyx = 0x5555555a1368, zaploc =
0x5555555a1368, aarg = 0x5555555a1368}, tproxy = {t = 64,
    proxychain = 0x40}, c = -9223372036854775807, n = 49, r = 2
'\002', filler = 0 '\000', h = 664, origin = 0,
  lock = 0, s = {7}}
(gdb) p self
$12 = (A) 0x7ffff3a28600 <primtab+4480>
(gdb) p *self
$13 = {kchain = {k = 56, chain = 0x38, globalst = 0x38, locpath =
0x38}, flag = 134217728, mback = {m = 0, back = 0x0,
    jobpyx = 0x0, zaploc = 0x0, aarg = 0x0}, tproxy = {t = 134217728,
proxychain = 0x8000000},
  c = 4611686018427387904, n = 0, r = 0 '\000', filler = 0 '\000', h =
0, origin = 0, lock = 0, s = {0}}
(gdb) p awr
$14 = 2

So it should be adding -100 to a 7 by 7 matrix of extended integers.

And, for what it's worth, here's av, wv and zv in jtva2 (for comparison):
(gdb) p av
$15 = (C *) 0x555555768938 "p\022tUUU"
(gdb) p wv
$16 = (C *) 0x5555556d1648 "Ц}UUU"
(gdb) p zv
$17 = (C *) 0x5555556cfc48 "\300\017|UUU"

and the point of failure in plusXX (where I've added audittstack calls
between every line) looks like this at the point of failure:
#2  0x00007ffff2e70879 in plusXX (n=-50, m=1, x=0x555555768940,
y=0x5555556d17a0, z=0x5555556cfda0, jt=0x7ffff13e8200)
    at ../../../../jsrc/vx.c:368
368             {if(MEMAUDIT&2)audittstack(jt);}

...

There's a lot going on in jtva2, and there's a lot going on in the
memory management, but I'm wondering if maybe this is storing its
results to a bogus location - if that's the case I am not at all sure
that the details I am providing here are going to be all that useful
for isolating the problem.

But I guess I could use a few more hints about where I should be looking...

Thanks,

-- 
Raul

On Sat, Nov 25, 2023 at 10:42 AM Raul Miller <rauldmil...@gmail.com> wrote:
>
> Well... I discovered and fixed one problem (an off by 2 error with AC(z)).
>
> Ironically, fixing this problem had no effect on the problem I've been
> trying to isolate.
>
> Which seems strange...
>
> --
> Raul
>
> On Sat, Nov 25, 2023 at 8:43 AM Raul Miller <rauldmil...@gmail.com> wrote:
> >
> > I think I figured out the problem, given Henry's hints and what I'm seeing.
> >
> > No need for anyone else to spend time on this.
> >
> > Thanks,
> >
> > --
> > Raul
> >
> > On Fri, Nov 24, 2023 at 10:12 PM Bill Heagy <whe...@bell.net> wrote:
> > >
> > > Probably not helpful.  This produced the right answer, but complained a
> > > lot.  I just recompiled with gcc, which produced the wrong answer, but
> > > didn't complain.
> > >
> > > Bill H.
> > >
> > > On 11/24/23 22:01, Bill Heagy wrote:
> > > > Does this help:
> > > >
> > > > I've compiled with sanitize:
> > > > CC='clang -g -fsanitize=address -fno-omit-frame-pointer
> > > > -fsanitize-recover=address' jplatform=linux j64x=j64avx2 
> > > > make2/build_all.sh
> > > >
> > > >
> > > > $ USE_OPENMPI=2 ASAN_OPTIONS=halt_on_error=0 ./jlibrary/bin/jconsole
> > > > [various complaints starting up]
> > > > .......
> > > >     0j30 ": (*%) 11 c. 665142606648569600281099799288x
> > > > =================================================================
> > > > ==136587==ERROR: AddressSanitizer: global-buffer-overflow on address
> > > > 0x7eff18c54180 at pc 0x7eff18611c52 bp 0x7ffc21b2e3f0 sp 0x7ffc21b2e3e8
> > > > READ of size 8 at 0x7eff18c54180 thread T0
> > > >      #0 0x7eff18611c51 in mvc
> > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/u.c:345:8
> > > >      #1 0x7eff18571535 in jtfmte
> > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/f2.c:224:94
> > > >      #2 0x7eff18571535 in jtfmt1
> > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/f2.c:244:13
> > > >      #3 0x7eff1856d943 in jtth2a
> > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/f2.c:287:3
> > > >      #4 0x7eff1856d943 in jtthorn2
> > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/f2.c:390:3
> > > >      #5 0x7eff185c964f in jtparsea
> > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/p.c:751:10
> > > >      #6 0x7eff185c817b in jtparse
> > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/p.c:290:4
> > > >      #7 0x7eff185d6afb in jtimmex
> > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/px.c:54:28
> > > >      #8 0x7eff1858f47c in jtimmexexecct
> > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/io.c:386:2
> > > >      #9 0x7eff1858f47c in jdo
> > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/io.c:422:111
> > > >      #10 0x7eff1858ee62 in JDo
> > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/io.c:527:9
> > > >      #11 0x563c3aa02bff in main
> > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/jconsole.c:393:28
> > > >      #12 0x7eff1fe286c9 in __libc_start_call_main
> > > > csu/../sysdeps/nptl/libc_start_call_main.h:58:16
> > > >      #13 0x7eff1fe28784 in __libc_start_main 
> > > > csu/../csu/libc-start.c:360:3
> > > >      #14 0x563c3a92b440 in _start
> > > > (/home/wheagy/git/jsource/jlibrary/bin/jconsole+0x23440) (BuildId:
> > > > 5cef6dfdd9af34a072711de8a792780461088bc1)
> > > >
> > > > 0x7eff18c54180 is located 32 bytes before global variable '.str.13'
> > > > defined in
> > > > '/home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/f2.c:104'
> > > >  (0x7eff18c541a0) of size 2
> > > >    '.str.13' is ascii string '*'
> > > > 0x7eff18c54182 is located 0 bytes after global variable '.str.12'
> > > > defined in
> > > > '/home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/f2.c:98'
> > > >  (0x7eff18c54180) of size 2
> > > >    '.str.12' is ascii string '0'
> > > > SUMMARY: AddressSanitizer: global-buffer-overflow
> > > > /home/wheagy/git/jsource/make2/obj/linux/j64avx2/../../../../jsrc/u.c:345:8
> > > >  in mvc
> > > > Shadow bytes around the buggy address:
> > > >    0x7eff18c53f00: 00 00 00 02 f9 f9 f9 f9 00 00 00 00 00 00 00 00
> > > >    0x7eff18c53f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > >    0x7eff18c54000: 00 05 f9 f9 00 07 f9 f9 00 06 f9 f9 04 f9 f9 f9
> > > >    0x7eff18c54080: 03 f9 f9 f9 02 f9 f9 f9 04 f9 f9 f9 03 f9 f9 f9
> > > >    0x7eff18c54100: 05 f9 f9 f9 04 f9 f9 f9 03 f9 f9 f9 02 f9 f9 f9
> > > > =>0x7eff18c54180:[02]f9 f9 f9 02 f9 f9 f9 00 00 00 f9 f9 f9 f9 f9
> > > >    0x7eff18c54200: 00 00 06 f9 f9 f9 f9 f9 03 f9 f9 f9 00 00 00 00
> > > >    0x7eff18c54280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > >    0x7eff18c54300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > >    0x7eff18c54380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > >    0x7eff18c54400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > Shadow byte legend (one shadow byte represents 8 application bytes):
> > > >    Addressable:           00
> > > >    Partially addressable: 01 02 03 04 05 06 07
> > > >    Heap left redzone:       fa
> > > >    Freed heap region:       fd
> > > >    Stack left redzone:      f1
> > > >    Stack mid redzone:       f2
> > > >    Stack right redzone:     f3
> > > >    Stack after return:      f5
> > > >    Stack use after scope:   f8
> > > >    Global redzone:          f9
> > > >    Global init order:       f6
> > > >    Poisoned by user:        f7
> > > >    Container overflow:      fc
> > > >    Array cookie:            ac
> > > >    Intra object redzone:    bb
> > > >    ASan internal:           fe
> > > >    Left alloca redzone:     ca
> > > >    Right alloca redzone:    cb
> > > > 1.000000000000000000000000000000
> > > >
> > > >
> > > > On 11/24/23 21:41, Raul Miller wrote:
> > > >> So... near as I can tell, the problem occurs inside plusXX which
> > > >> resides between jtva2 and jtxplus.
> > > >>
> > > >> Specifically, an audittstack in jtva2 like this:
> > > >> {if(MEMAUDIT&2)audittstack(jt);}
> > > >>          {I lrc=((AHDR2FN*)aadocv->f)(n,m,av,wv,zv,jt);    // run one
> > > >> section.  Result of 0 means error
> > > >> {if(MEMAUDIT&2)audittstack(jt);}
> > > >>
> > > >> and in jtplusx like this:
> > > >>
> > > >> XF2(jtxplus){ // a+w
> > > >> {if(MEMAUDIT&2)audittstack(jt);}
> > > >>
> > > >> gives me a segfault with a stack trace on that first line of jtxplus,
> > > >> with plusXX between jtva2 and jtxplus, and of course with the above
> > > >> lrc= line on the stack for jtva2 (but only after the 150 seconds of
> > > >> scripting to trigger the problem).
> > > >>
> > > > ----------------------------------------------------------------------
> > > > For information about J forums see http://www.jsoftware.com/forums.htm
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to