The segfault on 1x was due to bad debugging code which I had
introduced. Yes, I have removed that.

Thanks,

-- 
Raul

On Mon, Nov 27, 2023 at 9:25 AM bill lam <[email protected]> wrote:
>
> Have you fixed segfault on 1x when adding the line
> if(unlikely(ISGMP(w)) SEGFAULT; // do not free libgmp managed memory here
>
>
> On Mon, Nov 6, 2023 at 21:36 Raul Miller <[email protected]> wrote:
>
> > So...
> >
> > It turns out that it's fairly easy to find a simple example of this
> > problem.
> >
> > In m.c, add:
> > if(unlikely(ISGMP(w)) SEGFAULT; // do not free libgmp managed memory here
> >
> > as the first line of jtmf. This will crash on i.10x
> >
> > That said, replacing the SEGFAULT with {gmpfree(w);R;} or (since ja.h
> > isn't available here) its definition, changing x for w), still results
> > in a crash, it just takes longer.
> >
> > This behavior roughly matches an earlier suspicion (that the decision
> > to use the gmp deallocator vs the j deallocator was being bypassed -
> > that there was an assumption that all members of an XNUM (or a RAT)
> > were constructed using the same allocator), but I don't know my way
> > around the memory management code to see where I should be looking to
> > find this decision.
> >
> > That said, also, I was expecting a segfault in the gmp deallocator,
> > not in m.c, but that corresponding SEGFAULT doesn't seem to trigger.
> >
> > So... do you have any suggestions on how I should approach looking in
> > m.c to find the assumptions about homogeneous XNUMs that I'm breaking?
> >
> > (My current level of ignorance on this subject feels like the kind of
> > architectural detail ignorance that routinely trips people up in many,
> > many "enterprise" contexts. I can presumably work through it by
> > performing inspection and experiments, but it seems more fruitful to
> > just ask.)
> >
> > Thanks,
> >
> > --
> > Raul
> >
> >
> > On Fri, Nov 3, 2023 at 6:22 PM Raul Miller <[email protected]> wrote:
> > >
> > > This sounds like a good plan.
> > >
> > > Unfortunately, my machine is crashing (and sometimes failing to
> > > reboot) just doing step 1.
> > >
> > > So I'm not even sure that the problem I'm encountering is a problem in
> > > my changes - it might be a problem in my machine. (That said, my code
> > > is still the prime suspect.)
> > >
> > > So... I've pushed a copy of my changes to a new branch (gmp-redo0).
> > > This is partially to guard against a complete loss of my machine, and
> > > partially to give someone else a chance of looking at the problem.
> > >
> > > I've not given up, but I have expanded the scope of my concerns, which
> > > is going to slow me down.
> > >
> > > FYI,
> > >
> > > --
> > > Raul
> > >
> > > On Fri, Nov 3, 2023 at 1:41 PM Henry Rich <[email protected]> wrote:
> > > >
> > > > You have an unknown memory corruption running the test suite. This is
> > how I
> > > > debug those:
> > > >
> > > > 1. RECHO ddall to see where it crashes.
> > > > 2. Run the scripts before the crash to see if you can crash with a
> > shorter
> > > > run
> > > > 3. When you have the crash as small as you can, set MEMAUDIT to 1d and
> > see
> > > > if you get an audit failure.
> > > > 4. Once you get an audit failure you want to increase the frequency of
> > > > audits. This is where 6!:5 (1) comes in. Once you execute that, it
> > audits
> > > > the free pool very frequently. That slows things down so you want to
> > set it
> > > > as close to the actual error as possible.
> > > > 5. When you have found the first failure, it will be soon after the
> > errant
> > > > code. Add calls to auditmemchains liberally until you have isolated the
> > > > error line.
> > > > 6. If at any point you need to know what J sentence is executing, set
> > > > TRACKINFO to 1 and look at the name track* in any routine that defines
> > > > them.
> > > >
> > > > hhr
> > > >
> > > > On Fri, Nov 3, 2023, 5:15 PM Raul Miller <[email protected]>
> > wrote:
> > > >
> > > > > Can you give me some hints on using 6!:5?
> > > > >
> > > > > I don't see any comments on jtpeekdata, and I haven't visualized what
> > > > > I'd be looking for, nor when, for that matter.
> > > > >
> > > > > I got the 0xdeadbeef stuff running MEMAUDIT=0xd.
> > > > >
> > > > > I was running MEMAUDIT=0xff overnight, to see if I could catch the
> > > > > problem earlier, but my machine rebooted. I don't know if that was
> > > > > windows update or if that was some other issue. I'll give it another
> > > > > shot.
> > > > >
> > > > > Basically, though, this isn't a problem which I've figured out a good
> > > > > way of triggering, so it's slow going.
> > > > >
> > > > > --
> > > > > Raul
> > > > >
> > > > >
> > > > > On Fri, Nov 3, 2023 at 6:18 AM Henry Rich <[email protected]>
> > wrote:
> > > > > >
> > > > > > Set MEMAUDIT and 6!:5 to see where the free chain is corrupted.
> > > > > >
> > > > > > hhr
> > > > > >
> > > > > > On Thu, Nov 2, 2023, 3:32 PM Raul Miller <[email protected]>
> > wrote:
> > > > > >
> > > > > > > This is turning out to be more difficult than I had anticipated.
> > > > > > >
> > > > > > > I've modified jtxplus to use mpn_add (and mpn_sub and a macro
> > > > > > > workalike for the inlined mpn_neg, which in turn uses mpn_com -
> > > > > > > necessary because the mpn_ family of routines works on unsigned
> > limb
> > > > > > > sequences). And, it *mostly* works.
> > > > > > >
> > > > > > > However, when running script/testga.sh, I encounter a double free
> > > > > problem.
> > > > > > >
> > > > > > > My current best guess is that somewhere I'm relying on a
> > container
> > > > > > > test (XNUM/RAT) instead of relying on the ISGMP() test. But I
> > looked
> > > > > > > through m.c and I'm not seeing anything there that looks
> > plausible.
> > > > > > >
> > > > > > > I did notice that the frgmp() macro is not referenced anywhere,
> > and I
> > > > > > > used the corresponding fr() macro in my implementation rather
> > than
> > > > > > > mf() - but if that's an issue, I need a better understanding of
> > this
> > > > > > > part of the internal api.
> > > > > > >
> > > > > > > So... anyways... before I dig this hole too deep, I figure I
> > should
> > > > > > > ask for advice on how to proceed.
> > > > > > >
> > > > > > > Thoughts?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > --
> > > > > > > Raul
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Oct 25, 2023 at 3:12 AM Henry Rich <[email protected]
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > OK, I see now.  The implementation using low-level GMP would
> > be more
> > > > > > > > parallel with Roger's original version, right?
> > > > > > > >
> > > > > > > > That would end up being simpler than having reserve memory, as
> > well
> > > > > as
> > > > > > > > stabler.
> > > > > > > >
> > > > > > > > Since you currently mark blocks that are to be freed by GMP,
> > you
> > > > > could
> > > > > > > > make this change piecemeal, right?  When you rewrite addition
> > to use
> > > > > the
> > > > > > > > low-level routines, you mark the blocks allocated by addition
> > as J
> > > > > not
> > > > > > > > GMP, and everything else follows automatically.
> > > > > > > >
> > > > > > > > Tbat's a great idea.
> > > > > > > >
> > > > > > > > hhr
> > > > > > > >
> > > > > > > > On 10/24/2023 8:34 PM, Raul Miller wrote:
> > > > > > > > > libzahl is not thread safe, and even if it was, it's not
> > clear to
> > > > > me
> > > > > > > > > that it adequately supports enough architectures.
> > > > > > > > >
> > > > > > > > > Meanwhile, libgmp's problems are addressable. I just have to
> > use a
> > > > > > > > > different part of its API.
> > > > > > > > >
> > > > > > > > > (Also, on windows, we're using mpir rather than libgmp.)
> > > > > > > > >
> > > > > > > > > (J currently uses parts of the libgmp high level API, which
> > > > > performs
> > > > > > > > > memory allocations within the libgmp library routines, using
> > > > > callbacks
> > > > > > > > > whose implementation I supply. But it also exposes the low
> > level
> > > > > > > > > routines used to build those high level routines, and those
> > low
> > > > > level
> > > > > > > > > routines do not perform memory allocation, which means that
> > we can
> > > > > > > > > manage the memory outside of the API.)
> > > > > > > > >
> > > > > > > > > ((The problem with libgmp's high level API is that if a
> > memory
> > > > > > > > > allocation fails, it exits the program. So we came up with a
> > > > > > > > > workaround which reserves a memory pool, and limits
> > arguments to
> > > > > > > > > certain routines, so successful memory allocations will
> > succeed
> > > > > even
> > > > > > > > > under low memory conditions. That's not ideal, but it has
> > been
> > > > > "good
> > > > > > > > > enough, so far". But libgmp supports another approach. It's a
> > > > > little
> > > > > > > > > more work, but not an excessive amount of work.))
> > > > > > > > >
> > > > > > > > > I hope this makes sense,
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > >
> > ----------------------------------------------------------------------
> > > > > > > > For information about J forums see
> > > > > http://www.jsoftware.com/forums.htm
> > > > > > >
> > ----------------------------------------------------------------------
> > > > > > > For information about J forums see
> > http://www.jsoftware.com/forums.htm
> > > > > > >
> > > > > >
> > ----------------------------------------------------------------------
> > > > > > For information about J forums see
> > http://www.jsoftware.com/forums.htm
> > > > >
> > ----------------------------------------------------------------------
> > > > > For information about J forums see
> > http://www.jsoftware.com/forums.htm
> > > > >
> > > > ----------------------------------------------------------------------
> > > > For information about J forums see http://www.jsoftware.com/forums.htm
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> >
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to