Thanks for the update, Greg. Threading could certainly be a cause if we
arre missing some kind of synchronization. I know that we have
workspace.startBuilding() and workspace.startIdleState() as ways of
ensuring threads are under control. We may be missing one of those calls
somewhere before emitting JS.

As for GC, I recall that reducing JVM memory wasn't necessarily enough for
me to reproduce the other GC related bug I mentioned, strange as that
seems. I remember also adding System.gc() calls in various places (though I
don't remember exactly where), and I think that's what finally allowed me
to reproduce the issue semi-reliably.

--
Josh Tynjala
Bowler Hat LLC
https://bowlerhat.dev/


On Mon, May 11, 2026 at 10:15 PM Greg Dove <[email protected]> wrote:

> Hey Josh,
>
> I am actively looking into this again. I am less convinced that it is GC
> related (I reduced memory allocation to low levels) and perhaps it is more
> to do with threads/race-conditions. But it's very difficult to be sure, I
> spent today adding logging and trying to repro, but did not repro the bug
> all day. I will keep on this tomorrow trying to find the right conditions
> to force it to occur. If I can figure out what those are, I will share them
> with you.
>
> -Greg
>
>
> On Tue, May 5, 2026 at 9:06 AM Greg Dove <[email protected]> wrote:
>
> > Wake up brain (self talk):
> > "and then not wrong for subsequent output" <- should be of course "and
> > then wrong for subsequent output".
> >
> > On Tue, May 5, 2026 at 9:05 AM Greg Dove <[email protected]> wrote:
> >
> >> Thanks for looking into this, Josh.
> >>
> >>  "If it isn't too difficult to reproduce"
> >> Quick comments, just in case it helps:
> >>
> >> It was not something I could repro for debugging purposes in the
> >> compiler. It was still 'rare' in practice - max 2-3 times per day that I
> >> observed, sometimes only once a day - and not manifesting in the same
> code
> >> - although perhaps that is simply because code can change a lot between
> >> compiler runs - and "awareness" was based on the app not starting up
> >> correctly or noticeable runtime errors. I did not check this: perhaps
> it is
> >> happening more often than I think but with no side effects. This could
> >> happen if it sometimes outputs a typed method as instance.method() where
> >> type resolution worked and elsewhere alongside as instance['method']()
> >> where it did not. The problem might not simply get noticed in this case,
> >> but this is pure speculation, I have not checked for this.
> >>
> >> I did not try reducing heap allocation or anything to try to create
> >> conditions for it to perhaps happen more often if it is memory/GC
> related.
> >>
> >> I see notes like this in the code:
> >> // If we get this far, then we did not find a cached entry
> >> // It is possible for 2+ threads to get in here for the same name.
> >>  // This is intentional - the worst that happens is that we duplicate
> the
> >> resolution work
> >> // the benefit is that we avoid any sort of locking, which was proving
> >> expensive (time wise,
> >> // and memory wise).
> >>
> >> When you see the code that was problematic output, you can see the same
> >> name lookup inside a js method that is obviously correctly resolved
> >> (anecdotally it seems to be more often 'correct' the first time) and
> then
> >> not wrong for subsequent output, in nearby code, so I assume it might be
> >> related to some unsynchronized state or failure to do that 'duplicate'
> >> resolution work, where the various parts were being processed in
> parallel...
> >>
> >> Anyway, good luck, please let me know if you have anything you think I
> >> could do to help.
> >>
> >>
> >>
> >> On Tue, May 5, 2026 at 6:29 AM Harbs <[email protected]> wrote:
> >>
> >>> Sure. I’ll be in touch off list.
> >>>
> >>> > On May 4, 2026, at 9:18 PM, Josh Tynjala <[email protected]>
> >>> wrote:
> >>> >
> >>> > Would you be willing to give me access to the project? If it isn't
> too
> >>> > difficult to reproduce, I may be able to figure out what's going on
> >>> and how
> >>> > to restore the missing typing data, similar to my other fix. My
> >>> feeling is
> >>> > that the original Adobe devs intended for occasional garbage
> >>> collection to
> >>> > occur to stay within memory limits, but that the data would be
> >>> restorable,
> >>> > if needed later. I think that they simply missed some places where it
> >>> might
> >>> > need to be restored because it happens pretty rarely. Or maybe our
> >>> newer JS
> >>> > emitter isn't properly accounting for that possibility.
> >>> >
> >>> > --
> >>> > Josh Tynjala
> >>> > Bowler Hat LLC
> >>> > https://bowlerhat.dev/
> >>> >
> >>> >
> >>> > On Mon, May 4, 2026 at 10:37 AM Harbs <[email protected]> wrote:
> >>> >
> >>> >>> You've tested that this issue still
> >>> >>> reproduces using a compiler built from the latest source code?
> >>> >>
> >>> >> This was reproduced by a number of devs all working on the same
> >>> project.
> >>> >> And yes, it was with recent builds.
> >>> >>
> >>> >> I don’t think I personally have seen it (I have a lot of memory on
> my
> >>> >> machine), but it seems to have gotten worse recently. I don’t know
> if
> >>> >> something changed in the compiler or it’s due to the increased
> >>> project size.
> >>> >>
> >>> >> This was with variables — not functions.
> >>> >>
> >>> >> Harbs
> >>> >>
> >>> >>> On May 4, 2026, at 6:54 PM, Josh Tynjala <
> [email protected]>
> >>> >> wrote:
> >>> >>>
> >>> >>> This issue may be the same one:
> >>> >>>
> >>> >>> https://github.com/apache/royale-compiler/issues/182
> >>> >>>
> >>> >>> I also encountered and fixed an issue related weak references a
> >>> little
> >>> >> over
> >>> >>> a year ago. Function bodies were getting garbage collected, and I
> >>> needed
> >>> >> to
> >>> >>> clear out some stale definitions that were causing missing classes
> in
> >>> >>> generated ASDoc output and some similar issues with the -watch
> >>> compiler
> >>> >>> option.
> >>> >>>
> >>> >>>
> >>> >>
> >>>
> https://github.com/apache/royale-compiler/commit/35eed62f13519c659e6346d26cca3f44afe3170f
> >>> >>>
> >>> >>> This fix does not appear to have made it into a release yet. You're
> >>> not
> >>> >>> using an older compiler build, right? You've tested that this issue
> >>> still
> >>> >>> reproduces using a compiler built from the latest source code?
> >>> >>>
> >>> >>> --
> >>> >>> Josh Tynjala
> >>> >>> Bowler Hat LLC
> >>> >>> https://bowlerhat.dev/
> >>> >>>
> >>> >>>
> >>> >>> On Sun, May 3, 2026 at 9:40 PM Greg Dove <[email protected]>
> >>> wrote:
> >>> >>>
> >>> >>>> Compiler issues - (Josh, please?)
> >>> >>>>
> >>> >>>> We have a medium-sized project that has begun encountering
> >>> >> occasional/rare
> >>> >>>> (but at least daily during normal workloads) compilation issues
> that
> >>> >> appear
> >>> >>>> to be related to name/type resolution. There can be code within a
> >>> method
> >>> >>>> output where the name resolves correctly to its type in one part
> of
> >>> the
> >>> >>>> method's js output and elsewhere within the same js method output
> >>> as if
> >>> >> it
> >>> >>>> was Object/untyped. This is most obvious with XML or XMLList
> >>> instances
> >>> >>>> (because of .child('prop') vs ['prop] differences). I've also seen
> >>> it
> >>> >> get
> >>> >>>> confused between local variables and instance properties in some
> >>> cases,
> >>> >>>> which I believe is a manifestation of the same thing. In other
> >>> words,
> >>> >>>> different compilation runs with the exact same settings are not
> >>> >>>> completely deterministic, because sometimes they can provide
> >>> different
> >>> >>>> output. It is very difficult to repro, because it feels so random.
> >>> But
> >>> >> it
> >>> >>>> has been something that appears to be more frequent as the
> codebase
> >>> >> grows
> >>> >>>> (when all other settings remain the same). This led me to consider
> >>> that
> >>> >> it
> >>> >>>> could be GC-related, and I recently removed the SoftReferences
> >>> inside
> >>> >>>> ASScopeCache, as a prime suspect.
> >>> >>>>
> >>> >>>> After doing this, I have not seen the problem since (so far -
> after
> >>> 1.5
> >>> >>>> days)
> >>> >>>>
> >>> >>>> The ASScopeCache instances themselves are weakly held (inside
> >>> >>>> CompilerProject). So the internal maps inside each of these
> >>> instances
> >>> >> being
> >>> >>>> weakly held as well seems to be the problem, the internal maps can
> >>> >> perhaps
> >>> >>>> get into a partially cleared state between threads.
> >>> >>>>
> >>> >>>> I did some memory profiling with and without this change for
> >>> removing
> >>> >> the
> >>> >>>> SoftReferences inside ASScopeCache - but it was quite limited
> (just
> >>> >> testing
> >>> >>>> with compiling the one project). The memory usage was not much
> >>> >> different on
> >>> >>>> a typical run (approx 1Mb difference for a compilation with around
> >>> 1000
> >>> >> .as
> >>> >>>> and .mxml files combined, alongside a bunch of local swcs). There
> >>> was
> >>> >>>> possibly a small speed up without the SoftReferences, but I did
> not
> >>> test
> >>> >>>> enough to be sure.
> >>> >>>> But so far it seems there is not a big impact on memory with
> >>> omitting
> >>> >>>> these. If it introduces consistency I'm kinda keen to get it in
> >>> there -
> >>> >> I
> >>> >>>> know others have definitely seen this problem too.
> >>> >>>> And for Josh in particular: I think your compiler experience
> dwarfs
> >>> the
> >>> >>>> rest of us and wanted to get your feedback instead of just jumping
> >>> in
> >>> >> with
> >>> >>>> this one. One option could also be to make this change as a
> compiler
> >>> >>>> option, with the new non-weak references being the default, but
> >>> with the
> >>> >>>> ability to switch to the older behaviour via the option if that
> was
> >>> >>>> considered important as well... look forward to hearing your
> >>> thoughts.
> >>> >>>>
> >>> >>
> >>> >>
> >>>
> >>>
>

Reply via email to