Hey Josh,

I am actively looking into this again. I am less convinced that it is GC
related (I reduced memory allocation to low levels) and perhaps it is more
to do with threads/race-conditions. But it's very difficult to be sure, I
spent today adding logging and trying to repro, but did not repro the bug
all day. I will keep on this tomorrow trying to find the right conditions
to force it to occur. If I can figure out what those are, I will share them
with you.

-Greg


On Tue, May 5, 2026 at 9:06 AM Greg Dove <[email protected]> wrote:

> Wake up brain (self talk):
> "and then not wrong for subsequent output" <- should be of course "and
> then wrong for subsequent output".
>
> On Tue, May 5, 2026 at 9:05 AM Greg Dove <[email protected]> wrote:
>
>> Thanks for looking into this, Josh.
>>
>>  "If it isn't too difficult to reproduce"
>> Quick comments, just in case it helps:
>>
>> It was not something I could repro for debugging purposes in the
>> compiler. It was still 'rare' in practice - max 2-3 times per day that I
>> observed, sometimes only once a day - and not manifesting in the same code
>> - although perhaps that is simply because code can change a lot between
>> compiler runs - and "awareness" was based on the app not starting up
>> correctly or noticeable runtime errors. I did not check this: perhaps it is
>> happening more often than I think but with no side effects. This could
>> happen if it sometimes outputs a typed method as instance.method() where
>> type resolution worked and elsewhere alongside as instance['method']()
>> where it did not. The problem might not simply get noticed in this case,
>> but this is pure speculation, I have not checked for this.
>>
>> I did not try reducing heap allocation or anything to try to create
>> conditions for it to perhaps happen more often if it is memory/GC related.
>>
>> I see notes like this in the code:
>> // If we get this far, then we did not find a cached entry
>> // It is possible for 2+ threads to get in here for the same name.
>>  // This is intentional - the worst that happens is that we duplicate the
>> resolution work
>> // the benefit is that we avoid any sort of locking, which was proving
>> expensive (time wise,
>> // and memory wise).
>>
>> When you see the code that was problematic output, you can see the same
>> name lookup inside a js method that is obviously correctly resolved
>> (anecdotally it seems to be more often 'correct' the first time) and then
>> not wrong for subsequent output, in nearby code, so I assume it might be
>> related to some unsynchronized state or failure to do that 'duplicate'
>> resolution work, where the various parts were being processed in parallel...
>>
>> Anyway, good luck, please let me know if you have anything you think I
>> could do to help.
>>
>>
>>
>> On Tue, May 5, 2026 at 6:29 AM Harbs <[email protected]> wrote:
>>
>>> Sure. I’ll be in touch off list.
>>>
>>> > On May 4, 2026, at 9:18 PM, Josh Tynjala <[email protected]>
>>> wrote:
>>> >
>>> > Would you be willing to give me access to the project? If it isn't too
>>> > difficult to reproduce, I may be able to figure out what's going on
>>> and how
>>> > to restore the missing typing data, similar to my other fix. My
>>> feeling is
>>> > that the original Adobe devs intended for occasional garbage
>>> collection to
>>> > occur to stay within memory limits, but that the data would be
>>> restorable,
>>> > if needed later. I think that they simply missed some places where it
>>> might
>>> > need to be restored because it happens pretty rarely. Or maybe our
>>> newer JS
>>> > emitter isn't properly accounting for that possibility.
>>> >
>>> > --
>>> > Josh Tynjala
>>> > Bowler Hat LLC
>>> > https://bowlerhat.dev/
>>> >
>>> >
>>> > On Mon, May 4, 2026 at 10:37 AM Harbs <[email protected]> wrote:
>>> >
>>> >>> You've tested that this issue still
>>> >>> reproduces using a compiler built from the latest source code?
>>> >>
>>> >> This was reproduced by a number of devs all working on the same
>>> project.
>>> >> And yes, it was with recent builds.
>>> >>
>>> >> I don’t think I personally have seen it (I have a lot of memory on my
>>> >> machine), but it seems to have gotten worse recently. I don’t know if
>>> >> something changed in the compiler or it’s due to the increased
>>> project size.
>>> >>
>>> >> This was with variables — not functions.
>>> >>
>>> >> Harbs
>>> >>
>>> >>> On May 4, 2026, at 6:54 PM, Josh Tynjala <[email protected]>
>>> >> wrote:
>>> >>>
>>> >>> This issue may be the same one:
>>> >>>
>>> >>> https://github.com/apache/royale-compiler/issues/182
>>> >>>
>>> >>> I also encountered and fixed an issue related weak references a
>>> little
>>> >> over
>>> >>> a year ago. Function bodies were getting garbage collected, and I
>>> needed
>>> >> to
>>> >>> clear out some stale definitions that were causing missing classes in
>>> >>> generated ASDoc output and some similar issues with the -watch
>>> compiler
>>> >>> option.
>>> >>>
>>> >>>
>>> >>
>>> https://github.com/apache/royale-compiler/commit/35eed62f13519c659e6346d26cca3f44afe3170f
>>> >>>
>>> >>> This fix does not appear to have made it into a release yet. You're
>>> not
>>> >>> using an older compiler build, right? You've tested that this issue
>>> still
>>> >>> reproduces using a compiler built from the latest source code?
>>> >>>
>>> >>> --
>>> >>> Josh Tynjala
>>> >>> Bowler Hat LLC
>>> >>> https://bowlerhat.dev/
>>> >>>
>>> >>>
>>> >>> On Sun, May 3, 2026 at 9:40 PM Greg Dove <[email protected]>
>>> wrote:
>>> >>>
>>> >>>> Compiler issues - (Josh, please?)
>>> >>>>
>>> >>>> We have a medium-sized project that has begun encountering
>>> >> occasional/rare
>>> >>>> (but at least daily during normal workloads) compilation issues that
>>> >> appear
>>> >>>> to be related to name/type resolution. There can be code within a
>>> method
>>> >>>> output where the name resolves correctly to its type in one part of
>>> the
>>> >>>> method's js output and elsewhere within the same js method output
>>> as if
>>> >> it
>>> >>>> was Object/untyped. This is most obvious with XML or XMLList
>>> instances
>>> >>>> (because of .child('prop') vs ['prop] differences). I've also seen
>>> it
>>> >> get
>>> >>>> confused between local variables and instance properties in some
>>> cases,
>>> >>>> which I believe is a manifestation of the same thing. In other
>>> words,
>>> >>>> different compilation runs with the exact same settings are not
>>> >>>> completely deterministic, because sometimes they can provide
>>> different
>>> >>>> output. It is very difficult to repro, because it feels so random.
>>> But
>>> >> it
>>> >>>> has been something that appears to be more frequent as the codebase
>>> >> grows
>>> >>>> (when all other settings remain the same). This led me to consider
>>> that
>>> >> it
>>> >>>> could be GC-related, and I recently removed the SoftReferences
>>> inside
>>> >>>> ASScopeCache, as a prime suspect.
>>> >>>>
>>> >>>> After doing this, I have not seen the problem since (so far - after
>>> 1.5
>>> >>>> days)
>>> >>>>
>>> >>>> The ASScopeCache instances themselves are weakly held (inside
>>> >>>> CompilerProject). So the internal maps inside each of these
>>> instances
>>> >> being
>>> >>>> weakly held as well seems to be the problem, the internal maps can
>>> >> perhaps
>>> >>>> get into a partially cleared state between threads.
>>> >>>>
>>> >>>> I did some memory profiling with and without this change for
>>> removing
>>> >> the
>>> >>>> SoftReferences inside ASScopeCache - but it was quite limited (just
>>> >> testing
>>> >>>> with compiling the one project). The memory usage was not much
>>> >> different on
>>> >>>> a typical run (approx 1Mb difference for a compilation with around
>>> 1000
>>> >> .as
>>> >>>> and .mxml files combined, alongside a bunch of local swcs). There
>>> was
>>> >>>> possibly a small speed up without the SoftReferences, but I did not
>>> test
>>> >>>> enough to be sure.
>>> >>>> But so far it seems there is not a big impact on memory with
>>> omitting
>>> >>>> these. If it introduces consistency I'm kinda keen to get it in
>>> there -
>>> >> I
>>> >>>> know others have definitely seen this problem too.
>>> >>>> And for Josh in particular: I think your compiler experience dwarfs
>>> the
>>> >>>> rest of us and wanted to get your feedback instead of just jumping
>>> in
>>> >> with
>>> >>>> this one. One option could also be to make this change as a compiler
>>> >>>> option, with the new non-weak references being the default, but
>>> with the
>>> >>>> ability to switch to the older behaviour via the option if that was
>>> >>>> considered important as well... look forward to hearing your
>>> thoughts.
>>> >>>>
>>> >>
>>> >>
>>>
>>>

Reply via email to