Hey Josh, I am actively looking into this again. I am less convinced that it is GC related (I reduced memory allocation to low levels) and perhaps it is more to do with threads/race-conditions. But it's very difficult to be sure, I spent today adding logging and trying to repro, but did not repro the bug all day. I will keep on this tomorrow trying to find the right conditions to force it to occur. If I can figure out what those are, I will share them with you.
-Greg On Tue, May 5, 2026 at 9:06 AM Greg Dove <[email protected]> wrote: > Wake up brain (self talk): > "and then not wrong for subsequent output" <- should be of course "and > then wrong for subsequent output". > > On Tue, May 5, 2026 at 9:05 AM Greg Dove <[email protected]> wrote: > >> Thanks for looking into this, Josh. >> >> "If it isn't too difficult to reproduce" >> Quick comments, just in case it helps: >> >> It was not something I could repro for debugging purposes in the >> compiler. It was still 'rare' in practice - max 2-3 times per day that I >> observed, sometimes only once a day - and not manifesting in the same code >> - although perhaps that is simply because code can change a lot between >> compiler runs - and "awareness" was based on the app not starting up >> correctly or noticeable runtime errors. I did not check this: perhaps it is >> happening more often than I think but with no side effects. This could >> happen if it sometimes outputs a typed method as instance.method() where >> type resolution worked and elsewhere alongside as instance['method']() >> where it did not. The problem might not simply get noticed in this case, >> but this is pure speculation, I have not checked for this. >> >> I did not try reducing heap allocation or anything to try to create >> conditions for it to perhaps happen more often if it is memory/GC related. >> >> I see notes like this in the code: >> // If we get this far, then we did not find a cached entry >> // It is possible for 2+ threads to get in here for the same name. >> // This is intentional - the worst that happens is that we duplicate the >> resolution work >> // the benefit is that we avoid any sort of locking, which was proving >> expensive (time wise, >> // and memory wise). >> >> When you see the code that was problematic output, you can see the same >> name lookup inside a js method that is obviously correctly resolved >> (anecdotally it seems to be more often 'correct' the first time) and then >> not wrong for subsequent output, in nearby code, so I assume it might be >> related to some unsynchronized state or failure to do that 'duplicate' >> resolution work, where the various parts were being processed in parallel... >> >> Anyway, good luck, please let me know if you have anything you think I >> could do to help. >> >> >> >> On Tue, May 5, 2026 at 6:29 AM Harbs <[email protected]> wrote: >> >>> Sure. I’ll be in touch off list. >>> >>> > On May 4, 2026, at 9:18 PM, Josh Tynjala <[email protected]> >>> wrote: >>> > >>> > Would you be willing to give me access to the project? If it isn't too >>> > difficult to reproduce, I may be able to figure out what's going on >>> and how >>> > to restore the missing typing data, similar to my other fix. My >>> feeling is >>> > that the original Adobe devs intended for occasional garbage >>> collection to >>> > occur to stay within memory limits, but that the data would be >>> restorable, >>> > if needed later. I think that they simply missed some places where it >>> might >>> > need to be restored because it happens pretty rarely. Or maybe our >>> newer JS >>> > emitter isn't properly accounting for that possibility. >>> > >>> > -- >>> > Josh Tynjala >>> > Bowler Hat LLC >>> > https://bowlerhat.dev/ >>> > >>> > >>> > On Mon, May 4, 2026 at 10:37 AM Harbs <[email protected]> wrote: >>> > >>> >>> You've tested that this issue still >>> >>> reproduces using a compiler built from the latest source code? >>> >> >>> >> This was reproduced by a number of devs all working on the same >>> project. >>> >> And yes, it was with recent builds. >>> >> >>> >> I don’t think I personally have seen it (I have a lot of memory on my >>> >> machine), but it seems to have gotten worse recently. I don’t know if >>> >> something changed in the compiler or it’s due to the increased >>> project size. >>> >> >>> >> This was with variables — not functions. >>> >> >>> >> Harbs >>> >> >>> >>> On May 4, 2026, at 6:54 PM, Josh Tynjala <[email protected]> >>> >> wrote: >>> >>> >>> >>> This issue may be the same one: >>> >>> >>> >>> https://github.com/apache/royale-compiler/issues/182 >>> >>> >>> >>> I also encountered and fixed an issue related weak references a >>> little >>> >> over >>> >>> a year ago. Function bodies were getting garbage collected, and I >>> needed >>> >> to >>> >>> clear out some stale definitions that were causing missing classes in >>> >>> generated ASDoc output and some similar issues with the -watch >>> compiler >>> >>> option. >>> >>> >>> >>> >>> >> >>> https://github.com/apache/royale-compiler/commit/35eed62f13519c659e6346d26cca3f44afe3170f >>> >>> >>> >>> This fix does not appear to have made it into a release yet. You're >>> not >>> >>> using an older compiler build, right? You've tested that this issue >>> still >>> >>> reproduces using a compiler built from the latest source code? >>> >>> >>> >>> -- >>> >>> Josh Tynjala >>> >>> Bowler Hat LLC >>> >>> https://bowlerhat.dev/ >>> >>> >>> >>> >>> >>> On Sun, May 3, 2026 at 9:40 PM Greg Dove <[email protected]> >>> wrote: >>> >>> >>> >>>> Compiler issues - (Josh, please?) >>> >>>> >>> >>>> We have a medium-sized project that has begun encountering >>> >> occasional/rare >>> >>>> (but at least daily during normal workloads) compilation issues that >>> >> appear >>> >>>> to be related to name/type resolution. There can be code within a >>> method >>> >>>> output where the name resolves correctly to its type in one part of >>> the >>> >>>> method's js output and elsewhere within the same js method output >>> as if >>> >> it >>> >>>> was Object/untyped. This is most obvious with XML or XMLList >>> instances >>> >>>> (because of .child('prop') vs ['prop] differences). I've also seen >>> it >>> >> get >>> >>>> confused between local variables and instance properties in some >>> cases, >>> >>>> which I believe is a manifestation of the same thing. In other >>> words, >>> >>>> different compilation runs with the exact same settings are not >>> >>>> completely deterministic, because sometimes they can provide >>> different >>> >>>> output. It is very difficult to repro, because it feels so random. >>> But >>> >> it >>> >>>> has been something that appears to be more frequent as the codebase >>> >> grows >>> >>>> (when all other settings remain the same). This led me to consider >>> that >>> >> it >>> >>>> could be GC-related, and I recently removed the SoftReferences >>> inside >>> >>>> ASScopeCache, as a prime suspect. >>> >>>> >>> >>>> After doing this, I have not seen the problem since (so far - after >>> 1.5 >>> >>>> days) >>> >>>> >>> >>>> The ASScopeCache instances themselves are weakly held (inside >>> >>>> CompilerProject). So the internal maps inside each of these >>> instances >>> >> being >>> >>>> weakly held as well seems to be the problem, the internal maps can >>> >> perhaps >>> >>>> get into a partially cleared state between threads. >>> >>>> >>> >>>> I did some memory profiling with and without this change for >>> removing >>> >> the >>> >>>> SoftReferences inside ASScopeCache - but it was quite limited (just >>> >> testing >>> >>>> with compiling the one project). The memory usage was not much >>> >> different on >>> >>>> a typical run (approx 1Mb difference for a compilation with around >>> 1000 >>> >> .as >>> >>>> and .mxml files combined, alongside a bunch of local swcs). There >>> was >>> >>>> possibly a small speed up without the SoftReferences, but I did not >>> test >>> >>>> enough to be sure. >>> >>>> But so far it seems there is not a big impact on memory with >>> omitting >>> >>>> these. If it introduces consistency I'm kinda keen to get it in >>> there - >>> >> I >>> >>>> know others have definitely seen this problem too. >>> >>>> And for Josh in particular: I think your compiler experience dwarfs >>> the >>> >>>> rest of us and wanted to get your feedback instead of just jumping >>> in >>> >> with >>> >>>> this one. One option could also be to make this change as a compiler >>> >>>> option, with the new non-weak references being the default, but >>> with the >>> >>>> ability to switch to the older behaviour via the option if that was >>> >>>> considered important as well... look forward to hearing your >>> thoughts. >>> >>>> >>> >> >>> >> >>> >>>
