Hi all, Thanks for the discussion and pointers. I checked GitHub and didn't see an assigned issue/PR for this yet.
I agree with the consensus here (Kenn/Reuven/Robert/Byron) that this looks like a bug: we’re memoizing DoFnInvoker bytecode generation, but the cache key is currently only the DoFn class. This appears to be missing pertinent inputs and can lead to reusing an invoker with the wrong cast target. I’d like to volunteer to fix this. Plan: 1. Add a regression test that reproduces the collision/ClassCastException (e.g., reusing the same DoFn class in different contexts with different cast targets). 2. Update ByteBuddyDoFnInvokerFactory to key the cache on the DoFn class plus the cast target (as Byron suggested). 3. If the cast target isn’t directly available at the caching boundary, I can explore using the stage name as a proxy as Robert suggested. 4. Submit a PR for review. I’ll open a GitHub issue to track this and link it back to this thread. Best, Elia LIU
