Hi all,

Thanks for the discussion and pointers. I checked GitHub and didn't
see an assigned issue/PR for this yet.

I agree with the consensus here (Kenn/Reuven/Robert/Byron) that this
looks like a bug: we’re memoizing DoFnInvoker bytecode generation, but
the cache key is currently only the DoFn class. This appears to be
missing pertinent inputs and can lead to reusing an invoker with the
wrong cast target.

I’d like to volunteer to fix this.

Plan:

1. Add a regression test that reproduces the
collision/ClassCastException (e.g., reusing the same DoFn class in
different contexts with different cast targets).

2. Update ByteBuddyDoFnInvokerFactory to key the cache on the DoFn
class plus the cast target (as Byron suggested).

3. If the cast target isn’t directly available at the caching
boundary, I can explore using the stage name as a proxy as Robert
suggested.

4. Submit a PR for review.

I’ll open a GitHub issue to track this and link it back to this thread.

Best,
Elia LIU

Reply via email to