Eliaaazzz commented on code in PR #37355:
URL: https://github.com/apache/beam/pull/37355#discussion_r2723431507
##########
sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/reflect/DoFnInvokersTest.java:
##########
@@ -1382,11 +1385,18 @@ public void process() {}
@Test
public void testStableName() {
DoFnInvoker<Void, Void> invoker = DoFnInvokers.invokerFor(new
StableNameTestDoFn());
+ // The invoker class name includes a hash of the type descriptors to
support
+ // different generic instantiations of the same DoFn class.
+ // Format: <DoFn class name>$<DoFnInvoker>$<type hash>
+ TypeDescriptor<Void> voidType = new
StableNameTestDoFn().getInputTypeDescriptor();
+ String expectedTypeSuffix =
Review Comment:
@kennknowles Thanks for the review! I have updated the code to use
ToStringHelper and Objects.hash as suggested, and extracted the suffix logic to
clean up the tests.
Regarding the cache collision in the fallback case:
You are exactly right—if type lookup fails, different generic instantiations
like MyDoFn<String> and MyDoFn<Integer> will map to the same CacheKey.
However, it's important to highlight that type lookup failure is often an
objective side effect of deserialization in a distributed environment. When a
DoFn is serialized and transmitted across workers, specific generic type
information can be lost due to Java's type erasure or classloader limitations
on the worker side. This is an inherent constraint we must handle.
This collision is safe and intentional for two reasons:
Class Isolation: The cache key still includes fnClass, so different DoFn
implementations will never share an invoker.
Erasure Compatibility: When we fall back to Object, we generate a "Raw
Invoker". Due to Java type erasure, the underlying method in the bytecode acts
as processElement(Object). Thus, a single shared "Raw Invoker" is perfectly
compatible with any generic instantiation of that class.
I have added a comment in the catch block to explicitly mention that this
fallback is a resilient design for cases where type information is lost during
deserialization, ensuring the system remains functional even when reflection is
limited.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]