Eliaaazzz commented on code in PR #37355:
URL: https://github.com/apache/beam/pull/37355#discussion_r2723431507


##########
sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/reflect/DoFnInvokersTest.java:
##########
@@ -1382,11 +1385,18 @@ public void process() {}
   @Test
   public void testStableName() {
     DoFnInvoker<Void, Void> invoker = DoFnInvokers.invokerFor(new 
StableNameTestDoFn());
+    // The invoker class name includes a hash of the type descriptors to 
support
+    // different generic instantiations of the same DoFn class.
+    // Format: <DoFn class name>$<DoFnInvoker>$<type hash>
+    TypeDescriptor<Void> voidType = new 
StableNameTestDoFn().getInputTypeDescriptor();
+    String expectedTypeSuffix =

Review Comment:
   @kennknowles Thanks for the review! I have updated the code to use 
ToStringHelper and Objects.hash as suggested, and extracted the suffix logic to 
clean up the tests.
   
   Regarding the cache collision in the fallback case:
   
   You are exactly right—if type lookup fails, different generic instantiations 
like MyDoFn<String> and MyDoFn<Integer> will map to the same CacheKey.
   
   However, it's important to highlight that type lookup failure is often an 
objective side effect of deserialization in a distributed environment. When a 
DoFn is serialized and transmitted across workers, specific generic type 
information can be lost due to Java's type erasure or classloader limitations 
on the worker side. This is an inherent constraint we must handle.
   
   This collision is safe and intentional for two reasons:
   
   Class Isolation: The cache key still includes fnClass, so different DoFn 
implementations will never share an invoker.
   
   Erasure Compatibility: When we fall back to Object, we generate a "Raw 
Invoker". Due to Java type erasure, the underlying method in the bytecode acts 
as processElement(Object). Thus, a single shared "Raw Invoker" is perfectly 
compatible with any generic instantiation of that class.
   
   I have added a comment in the catch block to explicitly mention that this 
fallback is a resilient design for cases where type information is lost during 
deserialization, ensuring the system remains functional even when reflection is 
limited.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to