ddebowczyk92 commented on code in PR #28590:
URL: https://github.com/apache/beam/pull/28590#discussion_r1337361747


##########
sdks/java/io/hadoop-common/src/main/java/org/apache/beam/sdk/io/hadoop/SerializableConfiguration.java:
##########
@@ -54,8 +58,15 @@ public Configuration get() {
 
   @Override
   public void writeExternal(ObjectOutput out) throws IOException {
+    if (serializationCache == null) {

Review Comment:
   I think it's hard to identify all the places where `Configuration` could be 
modified after the serialization cache is initialized.  We can assume that 
anytime `Configuration` is accessed by calling 
`SerializableConfiguration#get()` it might get modified therefore making 
`serializableCache` store expired/invalid data. To address this concern, I 
added cache invalidation at the beginning of the 
`SerializableConfiguration#get()` method in a PR update. 
   
   I tested this change on the same job I described in 
https://github.com/apache/beam/issues/28589 and it still works 
   
   
   
![serializable_configuration_sample](https://github.com/apache/beam/assets/8567235/99cb04d4-2de9-4746-a07d-69fbbe2157bc)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to