kennknowles opened a new issue, #18430:
URL: https://github.com/apache/beam/issues/18430

   https://github.com/apache/beam/pull/3654 introduces 
SerializablePipelineOptions, which on deserialization calls 
FileSystems.setDefaultPipelineOptions.
   
   This is obviously problematic and racy in case the same process uses 
SerializablePipelineOptions with different filesystem-related options in them.
   
   The reason the PR does this is, Flink and Apex runners were already doing it 
in their respective SerializablePipelineOptions-like classes (being removed in 
the PR); and Spark wasn't but probably should have.
   
   I believe this is done for the sake of having the proper filesystem options 
automatically available on workers in all places where any kind of 
PipelineOptions are used. Instead, all 3 runners should pick a better place to 
initialize their workers, and explicitly call 
FileSystems.setDefaultPipelineOptions there.
   
   It would be even better if FileSystems.setDefaultPipelineOptions didn't 
exist at all, but that's a topic for a separate JIRA.
   
   CC'ing runner contributors [~aljoscha] [~aviemzur] [~thw]
   
   Imported from Jira 
[BEAM-2712](https://issues.apache.org/jira/browse/BEAM-2712). Original Jira may 
contain additional context.
   Reported by: jkff.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to