[ https://issues.apache.org/jira/browse/BEAM-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16167598#comment-16167598 ]
Aljoscha Krettek commented on BEAM-2712: ---------------------------------------- [~jkff] See my comment on BEAM-2948, I think for Flink we can resolve this by calling it in the {{setup()}} method of the stream operator because this is invoked before any user code/state is touched. > SerializablePipelineOptions should not call > FileSystems.setDefaultPipelineOptions. > ---------------------------------------------------------------------------------- > > Key: BEAM-2712 > URL: https://issues.apache.org/jira/browse/BEAM-2712 > Project: Beam > Issue Type: Bug > Components: runner-apex, runner-core, runner-flink, runner-spark > Reporter: Eugene Kirpichov > > https://github.com/apache/beam/pull/3654 introduces > SerializablePipelineOptions, which on deserialization calls > FileSystems.setDefaultPipelineOptions. > This is obviously problematic and racy in case the same process uses > SerializablePipelineOptions with different filesystem-related options in them. > The reason the PR does this is, Flink and Apex runners were already doing it > in their respective SerializablePipelineOptions-like classes (being removed > in the PR); and Spark wasn't but probably should have. > I believe this is done for the sake of having the proper filesystem options > automatically available on workers in all places where any kind of > PipelineOptions are used. Instead, all 3 runners should pick a better place > to initialize their workers, and explicitly call > FileSystems.setDefaultPipelineOptions there. > It would be even better if FileSystems.setDefaultPipelineOptions didn't exist > at all, but that's a topic for a separate JIRA. > CC'ing runner contributors [~aljoscha] [~aviemzur] [~thw] -- This message was sent by Atlassian JIRA (v6.4.14#64029)