[
https://issues.apache.org/jira/browse/FLINK-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012822#comment-17012822
]
Kostas Kloudas commented on FLINK-15533:
----------------------------------------
I think I get the problem. Before 1.10 you could only use either the {{CLI}}
(i.e. the {{ContextEnv}}) or the {{RemoteStreamEnv}} and in these cases, at
some point of the job submission we were reseting the parallelism to a
non-negative value, while now we do not. I pushed a change in the fix in the
branch above.
Could you try it out and let me know if it works in all cases, i.e. even when
you do not set the parallelism?
> Writing DataStream as text file fails due to output path already exists
> -----------------------------------------------------------------------
>
> Key: FLINK-15533
> URL: https://issues.apache.org/jira/browse/FLINK-15533
> Project: Flink
> Issue Type: Bug
> Components: Client / Job Submission
> Affects Versions: 1.10.0
> Reporter: Rui Li
> Assignee: Kostas Kloudas
> Priority: Blocker
> Fix For: 1.10.0
>
>
> The following program reproduces the issue.
> {code}
> Configuration configuration = GlobalConfiguration.loadConfiguration();
> configuration.set(DeploymentOptions.TARGET, RemoteExecutor.NAME);
> StreamExecutionEnvironment streamEnv = new
> StreamExecutionEnvironment(configuration);
> DataStream dataStream = streamEnv.fromCollection(Arrays.asList(1,2,3));
> dataStream.writeAsText("hdfs://localhost:8020/tmp/output");
> streamEnv.execute();
> {code}
> The job will fail with the follow error, even though the output path doesn't
> exist before job submission:
> {noformat}
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.fs.FileAlreadyExistsException):
> /tmp/output already exists as a directory
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)