[ 
https://issues.apache.org/jira/browse/FLINK-14590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hequn Cheng reassigned FLINK-14590:
-----------------------------------

    Assignee: Wei Zhong  (was: Hequn Cheng)

> Unify the working directory of Java process and Python process when 
> submitting python jobs via "flink run -py"
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-14590
>                 URL: https://issues.apache.org/jira/browse/FLINK-14590
>             Project: Flink
>          Issue Type: Bug
>          Components: API / Python
>            Reporter: Wei Zhong
>            Assignee: Wei Zhong
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Assume we enter this flink directory with following structure:
> {code:java}
> flink/
>       bin/
>           flink
>           pyflink-shell.sh
>           python-gateway-server.sh
>           ...
>       bad_case/
>                word_count.py
>                data.txt
>       lib/...
>       opt/...{code}
>  And the word_count.py has such a piece of code:
> {code:java}
>     t_config = TableConfig()
>     env = StreamExecutionEnvironment.get_execution_environment()
>     t_env = StreamTableEnvironment.create(env, t_config)
>     env._j_stream_execution_environment.registerCachedFile("data", 
> "bad_case/data.txt")
>     with open("bad_case/data.txt", "r") as f:
>         content = f.read()
>     elements = [(word, 1) for word in content.split(" ")]
>     t_env.from_elements(elements, ["word", "count"]){code}
> Then we enter the "flink" directory and run:
> {code:java}
> bin/flink run -py bad_case/word_count.py
> {code}
> The program will fail at the line of "with open("bad_case/data.txt", "r") as 
> f:".
> It is because the working directory of Java process is current directory but 
> the working directory of Python process is a temporary directory.
> So there is no problem when relative path is used in the api call to java 
> process. But if relative path is used in other place such as native file 
> access, it will fail, because the working directory of python process has 
> been change to a temporary directory that is not known to users.
> I think it will cause some confusion for users, especially after we support 
> dependency management. It will be great if we unify the working directory of 
> Java process and Python process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to