[ 
https://issues.apache.org/jira/browse/FLINK-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-3203:
----------------------------------
      Labels: DistributedCache OGS auto-deprioritized-major 
auto-deprioritized-minor  (was: DistributedCache OGS auto-deprioritized-major 
stale-minor)
    Priority: Not a Priority  (was: Minor)

This issue was labeled "stale-minor" 7 days ago and has not received any 
updates so it is being deprioritized. If this ticket is actually Minor, please 
raise the priority and ask a committer to assign you the issue or revive the 
public discussion.


> DistributedCache crashing when run in OGS
> -----------------------------------------
>
>                 Key: FLINK-3203
>                 URL: https://issues.apache.org/jira/browse/FLINK-3203
>             Project: Flink
>          Issue Type: Bug
>          Components: API / DataSet
>    Affects Versions: 0.10.0
>         Environment: Rocks 6.1 SP1, CentOS release 6.7 
> (2.6.32-573.7.1.el6.x86_64), java/oraclejdk/1.8.0_45, Python 2.6.6
>            Reporter: Omar Alvarez
>            Priority: Not a Priority
>              Labels: DistributedCache, OGS, auto-deprioritized-major, 
> auto-deprioritized-minor
>
> When trying to execute the Python example without HDFS, the FlatMap fails 
> with the following error:
> {code:title=PyExample|borderStyle=solid}
> 01/05/2016 13:09:38     Job execution switched to status RUNNING.
> 01/05/2016 13:09:38     DataSource (ValueSource)(1/1) switched to SCHEDULED
> 01/05/2016 13:09:38     DataSource (ValueSource)(1/1) switched to DEPLOYING
> 01/05/2016 13:09:38     DataSource (ValueSource)(1/1) switched to RUNNING
> 01/05/2016 13:09:38     MapPartition (PythonFlatMap -> PythonCombine)(1/1) 
> switched to SCHEDULED
> 01/05/2016 13:09:38     MapPartition (PythonFlatMap -> PythonCombine)(1/1) 
> switched to DEPLOYING
> 01/05/2016 13:09:38     DataSource (ValueSource)(1/1) switched to FINISHED
> 01/05/2016 13:09:38     MapPartition (PythonFlatMap -> PythonCombine)(1/1) 
> switched to RUNNING
> 01/05/2016 13:09:38     MapPartition (PythonFlatMap -> PythonCombine)(1/1) 
> switched to FAILED
> java.lang.Exception: The user defined 'open()' method caused an exception: An 
> error occurred while copying the file.
>         at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:484)
>         at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:354)
>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: An error occurred while copying the 
> file.
>         at 
> org.apache.flink.api.common.cache.DistributedCache.getFile(DistributedCache.java:78)
>               at 
> org.apache.flink.languagebinding.api.java.python.streaming.PythonStreamer.startPython(PythonStreamer.java:68)
>               at 
> org.apache.flink.languagebinding.api.java.python.streaming.PythonStreamer.setupProcess(PythonStreamer.java:58)
>               at 
> org.apache.flink.languagebinding.api.java.common.streaming.Streamer.open(Streamer.java:67)
>               at 
> org.apache.flink.languagebinding.api.java.python.functions.PythonMapPartition.open(PythonMapPartition.java:47)
>               at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>               at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:480)
>               ... 3 more
> Caused by: java.io.FileNotFoundException: File file:/tmp/flink does not exist 
> or the user running Flink ('omar.alvarez') has insufficient permissions to 
> access it.
>               at 
> org.apache.flink.core.fs.local.LocalFileSystem.getFileStatus(LocalFileSystem.java:107)
>               at 
> org.apache.flink.runtime.filecache.FileCache.copy(FileCache.java:242)
>               at 
> org.apache.flink.runtime.filecache.FileCache$CopyProcess.call(FileCache.java:322)
>               at 
> org.apache.flink.runtime.filecache.FileCache$CopyProcess.call(FileCache.java:306)
>               at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>               at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>               at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>               at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>               at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>               at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>               at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>               ... 1 more
> {code}
> It is important to mention that I am using modified Flink cluster launch 
> scripts to use the OGS engine. The modified scripts and usage case can be 
> found in https://github.com/omaralvarez/flink-OGS-GE.
> The same example in the Java API works correctly when not using the 
> DistributedCache, and the user has sufficient permissions to write the file. 
> If I use interactive nodes instead of the qsub command to run the example it 
> does not fail.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to