[
https://issues.apache.org/jira/browse/FLINK-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flink Jira Bot updated FLINK-3203:
----------------------------------
Labels: DistributedCache OGS auto-deprioritized-major
auto-deprioritized-minor (was: DistributedCache OGS auto-deprioritized-major
stale-minor)
Priority: Not a Priority (was: Minor)
This issue was labeled "stale-minor" 7 days ago and has not received any
updates so it is being deprioritized. If this ticket is actually Minor, please
raise the priority and ask a committer to assign you the issue or revive the
public discussion.
> DistributedCache crashing when run in OGS
> -----------------------------------------
>
> Key: FLINK-3203
> URL: https://issues.apache.org/jira/browse/FLINK-3203
> Project: Flink
> Issue Type: Bug
> Components: API / DataSet
> Affects Versions: 0.10.0
> Environment: Rocks 6.1 SP1, CentOS release 6.7
> (2.6.32-573.7.1.el6.x86_64), java/oraclejdk/1.8.0_45, Python 2.6.6
> Reporter: Omar Alvarez
> Priority: Not a Priority
> Labels: DistributedCache, OGS, auto-deprioritized-major,
> auto-deprioritized-minor
>
> When trying to execute the Python example without HDFS, the FlatMap fails
> with the following error:
> {code:title=PyExample|borderStyle=solid}
> 01/05/2016 13:09:38 Job execution switched to status RUNNING.
> 01/05/2016 13:09:38 DataSource (ValueSource)(1/1) switched to SCHEDULED
> 01/05/2016 13:09:38 DataSource (ValueSource)(1/1) switched to DEPLOYING
> 01/05/2016 13:09:38 DataSource (ValueSource)(1/1) switched to RUNNING
> 01/05/2016 13:09:38 MapPartition (PythonFlatMap -> PythonCombine)(1/1)
> switched to SCHEDULED
> 01/05/2016 13:09:38 MapPartition (PythonFlatMap -> PythonCombine)(1/1)
> switched to DEPLOYING
> 01/05/2016 13:09:38 DataSource (ValueSource)(1/1) switched to FINISHED
> 01/05/2016 13:09:38 MapPartition (PythonFlatMap -> PythonCombine)(1/1)
> switched to RUNNING
> 01/05/2016 13:09:38 MapPartition (PythonFlatMap -> PythonCombine)(1/1)
> switched to FAILED
> java.lang.Exception: The user defined 'open()' method caused an exception: An
> error occurred while copying the file.
> at
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:484)
> at
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:354)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: An error occurred while copying the
> file.
> at
> org.apache.flink.api.common.cache.DistributedCache.getFile(DistributedCache.java:78)
> at
> org.apache.flink.languagebinding.api.java.python.streaming.PythonStreamer.startPython(PythonStreamer.java:68)
> at
> org.apache.flink.languagebinding.api.java.python.streaming.PythonStreamer.setupProcess(PythonStreamer.java:58)
> at
> org.apache.flink.languagebinding.api.java.common.streaming.Streamer.open(Streamer.java:67)
> at
> org.apache.flink.languagebinding.api.java.python.functions.PythonMapPartition.open(PythonMapPartition.java:47)
> at
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> at
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:480)
> ... 3 more
> Caused by: java.io.FileNotFoundException: File file:/tmp/flink does not exist
> or the user running Flink ('omar.alvarez') has insufficient permissions to
> access it.
> at
> org.apache.flink.core.fs.local.LocalFileSystem.getFileStatus(LocalFileSystem.java:107)
> at
> org.apache.flink.runtime.filecache.FileCache.copy(FileCache.java:242)
> at
> org.apache.flink.runtime.filecache.FileCache$CopyProcess.call(FileCache.java:322)
> at
> org.apache.flink.runtime.filecache.FileCache$CopyProcess.call(FileCache.java:306)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> ... 1 more
> {code}
> It is important to mention that I am using modified Flink cluster launch
> scripts to use the OGS engine. The modified scripts and usage case can be
> found in https://github.com/omaralvarez/flink-OGS-GE.
> The same example in the Java API works correctly when not using the
> DistributedCache, and the user has sufficient permissions to write the file.
> If I use interactive nodes instead of the qsub command to run the example it
> does not fail.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)