[
https://issues.apache.org/jira/browse/FLINK-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14259110#comment-14259110
]
Fabian Hueske commented on FLINK-1341:
--------------------------------------
Data in HDFS is usually replicated, i.e., each piece of data is stored multiple
times on different nodes in the cluster (default is 3). Flink tries to assign
read operations to local nodes. If some piece of data cannot be locally read by
any node, it is remotely read, i.e., the data is sent over the network. This is
similiar for all systems that read data from HDFS such as Hadoop and Spark.
> In Hadoop cluster mode, Worker node has data but not TaskManager
> ----------------------------------------------------------------
>
> Key: FLINK-1341
> URL: https://issues.apache.org/jira/browse/FLINK-1341
> Project: Flink
> Issue Type: Wish
> Components: JobManager, TaskManager
> Environment: Hadoop 2.0, YARN cluster
> Reporter: Solaimurugan.V
>
> in my Hadoop 2.0 cluster setup, which has 12 node, 11 for Workers. I'm trying
> to setup Flink cluster mode, on top up exsiting setup. I have decided to have
> 5 TaskManager out of 11 Workers. what happens if I'm submitting Flink job,
> data may exist in node 5 but TaskManager not runnig on node 5. .??
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)