Did you intentionally post to the mailing list?

I'm investigating the issue.
So far, I found that the hostname has never been passed to the input split
assigner. I guess this issue was introduced by the recent jobmanager
changes.
And secondly, Flink is using the fully qualified hostname, whereas HDFS is
using the hostname only. This caused a string-mismatch.

I wouln't cancel the release because we are at a point where it is faster
to vote a bugfix release.
The issue is not a show stopper for using flink. Its just slow on large
datasets.

On Fri, Oct 17, 2014 at 11:58 AM, Fabian Hueske <[email protected]> wrote:

> This is a critical issue and sounds bit like a release blocker for 0.7 to
> me.
>
> Other opinions?
>
> 2014-10-17 11:25 GMT+02:00 Robert Metzger (JIRA) <[email protected]>:
>
> > Robert Metzger created FLINK-1170:
> > -------------------------------------
> >
> >              Summary: Localization of InputSplits is not working properly
> >                  Key: FLINK-1170
> >                  URL: https://issues.apache.org/jira/browse/FLINK-1170
> >              Project: Flink
> >           Issue Type: Bug
> >           Components: Distributed Runtime
> >             Reporter: Robert Metzger
> >             Assignee: Robert Metzger
> >
> >
> > While running some benchmarks, I found that Flink is not properly
> > assigning the InputSplits.
> >
> > On my testing cluster, ALL splits were assigned to remote HDFS DataNodes,
> > which causes a lot of network I/O.
> >
> >
> >
> > --
> > This message was sent by Atlassian JIRA
> > (v6.3.4#6332)
> >
>

Reply via email to