Hello,

I have a process that works fine with flink 0.8.1 but I decided to test it
against 0.9.0-milestone-1.  I have 12 task managers across 3 machines - so
it's a small setup.

The process fails with the following message.  It appears that it's
attempting to do a shuffle in response to my join request.  I checked all 3
machines and there are no issues with the hostname on any of them.  But the
host being reported as "localhost" seems to make me wonder if I haven't
missed something obvious.

I noticed this exception in one of the Travis CI builds, so I'm hoping it's
something obvious I've missed.

06/23/2015 05:03:00     Join (Join at run(Job.java:137))(11/12) switched to
RUNNING
06/23/2015 05:03:00     Join (Join at run(Job.java:176))(9/12) switched to
RUNNING
06/23/2015 05:03:00     Join (Join at run(Job.java:176))(12/12) switched to
RUNNING
06/23/2015 05:03:00     Join (Join at run(Job.java:137))(12/12) switched to
FAILED
java.lang.Exception: The data preparation for task 'Join (Join at
run(Job.java:137))' , caused an error: Connecting the channel failed:
Connection refused: localhost/127.0.0.1:46229
        at
org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:472)
        at
org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
        at
org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Connecting the channel failed: Connection
refused: localhost/127.0.0.1:46229
        at
org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.waitForChannel(PartitionRequestClientFactory.java:193)
        at
org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.access$000(PartitionRequestClientFactory.java:129)
        at
org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:65)
        at
org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:57)
        at
org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:106)
        at
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:305)
        at
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:328)
        at
org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:76)
        at
org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:34)
        at
org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:59)
        at
org.apache.flink.runtime.operators.hash.MutableHashTable.buildInitialTable(MutableHashTable.java:696)
        at
org.apache.flink.runtime.operators.hash.MutableHashTable.open(MutableHashTable.java:440)
        at
org.apache.flink.runtime.operators.hash.NonReusingBuildSecondHashMatchIterator.open(NonReusingBuildSecondHashMatchIterator.java:85)
        at
org.apache.flink.runtime.operators.MatchDriver.prepare(MatchDriver.java:160)
        at
org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:466)
        ... 3 more

Thanks

Reply via email to