Fabian Hueske created FLINK-1343: ------------------------------------ Summary: Branching Join Program Deadlocks Key: FLINK-1343 URL: https://issues.apache.org/jira/browse/FLINK-1343 Project: Flink Issue Type: Bug Components: Optimizer Affects Versions: 0.8, 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske
The following program which gets its data from a single non-parallel data source, branches two times, and joins the branches with two joins, deadlocks. {code:java} public class DeadlockProgram { public static void main(String[] args) throws Exception { final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); DataSet<Long> longs = env.generateSequence(0,1000000l).setParallelism(1); DataSet<Long> longs2 = env.generateSequence(0, 1000000l).setParallelism(1); DataSet<Tuple1<Long>> longT1 = longs.map(new TupleWrapper()); DataSet<Tuple1<Long>> longT2 = longT1.project(0); DataSet<Tuple1<Long>> longT3 = longs.map(new TupleWrapper()); // deadlocks // DataSet<Tuple1<Long>> longT3 = longs2.map(new TupleWrapper()); // works longT2.join(longT3).where(0).equalTo(0).projectFirst(0) .join(longT1).where(0).equalTo(0).projectFirst(0) .print(); env.execute(); } public static class TupleWrapper implements MapFunction<Long, Tuple1<Long>> { @Override public Tuple1<Long> map(Long l) throws Exception { return new Tuple1<Long>(l); } }; } {code} If one of the branches reads its data from a second data source (see inline comment) or if the single data source uses the default parallelism, the program executes correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)