Hi! I have just pushed a big patch to rework the JobManager job and scheduling classes. It fixes some scalability and robstness issues, simplifies the task hierarchies, and makes the code ready for some of the prepared next features (incremental/interactive jobs).
The pull request is https://github.com/apache/incubator-flink/pull/122 What will affect developers that go against the lower level APIs (like the streaming parts) is the following: - No more distrinction between input/intermediate/output tasks - Intermediate data sets have a data structure now. This implies that some methods change slightly (more in name than in meaning). In the future, data sets can be consumed many times, but for now, the network stack supports only one cosumer. - The conceptual change that receivers attach senders as inputs (and grab their outgoing data streams), rather than senders forwarding to receivers means that the wiring of JobGraphs is now the other way around. - No more distinction between in-memory and network channels. All channels have always been automatically in-memory, when senders and receiver are co-located. The flag was purely a scheduler hint, which is obsolete now (see below). Most importantly: - The scheduling is a bit different now. Instread of instance sharing, we now have SlotSharing Groups, which give you a way to share resources across tasks, but they behave more dynamic, which is important for more dynamic environments, and when a cluster has less task slots than the parallelism of some tasks is. - For cases that need strict co-location of tasks, we now have CoLocationConstraints. The Batch API uses them to ensure that head, tail, and tasks inside a closed-loop iteration are co-located. Stephan
