Stephan Ewen created FLINK-972:
----------------------------------
Summary: Run Flink on Tez
Key: FLINK-972
URL: https://issues.apache.org/jira/browse/FLINK-972
Project: Flink
Issue Type: New Feature
Components: New Components
Affects Versions: 0.6-incubating
Reporter: Stephan Ewen
The current status is:
- A prototype that explores how Tez/Flink classes can interoperate was
created by Filip Haase and is at
https://github.com/filiphaase/incubator-tez/tree/stratosphere-input-output-proto2
- There is a version that runs "WordCount" in Tez, using the Flink input
formats, output formats, and UDFs.
Next steps towards generic support for Flink programs are:
1) Integrate the Flink Memory Manager with Tez. This means actually defining
how much memory of each container Flink may allocate for its internal
algorithms. In Flink's core, we allow users to set the amount of memory, or
define it relative to the heap size (with 0.7*heap_size) being used if nothing
else is specified.
2) Create a version of the Flink task context (PactTaskContext) for Tez: This
would allow to run the Flink runtime operators on a Tez processor.
3) Integrate Flink "ship strategies" (partitioning, replication,
redistribution, etc) with the way Tez parameterizes connections.
4) Integrate the Flink Sorting/Caching with Tez. This should be simple, if the
memory manager is there, these classes should work out of the box.
5) Create a component that creates the Tez DAG from a flink "OptimizedPlan". We
currently have a component that creates a "Job Graph" (Flink's DAG) from an
OptimizedPlan, it is the last step of the "pre-flight phase" before the job is
given to the master to be scheduled. We need an equivalent component to create
a Tez DAG.
--
This message was sent by Atlassian JIRA
(v6.2#6252)