陈梓立 created FLINK-10320:
---------------------------

             Summary: Introduce JobMaster schedule micro-benchmark
                 Key: FLINK-10320
                 URL: https://issues.apache.org/jira/browse/FLINK-10320
             Project: Flink
          Issue Type: Improvement
          Components: Tests
            Reporter: 陈梓立
            Assignee: 陈梓立


Based on {{org.apache.flink.streaming.runtime.io.benchmark}} stuff and the repo 
[flink-benchmark|https://github.com/dataArtisans/flink-benchmarks], I proposal 
to introduce another micro-benchmark which focuses on {{JobMaster}} schedule 
performance

h3. Target
Benchmark how long from {{JobMaster}} startup(receive the {{JobGraph}} and 
init) to all tasks RUNNING. Technically we use bounded stream and TM finishes 
tasks as soon as they arrived. So the real interval we measure is to all tasks 
FINISHED.

h3. Case
1. JobGraph that cover EAGER + PIPELINED edges
2. JobGraph that cover LAZY_FROM_SOURCES + PIPELINED edges
3. JobGraph that cover LAZY_FROM_SOURCES + BLOCKING edges
ps: maybe benchmark if the source is get from {{InputSplit}}?

h3. Implement
Based on the flink-benchmark repo, we finally run benchmark using jmh. So the 
whole test suit is separated into two repos. The testing environment could be 
located in the main repo, maybe under 
flink-runtime/src/test/java/org/apache/flink/runtime/jobmaster/benchmark.
To measure the performance of {{JobMaster}} scheduling, we need to simulate an 
environment that:
1. has a real {{JobMaster}}
2. has a mock/testing {{ResourceManager}} that having infinite resource and 
react immediately.
3. has a(many?) mock/testing {{TaskExecutor}} that deploy and finish tasks 
immediately.

[~trohrm...@apache.org] [~GJL] [~pnowojski] could you please review this 
proposal to help clarify the goal and concrete details? Thanks in advance.

Any suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to