[ 
https://issues.apache.org/jira/browse/FLINK-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712716#comment-14712716
 ] 

Maximilian Michels commented on FLINK-1195:
-------------------------------------------

The cloud performance testing tool called Yoka should cover your use cases: 
https://github.com/mxm/yoka/ It might not be exactly what your were looking for 
but its API can still be adapted. 

You can take a look at the "nightly" (runs every three days) configuration: 
https://github.com/mxm/yoka/blob/master/runs/nightly.py

I know you created Peel which has some overlap with Yoka. The main difference 
between the two is the way your write experiments, visualization of results, 
and Yoka's support for Cloud infrastructure (currently only GCE but I'm 
planning to make it more generic in the future by using the libcloud Python 
module). 

> Improvement of benchmarking infrastructure
> ------------------------------------------
>
>                 Key: FLINK-1195
>                 URL: https://issues.apache.org/jira/browse/FLINK-1195
>             Project: Flink
>          Issue Type: Wish
>            Reporter: Till Rohrmann
>            Assignee: Alexander Alexandrov
>
> I noticed while running my ALS benchmarks that we still have some potential 
> to improve our benchmarking infrastructure. The current state is that we 
> execute the benchmark jobs by writing a script with a single set of 
> parameters. The runtime is then manually retrieved from the web interface of 
> Flink and Spark, respectively.
> I think we need the following extensions:
> * Automatic runtime retrieval and storage in a file
> * Repeated execution of jobs to gather some "advanced" statistics such as 
> mean and standard deviation of the runtimes
> * Support for value sets for the individual parameters
> The automatic runtime retrieval would allow us to execute several benchmarks 
> consecutively without having to lookup the runtimes in the logs or in the web 
> interface, which btw only stores the runtimes of the last 5 jobs.
> What I mean with value sets is that would be nice to specify a set of 
> parameter values for which the benchmark is run without having to write for 
> every single parameter combination a benchmark script. I believe that this 
> feature would become very handy when we want to look at the runtime behaviour 
> of Flink for different input sizes or degrees of parallelism, for example. To 
> illustrate what I mean:
> {code}
> INPUTSIZE = 1000, 2000, 4000, 8000
> DOP = 1, 2, 4, 8
> OUTPUT=benchmarkResults
> repetitions=10
> command=benchmark.jar -p $DOP $INPUTSIZE 
> {code} 
> Something like that would execute the benchmark job with (DOP=1, 
> INPUTSIZE=1000), (DOP=2, INPUTSIZE=2000),.... 10 times each, calculate for 
> each parameter combination runtime statistics and store the results in the 
> file benchmarkResults.
> I believe that spending some effort now will pay off in the long run because 
> we will benchmark Flink continuously. What do you guys think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to