[ 
https://issues.apache.org/jira/browse/FLINK-18033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17124737#comment-17124737
 ] 

Robert Metzger edited comment on FLINK-18033 at 6/3/20, 8:13 AM:
-----------------------------------------------------------------

Thanks a lot for opening this ticket. I agree that the e2e tests are running 
for quite a while, and I'm pretty sure that there's a lot of room for 
improvement in the scripts.

I see two ways forward (which are orthogonal to each other)
A) Investigate the slowest e2e tests and potential inefficiencies in the common 
scripts (cluster startup and shutdown etc.). The goal should not be to optimize 
the last few seconds, but rather the cases where we are wasting minutes of time 
waiting.

B) Improve the tooling to execute the e2e tests by adding a "e2e test scheduler"
We have the following CI environments: 
1. Flink: Servers providing an environment to run tests on Docker; VMs from 
Azure to run tests on 'bare metal'
2. personal Azure accounts: Only VMs from Azure
At the moment, we can not execute the e2e tests requiring Docker or Kubernetes 
in the Docker environment (need more research into that)

In the past, we've manually split the e2e tests into separate scripts. This has 
lead to "code duplication" and a perceived complexity in the e2e tests.

Given these constraints, I imagine the following design of a "e2e test 
scheduler".
1. Registration of tests:
Tests can have the following properties: 
- bare_metal = require a bare metal environment (azure vm)
- light = have a resource footprint below 1GB of memory and low CPU usage (< 
10%)
- heavy = have a resource footprint up to 8GB of memory and high CPU usage

{code:bash}
register_test properties=bare_metal test_docker_embedded_job.sh dummy-fs
register_test properties=bare_metal test_kubernetes_embedded_job.sh
register_test properties=light test_quickstarts.sh java
register_test properties=heavy test_tpch.sh
{code}
(Note: we could determine the light / heavyness of tests using a tool like: 
https://github.com/astrofrog/psrecord)

2. Execution:
In each environment, we run tests through a call like:
{code:bash}
execute_test properties=bare_metal  # on the azure VMs
execute_test properties=light paralleism=8.  # on the dockerized environment 
for Flink, otherwise on Azure VM. Run 8 tests concurrently
execute_test properties=heavy paralleism=1.  # on the dockerized environment 
for Flink, otherwise on Azure VM. Run 8 tests concurrently
{code}

For now, the Java e2e tests are executed outside of the scheduler.


was (Author: rmetzger):
Thanks a lot for opening this ticket. I agree that the e2e tests are running 
for quite a while, and I'm pretty sure that there's a lot of room for 
improvement in the scripts.

I see two ways forward (which are orthogonal to each other)
A) Investigate the slowest e2e tests and potential inefficiencies in the common 
scripts (cluster startup and shutdown etc.). The goal should not be to optimize 
the last few seconds, but rather the cases where we are wasting minutes of time 
waiting.

B) Improve the tooling to execute the e2e tests by adding a "e2e test scheduler"
We have the following CI environments: 
1. Flink: Servers providing an environment to run tests on Docker; VMs from 
Azure to run tests on 'bare metal'
2. personal Azure accounts: Only VMs from Azure
At the moment, we can not execute the e2e tests requiring Docker or Kubernetes 
in the Docker environment (need more research into that)

In the past, we've manually split the e2e tests into separate scripts. This has 
lead to "code duplication" and a perceived complexity in the e2e tests.

Given these constraints, I imagine the following design of a "e2e test 
scheduler".
1. Registration of tests:
Tests can have the following properties: 
- bare_metal = require a bare metal environment (azure vm)
- light = have a resource footprint below 1GB of memory and low CPU usage (< 
10%)
- heavy = have a resource footprint up to 8GB of memory and high CPU usage

{code:bash}
register_test properties=bare_metal test_docker_embedded_job.sh dummy-fs
register_test properties=bare_metal test_kubernetes_embedded_job.sh
register_test properties=light test_quickstarts.sh java
register_test properties=heavy test_tpch.sh
{code}
(Note: we could determine the light / heavyness of tests using a tool like: 
https://github.com/astrofrog/psrecord)

2. Execution:
In each environment, we run tests through a call like:
{code:bash}
execute_test properties=bare_metal                # on the azure VMs
execute_test properties=light paralleism=8    # on the dockerized environment 
for Flink, otherwise on Azure VM. Run 8 tests concurrently
execute_test properties=heavy paralleism=1  # on the dockerized environment for 
Flink, otherwise on Azure VM. Run 8 tests concurrently
{code}

For now, the Java e2e tests are executed outside of the scheduler.

> Improve e2e test execution time
> -------------------------------
>
>                 Key: FLINK-18033
>                 URL: https://issues.apache.org/jira/browse/FLINK-18033
>             Project: Flink
>          Issue Type: Improvement
>          Components: Build System / Azure Pipelines, Test Infrastructure, 
> Tests
>            Reporter: Chesnay Schepler
>            Priority: Major
>
> Running all e2e tests currently requires ~3.5h, and this time is growing.
> We should look into ways to bring this time down to improve feedback times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to