[jira] [Commented] (FLINK-18433) From the end-to-end performance test results, 1.11 has a regression

2020-06-30 Thread Aihua Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148365#comment-17148365
 ] 

Aihua Li commented on FLINK-18433:
--

I think this change should be noted in the release note of 1.11

> From the end-to-end performance test results, 1.11 has a regression
> ---
>
> Key: FLINK-18433
> URL: https://issues.apache.org/jira/browse/FLINK-18433
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, API / DataStream
>Affects Versions: 1.11.0
> Environment: 3 machines
> [|https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations_1.11/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
>Reporter: Aihua Li
>Priority: Major
> Attachments: flink_11.log.gz
>
>
>  
> I ran end-to-end performance tests between the Release-1.10 and Release-1.11. 
> the results were as follows:
> |scenarioName|release-1.10|release-1.11| |
> |OneInput_Broadcast_LazyFromSource_ExactlyOnce_10_rocksdb|46.175|43.8133|-5.11%|
> |OneInput_Rescale_LazyFromSource_ExactlyOnce_100_heap|211.835|200.355|-5.42%|
> |OneInput_Rebalance_LazyFromSource_ExactlyOnce_1024_rocksdb|1721.041667|1618.32|-5.97%|
> |OneInput_KeyBy_LazyFromSource_ExactlyOnce_10_heap|46|43.615|-5.18%|
> |OneInput_Broadcast_Eager_ExactlyOnce_100_rocksdb|212.105|199.688|-5.85%|
> |OneInput_Rescale_Eager_ExactlyOnce_1024_heap|1754.64|1600.12|-8.81%|
> |OneInput_Rebalance_Eager_ExactlyOnce_10_rocksdb|45.9167|43.0983|-6.14%|
> |OneInput_KeyBy_Eager_ExactlyOnce_100_heap|212.0816667|200.727|-5.35%|
> |OneInput_Broadcast_LazyFromSource_AtLeastOnce_1024_rocksdb|1718.245|1614.381667|-6.04%|
> |OneInput_Rescale_LazyFromSource_AtLeastOnce_10_heap|46.12|43.5517|-5.57%|
> |OneInput_Rebalance_LazyFromSource_AtLeastOnce_100_rocksdb|212.038|200.388|-5.49%|
> |OneInput_KeyBy_LazyFromSource_AtLeastOnce_1024_heap|1762.048333|1606.408333|-8.83%|
> |OneInput_Broadcast_Eager_AtLeastOnce_10_rocksdb|46.0583|43.4967|-5.56%|
> |OneInput_Rescale_Eager_AtLeastOnce_100_heap|212.233|201.188|-5.20%|
> |OneInput_Rebalance_Eager_AtLeastOnce_1024_rocksdb|1720.66|1616.85|-6.03%|
> |OneInput_KeyBy_Eager_AtLeastOnce_10_heap|46.14|43.6233|-5.45%|
> |TwoInputs_Broadcast_LazyFromSource_ExactlyOnce_100_rocksdb|156.918|152.957|-2.52%|
> |TwoInputs_Rescale_LazyFromSource_ExactlyOnce_1024_heap|1415.511667|1300.1|-8.15%|
> |TwoInputs_Rebalance_LazyFromSource_ExactlyOnce_10_rocksdb|34.2967|34.1667|-0.38%|
> |TwoInputs_KeyBy_LazyFromSource_ExactlyOnce_100_heap|158.353|151.848|-4.11%|
> |TwoInputs_Broadcast_Eager_ExactlyOnce_1024_rocksdb|1373.406667|1300.056667|-5.34%|
> |TwoInputs_Rescale_Eager_ExactlyOnce_10_heap|34.5717|32.0967|-7.16%|
> |TwoInputs_Rebalance_Eager_ExactlyOnce_100_rocksdb|158.655|147.44|-7.07%|
> |TwoInputs_KeyBy_Eager_ExactlyOnce_1024_heap|1356.611667|1292.386667|-4.73%|
> |TwoInputs_Broadcast_LazyFromSource_AtLeastOnce_10_rocksdb|34.01|33.205|-2.37%|
> |TwoInputs_Rescale_LazyFromSource_AtLeastOnce_100_heap|149.588|145.997|-2.40%|
> |TwoInputs_Rebalance_LazyFromSource_AtLeastOnce_1024_rocksdb|1359.74|1299.156667|-4.46%|
> |TwoInputs_KeyBy_LazyFromSource_AtLeastOnce_10_heap|34.025|29.6833|-12.76%|
> |TwoInputs_Broadcast_Eager_AtLeastOnce_100_rocksdb|157.303|151.4616667|-3.71%|
> |TwoInputs_Rescale_Eager_AtLeastOnce_1024_heap|1368.74|1293.238333|-5.52%|
> |TwoInputs_Rebalance_Eager_AtLeastOnce_10_rocksdb|34.325|33.285|-3.03%|
> |TwoInputs_KeyBy_Eager_AtLeastOnce_100_heap|162.5116667|134.375|-17.31%|
> It can be seen that the performance of 1.11 has a regression, basically 
> around 5%, and the maximum regression is 17%. This needs to be checked.
> the test code:
> flink-1.10.0: 
> [https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
> flink-1.11.0: 
> [https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations_1.11/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
> commit cmd like tis:
> bin/flink run -d -m 192.168.39.246:8081 -c 
> org.apache.flink.basic.operations.PerformanceTestJob 
> /home/admin/flink-basic-operations_2.11-1.10-SNAPSHOT.jar --topologyName 
> OneInput --LogicalAttributesofEdges Broadcast --ScheduleMode LazyFromSource 
> --CheckpointMode ExactlyOnce --recordSize 10 --stateBackend rocksdb
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-18433) From the end-to-end performance test results, 1.11 has a regression

2020-06-30 Thread Aihua Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148331#comment-17148331
 ] 

Aihua Li edited comment on FLINK-18433 at 6/30/20, 6:09 AM:


I found the reason:

There were three machines in the performance comparison environment, one master 
machine and two workers configured in the conf/slaves file. But in version 
1.11, the conf/slaves file has been changed to conf/workers and the default 
value is “localhost”. So in the fact that there was one JM and two Tms in the 
1.10 test, and test job’s container was scheduled to other machines. But there 
is only one jm and one tm in 1.11 and they were on the same machine

I confirmed this change with @Xintong Song, and ran on a machine to re-verify 
it,the result is:
 
|scenarioName|release-1.10|release-1.11| |
|OneInput_Broadcast_LazyFromSource_ExactlyOnce_10_rocksdb|18.1767|17.7583|-2.30%|
|OneInput_Rescale_LazyFromSource_ExactlyOnce_100_heap|168.875|163.777|-3.02%|
|OneInput_Rebalance_LazyFromSource_ExactlyOnce_1024_rocksdb|630.715|644.938|2.26%|
|OneInput_KeyBy_LazyFromSource_ExactlyOnce_10_heap|37.6083|36.6117|-2.65%|
|OneInput_Broadcast_Eager_ExactlyOnce_100_rocksdb|81.4017|79.07|-2.86%|
|OneInput_Rescale_Eager_ExactlyOnce_1024_heap|1426.50|1398.851667|-1.94%|
|OneInput_Rebalance_Eager_ExactlyOnce_10_rocksdb|18.825|18.0783|-3.97%|
|OneInput_KeyBy_Eager_ExactlyOnce_100_heap|168.903|160.275|-5.11%|
|OneInput_Broadcast_LazyFromSource_AtLeastOnce_1024_rocksdb|630.1816667|654.7516667|3.90%|
|OneInput_Rescale_LazyFromSource_AtLeastOnce_10_heap|38.3033|36.1567|-5.60%|
|OneInput_Rebalance_LazyFromSource_AtLeastOnce_100_rocksdb|80.87|80.1767|-0.86%|
|OneInput_KeyBy_LazyFromSource_AtLeastOnce_1024_heap|1300.06|1394.201667|7.24%|
|OneInput_Broadcast_Eager_AtLeastOnce_10_rocksdb|18.6033|18.055|-2.95%|
|OneInput_Rescale_Eager_AtLeastOnce_100_heap|168.563|190.008|12.72%|
|OneInput_Rebalance_Eager_AtLeastOnce_1024_rocksdb|625.843|651.003|4.02%|
|OneInput_KeyBy_Eager_AtLeastOnce_10_heap|37.9617|35.8567|-5.55%|
|TwoInputs_Broadcast_LazyFromSource_ExactlyOnce_100_rocksdb|140.3716667|138.8116667|-1.11%|
|TwoInputs_Rescale_LazyFromSource_ExactlyOnce_1024_heap|1231.29|1230.145|-0.09%|
|TwoInputs_Rebalance_LazyFromSource_ExactlyOnce_10_rocksdb|32.4667|31.6883|-2.40%|
|TwoInputs_KeyBy_LazyFromSource_ExactlyOnce_100_heap|141.22|131.478|-6.90%|
|TwoInputs_Broadcast_Eager_ExactlyOnce_1024_rocksdb|1007.375|1092.865|8.49%|
|TwoInputs_Rescale_Eager_ExactlyOnce_10_heap|33.2533|31.0167|-6.73%|
|TwoInputs_Rebalance_Eager_ExactlyOnce_100_rocksdb|141.065|137.26|-2.70%|
|TwoInputs_KeyBy_Eager_ExactlyOnce_1024_heap|1233.316667|1222.988333|-0.84%|
|TwoInputs_Broadcast_LazyFromSource_AtLeastOnce_10_rocksdb|32.1167|31.6817|-1.35%|
|TwoInputs_Rescale_LazyFromSource_AtLeastOnce_100_heap|144.908|136.645|-5.70%|
|TwoInputs_Rebalance_LazyFromSource_AtLeastOnce_1024_rocksdb|1005.598333|1090.656667|8.46%|
|TwoInputs_KeyBy_LazyFromSource_AtLeastOnce_10_heap|32.84|31.2483|-4.85%|
|TwoInputs_Broadcast_Eager_AtLeastOnce_100_rocksdb|141.53|137.675|-2.72%|
|TwoInputs_Rescale_Eager_AtLeastOnce_1024_heap|1260.055|1183.28|-6.09%|
|TwoInputs_Rebalance_Eager_AtLeastOnce_10_rocksdb|31.9767|31.5567|-1.31%|
|TwoInputs_KeyBy_Eager_AtLeastOnce_100_heap|143.79|132.5716667|-7.80%|

>From this result , some scenarios 1.11 were better than 1.10,but some 
>scenarios were opposite, and the changed little.I think it is normal. What do 
>you think?


was (Author: aihua):
I found the reason:

There were three machines in the performance comparison environment, one master 
machine and two workers configured in the conf/slaves file. But in version 
1.11, the conf/slaves file has been changed to conf/workers and the default 
value is “localhost”. So in the fact that there was one JM and two Tms in the 
1.10 test, and test job’s container was scheduled to other machines. But there 
is only one jm and one tm in 1.11 and they were on the same machine

I confirmed this change with @Xintong Song, and ran on a machine to re-verify 
it,the result is:
|release-1.10|release-1.11| |
|18.1767|17.7583|-2.30%|
|168.875|163.777|-3.02%|
|630.715|644.938|2.26%|
|37.6083|36.6117|-2.65%|
|81.4017|79.07|-2.86%|
|1426.50|1398.851667|-1.94%|
|18.825|18.0783|-3.97%|
|168.903|160.275|-5.11%|
|630.1816667|654.7516667|3.90%|
|38.3033|36.1567|-5.60%|
|80.87|80.1767|-0.86%|
|1300.06|1394.201667|7.24%|
|18.6033|18.055|-2.95%|
|168.563|190.008|12.72%|
|625.843|651.003|4.02%|
|37.9617|35.8567|-5.55%|
|140.3716667|138.8116667|-1.11%|
|1231.29|1230.145|-0.09%|
|32.4667|31.6883|-2.40%|
|141.22|131.478|-6.90%|
|1007.375|1092.865|8.49%|
|33.2533|31.0167|-6.73%|
|141.065|137.26|-2.70%|

[jira] [Commented] (FLINK-18433) From the end-to-end performance test results, 1.11 has a regression

2020-06-29 Thread Aihua Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148331#comment-17148331
 ] 

Aihua Li commented on FLINK-18433:
--

I found the reason:

There were three machines in the performance comparison environment, one master 
machine and two workers configured in the conf/slaves file. But in version 
1.11, the conf/slaves file has been changed to conf/workers and the default 
value is “localhost”. So in the fact that there was one JM and two Tms in the 
1.10 test, and test job’s container was scheduled to other machines. But there 
is only one jm and one tm in 1.11 and they were on the same machine

I confirmed this change with @Xintong Song, and ran on a machine to re-verify 
it,the result is:
|release-1.10|release-1.11| |
|18.1767|17.7583|-2.30%|
|168.875|163.777|-3.02%|
|630.715|644.938|2.26%|
|37.6083|36.6117|-2.65%|
|81.4017|79.07|-2.86%|
|1426.50|1398.851667|-1.94%|
|18.825|18.0783|-3.97%|
|168.903|160.275|-5.11%|
|630.1816667|654.7516667|3.90%|
|38.3033|36.1567|-5.60%|
|80.87|80.1767|-0.86%|
|1300.06|1394.201667|7.24%|
|18.6033|18.055|-2.95%|
|168.563|190.008|12.72%|
|625.843|651.003|4.02%|
|37.9617|35.8567|-5.55%|
|140.3716667|138.8116667|-1.11%|
|1231.29|1230.145|-0.09%|
|32.4667|31.6883|-2.40%|
|141.22|131.478|-6.90%|
|1007.375|1092.865|8.49%|
|33.2533|31.0167|-6.73%|
|141.065|137.26|-2.70%|
|1233.316667|1222.988333|-0.84%|
|32.1167|31.6817|-1.35%|
|144.908|136.645|-5.70%|
|1005.598333|1090.656667|8.46%|
|32.84|31.2483|-4.85%|
|141.53|137.675|-2.72%|
|1260.055|1183.28|-6.09%|
|31.9767|31.5567|-1.31%|
|143.79|132.5716667|-7.80%|

>From this result , some scenarios 1.11 were better than 1.10,but some 
>scenarios were opposite, and the changed little.I think it is normal. What do 
>you think?

> From the end-to-end performance test results, 1.11 has a regression
> ---
>
> Key: FLINK-18433
> URL: https://issues.apache.org/jira/browse/FLINK-18433
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, API / DataStream
>Affects Versions: 1.11.0
> Environment: 3 machines
> [|https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations_1.11/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
>Reporter: Aihua Li
>Priority: Major
> Attachments: flink_11.log.gz
>
>
>  
> I ran end-to-end performance tests between the Release-1.10 and Release-1.11. 
> the results were as follows:
> |scenarioName|release-1.10|release-1.11| |
> |OneInput_Broadcast_LazyFromSource_ExactlyOnce_10_rocksdb|46.175|43.8133|-5.11%|
> |OneInput_Rescale_LazyFromSource_ExactlyOnce_100_heap|211.835|200.355|-5.42%|
> |OneInput_Rebalance_LazyFromSource_ExactlyOnce_1024_rocksdb|1721.041667|1618.32|-5.97%|
> |OneInput_KeyBy_LazyFromSource_ExactlyOnce_10_heap|46|43.615|-5.18%|
> |OneInput_Broadcast_Eager_ExactlyOnce_100_rocksdb|212.105|199.688|-5.85%|
> |OneInput_Rescale_Eager_ExactlyOnce_1024_heap|1754.64|1600.12|-8.81%|
> |OneInput_Rebalance_Eager_ExactlyOnce_10_rocksdb|45.9167|43.0983|-6.14%|
> |OneInput_KeyBy_Eager_ExactlyOnce_100_heap|212.0816667|200.727|-5.35%|
> |OneInput_Broadcast_LazyFromSource_AtLeastOnce_1024_rocksdb|1718.245|1614.381667|-6.04%|
> |OneInput_Rescale_LazyFromSource_AtLeastOnce_10_heap|46.12|43.5517|-5.57%|
> |OneInput_Rebalance_LazyFromSource_AtLeastOnce_100_rocksdb|212.038|200.388|-5.49%|
> |OneInput_KeyBy_LazyFromSource_AtLeastOnce_1024_heap|1762.048333|1606.408333|-8.83%|
> |OneInput_Broadcast_Eager_AtLeastOnce_10_rocksdb|46.0583|43.4967|-5.56%|
> |OneInput_Rescale_Eager_AtLeastOnce_100_heap|212.233|201.188|-5.20%|
> |OneInput_Rebalance_Eager_AtLeastOnce_1024_rocksdb|1720.66|1616.85|-6.03%|
> |OneInput_KeyBy_Eager_AtLeastOnce_10_heap|46.14|43.6233|-5.45%|
> |TwoInputs_Broadcast_LazyFromSource_ExactlyOnce_100_rocksdb|156.918|152.957|-2.52%|
> |TwoInputs_Rescale_LazyFromSource_ExactlyOnce_1024_heap|1415.511667|1300.1|-8.15%|
> |TwoInputs_Rebalance_LazyFromSource_ExactlyOnce_10_rocksdb|34.2967|34.1667|-0.38%|
> |TwoInputs_KeyBy_LazyFromSource_ExactlyOnce_100_heap|158.353|151.848|-4.11%|
> |TwoInputs_Broadcast_Eager_ExactlyOnce_1024_rocksdb|1373.406667|1300.056667|-5.34%|
> |TwoInputs_Rescale_Eager_ExactlyOnce_10_heap|34.5717|32.0967|-7.16%|
> |TwoInputs_Rebalance_Eager_ExactlyOnce_100_rocksdb|158.655|147.44|-7.07%|
> |TwoInputs_KeyBy_Eager_ExactlyOnce_1024_heap|1356.611667|1292.386667|-4.73%|
> |TwoInputs_Broadcast_LazyFromSource_AtLeastOnce_10_rocksdb|34.01|33.205|-2.37%|
> 

[jira] [Commented] (FLINK-18433) From the end-to-end performance test results, 1.11 has a regression

2020-06-27 Thread Aihua Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-18433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147221#comment-17147221
 ] 

Aihua Li commented on FLINK-18433:
--

Thanks for all the above analysis.I add some test notes:
1. The last commit corresponding to the test package is 
e13146f80114266aa34c9fe9f3dc27e87f7a7649, [~liyu] You can check whether your pr 
is included
2. Result acquisition: The comparison result data is the job's tps. In order to 
avoid the impact of resource allocation, the TPS will be obtained through the 
restful 
API(jobs/$jobId/vertices/$verticeId/subtasks/metrics?get=*numBuffersOutPerSecond)
 after the job is submitted for 2 minutes (the job has been actually scheduled 
at this time). Each job will get 10 tps, the interval between the two tps is 
10s, and then average the 10 tps. Each scene will submit 5 jobs to find the 
average tps as the finally result.
3. parallism is 1: I will adjust the machine to 1 and run again, then update 
the data

> From the end-to-end performance test results, 1.11 has a regression
> ---
>
> Key: FLINK-18433
> URL: https://issues.apache.org/jira/browse/FLINK-18433
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, API / DataStream
>Affects Versions: 1.11.0
> Environment: 3 machines
> [|https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations_1.11/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
>Reporter: Aihua Li
>Priority: Major
> Attachments: flink_11.log.gz
>
>
>  
> I ran end-to-end performance tests between the Release-1.10 and Release-1.11. 
> the results were as follows:
> |scenarioName|release-1.10|release-1.11| |
> |OneInput_Broadcast_LazyFromSource_ExactlyOnce_10_rocksdb|46.175|43.8133|-5.11%|
> |OneInput_Rescale_LazyFromSource_ExactlyOnce_100_heap|211.835|200.355|-5.42%|
> |OneInput_Rebalance_LazyFromSource_ExactlyOnce_1024_rocksdb|1721.041667|1618.32|-5.97%|
> |OneInput_KeyBy_LazyFromSource_ExactlyOnce_10_heap|46|43.615|-5.18%|
> |OneInput_Broadcast_Eager_ExactlyOnce_100_rocksdb|212.105|199.688|-5.85%|
> |OneInput_Rescale_Eager_ExactlyOnce_1024_heap|1754.64|1600.12|-8.81%|
> |OneInput_Rebalance_Eager_ExactlyOnce_10_rocksdb|45.9167|43.0983|-6.14%|
> |OneInput_KeyBy_Eager_ExactlyOnce_100_heap|212.0816667|200.727|-5.35%|
> |OneInput_Broadcast_LazyFromSource_AtLeastOnce_1024_rocksdb|1718.245|1614.381667|-6.04%|
> |OneInput_Rescale_LazyFromSource_AtLeastOnce_10_heap|46.12|43.5517|-5.57%|
> |OneInput_Rebalance_LazyFromSource_AtLeastOnce_100_rocksdb|212.038|200.388|-5.49%|
> |OneInput_KeyBy_LazyFromSource_AtLeastOnce_1024_heap|1762.048333|1606.408333|-8.83%|
> |OneInput_Broadcast_Eager_AtLeastOnce_10_rocksdb|46.0583|43.4967|-5.56%|
> |OneInput_Rescale_Eager_AtLeastOnce_100_heap|212.233|201.188|-5.20%|
> |OneInput_Rebalance_Eager_AtLeastOnce_1024_rocksdb|1720.66|1616.85|-6.03%|
> |OneInput_KeyBy_Eager_AtLeastOnce_10_heap|46.14|43.6233|-5.45%|
> |TwoInputs_Broadcast_LazyFromSource_ExactlyOnce_100_rocksdb|156.918|152.957|-2.52%|
> |TwoInputs_Rescale_LazyFromSource_ExactlyOnce_1024_heap|1415.511667|1300.1|-8.15%|
> |TwoInputs_Rebalance_LazyFromSource_ExactlyOnce_10_rocksdb|34.2967|34.1667|-0.38%|
> |TwoInputs_KeyBy_LazyFromSource_ExactlyOnce_100_heap|158.353|151.848|-4.11%|
> |TwoInputs_Broadcast_Eager_ExactlyOnce_1024_rocksdb|1373.406667|1300.056667|-5.34%|
> |TwoInputs_Rescale_Eager_ExactlyOnce_10_heap|34.5717|32.0967|-7.16%|
> |TwoInputs_Rebalance_Eager_ExactlyOnce_100_rocksdb|158.655|147.44|-7.07%|
> |TwoInputs_KeyBy_Eager_ExactlyOnce_1024_heap|1356.611667|1292.386667|-4.73%|
> |TwoInputs_Broadcast_LazyFromSource_AtLeastOnce_10_rocksdb|34.01|33.205|-2.37%|
> |TwoInputs_Rescale_LazyFromSource_AtLeastOnce_100_heap|149.588|145.997|-2.40%|
> |TwoInputs_Rebalance_LazyFromSource_AtLeastOnce_1024_rocksdb|1359.74|1299.156667|-4.46%|
> |TwoInputs_KeyBy_LazyFromSource_AtLeastOnce_10_heap|34.025|29.6833|-12.76%|
> |TwoInputs_Broadcast_Eager_AtLeastOnce_100_rocksdb|157.303|151.4616667|-3.71%|
> |TwoInputs_Rescale_Eager_AtLeastOnce_1024_heap|1368.74|1293.238333|-5.52%|
> |TwoInputs_Rebalance_Eager_AtLeastOnce_10_rocksdb|34.325|33.285|-3.03%|
> |TwoInputs_KeyBy_Eager_AtLeastOnce_100_heap|162.5116667|134.375|-17.31%|
> It can be seen that the performance of 1.11 has a regression, basically 
> around 5%, and the maximum regression is 17%. This needs to be checked.
> the test code:
> flink-1.10.0: 
> [https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
> flink-1.11.0: 
> 

[jira] [Updated] (FLINK-18433) From the end-to-end performance test results, 1.11 has a regression

2020-06-24 Thread Aihua Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Li updated FLINK-18433:
-
Summary: From the end-to-end performance test results, 1.11 has a 
regression  (was: From the end-to-end performance test results, 1.11 has a )

> From the end-to-end performance test results, 1.11 has a regression
> ---
>
> Key: FLINK-18433
> URL: https://issues.apache.org/jira/browse/FLINK-18433
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, API / DataStream
>Affects Versions: 1.11.1
> Environment: 3 machines
> [|https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations_1.11/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
>Reporter: Aihua Li
>Priority: Major
>
>  
> I ran end-to-end performance tests between the Release-1.10 and Release-1.11. 
> the results were as follows:
> |scenarioName|release-1.10|release-1.11| |
> |OneInput_Broadcast_LazyFromSource_ExactlyOnce_10_rocksdb|46.175|43.8133|-5.11%|
> |OneInput_Rescale_LazyFromSource_ExactlyOnce_100_heap|211.835|200.355|-5.42%|
> |OneInput_Rebalance_LazyFromSource_ExactlyOnce_1024_rocksdb|1721.041667|1618.32|-5.97%|
> |OneInput_KeyBy_LazyFromSource_ExactlyOnce_10_heap|46|43.615|-5.18%|
> |OneInput_Broadcast_Eager_ExactlyOnce_100_rocksdb|212.105|199.688|-5.85%|
> |OneInput_Rescale_Eager_ExactlyOnce_1024_heap|1754.64|1600.12|-8.81%|
> |OneInput_Rebalance_Eager_ExactlyOnce_10_rocksdb|45.9167|43.0983|-6.14%|
> |OneInput_KeyBy_Eager_ExactlyOnce_100_heap|212.0816667|200.727|-5.35%|
> |OneInput_Broadcast_LazyFromSource_AtLeastOnce_1024_rocksdb|1718.245|1614.381667|-6.04%|
> |OneInput_Rescale_LazyFromSource_AtLeastOnce_10_heap|46.12|43.5517|-5.57%|
> |OneInput_Rebalance_LazyFromSource_AtLeastOnce_100_rocksdb|212.038|200.388|-5.49%|
> |OneInput_KeyBy_LazyFromSource_AtLeastOnce_1024_heap|1762.048333|1606.408333|-8.83%|
> |OneInput_Broadcast_Eager_AtLeastOnce_10_rocksdb|46.0583|43.4967|-5.56%|
> |OneInput_Rescale_Eager_AtLeastOnce_100_heap|212.233|201.188|-5.20%|
> |OneInput_Rebalance_Eager_AtLeastOnce_1024_rocksdb|1720.66|1616.85|-6.03%|
> |OneInput_KeyBy_Eager_AtLeastOnce_10_heap|46.14|43.6233|-5.45%|
> |TwoInputs_Broadcast_LazyFromSource_ExactlyOnce_100_rocksdb|156.918|152.957|-2.52%|
> |TwoInputs_Rescale_LazyFromSource_ExactlyOnce_1024_heap|1415.511667|1300.1|-8.15%|
> |TwoInputs_Rebalance_LazyFromSource_ExactlyOnce_10_rocksdb|34.2967|34.1667|-0.38%|
> |TwoInputs_KeyBy_LazyFromSource_ExactlyOnce_100_heap|158.353|151.848|-4.11%|
> |TwoInputs_Broadcast_Eager_ExactlyOnce_1024_rocksdb|1373.406667|1300.056667|-5.34%|
> |TwoInputs_Rescale_Eager_ExactlyOnce_10_heap|34.5717|32.0967|-7.16%|
> |TwoInputs_Rebalance_Eager_ExactlyOnce_100_rocksdb|158.655|147.44|-7.07%|
> |TwoInputs_KeyBy_Eager_ExactlyOnce_1024_heap|1356.611667|1292.386667|-4.73%|
> |TwoInputs_Broadcast_LazyFromSource_AtLeastOnce_10_rocksdb|34.01|33.205|-2.37%|
> |TwoInputs_Rescale_LazyFromSource_AtLeastOnce_100_heap|149.588|145.997|-2.40%|
> |TwoInputs_Rebalance_LazyFromSource_AtLeastOnce_1024_rocksdb|1359.74|1299.156667|-4.46%|
> |TwoInputs_KeyBy_LazyFromSource_AtLeastOnce_10_heap|34.025|29.6833|-12.76%|
> |TwoInputs_Broadcast_Eager_AtLeastOnce_100_rocksdb|157.303|151.4616667|-3.71%|
> |TwoInputs_Rescale_Eager_AtLeastOnce_1024_heap|1368.74|1293.238333|-5.52%|
> |TwoInputs_Rebalance_Eager_AtLeastOnce_10_rocksdb|34.325|33.285|-3.03%|
> |TwoInputs_KeyBy_Eager_AtLeastOnce_100_heap|162.5116667|134.375|-17.31%|
> It can be seen that the performance of 1.11 has a regression, basically 
> around 5%, and the maximum regression is 17%. This needs to be checked.
> the test code:
> flink-1.10.0: 
> [https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
> flink-1.11.0: 
> [https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations_1.11/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
> commit cmd like tis:
> bin/flink run -d -m 192.168.39.246:8081 -c 
> org.apache.flink.basic.operations.PerformanceTestJob 
> /home/admin/flink-basic-operations_2.11-1.10-SNAPSHOT.jar --topologyName 
> OneInput --LogicalAttributesofEdges Broadcast --ScheduleMode LazyFromSource 
> --CheckpointMode ExactlyOnce --recordSize 10 --stateBackend rocksdb
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (FLINK-18116) Manually test E2E performance on Flink 1.11

2020-06-24 Thread Aihua Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Li reopened FLINK-18116:
--

I ran end-to-end performance tests between the Release-1.10 and Release-1.11 
and found 1.11 has a regression. I submitted a 
bug:https://issues.apache.org/jira/browse/FLINK-18433 to record the details.

> Manually test E2E performance on Flink 1.11
> ---
>
> Key: FLINK-18116
> URL: https://issues.apache.org/jira/browse/FLINK-18116
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Core, API / DataStream, API / State Processor, 
> Build System, Client / Job Submission
>Affects Versions: 1.11.0
>Reporter: Aihua Li
>Assignee: Aihua Li
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.11.0
>
>
> it's mainly to verify the performance don't less than 1.10 version by 
> checking the metrics of end-to-end performance test,such as qps,latency .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-18433) From the end-to-end performance test results, 1.11 has a

2020-06-24 Thread Aihua Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Li updated FLINK-18433:
-
Summary: From the end-to-end performance test results, 1.11 has a   (was: 
1.11 has a regression)

> From the end-to-end performance test results, 1.11 has a 
> -
>
> Key: FLINK-18433
> URL: https://issues.apache.org/jira/browse/FLINK-18433
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, API / DataStream
>Affects Versions: 1.11.1
> Environment: 3 machines
> [|https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations_1.11/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
>Reporter: Aihua Li
>Priority: Major
>
>  
> I ran end-to-end performance tests between the Release-1.10 and Release-1.11. 
> the results were as follows:
> |scenarioName|release-1.10|release-1.11| |
> |OneInput_Broadcast_LazyFromSource_ExactlyOnce_10_rocksdb|46.175|43.8133|-5.11%|
> |OneInput_Rescale_LazyFromSource_ExactlyOnce_100_heap|211.835|200.355|-5.42%|
> |OneInput_Rebalance_LazyFromSource_ExactlyOnce_1024_rocksdb|1721.041667|1618.32|-5.97%|
> |OneInput_KeyBy_LazyFromSource_ExactlyOnce_10_heap|46|43.615|-5.18%|
> |OneInput_Broadcast_Eager_ExactlyOnce_100_rocksdb|212.105|199.688|-5.85%|
> |OneInput_Rescale_Eager_ExactlyOnce_1024_heap|1754.64|1600.12|-8.81%|
> |OneInput_Rebalance_Eager_ExactlyOnce_10_rocksdb|45.9167|43.0983|-6.14%|
> |OneInput_KeyBy_Eager_ExactlyOnce_100_heap|212.0816667|200.727|-5.35%|
> |OneInput_Broadcast_LazyFromSource_AtLeastOnce_1024_rocksdb|1718.245|1614.381667|-6.04%|
> |OneInput_Rescale_LazyFromSource_AtLeastOnce_10_heap|46.12|43.5517|-5.57%|
> |OneInput_Rebalance_LazyFromSource_AtLeastOnce_100_rocksdb|212.038|200.388|-5.49%|
> |OneInput_KeyBy_LazyFromSource_AtLeastOnce_1024_heap|1762.048333|1606.408333|-8.83%|
> |OneInput_Broadcast_Eager_AtLeastOnce_10_rocksdb|46.0583|43.4967|-5.56%|
> |OneInput_Rescale_Eager_AtLeastOnce_100_heap|212.233|201.188|-5.20%|
> |OneInput_Rebalance_Eager_AtLeastOnce_1024_rocksdb|1720.66|1616.85|-6.03%|
> |OneInput_KeyBy_Eager_AtLeastOnce_10_heap|46.14|43.6233|-5.45%|
> |TwoInputs_Broadcast_LazyFromSource_ExactlyOnce_100_rocksdb|156.918|152.957|-2.52%|
> |TwoInputs_Rescale_LazyFromSource_ExactlyOnce_1024_heap|1415.511667|1300.1|-8.15%|
> |TwoInputs_Rebalance_LazyFromSource_ExactlyOnce_10_rocksdb|34.2967|34.1667|-0.38%|
> |TwoInputs_KeyBy_LazyFromSource_ExactlyOnce_100_heap|158.353|151.848|-4.11%|
> |TwoInputs_Broadcast_Eager_ExactlyOnce_1024_rocksdb|1373.406667|1300.056667|-5.34%|
> |TwoInputs_Rescale_Eager_ExactlyOnce_10_heap|34.5717|32.0967|-7.16%|
> |TwoInputs_Rebalance_Eager_ExactlyOnce_100_rocksdb|158.655|147.44|-7.07%|
> |TwoInputs_KeyBy_Eager_ExactlyOnce_1024_heap|1356.611667|1292.386667|-4.73%|
> |TwoInputs_Broadcast_LazyFromSource_AtLeastOnce_10_rocksdb|34.01|33.205|-2.37%|
> |TwoInputs_Rescale_LazyFromSource_AtLeastOnce_100_heap|149.588|145.997|-2.40%|
> |TwoInputs_Rebalance_LazyFromSource_AtLeastOnce_1024_rocksdb|1359.74|1299.156667|-4.46%|
> |TwoInputs_KeyBy_LazyFromSource_AtLeastOnce_10_heap|34.025|29.6833|-12.76%|
> |TwoInputs_Broadcast_Eager_AtLeastOnce_100_rocksdb|157.303|151.4616667|-3.71%|
> |TwoInputs_Rescale_Eager_AtLeastOnce_1024_heap|1368.74|1293.238333|-5.52%|
> |TwoInputs_Rebalance_Eager_AtLeastOnce_10_rocksdb|34.325|33.285|-3.03%|
> |TwoInputs_KeyBy_Eager_AtLeastOnce_100_heap|162.5116667|134.375|-17.31%|
> It can be seen that the performance of 1.11 has a regression, basically 
> around 5%, and the maximum regression is 17%. This needs to be checked.
> the test code:
> flink-1.10.0: 
> [https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
> flink-1.11.0: 
> [https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations_1.11/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
> commit cmd like tis:
> bin/flink run -d -m 192.168.39.246:8081 -c 
> org.apache.flink.basic.operations.PerformanceTestJob 
> /home/admin/flink-basic-operations_2.11-1.10-SNAPSHOT.jar --topologyName 
> OneInput --LogicalAttributesofEdges Broadcast --ScheduleMode LazyFromSource 
> --CheckpointMode ExactlyOnce --recordSize 10 --stateBackend rocksdb
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18433) 1.11 has a regression

2020-06-24 Thread Aihua Li (Jira)
Aihua Li created FLINK-18433:


 Summary: 1.11 has a regression
 Key: FLINK-18433
 URL: https://issues.apache.org/jira/browse/FLINK-18433
 Project: Flink
  Issue Type: Bug
  Components: API / Core, API / DataStream
Affects Versions: 1.11.1
 Environment: 3 machines

[|https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations_1.11/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
Reporter: Aihua Li


 

I ran end-to-end performance tests between the Release-1.10 and Release-1.11. 
the results were as follows:
|scenarioName|release-1.10|release-1.11| |
|OneInput_Broadcast_LazyFromSource_ExactlyOnce_10_rocksdb|46.175|43.8133|-5.11%|
|OneInput_Rescale_LazyFromSource_ExactlyOnce_100_heap|211.835|200.355|-5.42%|
|OneInput_Rebalance_LazyFromSource_ExactlyOnce_1024_rocksdb|1721.041667|1618.32|-5.97%|
|OneInput_KeyBy_LazyFromSource_ExactlyOnce_10_heap|46|43.615|-5.18%|
|OneInput_Broadcast_Eager_ExactlyOnce_100_rocksdb|212.105|199.688|-5.85%|
|OneInput_Rescale_Eager_ExactlyOnce_1024_heap|1754.64|1600.12|-8.81%|
|OneInput_Rebalance_Eager_ExactlyOnce_10_rocksdb|45.9167|43.0983|-6.14%|
|OneInput_KeyBy_Eager_ExactlyOnce_100_heap|212.0816667|200.727|-5.35%|
|OneInput_Broadcast_LazyFromSource_AtLeastOnce_1024_rocksdb|1718.245|1614.381667|-6.04%|
|OneInput_Rescale_LazyFromSource_AtLeastOnce_10_heap|46.12|43.5517|-5.57%|
|OneInput_Rebalance_LazyFromSource_AtLeastOnce_100_rocksdb|212.038|200.388|-5.49%|
|OneInput_KeyBy_LazyFromSource_AtLeastOnce_1024_heap|1762.048333|1606.408333|-8.83%|
|OneInput_Broadcast_Eager_AtLeastOnce_10_rocksdb|46.0583|43.4967|-5.56%|
|OneInput_Rescale_Eager_AtLeastOnce_100_heap|212.233|201.188|-5.20%|
|OneInput_Rebalance_Eager_AtLeastOnce_1024_rocksdb|1720.66|1616.85|-6.03%|
|OneInput_KeyBy_Eager_AtLeastOnce_10_heap|46.14|43.6233|-5.45%|
|TwoInputs_Broadcast_LazyFromSource_ExactlyOnce_100_rocksdb|156.918|152.957|-2.52%|
|TwoInputs_Rescale_LazyFromSource_ExactlyOnce_1024_heap|1415.511667|1300.1|-8.15%|
|TwoInputs_Rebalance_LazyFromSource_ExactlyOnce_10_rocksdb|34.2967|34.1667|-0.38%|
|TwoInputs_KeyBy_LazyFromSource_ExactlyOnce_100_heap|158.353|151.848|-4.11%|
|TwoInputs_Broadcast_Eager_ExactlyOnce_1024_rocksdb|1373.406667|1300.056667|-5.34%|
|TwoInputs_Rescale_Eager_ExactlyOnce_10_heap|34.5717|32.0967|-7.16%|
|TwoInputs_Rebalance_Eager_ExactlyOnce_100_rocksdb|158.655|147.44|-7.07%|
|TwoInputs_KeyBy_Eager_ExactlyOnce_1024_heap|1356.611667|1292.386667|-4.73%|
|TwoInputs_Broadcast_LazyFromSource_AtLeastOnce_10_rocksdb|34.01|33.205|-2.37%|
|TwoInputs_Rescale_LazyFromSource_AtLeastOnce_100_heap|149.588|145.997|-2.40%|
|TwoInputs_Rebalance_LazyFromSource_AtLeastOnce_1024_rocksdb|1359.74|1299.156667|-4.46%|
|TwoInputs_KeyBy_LazyFromSource_AtLeastOnce_10_heap|34.025|29.6833|-12.76%|
|TwoInputs_Broadcast_Eager_AtLeastOnce_100_rocksdb|157.303|151.4616667|-3.71%|
|TwoInputs_Rescale_Eager_AtLeastOnce_1024_heap|1368.74|1293.238333|-5.52%|
|TwoInputs_Rebalance_Eager_AtLeastOnce_10_rocksdb|34.325|33.285|-3.03%|
|TwoInputs_KeyBy_Eager_AtLeastOnce_100_heap|162.5116667|134.375|-17.31%|

It can be seen that the performance of 1.11 has a regression, basically around 
5%, and the maximum regression is 17%. This needs to be checked.

the test code:

flink-1.10.0: 
[https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]

flink-1.11.0: 
[https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations_1.11/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]

commit cmd like tis:

bin/flink run -d -m 192.168.39.246:8081 -c 
org.apache.flink.basic.operations.PerformanceTestJob 
/home/admin/flink-basic-operations_2.11-1.10-SNAPSHOT.jar --topologyName 
OneInput --LogicalAttributesofEdges Broadcast --ScheduleMode LazyFromSource 
--CheckpointMode ExactlyOnce --recordSize 10 --stateBackend rocksdb

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (FLINK-18116) Manually test E2E performance on Flink 1.11

2020-06-24 Thread Aihua Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Li updated FLINK-18116:
-
Comment: was deleted

(was: I mainly ran the stability test developed by Ali: by simulating online 
abnormal conditions (such as network interruption, full disk, JM/AM process 
being killed, TM throwing exception, etc.) to check whether flink operation can 
be automatically recovered. The test lasted 5 hours, simulated multiple 
abnormal combination scenarios, flink job can return to normal, and the 
checkpoint can be created. The test pass)

> Manually test E2E performance on Flink 1.11
> ---
>
> Key: FLINK-18116
> URL: https://issues.apache.org/jira/browse/FLINK-18116
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Core, API / DataStream, API / State Processor, 
> Build System, Client / Job Submission
>Affects Versions: 1.11.0
>Reporter: Aihua Li
>Assignee: Aihua Li
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.11.0
>
>
> it's mainly to verify the performance don't less than 1.10 version by 
> checking the metrics of end-to-end performance test,such as qps,latency .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-18115) Manually test fault-tolerance stability on Flink 1.11

2020-06-24 Thread Aihua Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Li closed FLINK-18115.

Resolution: Done

I mainly ran the stability test developed by Ali: by simulating online abnormal 
conditions (such as network interruption, full disk, JM/AM process being 
killed, TM throwing exception, etc.) to check whether flink operation can be 
automatically recovered. The test lasted 5 hours, simulated multiple abnormal 
combination scenarios, flink job can return to normal, and the checkpoint can 
be created. The test pass

> Manually test fault-tolerance stability on Flink 1.11
> -
>
> Key: FLINK-18115
> URL: https://issues.apache.org/jira/browse/FLINK-18115
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Core, API / State Processor, Build System, Client 
> / Job Submission
>Affects Versions: 1.11.0
>Reporter: Aihua Li
>Assignee: Aihua Li
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.11.0
>
>
> It mainly checks the flink job can recover from  various unabnormal 
> situations including disk full, network interruption, zk unable to connect, 
> rpc message timeout, etc. 
> If job can't be recoverd it means test failed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (FLINK-18116) Manually test E2E performance on Flink 1.11

2020-06-24 Thread Aihua Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Li closed FLINK-18116.

Resolution: Done

I mainly ran the stability test developed by Ali: by simulating online abnormal 
conditions (such as network interruption, full disk, JM/AM process being 
killed, TM throwing exception, etc.) to check whether flink operation can be 
automatically recovered. The test lasted 5 hours, simulated multiple abnormal 
combination scenarios, flink job can return to normal, and the checkpoint can 
be created. The test pass

> Manually test E2E performance on Flink 1.11
> ---
>
> Key: FLINK-18116
> URL: https://issues.apache.org/jira/browse/FLINK-18116
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Core, API / DataStream, API / State Processor, 
> Build System, Client / Job Submission
>Affects Versions: 1.11.0
>Reporter: Aihua Li
>Assignee: Aihua Li
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.11.0
>
>
> it's mainly to verify the performance don't less than 1.10 version by 
> checking the metrics of end-to-end performance test,such as qps,latency .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-18115) Manually test fault-tolerance stability on Flink 1.11

2020-06-04 Thread Aihua Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Li updated FLINK-18115:
-
Summary: Manually test fault-tolerance stability on Flink 1.11  (was: 
StabilityTest)

> Manually test fault-tolerance stability on Flink 1.11
> -
>
> Key: FLINK-18115
> URL: https://issues.apache.org/jira/browse/FLINK-18115
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Core, API / State Processor, Build System, Client 
> / Job Submission
>Affects Versions: 1.11.0
>Reporter: Aihua Li
>Assignee: Aihua Li
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.11.0
>
>
> It mainly checks the flink job can recover from  various unabnormal 
> situations including disk full, network interruption, zk unable to connect, 
> rpc message timeout, etc. 
> If job can't be recoverd it means test failed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-18115) StabilityTest

2020-06-04 Thread Aihua Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Li updated FLINK-18115:
-
Parent: FLINK-18088
Issue Type: Sub-task  (was: Test)

> StabilityTest
> -
>
> Key: FLINK-18115
> URL: https://issues.apache.org/jira/browse/FLINK-18115
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Core, API / State Processor, Build System, Client 
> / Job Submission
>Affects Versions: 1.10.0
>Reporter: Aihua Li
>Priority: Major
>  Labels: release-testing
> Fix For: 1.11.0
>
>
> It mainly checks the flink job can recover from  various unabnormal 
> situations including disk full, network interruption, zk unable to connect, 
> rpc message timeout, etc. 
> If job can't be recoverd it means test failed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-18116) E2E performance test

2020-06-04 Thread Aihua Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Li updated FLINK-18116:
-
Parent: FLINK-18088
Issue Type: Sub-task  (was: Test)

> E2E performance test
> 
>
> Key: FLINK-18116
> URL: https://issues.apache.org/jira/browse/FLINK-18116
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Core, API / DataStream, API / State Processor, 
> Build System, Client / Job Submission
>Affects Versions: 1.11.0
>Reporter: Aihua Li
>Priority: Major
>  Labels: release-testing
> Fix For: 1.11.0
>
>
> it's mainly to verify the performance don't less than 1.10 version by 
> checking the metrics of end-to-end performance test,such as qps,latency .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18116) E2E performance test

2020-06-04 Thread Aihua Li (Jira)
Aihua Li created FLINK-18116:


 Summary: E2E performance test
 Key: FLINK-18116
 URL: https://issues.apache.org/jira/browse/FLINK-18116
 Project: Flink
  Issue Type: Test
  Components: API / Core, API / DataStream, API / State Processor, 
Build System, Client / Job Submission
Affects Versions: 1.11.0
Reporter: Aihua Li
 Fix For: 1.11.0


it's mainly to verify the performance don't less than 1.10 version by checking 
the metrics of end-to-end performance test,such as qps,latency .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-18115) StabilityTest

2020-06-04 Thread Aihua Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-18115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Li updated FLINK-18115:
-
Labels: release-testing  (was: )

> StabilityTest
> -
>
> Key: FLINK-18115
> URL: https://issues.apache.org/jira/browse/FLINK-18115
> Project: Flink
>  Issue Type: Test
>  Components: API / Core, API / State Processor, Build System, Client 
> / Job Submission
>Affects Versions: 1.10.0
>Reporter: Aihua Li
>Priority: Major
>  Labels: release-testing
> Fix For: 1.11.0
>
>
> It mainly checks the flink job can recover from  various unabnormal 
> situations including disk full, network interruption, zk unable to connect, 
> rpc message timeout, etc. 
> If job can't be recoverd it means test failed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18115) StabilityTest

2020-06-04 Thread Aihua Li (Jira)
Aihua Li created FLINK-18115:


 Summary: StabilityTest
 Key: FLINK-18115
 URL: https://issues.apache.org/jira/browse/FLINK-18115
 Project: Flink
  Issue Type: Test
  Components: API / Core, API / State Processor, Build System, Client / 
Job Submission
Affects Versions: 1.10.0
Reporter: Aihua Li
 Fix For: 1.11.0


It mainly checks the flink job can recover from  various unabnormal situations 
including disk full, network interruption, zk unable to connect, rpc message 
timeout, etc. 
If job can't be recoverd it means test failed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17907) flink-table-api-java: Compilation failure

2020-05-27 Thread Aihua Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17117491#comment-17117491
 ] 

Aihua Li commented on FLINK-17907:
--

when i update the jdk version to" Java(TM) SE Runtime Environment (build 
1.8.0_131-b11)", this failure is disappeared.

[~jark] could you help me to close this bug? thanks

> flink-table-api-java: Compilation failure
> -
>
> Key: FLINK-17907
> URL: https://issues.apache.org/jira/browse/FLINK-17907
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.11.0
> Environment: local env
>Reporter: Aihua Li
>Priority: Blocker
> Fix For: 1.11.0
>
>
> When i execute the command "mvn clean install -B -U -DskipTests 
> -Dcheckstyle.skip=true -Drat.ignoreErrors -Dmaven.javadoc.skip " in branch 
> "master" and "release\-1.11" to install flink in my local env, i meet this 
> failure:
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.8.0:compile 
> (default-compile) on project flink-table-api-java: Compilation failure
> [ERROR] 
> flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/operations/utils/AggregateOperationFactory.java:[550,53]
>  unreported exception X; must be caught or declared to be thrown
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> [ERROR]
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR] mvn  -rf :flink-table-api-java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17907) flink-table-api-java: Compilation failure

2020-05-25 Thread Aihua Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17115770#comment-17115770
 ] 

Aihua Li commented on FLINK-17907:
--

java version:openjdk version "1.8.0_242"

mvn version:Apache Maven 3.2.5 (12a6b3acb947671f09b81f49094c53f426d8cea1; 
2014-12-15T01:29:23+08:00)

> flink-table-api-java: Compilation failure
> -
>
> Key: FLINK-17907
> URL: https://issues.apache.org/jira/browse/FLINK-17907
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.11.0
> Environment: local env
>Reporter: Aihua Li
>Priority: Blocker
> Fix For: 1.11.0
>
>
> When i execute the command "mvn clean install -B -U -DskipTests 
> -Dcheckstyle.skip=true -Drat.ignoreErrors -Dmaven.javadoc.skip " in branch 
> "master" and "release\-1.11" to install flink in my local env, i meet this 
> failure:
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.8.0:compile 
> (default-compile) on project flink-table-api-java: Compilation failure
> [ERROR] 
> flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/operations/utils/AggregateOperationFactory.java:[550,53]
>  unreported exception X; must be caught or declared to be thrown
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> [ERROR]
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR] mvn  -rf :flink-table-api-java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17907) flink-table-api-java: Compilation failure

2020-05-24 Thread Aihua Li (Jira)
Aihua Li created FLINK-17907:


 Summary: flink-table-api-java: Compilation failure
 Key: FLINK-17907
 URL: https://issues.apache.org/jira/browse/FLINK-17907
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Affects Versions: 1.11.0
 Environment: local env
Reporter: Aihua Li
 Fix For: 1.11.0


When i execute the command "mvn clean install -B -U -DskipTests 
-Dcheckstyle.skip=true -Drat.ignoreErrors -Dmaven.javadoc.skip " in branch 
"master" and "release\-1.11" to install flink in my local env, i meet this 
failure:

[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.8.0:compile (default-compile) 
on project flink-table-api-java: Compilation failure
[ERROR] 
flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/operations/utils/AggregateOperationFactory.java:[550,53]
 unreported exception X; must be caught or declared to be thrown
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn  -rf :flink-table-api-java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)