[
https://issues.apache.org/jira/browse/FLINK-18433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17147409#comment-17147409
]
Arvid Heise commented on FLINK-18433:
-------------------------------------
[~trohrmann], I used the local executor with explicit Xmx configuration, so I'm
bypassing all the TM/JM memory setup code. In the end, most values should be
default values.
{noformat}
2020-06-26 14:53:07,199 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils [] - The
configuration option Key: 'taskmanager.cpu.cores' , default: null (fallback
keys: []) required for local execution is not set, setting it to its default
value 1.7976931348623157E308
2020-06-26 14:53:07,201 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils [] - The
configuration option Key: 'taskmanager.memory.task.heap.size' , default: null
(fallback keys: []) required for local execution is not set, setting it to its
default value 9223372036854775807 bytes
2020-06-26 14:53:07,201 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils [] - The
configuration option Key: 'taskmanager.memory.task.off-heap.size' , default: 0
bytes (fallback keys: []) required for local execution is not set, setting it
to its default value 9223372036854775807 bytes
2020-06-26 14:53:07,202 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils [] - The
configuration option Key: 'taskmanager.memory.network.min' , default: 64 mb
(fallback keys: [{key=taskmanager.network.memory.min, isDeprecated=true}])
required for local execution is not set, setting it to its default value 64 mb
2020-06-26 14:53:07,202 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils [] - The
configuration option Key: 'taskmanager.memory.network.max' , default: 1 gb
(fallback keys: [{key=taskmanager.network.memory.max, isDeprecated=true}])
required for local execution is not set, setting it to its default value 64 mb
2020-06-26 14:53:07,202 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils [] - The
configuration option Key: 'taskmanager.memory.managed.size' , default: null
(fallback keys: [{key=taskmanager.memory.size, isDeprecated=true}]) required
for local execution is not set, setting it to its default value 128 mb{noformat}
Anyone knows how the TPS metric is calculated? Would slower deployment affect
it? Or is it only the record/s for the last second? [~Aihua] could you publish
the raw measurements? I'd like to see the spread and maybe the timeline will
also help us.
We can exclude the weird cancellation behavior though (should still be
investigated) as it seems [~Aihua] did not cancel the job before taking the
metric.
> From the end-to-end performance test results, 1.11 has a regression
> -------------------------------------------------------------------
>
> Key: FLINK-18433
> URL: https://issues.apache.org/jira/browse/FLINK-18433
> Project: Flink
> Issue Type: Bug
> Components: API / Core, API / DataStream
> Affects Versions: 1.11.0
> Environment: 3 machines
> [|https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations_1.11/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
> Reporter: Aihua Li
> Priority: Major
> Attachments: flink_11.log.gz
>
>
>
> I ran end-to-end performance tests between the Release-1.10 and Release-1.11.
> the results were as follows:
> |scenarioName|release-1.10|release-1.11| |
> |OneInput_Broadcast_LazyFromSource_ExactlyOnce_10_rocksdb|46.175|43.81333333|-5.11%|
> |OneInput_Rescale_LazyFromSource_ExactlyOnce_100_heap|211.835|200.355|-5.42%|
> |OneInput_Rebalance_LazyFromSource_ExactlyOnce_1024_rocksdb|1721.041667|1618.323333|-5.97%|
> |OneInput_KeyBy_LazyFromSource_ExactlyOnce_10_heap|46|43.615|-5.18%|
> |OneInput_Broadcast_Eager_ExactlyOnce_100_rocksdb|212.105|199.6883333|-5.85%|
> |OneInput_Rescale_Eager_ExactlyOnce_1024_heap|1754.64|1600.123333|-8.81%|
> |OneInput_Rebalance_Eager_ExactlyOnce_10_rocksdb|45.91666667|43.09833333|-6.14%|
> |OneInput_KeyBy_Eager_ExactlyOnce_100_heap|212.0816667|200.7266667|-5.35%|
> |OneInput_Broadcast_LazyFromSource_AtLeastOnce_1024_rocksdb|1718.245|1614.381667|-6.04%|
> |OneInput_Rescale_LazyFromSource_AtLeastOnce_10_heap|46.12|43.55166667|-5.57%|
> |OneInput_Rebalance_LazyFromSource_AtLeastOnce_100_rocksdb|212.0383333|200.3883333|-5.49%|
> |OneInput_KeyBy_LazyFromSource_AtLeastOnce_1024_heap|1762.048333|1606.408333|-8.83%|
> |OneInput_Broadcast_Eager_AtLeastOnce_10_rocksdb|46.05833333|43.49666667|-5.56%|
> |OneInput_Rescale_Eager_AtLeastOnce_100_heap|212.2333333|201.1883333|-5.20%|
> |OneInput_Rebalance_Eager_AtLeastOnce_1024_rocksdb|1720.663333|1616.85|-6.03%|
> |OneInput_KeyBy_Eager_AtLeastOnce_10_heap|46.14|43.62333333|-5.45%|
> |TwoInputs_Broadcast_LazyFromSource_ExactlyOnce_100_rocksdb|156.9183333|152.9566667|-2.52%|
> |TwoInputs_Rescale_LazyFromSource_ExactlyOnce_1024_heap|1415.511667|1300.1|-8.15%|
> |TwoInputs_Rebalance_LazyFromSource_ExactlyOnce_10_rocksdb|34.29666667|34.16666667|-0.38%|
> |TwoInputs_KeyBy_LazyFromSource_ExactlyOnce_100_heap|158.3533333|151.8483333|-4.11%|
> |TwoInputs_Broadcast_Eager_ExactlyOnce_1024_rocksdb|1373.406667|1300.056667|-5.34%|
> |TwoInputs_Rescale_Eager_ExactlyOnce_10_heap|34.57166667|32.09666667|-7.16%|
> |TwoInputs_Rebalance_Eager_ExactlyOnce_100_rocksdb|158.655|147.44|-7.07%|
> |TwoInputs_KeyBy_Eager_ExactlyOnce_1024_heap|1356.611667|1292.386667|-4.73%|
> |TwoInputs_Broadcast_LazyFromSource_AtLeastOnce_10_rocksdb|34.01|33.205|-2.37%|
> |TwoInputs_Rescale_LazyFromSource_AtLeastOnce_100_heap|149.5883333|145.9966667|-2.40%|
> |TwoInputs_Rebalance_LazyFromSource_AtLeastOnce_1024_rocksdb|1359.74|1299.156667|-4.46%|
> |TwoInputs_KeyBy_LazyFromSource_AtLeastOnce_10_heap|34.025|29.68333333|-12.76%|
> |TwoInputs_Broadcast_Eager_AtLeastOnce_100_rocksdb|157.3033333|151.4616667|-3.71%|
> |TwoInputs_Rescale_Eager_AtLeastOnce_1024_heap|1368.74|1293.238333|-5.52%|
> |TwoInputs_Rebalance_Eager_AtLeastOnce_10_rocksdb|34.325|33.285|-3.03%|
> |TwoInputs_KeyBy_Eager_AtLeastOnce_100_heap|162.5116667|134.375|-17.31%|
> It can be seen that the performance of 1.11 has a regression, basically
> around 5%, and the maximum regression is 17%. This needs to be checked.
> the test code:
> flink-1.10.0:
> [https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
> flink-1.11.0:
> [https://github.com/Li-Aihua/flink/blob/test_suite_for_basic_operations_1.11/flink-end-to-end-perf-tests/flink-basic-operations/src/main/java/org/apache/flink/basic/operations/PerformanceTestJob.java]
> commit cmd like tis:
> bin/flink run -d -m 192.168.39.246:8081 -c
> org.apache.flink.basic.operations.PerformanceTestJob
> /home/admin/flink-basic-operations_2.11-1.10-SNAPSHOT.jar --topologyName
> OneInput --LogicalAttributesofEdges Broadcast --ScheduleMode LazyFromSource
> --CheckpointMode ExactlyOnce --recordSize 10 --stateBackend rocksdb
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)