[ https://issues.apache.org/jira/browse/FLINK-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16993382#comment-16993382 ]
Roman Grebennikov commented on FLINK-15171: ------------------------------------------- [~pnowojski] [~roman_khachatryan] ,while running before-after benchmarks with async-profiler, I noticed that the amount of GC produced by serializerTuple is really different: the new code has much higher GC (about +10%) pressure due to intermediate buffer allocations in StringValue.writeString. All the benchmarks I've did were done on a Ryzen 7 2700 with 8 physical cores (16 with HT), but the benchmarking Hetzner box with i7 7700 has only 4 (8 with HT). Before PR: [^dec05.svg] After PR: [^dec11.svg] As GC threads are concurrent, on 8 core box they didn't interfere with the main benchmark code (as most probably they were scheduled on the idle cores). But on 4 core box they started significantly interfere with the benchmark threads, as there is much less CPU resources available. I did a yet another round of improvements in the serialization code to avoid additional allocations in StringValue.writeString by adding a static ThreadLocal buffer for short strings, the same trick was done for StringValue.readString (but it looks a bit ugly, as thread-locals are dangerous). But anyway this seems to fix the issue with the too heavy GC pressure. I will make a PR today when I will be able to finish all the benchmarks to validate the results. > Performance regression in serialisation benchmarks > -------------------------------------------------- > > Key: FLINK-15171 > URL: https://issues.apache.org/jira/browse/FLINK-15171 > Project: Flink > Issue Type: Bug > Components: API / Type Serialization System, Benchmarks > Affects Versions: 1.10.0 > Reporter: Piotr Nowojski > Assignee: Roman Khachatryan > Priority: Blocker > Fix For: 1.10.0 > > Attachments: dec05.svg, dec11.svg > > > There is quite significant performance regression in serialisation benchmarks > in the commit range 2ecf7ca..9320f34 (which includes FLINK-14346). > http://codespeed.dak8s.net:8000/timeline/?ben=serializerTuple&env=2 > http://codespeed.dak8s.net:8000/timeline/?ben=serializerRow&env=2 > http://codespeed.dak8s.net:8000/timeline/?ben=serializerPojo&env=2 > it coincides with the performance improvement for heavy strings > http://codespeed.dak8s.net:8000/timeline/?ben=serializerHeavyString&env=2 > it might be caused by some accidental change in the benchmarking code > (changing parallelism in one benchmarks is carried on to the next one?) or in > the code itself. > CC [~rgrebennikov] [~AHeise] -- This message was sent by Atlassian Jira (v8.3.4#803005)