Guys, During internal testing, we've found a critical bug with discovery (cluster falls apart if two nodes segmented sequentially). This problem is not reproduced in 2.8.1. I think we should fix it before release. Under investigation now. I'll let you know when we get something.
чт, 17 сент. 2020 г. в 00:51, Andrey Gura <ag...@apache.org>: > > So what do you think? Should we proceed with a 'hacked' version of the > message factory in 2.9 and go for the runtime message generation in later > release, or keep the code clean and fix the regression in the next releases? > > Andrey, can you take a look at my change? I think it is fairly > straightforward and does not change the semantics, just skips the factory > closures for certain messages. > > IMHO 2.5% isn't too much especially because it isn't actual for all > workloads (I didn't get any significant drops during benchmarking). So > I prefer the runtime generation in later releases. > > On Mon, Sep 14, 2020 at 12:41 PM Alexey Goncharuk > <alexey.goncha...@gmail.com> wrote: > > > > Alexey, Andrey, Igniters, > > > > So what do you think? Should we proceed with a 'hacked' version of the > message factory in 2.9 and go for the runtime message generation in later > release, or keep the code clean and fix the regression in the next releases? > > Andrey, can you take a look at my change? I think it is fairly > straightforward and does not change the semantics, just skips the factory > closures for certain messages. > > > > Personally, I would prefer fixing the regression given that we also > introduced tracing in this release. > > > > > > > > пт, 11 сент. 2020 г. в 12:09, Alex Plehanov <plehanov.a...@gmail.com>: > >> > >> Alexey, > >> > >> We've benchmarked by yardstick commits 130376741bf vs ed52559eb95 and > the performance of ed52559eb95 is better for about 2.5% on average on our > environment (about the same results we got benchmarking 65c30ec6 vs > 0606f03d). We've made 24 runs for each commit of > IgnitePutTxImplicitBenchmark (we got maximum drop for 2.9 on this > benchmark), 200 seconds warmup, 300 seconds benchmark, 6 servers, 5 clients > 50 threads each. Yardstick results for this configuration: > >> Commit 130376741bf: avg TPS=164096, avg latency=9173464 ns > >> Commit ed52559eb95: avg TPS=168283, avg latency=8945908 ns > >> > >> пт, 11 сент. 2020 г. в 09:51, Artem Budnikov < > a.budnikov.ign...@gmail.com>: > >>> > >>> Hi Everyone, > >>> > >>> I posted an instruction on how to publish the docs on > ignite.apache.org/docs [1]. When you finish with Ignite 2.9, you can > update the docs by following the instruction. Unfortunately, I won't be > able to spend any time on this project any longer. You can send your pull > requests and questions about the documentation to Denis Magda. > >>> > >>> -Artem > >>> > >>> [1] : > https://cwiki.apache.org/confluence/display/IGNITE/How+to+Document > >>> > >>> On Thu, Sep 10, 2020 at 2:45 PM Alexey Goncharuk < > alexey.goncha...@gmail.com> wrote: > >>>> > >>>> Alexey, > >>>> > >>>> I've tried to play with message factories locally, but unfortunately, > I > >>>> cannot spot the difference between old and new implementation in > >>>> distributed benchmarks. I pushed an implementation of > MessageFactoryImpl > >>>> with the old switch statement to the ignite-2.9-revert-12568 branch > >>>> (discussed this with Andrey Gura, the change should be compatible > with the > >>>> new metrics as we still use the register() mechanics). > >>>> > >>>> Can you check if this change makes any difference performance-wise in > your > >>>> environment? If yes, we can go with runtime code generation in the > long > >>>> term: register classes and generate a dynamic message factory with a > switch > >>>> statement once all messages are registered (not in 2.9 though, > obviously). > >>>> > >>>> ср, 9 сент. 2020 г. в 14:53, Alex Plehanov <plehanov.a...@gmail.com>: > >>>> > >>>> > Hello guys, > >>>> > > >>>> > I've tried to optimize tracing implementation (ticket [1]), it > reduced the > >>>> > drop, but not completely removed it. > >>>> > Ivan Rakov, Alexander Lapin, can you please review the patch? > >>>> > Ivan Artiukhov, can you please benchmark the patch [2] against 2.8.1 > >>>> > release on your environment? > >>>> > With this patch on our environment, it's about a 3% drop left, it's > close > >>>> > to measurement error and I think such a drop is not a showstopper. > Guys, > >>>> > WDYT? > >>>> > > >>>> > Also, I found that compatibility is broken for JDBC thin driver > between 2.8 > >>>> > and 2.9 versions (ticket [3]). I think it's a blocker and should be > >>>> > fixed in 2.9. I've prepared the patch. > >>>> > Taras Ledkov, can you please review this patch? > >>>> > > >>>> > And one more ticket I propose to include into 2.9 [4] (NIO message > >>>> > send problem in some circumstances). I will cherry-pick it if there > is no > >>>> > objection. > >>>> > > >>>> > [1] https://issues.apache.org/jira/browse/IGNITE-13411 > >>>> > [2] https://github.com/apache/ignite/pull/8223 > >>>> > [3] https://issues.apache.org/jira/browse/IGNITE-13414 > >>>> > [4] https://issues.apache.org/jira/browse/IGNITE-13361 > >>>> > > >>>> > пн, 7 сент. 2020 г. в 14:14, Maxim Muzafarov <mmu...@apache.org>: > >>>> > > >>>> > > Alexey, > >>>> > > > >>>> > > I propose to include [1] issue to the 2.9 release. Since this > issue is > >>>> > > related to the new master key change functionality which haven't > been > >>>> > > released yet I think it will be safe to cherry-pick commit to the > >>>> > > release branch. > >>>> > > > >>>> > > [1] https://issues.apache.org/jira/browse/IGNITE-13390 > >>>> > > > >>>> > > On Tue, 1 Sep 2020 at 12:13, Nikolay Izhikov <nizhi...@apache.org > > > >>>> > wrote: > >>>> > > > > >>>> > > > Hello, Igniters. > >>>> > > > > >>>> > > > Alexey, please, include one more Python thin client fix [1] > into the > >>>> > 2.9 > >>>> > > release > >>>> > > > It fixes kinda major issue - "Python client returns fields in > wrong > >>>> > > order since the 2 row when fields_count>10" > >>>> > > > > >>>> > > > [1] https://issues.apache.org/jira/browse/IGNITE-12809 > >>>> > > > [2] > >>>> > > > >>>> > > https://github.com/apache/ignite/commit/38025ee4167f05eaa2d6a2c5c2ab70c83a462cfc > >>>> > > > > >>>> > > > > 31 авг. 2020 г., в 19:23, Alexey Goncharuk < > >>>> > alexey.goncha...@gmail.com> > >>>> > > написал(а): > >>>> > > > > > >>>> > > > > Alexey, thanks, got it. I am not sure we can optimize > anything out of > >>>> > > the > >>>> > > > > message factory with suppliers (at least I have no ideas > right now), > >>>> > so > >>>> > > > > most likely the only move here is to switch back to the switch > >>>> > approach > >>>> > > > > somehow preserving the metrics part. Probably, inlining the > Ignite > >>>> > > messages > >>>> > > > > to the IgniteMessageFactoryImpl should do the trick. Let me > explore > >>>> > the > >>>> > > > > code a bit. > >>>> > > > > > >>>> > > > > P.S. I am surprised by the impact this part makes for the > >>>> > performance. > >>>> > > > > Message creation is indeed on the hot path, but a single > virtual call > >>>> > > > > should not make that much of a difference given the amount of > other > >>>> > > work > >>>> > > > > happening during the message processing. > >>>> > > > > > >>>> > > > > пн, 31 авг. 2020 г. в 18:33, Alex Plehanov < > plehanov.a...@gmail.com > >>>> > >: > >>>> > > > > > >>>> > > > >> Alexey, sorry, I wrongly interpreted our benchmark results. > >>>> > Actually, > >>>> > > we > >>>> > > > >> were looking for a drop using bi-sect in the range between > e6a7f93 > >>>> > > (first > >>>> > > > >> commit in the 2.9 branch after 2.8 branch cut) and 6592dfa5 > (last > >>>> > > commit in > >>>> > > > >> the 2.9 branch). And we found these two problematic commits. > >>>> > > > >> > >>>> > > > >> Perhaps only IGNITE-13060 (Tracing) is responsible for a drop > >>>> > between > >>>> > > > >> 2.8.1 and 2.9 (we have benchmarked 2.8.1 vs 2.9 with reverted > >>>> > > IGNITE-13060 > >>>> > > > >> now and performance looks the same) > >>>> > > > >> > >>>> > > > >> Ticket IGNITE-12568 (MessageFactory refactoring) is not > related to > >>>> > > drop > >>>> > > > >> between 2.8.1 and 2.9, but still has some performance > problem, and > >>>> > we > >>>> > > can > >>>> > > > >> win back IGNITE-13060 drop by this ticket. > >>>> > > > >> > >>>> > > > >> Do we need more investigation on IGNITE-13060 or we leave it > as is? > >>>> > > > >> > >>>> > > > >> What should we do with IGNITE-12568 (MessageFactory > refactoring)? > >>>> > > > >> > >>>> > > > >> пн, 31 авг. 2020 г. в 13:25, Alexey Goncharuk < > >>>> > > alexey.goncha...@gmail.com > >>>> > > > >>> : > >>>> > > > >> > >>>> > > > >>> Alexey, > >>>> > > > >>> > >>>> > > > >>> While investigating, I found that IGNITE-12568 has an > incorrect fix > >>>> > > > >>> version and is actually present in ignite-2.8.1 branch [1], > so it > >>>> > > cannot be > >>>> > > > >>> the source of the drop against 2.8.1. > >>>> > > > >>> > >>>> > > > >>> P.S. Looks like we need to enforce a more accurate work > with fix > >>>> > > versions > >>>> > > > >>> or develop some sort of tooling to verify the fix versions. > >>>> > > > >>> > >>>> > > > >>> --AG > >>>> > > > >>> > >>>> > > > >>> [1] > >>>> > > > >>> > >>>> > > > >>>> > > https://github.com/apache/ignite/commit/3e492bd23851856bbd0385c6a419892d0bba2a34 > >>>> > > > >>> > >>>> > > > >>> пн, 31 авг. 2020 г. в 12:42, Alexey Goncharuk < > >>>> > > alexey.goncha...@gmail.com > >>>> > > > >>>> : > >>>> > > > >>> > >>>> > > > >>>> пт, 28 авг. 2020 г. в 11:16, Alex Plehanov < > >>>> > plehanov.a...@gmail.com > >>>> > > >: > >>>> > > > >>>> > >>>> > > > >>>>> Guys, > >>>> > > > >>>>> > >>>> > > > >>>>> We have benchmarked 2.9 without IGNITE-13060 and > IGNITE-12568 > >>>> > > (reverted > >>>> > > > >>>>> it > >>>> > > > >>>>> locally) and got the same performance as on 2.8.1 > >>>> > > > >>>>> > >>>> > > > >>>>> IGNITE-13060 (Tracing) - some code was added to hot > paths, to > >>>> > trace > >>>> > > > >>>>> these > >>>> > > > >>>>> hot paths, it's clear why we have performance drop here. > >>>> > > > >>>>> > >>>> > > > >>>>> IGNITE-12568 (MessageFactory refactoring) - switch/case > block was > >>>> > > > >>>>> refactored to an array of message suppliers. The message > factory > >>>> > > is on > >>>> > > > >>>>> the > >>>> > > > >>>>> hot path, which explains why this commit has an impact on > total > >>>> > > > >>>>> performance. > >>>> > > > >>>>> I've checked JIT assembly output, done some JMH > microbenchmarks, > >>>> > > and > >>>> > > > >>>>> found > >>>> > > > >>>>> that old implementation of MessageFactory.create() about > 30-35% > >>>> > > faster > >>>> > > > >>>>> than > >>>> > > > >>>>> the new one. The reason - approach with switch/case can > >>>> > effectively > >>>> > > > >>>>> inline > >>>> > > > >>>>> message creation code, but with an array of suppliers > relatively > >>>> > > heavy > >>>> > > > >>>>> "invokeinterface" cannot be skipped. I've tried to > rewrite the > >>>> > code > >>>> > > > >>>>> using > >>>> > > > >>>>> an abstract class for suppliers instead of an interface > (to > >>>> > > > >>>>> replace "invokeinterface" with the "invokevirtual"), but > it gives > >>>> > > back > >>>> > > > >>>>> only > >>>> > > > >>>>> 10% of method performance and in this case, code looks > ugly > >>>> > > (lambdas > >>>> > > > >>>>> can't > >>>> > > > >>>>> be used). Currently, I can't find any more ways to > optimize the > >>>> > > current > >>>> > > > >>>>> approach (except return to the switch/case block). Andrey > Gura, > >>>> > as > >>>> > > the > >>>> > > > >>>>> author of IGNITE-12568, maybe you have some ideas about > >>>> > > optimization? > >>>> > > > >>>>> > >>>> > > > >>>>> Perhaps we should revert IGNITE-12568, but there are some > metrics > >>>> > > > >>>>> already > >>>> > > > >>>>> created, which can't be rewritten using old message > factory > >>>> > > > >>>>> implementation > >>>> > > > >>>>> (IGNITE-12756). Guys, WDYT? > >>>> > > > >>>>> > >>>> > > > >>>> > >>>> > > > >>>> Alexey, > >>>> > > > >>>> > >>>> > > > >>>> I see that IGNITE-12756 (metrics improvements) is already > released > >>>> > > in > >>>> > > > >>>> Ignite 2.8.1 while IGNITE-12568 (message factory) is only > present > >>>> > > in Ignite > >>>> > > > >>>> 2.9. Let's revert both IGNITE-12568 and whichever new > metrics > >>>> > > created for > >>>> > > > >>>> 2.9 that depend on the new message factory to unblock the > release > >>>> > > and deal > >>>> > > > >>>> with the optimizations in 2.10? > >>>> > > > >>>> > >>>> > > > >>> > >>>> > > > > >>>> > > > >>>> > >