Denis, For a reproducer, please see CacheAsyncContinuationExecutorTest.java in the linked PoC [1]
[1] https://github.com/apache/ignite/pull/8870/files#diff-c788c12013622093df07d8f628a6e8c5fb0c15007f0787f2d459dbb3e377fc5aR54 On Thu, Mar 18, 2021 at 1:58 AM Raymond Wilson <raymond_wil...@trimble.com> wrote: > We implemented the ContinueWith() suggestion from Pavel, and it works well > so far in testing, though we do not have data to support if there is or is > not a performance penalty in our use case.. > > To lend another vote to the 'Don't do continuations on the striped thread > pool' line of thinking: Deadlocking is an issue as is starvation. In some > ways starvation is more insidious because by the time things stop working > the cause and effect distance may be large. > > I appreciate the documentation does make statements about not performing > cache operations in a continuation due to deadlock possibilities, but that > statement does not reveal why this is an issue. It is less a case of a > async cache operation being followed by some other cache operation (an > immediate issue), and more a general case of the continuation of > application logic using a striped pool thread in a way that might mean that > thread is never given back - it's now just a piece of the application > infrastructure until some other application activity schedules away from > that thread (eg: by ContinueWith or some other async operation in the > application code that releases the thread). To be fair, beyond structures > like ContinueWith(), it is not obvious how that continuation thread should > be handed back. This will be the same behaviour for dedicated continuation > pools, but with far less risk in the absence of ContinueWith() constructs. > > In the .Net world this is becoming more of an issue as fewer .Net use cases > outside of UI bother with synchronization contexts by default. > > Raymond. > > > On Thu, Mar 18, 2021 at 9:56 AM Valentin Kulichenko < > valentin.kuliche...@gmail.com> wrote: > > > Hi Denis, > > > > I think Pavel's main point is that behavior is unpredictable. For > example, > > AFAIK, putAsync can be executed in the main thread instead of the striped > > pool thread if the operation is local. The listener can also be executed > in > > the main thread - this happens if the future is completed prior to > listener > > invocation (this is actually quite possible in the unit test environment > > causing the test to pass). Finally, I'm pretty sure there are many cases > > when a deadlock does not occur right away, but instead it will reveal > > itself under high load due to thread starvation. The latter type of > issues > > is the most dangerous because they are often reproduced only in > production. > > Finally, there are performance considerations as well - cache operations > > and listeners share the same fixed-sized pool which can have negative > > effects. > > > > I'm OK with the change. Although, it might be better to introduce a new > > fixed-sized pool instead of ForkJoinPool for listeners, simply because > this > > is the approach taken throughout the project. But this is up to a debate. > > > > -Val > > > > On Wed, Mar 17, 2021 at 11:31 AM Denis Garus <garus....@gmail.com> > wrote: > > > > > Pavel, > > > I tried this: > > > > > > @Test > > > public void test() throws Exception { > > > IgniteCache<Integer, String> cache = > > > startGrid().getOrCreateCache("test_cache"); > > > > > > cache.putAsync(1, "one").listen(f -> cache.replace(1, "two")); > > > > > > assertEquals("two", cache.get(1)); > > > } > > > > > > and this test is green. > > > I believe that an user can make listener that leads to deadlock, but > > > the example in the IEP does not reflect this. > > > > > > > > > > > > ср, 17 мар. 2021 г. в 17:36, Вячеслав Коптилин < > slava.kopti...@gmail.com > > >: > > > > > > > Hi Pavel, > > > > > > > > > Not a good excuse really. We have a usability problem, you have to > > > admit > > > > it. > > > > Fair enough. I agree that this is a usability issue, but I have > doubts > > > that > > > > the proposed approach to overcome it is the best one. > > > > > > > > > Documentation won't help - no one is going to read the Javadoc for > a > > > > trivial method like putAsync > > > > That is sad... However, I don't think that this is a strong argument > > > here. > > > > > > > > This is just my opinion. Let's see what other community members have > to > > > > say. > > > > > > > > Thanks, > > > > S. > > > > > > > > > > > > ср, 17 мар. 2021 г. в 17:01, Pavel Tupitsyn <ptupit...@apache.org>: > > > > > > > > > > the user should use the right API > > > > > > > > > > Not a good excuse really. We have a usability problem, you have to > > > admit > > > > > it. > > > > > "The brakes did not work on your car - too bad, you should have > known > > > > that > > > > > on Sundays only your left foot is allowed on the pedal" > > > > > > > > > > This particular use case is too intricate. > > > > > Even when you know about that, it is difficult to decide what can > run > > > on > > > > > the striped pool, > > > > > and what can't. It is too easy to forget. > > > > > And most people don't know, even among Ignite developers. > > > > > > > > > > Documentation won't help - no one is going to read the Javadoc for > a > > > > > trivial method like putAsync. > > > > > > > > > > > > > > > So I propose to have a safe default. > > > > > Then document the performance tuning opportunity on [1]. > > > > > > > > > > Think about how many users abandon a product because it > mysteriously > > > > > crashes and hangs. > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > https://ignite.apache.org/docs/latest/perf-and-troubleshooting/general-perf-tips > > > > > > > > > > > > > > > On Wed, Mar 17, 2021 at 4:21 PM Вячеслав Коптилин < > > > > > slava.kopti...@gmail.com> > > > > > wrote: > > > > > > > > > > > Hi Pavel, > > > > > > > > > > > > Well, I think that the user should use the right API instead of > > > > > introducing > > > > > > uncontested overhead for everyone. > > > > > > For instance, the code that is provided by IEP can changed as > > > follows: > > > > > > > > > > > > IgniteFuture fut = cache.putAsync(1, 1); > > > > > > fut.listenAync(f -> { > > > > > > // Executes on Striped pool and deadlocks. > > > > > > cache.replace(1, 2); > > > > > > }, ForkJoinPool.commonPool()); > > > > > > > > > > > > Of course, it does not mean that this fact should not be properly > > > > > > documented. > > > > > > Perhaps, I am missing something. > > > > > > > > > > > > Thanks, > > > > > > S. > > > > > > > > > > > > ср, 17 мар. 2021 г. в 16:01, Pavel Tupitsyn < > ptupit...@apache.org > > >: > > > > > > > > > > > > > Slava, > > > > > > > > > > > > > > Your suggestion is to keep things as is and discard the IEP, > > right? > > > > > > > > > > > > > > > this can lead to significant overhead > > > > > > > Yes, there is some overhead, but the cost of accidentally > > starving > > > > the > > > > > > > striped pool is worse, > > > > > > > not to mention the deadlocks. > > > > > > > > > > > > > > I believe that we should favor correctness over performance in > > any > > > > > case. > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 17, 2021 at 3:34 PM Вячеслав Коптилин < > > > > > > > slava.kopti...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Well, the specified method already exists :) > > > > > > > > > > > > > > > > /** > > > > > > > > * Registers listener closure to be asynchronously > notified > > > > > > whenever > > > > > > > > future completes. > > > > > > > > * Closure will be processed in specified executor. > > > > > > > > * > > > > > > > > * @param lsnr Listener closure to register. Cannot be > > {@code > > > > > > null}. > > > > > > > > * @param exec Executor to run listener. Cannot be {@code > > > > null}. > > > > > > > > */ > > > > > > > > public void listenAsync(IgniteInClosure<? super > > > > IgniteFuture<V>> > > > > > > > lsnr, > > > > > > > > Executor exec); > > > > > > > > > > > > > > > > Thanks, > > > > > > > > S. > > > > > > > > > > > > > > > > ср, 17 мар. 2021 г. в 15:25, Вячеслав Коптилин < > > > > > > slava.kopti...@gmail.com > > > > > > > >: > > > > > > > > > > > > > > > > > Hello Pavel, > > > > > > > > > > > > > > > > > > I took a look at your IEP and pool request. I have the > > > following > > > > > > > > concerns. > > > > > > > > > First of all, this change breaks the contract of > > > > > > > > IgniteFuture#listen(lsnr) > > > > > > > > > > > > > > > > > > /** > > > > > > > > > * Registers listener closure to be asynchronously > > notified > > > > > > > whenever > > > > > > > > > future completes. > > > > > > > > > * Closure will be processed in thread that completes > > this > > > > > future > > > > > > > or > > > > > > > > > (if future already > > > > > > > > > * completed) immediately in current thread. > > > > > > > > > * > > > > > > > > > * @param lsnr Listener closure to register. Cannot be > > > {@code > > > > > > > null}. > > > > > > > > > */ > > > > > > > > > public void listen(IgniteInClosure<? super > > IgniteFuture<V>> > > > > > > lsnr); > > > > > > > > > > > > > > > > > > In your pull request, the listener is always called > from > > a > > > > > > > specified > > > > > > > > > thread pool (which is fork-join by default) > > > > > > > > > even though the future is already completed at the > moment > > > the > > > > > > > listen > > > > > > > > > method is called. > > > > > > > > > In my opinion, this can lead to significant overhead - > > > > > submission > > > > > > > > > requires acquiring a lock and notifying a pool thread. > > > > > > > > > > > > > > > > > > It seems to me, that we should not change the current > > > > behavior. > > > > > > > > > However, thread pool executor can be added as an optional > > > > parameter > > > > > > of > > > > > > > > > listen() method as follows: > > > > > > > > > > > > > > > > > > public void listen(IgniteInClosure<? super > > > > IgniteFuture<V>> > > > > > > > lsnr, > > > > > > > > > Executor exec); > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > S. > > > > > > > > > > > > > > > > > > пн, 15 мар. 2021 г. в 19:24, Pavel Tupitsyn < > > > > ptupit...@apache.org > > > > > >: > > > > > > > > > > > > > > > > > >> Igniters, > > > > > > > > >> > > > > > > > > >> Please review the IEP [1] and let me know your thoughts. > > > > > > > > >> > > > > > > > > >> [1] > > > > > > > > >> > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-70%3A+Async+Continuation+Executor > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > <http://www.trimble.com/> > Raymond Wilson > Solution Architect, Civil Construction Software Systems (CCSS) > 11 Birmingham Drive | Christchurch, New Zealand > raymond_wil...@trimble.com > > < > https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch > > >