This looks like a mistake. However, we're going to drop IGFS so the fix is
unlikely to be expected.

The recommended practical approach is to increase number of threads in
system thread pool to large value.

Ilya Kasnacheev

вт, 27 авг. 2019 г. в 00:34, Chris Software <softwarechri...@gmail.com>:

> Hello,
> I am working on a project and we have run into two related problems while
> doing Map_Reduce on Ignite Filesystem Cache.
> We were originally on Ignite 2.6 but upgraded to 2.7.5 in an unsuccessful
> bid to resolve the problem.
> We have a deadlock in our map-reduce process and have reproduced it at
> https://github.com/csteppp/ignite-deadlock-issue in
> https://github.com/csteppp/ignite-deadlock-issue/blob/master/src/test/java/testignite/MapReduceIgniteTest.java.
> Basically, if you run the test (mvn test) it will deadlock and hang.  We
> have two IgfsTasks created and have set the SYS threadpool to size 2 for
> demonstration purposes.  Each IgfsTask sleeps and then writes to a file.
> This causes a deadlock because:
> 1.  The IgfsTask is run in the SYS pool.
> 2.  The Igfs write action uses a separate thread in the SYS pool
> 3.  Then if there are no empty threads available, the whole system hangs.
> First, shouldn't executeAsync execute the task in the PUBLIC pool?  Using
> the SYS pool seems unnecessarily risky, as we found it actually *locks up
> an entire cluster of many ignite nodes *when it deadlocks.  How do I get
> it to use the PUBLIC pool?  Also, since it is using the SYS pool, it
> actually seems to execute this on the client.  This is not obvious in this
> test, but in my real cluster of 30 nodes, the client seems to be doing this
> work, which is a problem.
> Second, is it bad form to open a file within a map-reduce?  Even using the
> public pool will not solve the inherent deadlock here--that one thread is
> depending on another thread in the same thread pool.  That's an inherent
> risk.  In our real process we open the file because we are performing file
> transformations in the IgfsTask, and then writing the results out to temp
> files in the cluster.  In the end, we collate all the temp files.  Is there
> a better approach, or a safe way to open a file and write to it from within
> a reduce?
> Thank you for your time!
> Chris

Reply via email to