Hi,

First of all, what is the reason of why you need to interact with caches using 
JNI? Probably we could recommend you some other approach that is simple and 
safer.

Second, it’s hard to tell why ignite1 gets into a deadlock without the 
following:
- logs from all the nodes;
- thread dumps of all the nodes.
- configuration you use.

Please this info if you need an assistance.

Finally, I would recommend switching to ignite 1.5 or wait for a week or so for 
ignite 1.6.

—
Denis

> On May 15, 2016, at 3:59 PM, John <[email protected]> wrote:
> 
> 
> Hi.
> 
> I have 2 ignite instances that use IgniteCache to store some cache values.
> The cache is configured with replication on, so both instances have the same 
> data. 
> 
> Since I am running JNI code to get the cache values, it sometimes (on rare 
> occasions) crashes, which in turn kills the ignite instance. I have an 
> external script that starts the failed ignite instance as soon as it crashes.
> 
> I was expecting the non crashed ignite instance (ignite1) to quickly update 
> the crashed instance (ignite2) and both to continue working as usual. 
> 
> This was exactly what was going on for a few days, until one time, ignite2 
> has crashed, and ignite1 seems to get into a deadlock. As soon as ignite2 got 
> back up, it failed to recognize ignite1, and failed to replicate from it. Any 
> client connections to ignite instances stopped working as well.
> 
> I am seeing this error in the log:
> 
> Failed to wait for initial partition map exchange. Possible reasons are: 
>   ^-- Transactions in deadlock.
>   ^-- Long running transactions (ignore if this is the case).
>   ^-- Unreleased explicit locks.
> 
> and also:
> 
> Local node has detected failed nodes and started cluster-wide procedure. To 
> speed up failure detection please see 'Failure Detection' section under 
> javadoc for 'TcpDiscoverySpi'
> 
> 
> I am using ignite v1.4
> Any suggestions or ideas will be highly appreciated.
> 
> Thanks!
> 

Reply via email to