Re: Failed to wait for initial partition map exchange

2016-08-01 Thread Alexey Goncharuk
The ticket is created: https://issues.apache.org/jira/browse/IGNITE-3616 2016-07-15 1:51 GMT+03:00 Alexey Goncharuk : > Alexey, I like the idea in general, but killing non-responsive nodes seems >> a bit drastic to me. How about this approach: >> >> - print out

Re: Failed to wait for initial partition map exchange

2016-07-14 Thread Alexey Goncharuk
> > Alexey, I like the idea in general, but killing non-responsive nodes seems > a bit drastic to me. How about this approach: > > - print out IDs/IPs of non-responsive nodes at all times > - introduce a certain kill timeout for non-responsive nodes (-1 means > disabled) > - the timeout should be

Re: Failed to wait for initial partition map exchange

2016-07-14 Thread Dmitriy Setrakyan
On Fri, Jul 15, 2016 at 12:02 AM, Alexey Goncharuk < alexey.goncha...@gmail.com> wrote: > This is a cross-post from a user list. > > We faced this issue for a lot of times before and got a lot of users > complaining about the whole cluster freeze. We can protect a cluster from > such a situation

Re: Failed to wait for initial partition map exchange

2016-07-14 Thread Alexey Goncharuk
This is a cross-post from a user list. We faced this issue for a lot of times before and got a lot of users complaining about the whole cluster freeze. We can protect a cluster from such a situation simply by dropping non-responsive nodes from the cluster. Of course, we need to get to the bottom