Re: Question about rebuilding caches, timeouts etc.

Ognen Duzlevski Tue, 12 May 2015 05:07:25 -0700

Dmitriy, nope - it is not in the spam and I still have not seen it :)

On Tue, May 12, 2015 at 2:52 AM, Dmitriy Setrakyan <[email protected]>
wrote:


> Ognen,
>
> I see your post on the user list:
>
> http://apache-ignite-users.70518.x6.nabble.com/Node-failing-with-weird-errors-td301.html
>
> I also got an email. Are you sure it didn't end up in your spam folder?
>
> D.
>
> On Tue, May 12, 2015 at 12:21 AM, Ognen Duzlevski <
> [email protected]
> > wrote:
>
> > Hello all,
> >
> > I tried posting to the user list but I am still not seeing the email
> after
> > an hour.
> >
> > I have a 5 node cluster where I "lost" a node temporarily (Amazon
> reported
> > a hardware error so my Ops guys shut down the instance and brought a new
> > one back up).
> >
> > I ran the same ignite.sh configuration on the new node, expecting it to
> > join the cluster - however, I am seeing the following in the logs (see
> > below). In addition, I cannot access the caches anymore from my code -
> > connecting to a cache via getOrCreateCache() just hangs and eventually
> > times out. The cluster still has 4 members so I am not quite sure what is
> > going on. To add to this - I can cache -scan the caches from visor and
> all
> > the information is still there, however, inaccessible from my code (with
> > client mode on or off, doesn't matter). I am baffled.
> >
> > Thanks!
> > Ognen
> >
> >
> >
> [23:10:38,923][WARNING][grid-timeout-worker-#33%null%][GridDhtPartitionsExchangeFuture]
> > Retrying preload partition exchange due to timeout [done=false,
> > dummy=false, exchId=GridDhtPartitionExchangeId
> > [topVer=AffinityTopologyVersion [topVer=1848, minorTopVer=0],
> > nodeId=9e648fd3, evt=NODE_JOINED], rcvdIds=[], rmtIds=[e5b581b3,
> bd33def3,
> > f7cc4da6, 8eda3172, efef2202], remaining=[e5b581b3, bd33def3, f7cc4da6,
> > 8eda3172, efef2202], init=true, initFut=true, ready=true, replied=false,
> > added=true, oldest=e5b581b3, oldestOrder=1, evtLatch=0,
> locNodeOrder=1848,
> > locNodeId=9e648fd3-7c84-4261-93e0-916275a0a9ae]
> > [23:10:53,699][WARNING][main][GridCachePartitionExchangeManager] Failed
> to
> > wait for initial partition map exchange. Possible reasons are:
> >   ^-- Transactions in deadlock.
> >   ^-- Long running transactions (ignore if this is the case).
> >   ^-- Unreleased explicit locks.
> >
> >
> [23:10:53,926][WARNING][grid-timeout-worker-#33%null%][GridDhtPartitionsExchangeFuture]
> > Retrying preload partition exchange due to timeout [done=false,
> > dummy=false, exchId=GridDhtPartitionExchangeId
> > [topVer=AffinityTopologyVersion [topVer=1848, minorTopVer=0],
> > nodeId=9e648fd3, evt=NODE_JOINED], rcvdIds=[], rmtIds=[e5b581b3,
> bd33def3,
> > f7cc4da6, 8eda3172, efef2202], remaining=[e5b581b3, bd33def3, f7cc4da6,
> > 8eda3172, efef2202], init=true, initFut=true, ready=true, replied=false,
> > added=true, oldest=e5b581b3, oldestOrder=1, evtLatch=0,
> locNodeOrder=1848,
> > locNodeId=9e648fd3-7c84-4261-93e0-916275a0a9ae]
> >
> >
> [23:11:08,929][WARNING][grid-timeout-worker-#33%null%][GridDhtPartitionsExchangeFuture]
> > Retrying preload partition exchange due to timeout [done=false,
> > dummy=false, exchId=GridDhtPartitionExchangeId
> > [topVer=AffinityTopologyVersion [topVer=1848, minorTopVer=0],
> > nodeId=9e648fd3, evt=NODE_JOINED], rcvdIds=[], rmtIds=[e5b581b3,
> bd33def3,
> > f7cc4da6, 8eda3172, efef2202], remaining=[e5b581b3, bd33def3, f7cc4da6,
> > 8eda3172, efef2202], init=true, initFut=true, ready=true, replied=false,
> > added=true, oldest=e5b581b3, oldestOrder=1, evtLatch=0,
> locNodeOrder=1848,
> > locNodeId=9e648fd3-7c84-4261-93e0-916275a0a9ae]
> > [....]
> > [repeated many, many times]
> > [....]
> >
>

Re: Question about rebuilding caches, timeouts etc.

Reply via email to