Dmitriy, nope - it is not in the spam and I still have not seen it :) On Tue, May 12, 2015 at 2:52 AM, Dmitriy Setrakyan <[email protected]> wrote:
> Ognen, > > I see your post on the user list: > > http://apache-ignite-users.70518.x6.nabble.com/Node-failing-with-weird-errors-td301.html > > I also got an email. Are you sure it didn't end up in your spam folder? > > D. > > On Tue, May 12, 2015 at 12:21 AM, Ognen Duzlevski < > [email protected] > > wrote: > > > Hello all, > > > > I tried posting to the user list but I am still not seeing the email > after > > an hour. > > > > I have a 5 node cluster where I "lost" a node temporarily (Amazon > reported > > a hardware error so my Ops guys shut down the instance and brought a new > > one back up). > > > > I ran the same ignite.sh configuration on the new node, expecting it to > > join the cluster - however, I am seeing the following in the logs (see > > below). In addition, I cannot access the caches anymore from my code - > > connecting to a cache via getOrCreateCache() just hangs and eventually > > times out. The cluster still has 4 members so I am not quite sure what is > > going on. To add to this - I can cache -scan the caches from visor and > all > > the information is still there, however, inaccessible from my code (with > > client mode on or off, doesn't matter). I am baffled. > > > > Thanks! > > Ognen > > > > > > > [23:10:38,923][WARNING][grid-timeout-worker-#33%null%][GridDhtPartitionsExchangeFuture] > > Retrying preload partition exchange due to timeout [done=false, > > dummy=false, exchId=GridDhtPartitionExchangeId > > [topVer=AffinityTopologyVersion [topVer=1848, minorTopVer=0], > > nodeId=9e648fd3, evt=NODE_JOINED], rcvdIds=[], rmtIds=[e5b581b3, > bd33def3, > > f7cc4da6, 8eda3172, efef2202], remaining=[e5b581b3, bd33def3, f7cc4da6, > > 8eda3172, efef2202], init=true, initFut=true, ready=true, replied=false, > > added=true, oldest=e5b581b3, oldestOrder=1, evtLatch=0, > locNodeOrder=1848, > > locNodeId=9e648fd3-7c84-4261-93e0-916275a0a9ae] > > [23:10:53,699][WARNING][main][GridCachePartitionExchangeManager] Failed > to > > wait for initial partition map exchange. Possible reasons are: > > ^-- Transactions in deadlock. > > ^-- Long running transactions (ignore if this is the case). > > ^-- Unreleased explicit locks. > > > > > [23:10:53,926][WARNING][grid-timeout-worker-#33%null%][GridDhtPartitionsExchangeFuture] > > Retrying preload partition exchange due to timeout [done=false, > > dummy=false, exchId=GridDhtPartitionExchangeId > > [topVer=AffinityTopologyVersion [topVer=1848, minorTopVer=0], > > nodeId=9e648fd3, evt=NODE_JOINED], rcvdIds=[], rmtIds=[e5b581b3, > bd33def3, > > f7cc4da6, 8eda3172, efef2202], remaining=[e5b581b3, bd33def3, f7cc4da6, > > 8eda3172, efef2202], init=true, initFut=true, ready=true, replied=false, > > added=true, oldest=e5b581b3, oldestOrder=1, evtLatch=0, > locNodeOrder=1848, > > locNodeId=9e648fd3-7c84-4261-93e0-916275a0a9ae] > > > > > [23:11:08,929][WARNING][grid-timeout-worker-#33%null%][GridDhtPartitionsExchangeFuture] > > Retrying preload partition exchange due to timeout [done=false, > > dummy=false, exchId=GridDhtPartitionExchangeId > > [topVer=AffinityTopologyVersion [topVer=1848, minorTopVer=0], > > nodeId=9e648fd3, evt=NODE_JOINED], rcvdIds=[], rmtIds=[e5b581b3, > bd33def3, > > f7cc4da6, 8eda3172, efef2202], remaining=[e5b581b3, bd33def3, f7cc4da6, > > 8eda3172, efef2202], init=true, initFut=true, ready=true, replied=false, > > added=true, oldest=e5b581b3, oldestOrder=1, evtLatch=0, > locNodeOrder=1848, > > locNodeId=9e648fd3-7c84-4261-93e0-916275a0a9ae] > > [....] > > [repeated many, many times] > > [....] > > >
