Dmitriy, I don’t see why a result of a simple query such as “select count(*) from t;” should be different if a rebalancing is in progress or after a cluster restart. Ignite’s SQL engine claims that its fault-tolerant and returns a consistent result set all the times unless a partition loss happened. Here is we don’t have a partition loss, thus, seems we caught a bug.
Vladimir O., please chime in. — Denis > On Oct 26, 2017, at 3:34 PM, Dmitry Pavlov <dpavlov....@gmail.com> wrote: > > Hi Denis > > It seems to me that this is not a bug for my scenario, because the data was > not loaded within the same transaction using transactional cache. In this > case it is ok that cache data is rebalanced according to partition update > counters,isn't it? > > I suppose in this case the data was not lost ,it was just not completely > transferred to the second node. > > Sincerely, > > чт, 26 окт. 2017 г., 21:09 Denis Magda <dma...@apache.org > <mailto:dma...@apache.org>>: > + dev list > > This scenario has to be handled automatically by Ignite. Seems like a bug. > Please refer to the initial description of the issue. Alex G, please have a > look: > > To reproduce: > 1. create a replicated cache with multiple indexedtypes, with some indexes > 2. Start first server node > 3. Insert data into cache (1000000 entries) > 4. Start second server node > > At this point, seems all is ok, data is apparently successfully rebalanced > making sql queries (count(*)) > > 5. Stop server nodes > 6. Restart server nodes > 7. Doing sql queries (count(*)) returns less data > > — > Denis > > > On Oct 23, 2017, at 5:11 AM, Dmitry Pavlov <dpavlov....@gmail.com > > <mailto:dpavlov....@gmail.com>> wrote: > > > > Hi, > > > > I tried to write the same code that will execute the described scenario. > > The results are as follows: > > If I do not give enough time to completely rebalance partitions, then the > > newly launched node will not have enough data to count(*). > > If I do not wait for enough time to allow to distribute the data on the > > grid, the query will return a smaller number - the number of records that > > have been uploaded to the node. I guess there is > > GridDhtPartitionDemandMessage’s can be found in Ignite debug log in this > > moment. > > > > If I wait for a sufficient amount of time or directly call the wait on the > > newly joined node > > ignite2.cache (CACHE) .rebalance (). get (); > > then all results will be correct. > > > > About your question> what's happen if one cluster node crashes in the > > middle of rebalance process? > > In this case normal failover scenario is started, data is rebalanced within > > cluster. And if there is enought WAL records on nodes representing history > > from crash point, then only recent changes (delta) will be send over > > network. If there is no enought history to apply rebalance with most recent > > changes, then partition will be rebalanced from scratch to new node. > > > > Sincerely, > > Pavlov Dmitry > > > > > > сб, 21 окт. 2017 г. в 2:07, Manu <maxn...@hotmail.com > > <mailto:maxn...@hotmail.com> <mailto:maxn...@hotmail.com > > <mailto:maxn...@hotmail.com>>>: > > Hi, > > > > after restart data seems not be consistent. > > > > We have been waiting until rebalance was fully completed to restart the > > cluster to check if durable memory data rebalance works correctly and sql > > queries still work. > > Another question (it´s not this case), what's happen if one cluster node > > crashes in the middle of rebalance process? > > > > Thanks! > > > > > > > > -- > > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ > > <http://apache-ignite-users.70518.x6.nabble.com/> > > <http://apache-ignite-users.70518.x6.nabble.com/ > > <http://apache-ignite-users.70518.x6.nabble.com/>>