Re: Ignite Backup doubts

2018-06-08 Thread dkarachentsev
Hi,

1. By default get() will read backups if node, on which it's invoked is
affinity node. In other words, if current node has backups, Ignite prefer to
read local data from backup rather requesting primary node over network.
This can be changed by setting CacheConfiguration.setReadFromBackup(false)
[1]. 

2. It depends on operations that you call. If you use get() - request will
go to primary node only. If you do SQL query by primary key or affinity key
- it will go to primary node too. In other cases SQL will be invoked on all
nodes as it doesn't know beforehand what data nodes have data satisfied your
query.

3. Optimal configuration is highly depends on your cluster size and hardware
resources. In your case, you have three node and 2 backups, that means each
node keeps full dataset and if two of three nodes failed, you don't loose
data. But if you have more data than available memory for one node, than
it's better either reduce number of backups or increase number of nodes. IMO
the best backup configuration is that allows you to loose 20-30% of nodes
without loosing data.

4. On node fail, affinity function will re-maps partitions between live
nodes, re-balance them and restores number of backups. The more
sophisticated behavior if you use persistence, because baseline topology
will try to avoid re-balancing [2].

[1]
https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/configuration/CacheConfiguration.html#setReadFromBackup-boolean-

[2] https://apacheignite.readme.io/docs/baseline-topology

Thanks!
-Dmitry



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Ignite Backup doubts

2018-06-08 Thread the_palakkaran
Hi,

I have 3 nodes. I have my caches on heap enabled. Also I have persistence
enabled. I have configured 2 backups for each cache. 

As I understand, I can use RendezvousAffinityFunction to exclude neighbors,
so that backup of a cache in Node-1 on a Machine-1 will be stored in a
machine other than itself.

Now, because of this at any point of time, there can be at least a primary
of the same node and a backup of some other node on a single machine. 

My doubt is, 

(1) Will my cache.get(key) be run on the primary and backups on the machine? 
If so, can I limit this ? 

(2) Also, every time I do a get operation or query, will it be executed on
every node?

(3) Again is there an optimal backup configuration based on an N number of
nodes?

(4) When a node in the cluster is down, will another backup be created for
the primary and backups that were on that node on other available nodes?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/