Hi!

I have a question about data consistency in cluster, if there are any mechanism 
for checking that cache is in consistency state (no lost data/partitions)

For example I have a cluster with N nodes, long compute job that calculate 
monthly revenue on huge amount of data. Data is transaction log that stored in 
cache “transactions" <UUID, Float> (where key is transaction id, value 
transaction amount). Cache backups = 2.

First case.

1. We load huge amount of data in cache "transactions"
2. All is fine
3. Run simple compute job that sum values in transaction log  ( Float::sum )
4. Compute job run on all nodes except n1,n2
5. We lost n1 node, than n2 node that stored backup of n1 node
6. Ignite found that node n1 and n2 are down and rebalancing data
7. Compute job was completed

Second case.

1. We load huge amount of data in cache "transactions"
2. All is fine
3. We lost n1 node, then n2 node that stored backup of n1 node
4. Ignite found that node n1 and n2 are down and rebalancing data
5. Run simple compute job that sum values in transaction log  ( (a,b) -> a+b )

So questions:
- what result we can retrieve in first case from compute job?
- at second case we run compute job on cache that is not in consistency state  
(partitions that belongs to n1, n2 nodes was lost) but Ignite cache will work 
fine and allow us to run ComputeJob on cache and doesn’t tell us that some data 
was lost, isn’t it?


With best regards
Alisher Alimov
alimovalis...@gmail.com




Reply via email to