Andrey,

I did not catch why Yakov's suggestion of data processing on per-partition
basis does not work for you? I assume that during this processing the cache
is not updated concurrently because otherwise the task does not make sense
since there are no full cache snapshots in Ignite (yet).

To summarize what has been posted in this thread so far (Sergi, pls correct
me if I am wrong):
1) SQL query issued from a single node is guaranteed to get consistent
results on changing topology (local should be set to false). Currently
there is no 'native' way to parallelize SQL query results processing, such
as limit SQL query to one partition.
2) Scan query issued for a single partition is guaranteed to get consistent
results on changing topology (local should be set to false). Note that
setting local=true and issuing a query locally is not guaranteed to get
consistent results.

Having said that, I see no problem with the following approach:
1) Create a compute task that will create a job for each partition
2) Map each partition's job to the partition's primary node
3) Execute a per-partition Scan query inside the job.

In a worst-case scenario, if a job arrives on the node which has already
lost the partition ownership, data will be fetched from a remote node.

You can open a ticket to fail per-partition Scan query if local flag is set
to true and partition has been moved - in this case 'wrong' jobs could be
failed over to correct nodes.

Reply via email to