Hi Andrey,
Please see below.
On 12/13/2015 9:08 AM, Andrey Kornev wrote:
Yakov,
If partions do not migrate while they are being iterated over,
wouldn't it then suffice to simply execute a single ScanQuery with its
isLocal set to true? My reasoning here is that the scan would create
an iterator for all affinity partitions, thus preventing their
migration. If it's not the case, how would then a local ScanQuery
behave in presence of topology changes?
From what I see in the code setting ScanQuery's isLocal to 'true' gives
an ability to iterate over all the partitions that belonged to a node at
the time the query is started. All the partitions won't be moved to
another node until the querys's iterator is closed.
However, here I see the following issue. Imagine that your cluster has
two nodes and you decided to iterate over all local partitions of two
nodes and the execution sequence looks like this:
1) ScanQuery with isLocal=true started executing on *node A*. All the
partitions are blocked and won't be moved.
2) *Node B* receives the same compute job with the ScanQuery in it.
However because of an OS scheduler a Thread that is in charge of
starting the query is blocked for some time. So the iterator over local
partitions is not ready yet and the partitions are not blocked;
3) Third *node C* joins the topology. Partitions that are owned by *Node
B *may be rebalanced among *node A* and *node C*.
4) Partitions that are rebalanced from node B to *node A* won't be
visited by your code because node's A iterator is already built while
node's B iterator is constructed after the rebalancing.
The issue can't happen when you specify partitions explicitly using
Yakov's approach below. Because in the worst case in the situation like
above a just rebalanced partition's data will be uploaded to a node that
was initially an owner of the partition (at the time when you calculated
partitions owners).
Also, what's the best way to handle topology changes while using the
SqlQuery rather than ScanQuery? Basically, it's the same use case,
only instead of scanning the entire partition I'd like to first filter
the cache entries using a query.
SqlQueries will work transparently for you and guarantee to return a
full and consistent result set even if a topology is changed while a
query is in progress.
--
Denis
Thanks
Andrey
_____________________________
From: Yakov Zhdanov <[email protected] <mailto:[email protected]>>
Sent: Friday, December 11, 2015 10:55 AM
Subject: RE: Computation on NodeEntries
To: <[email protected] <mailto:[email protected]>>
Partition will not migrate if local or remote iterator is not
finished/closed.
On Dec 11, 2015 21:05, "Andrey Kornev" < [email protected]
<mailto:[email protected]>> wrote:
Great suggestion! Thank you, Yakov!
Just one more question. :) Let's say the scan job is running node
A and processing partition 42. At the same time, a new node B
joins and partition 42 needs to be moved to this node. What will
happen to my scan query that is still running on node A and
iterating over the partition's entries? Would it complete
processing the entire partition despite the change of ownership?
Or, would the query terminate at some arbitrary point once the
partition ownership transfer has completed?
Thanks a lot!
Andrey
------------------------------------------------------------------------
Date: Fri, 11 Dec 2015 16:06:16 +0300
Subject: Re: Computation on NodeEntries
From: [email protected] <mailto:[email protected]>
To: [email protected] <mailto:[email protected]>
Guys, I would do the following:
1. Map all my partitions to
nodes: org.apache.ignite.cache.affinity.Affinity#mapPartitionsToNodes
2. Send jobs (with its list of partitions) to each node using map
returned on step1
3. Job may be like:
new Runnable() {
@Override public void run() {
for (Integer part : parts) {
Iterator<Cache.Entry<Object, Object>> it =cache.query(new
ScanQuery<>(part)).iterator();
// do the stuff... }
}
};
This may result in network calls for some worst cases when topology changes
under your feet, but even in this case this should work.
--Yakov
2015-12-11 2:13 GMT+03:00 Andrey Kornev <[email protected]
<mailto:[email protected]>>:
Dmitriy,
Given the approach you suggested below, what would be your
recommendation for dealing with cluster topology changes while
the iteration is in progress? An obvious one I can think of is to
- somehow detect the change,
- cancel the tasks on all the nodes
- wait until the rebalancing is finished and
- restart the computation.
Are there any other ways? Ideally, I'd like to have the
"exactly-once" execution semantics.
Thanks
Andrey