Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-11 Thread Yakov Zhdanov
Guys, after some thoughts I would say that even distribution is more
important for affinity function than traffic on rebalancing (which should
be kept to minimum also). Even distribution gives even load on stable
topology, while rebalancing is somet disaster. Apparently, grids should
spend more time in stable state than in failure recovery. And rebalancing
can be configured to cause as less impact to the system as possible.

Dmitry, M. Griggs fixed keys distribution over partitions, but not
partitions over nodes. This change is in ignite-4828, I reviewed it and
will merge it today.

Taras, your numbers are very suspicious also - do you really have 26
partitions migrated on 64 nodes topology when 1 node leaves? I will review
your changes one more time and provide comments here.

--Yakov

2017-04-10 18:12 GMT+03:00 Taras Ledkov :

> I updated the issue [1] with the table of the average count of migrated
> primary partitions when one node leaves.
>
> [1]. https://issues.apache.org/jira/browse/IGNITE-3018?focusedCom
> mentId=15963015=com.atlassian.jira.plugin.system.
> issuetabpanels:comment-tabpanel#comment-15963015
>
>
>
> On 10.04.2017 18:00, Sergi Vladykin wrote:
>
>> Absolutely agree, lets get some numbers on RendezvousAffinity with both
>> variants: useBalancer enabled and disabled. Taras, can you provide them?
>>
>> Anyways at the moment we need to make a decision on what will get into
>> 2.0.
>> I'm for dropping (or hiding) all the suspicious stuff and adding it back
>> if
>> we fix it. Thus I'm going to move FairAffinity into private package now.
>>
>> Sergi
>>
>> 2017-04-10 16:55 GMT+03:00 Vladimir Ozerov :
>>
>> Sergi,
>>>
>>> AFAIK the only reason why RendezvousAffinity is used by default is that
>>> behavior on rebalance is no less important than steady state performance,
>>> especially on large deployments and cloud environments, when nodes
>>> constantly joins and leaves topology. Let's stop guessing and discuss the
>>> numbers - how many partitions reassignments happen with new
>>> RendezvousAffinity flavor? I haven't seen any results so far.
>>>
>>> On Mon, Apr 10, 2017 at 4:39 PM, Andrey Gura  wrote:
>>>
>>> Guys,

 It seems that both mentioned problem have the same root cause: each
 cache has personal affinity function instance and it leads to
 perfromance problem (we retry the same calcualtions for each cache)
 and problem related with fact that FailAffinityFunction is statefull
 (some co-located cache has different assignment if it was started on
 different topology).

 Obvious solution is the same affinity for some cache set. As result
 all caches from one set will use the same assignment that will be
 calculated exactly once and will not depend on cache start topology.






 On Mon, Apr 10, 2017 at 4:05 PM, Sergi Vladykin
  wrote:

> As for default value for useBalancer flag, I agree with Yakov, it must
>
 be
>>>
 enabled by default. Because performance in steady state is usually more
> important than performance of rebalancing. For edge cases it can be
> disabled.
>
> Sergi
>
> 2017-04-10 15:04 GMT+03:00 Sergi Vladykin :
>
> If the RendezvousAffinity with enabled useBalancer is not much worse
>>
> than

> FairAffinity, I see no reason to keep the latter.
>>
>> Sergi
>>
>> 2017-04-10 13:00 GMT+03:00 Vladimir Ozerov :
>>
>> Guys,
>>>
>>> We should not have it enabled by default because as Taras mentioned:
>>>
>> "but

> in this case there is not guarantee that a partition doesn't move
>>>
>> from
>>>
 one

> node to another when node leave topology". Let's avoid any rush here.
>>> There
>>> is nothing terribly wrong with FairAffinity. It is not enabled by
>>>
>> default

> and at the very least we can always mark it as deprecated. It is
>>>
>> better to

> test rigorously rendezvous affinity first in terms of partition
>>> distribution and partition migration and decide whether results are
>>> acceptable.
>>>
>>> On Mon, Apr 10, 2017 at 12:43 PM, Yakov Zhdanov >> wrote:
>>>
>>> We should have it enabled by default.

 --Yakov

 2017-04-10 12:42 GMT+03:00 Sergi Vladykin <

>>> sergi.vlady...@gmail.com
>>>
 :
>
>> Why wouldn't we have useBalancer always enabled?
>
> Sergi
>
> 2017-04-10 12:31 GMT+03:00 Taras Ledkov :
>
> Folks,
>>
>> I worked on issue https://issues.apache.org/
>>
> jira/browse/IGNITE-3018

> that

> is related to performance of Rendezvous AF.
>>
>> 

Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Taras Ledkov
I updated the issue [1] with the table of the average count of migrated 
primary partitions when one node leaves.


[1]. 
https://issues.apache.org/jira/browse/IGNITE-3018?focusedCommentId=15963015=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15963015



On 10.04.2017 18:00, Sergi Vladykin wrote:

Absolutely agree, lets get some numbers on RendezvousAffinity with both
variants: useBalancer enabled and disabled. Taras, can you provide them?

Anyways at the moment we need to make a decision on what will get into 2.0.
I'm for dropping (or hiding) all the suspicious stuff and adding it back if
we fix it. Thus I'm going to move FairAffinity into private package now.

Sergi

2017-04-10 16:55 GMT+03:00 Vladimir Ozerov :


Sergi,

AFAIK the only reason why RendezvousAffinity is used by default is that
behavior on rebalance is no less important than steady state performance,
especially on large deployments and cloud environments, when nodes
constantly joins and leaves topology. Let's stop guessing and discuss the
numbers - how many partitions reassignments happen with new
RendezvousAffinity flavor? I haven't seen any results so far.

On Mon, Apr 10, 2017 at 4:39 PM, Andrey Gura  wrote:


Guys,

It seems that both mentioned problem have the same root cause: each
cache has personal affinity function instance and it leads to
perfromance problem (we retry the same calcualtions for each cache)
and problem related with fact that FailAffinityFunction is statefull
(some co-located cache has different assignment if it was started on
different topology).

Obvious solution is the same affinity for some cache set. As result
all caches from one set will use the same assignment that will be
calculated exactly once and will not depend on cache start topology.






On Mon, Apr 10, 2017 at 4:05 PM, Sergi Vladykin
 wrote:

As for default value for useBalancer flag, I agree with Yakov, it must

be

enabled by default. Because performance in steady state is usually more
important than performance of rebalancing. For edge cases it can be
disabled.

Sergi

2017-04-10 15:04 GMT+03:00 Sergi Vladykin :


If the RendezvousAffinity with enabled useBalancer is not much worse

than

FairAffinity, I see no reason to keep the latter.

Sergi

2017-04-10 13:00 GMT+03:00 Vladimir Ozerov :


Guys,

We should not have it enabled by default because as Taras mentioned:

"but

in this case there is not guarantee that a partition doesn't move

from

one

node to another when node leave topology". Let's avoid any rush here.
There
is nothing terribly wrong with FairAffinity. It is not enabled by

default

and at the very least we can always mark it as deprecated. It is

better to

test rigorously rendezvous affinity first in terms of partition
distribution and partition migration and decide whether results are
acceptable.

On Mon, Apr 10, 2017 at 12:43 PM, Yakov Zhdanov :


Folks,

I worked on issue https://issues.apache.org/

jira/browse/IGNITE-3018

that

is related to performance of Rendezvous AF.

But Wang/Jenkins hash integer hash distribution is worse then

MD5.

So,

i

try to use simple partition balancer close
to Fair AF for Rendezvous AF.

Take a look at the heatmaps of distributions at the issue.

e.g.:

- Compare of current Rendezvous AF and new Rendezvous AF based

of

Wang/Jenkins hash: https://issues.apache.org/jira
/secure/attachment/12858701/004.png
- Compare of current Rendezvous AF and new Rendezvous AF based

of

Wang/Jenkins hash with partition balancer:

https://issues.apache.org/jira

/secure/attachment/12858690/balanced.004.png

When the balancer is enabled the distribution of partitions by

nodes

looks

like close to even distribution
but in this case there is not guarantee that a partition

doesn't

move

from

one node to another
when node leave topology.
It is not guarantee but we try to minimize it because sorted

array

of

nodes is used (like in for pure-Rendezvous AF).

I think we can use new fast Rendezvous AF and use 'useBalancer'

flag

instead of Fair AF.

On 09.04.2017 14:12, Valentin Kulichenko wrote:


What is the replacement for FairAffinityFunction?

Generally I agree. If FairAffinityFunction can't be changed to

provide

consistent mapping, it should be dropped.

-Val

On Sun, Apr 9, 2017 at 3:50 AM, Sergi Vladykin <

sergi.vlady...@gmail.com

> wrote:

 Guys,

 It appeared that our FairAffinityFunction can assign the

same

 partitions to
 different nodes for different caches.

 It basically means that there is no collocation between

the

caches

 

Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Dmitriy Setrakyan
Guys,

To my knowledge FairAffinity, which is the most balanced distribution
possible, works just fine whenever the caches are configured on startup. I
think we should keep it, but throw an exception whenever a cache is started
dynamically (after the system start) with FairAffinity configured. Am I
missing something here?

As far as RendezvousAffinity, I don't like the we start migrating extra
partitions. To my knowledge, Michael Grigs implemented a close to even
partition distribution with a much better hash function. Do we really need
to improve even more?

D.

On Mon, Apr 10, 2017 at 8:00 AM, Sergi Vladykin 
wrote:

> Absolutely agree, lets get some numbers on RendezvousAffinity with both
> variants: useBalancer enabled and disabled. Taras, can you provide them?
>
> Anyways at the moment we need to make a decision on what will get into 2.0.
> I'm for dropping (or hiding) all the suspicious stuff and adding it back if
> we fix it. Thus I'm going to move FairAffinity into private package now.
>
> Sergi
>
> 2017-04-10 16:55 GMT+03:00 Vladimir Ozerov :
>
> > Sergi,
> >
> > AFAIK the only reason why RendezvousAffinity is used by default is that
> > behavior on rebalance is no less important than steady state performance,
> > especially on large deployments and cloud environments, when nodes
> > constantly joins and leaves topology. Let's stop guessing and discuss the
> > numbers - how many partitions reassignments happen with new
> > RendezvousAffinity flavor? I haven't seen any results so far.
> >
> > On Mon, Apr 10, 2017 at 4:39 PM, Andrey Gura  wrote:
> >
> > > Guys,
> > >
> > > It seems that both mentioned problem have the same root cause: each
> > > cache has personal affinity function instance and it leads to
> > > perfromance problem (we retry the same calcualtions for each cache)
> > > and problem related with fact that FailAffinityFunction is statefull
> > > (some co-located cache has different assignment if it was started on
> > > different topology).
> > >
> > > Obvious solution is the same affinity for some cache set. As result
> > > all caches from one set will use the same assignment that will be
> > > calculated exactly once and will not depend on cache start topology.
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Apr 10, 2017 at 4:05 PM, Sergi Vladykin
> > >  wrote:
> > > > As for default value for useBalancer flag, I agree with Yakov, it
> must
> > be
> > > > enabled by default. Because performance in steady state is usually
> more
> > > > important than performance of rebalancing. For edge cases it can be
> > > > disabled.
> > > >
> > > > Sergi
> > > >
> > > > 2017-04-10 15:04 GMT+03:00 Sergi Vladykin  >:
> > > >
> > > >> If the RendezvousAffinity with enabled useBalancer is not much worse
> > > than
> > > >> FairAffinity, I see no reason to keep the latter.
> > > >>
> > > >> Sergi
> > > >>
> > > >> 2017-04-10 13:00 GMT+03:00 Vladimir Ozerov :
> > > >>
> > > >>> Guys,
> > > >>>
> > > >>> We should not have it enabled by default because as Taras
> mentioned:
> > > "but
> > > >>> in this case there is not guarantee that a partition doesn't move
> > from
> > > one
> > > >>> node to another when node leave topology". Let's avoid any rush
> here.
> > > >>> There
> > > >>> is nothing terribly wrong with FairAffinity. It is not enabled by
> > > default
> > > >>> and at the very least we can always mark it as deprecated. It is
> > > better to
> > > >>> test rigorously rendezvous affinity first in terms of partition
> > > >>> distribution and partition migration and decide whether results are
> > > >>> acceptable.
> > > >>>
> > > >>> On Mon, Apr 10, 2017 at 12:43 PM, Yakov Zhdanov <
> yzhda...@apache.org
> > >
> > > >>> wrote:
> > > >>>
> > > >>> > We should have it enabled by default.
> > > >>> >
> > > >>> > --Yakov
> > > >>> >
> > > >>> > 2017-04-10 12:42 GMT+03:00 Sergi Vladykin <
> > sergi.vlady...@gmail.com
> > > >:
> > > >>> >
> > > >>> > > Why wouldn't we have useBalancer always enabled?
> > > >>> > >
> > > >>> > > Sergi
> > > >>> > >
> > > >>> > > 2017-04-10 12:31 GMT+03:00 Taras Ledkov  >:
> > > >>> > >
> > > >>> > > > Folks,
> > > >>> > > >
> > > >>> > > > I worked on issue https://issues.apache.org/
> > > jira/browse/IGNITE-3018
> > > >>> > that
> > > >>> > > > is related to performance of Rendezvous AF.
> > > >>> > > >
> > > >>> > > > But Wang/Jenkins hash integer hash distribution is worse then
> > > MD5.
> > > >>> So,
> > > >>> > i
> > > >>> > > > try to use simple partition balancer close
> > > >>> > > > to Fair AF for Rendezvous AF.
> > > >>> > > >
> > > >>> > > > Take a look at the heatmaps of distributions at the issue.
> > e.g.:
> > > >>> > > > - Compare of current Rendezvous AF and new Rendezvous AF
> based
> > of
> > > >>> > > > Wang/Jenkins hash: https://issues.apache.org/jira
> > > >>> > > > 

Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Sergi Vladykin
Absolutely agree, lets get some numbers on RendezvousAffinity with both
variants: useBalancer enabled and disabled. Taras, can you provide them?

Anyways at the moment we need to make a decision on what will get into 2.0.
I'm for dropping (or hiding) all the suspicious stuff and adding it back if
we fix it. Thus I'm going to move FairAffinity into private package now.

Sergi

2017-04-10 16:55 GMT+03:00 Vladimir Ozerov :

> Sergi,
>
> AFAIK the only reason why RendezvousAffinity is used by default is that
> behavior on rebalance is no less important than steady state performance,
> especially on large deployments and cloud environments, when nodes
> constantly joins and leaves topology. Let's stop guessing and discuss the
> numbers - how many partitions reassignments happen with new
> RendezvousAffinity flavor? I haven't seen any results so far.
>
> On Mon, Apr 10, 2017 at 4:39 PM, Andrey Gura  wrote:
>
> > Guys,
> >
> > It seems that both mentioned problem have the same root cause: each
> > cache has personal affinity function instance and it leads to
> > perfromance problem (we retry the same calcualtions for each cache)
> > and problem related with fact that FailAffinityFunction is statefull
> > (some co-located cache has different assignment if it was started on
> > different topology).
> >
> > Obvious solution is the same affinity for some cache set. As result
> > all caches from one set will use the same assignment that will be
> > calculated exactly once and will not depend on cache start topology.
> >
> >
> >
> >
> >
> >
> > On Mon, Apr 10, 2017 at 4:05 PM, Sergi Vladykin
> >  wrote:
> > > As for default value for useBalancer flag, I agree with Yakov, it must
> be
> > > enabled by default. Because performance in steady state is usually more
> > > important than performance of rebalancing. For edge cases it can be
> > > disabled.
> > >
> > > Sergi
> > >
> > > 2017-04-10 15:04 GMT+03:00 Sergi Vladykin :
> > >
> > >> If the RendezvousAffinity with enabled useBalancer is not much worse
> > than
> > >> FairAffinity, I see no reason to keep the latter.
> > >>
> > >> Sergi
> > >>
> > >> 2017-04-10 13:00 GMT+03:00 Vladimir Ozerov :
> > >>
> > >>> Guys,
> > >>>
> > >>> We should not have it enabled by default because as Taras mentioned:
> > "but
> > >>> in this case there is not guarantee that a partition doesn't move
> from
> > one
> > >>> node to another when node leave topology". Let's avoid any rush here.
> > >>> There
> > >>> is nothing terribly wrong with FairAffinity. It is not enabled by
> > default
> > >>> and at the very least we can always mark it as deprecated. It is
> > better to
> > >>> test rigorously rendezvous affinity first in terms of partition
> > >>> distribution and partition migration and decide whether results are
> > >>> acceptable.
> > >>>
> > >>> On Mon, Apr 10, 2017 at 12:43 PM, Yakov Zhdanov  >
> > >>> wrote:
> > >>>
> > >>> > We should have it enabled by default.
> > >>> >
> > >>> > --Yakov
> > >>> >
> > >>> > 2017-04-10 12:42 GMT+03:00 Sergi Vladykin <
> sergi.vlady...@gmail.com
> > >:
> > >>> >
> > >>> > > Why wouldn't we have useBalancer always enabled?
> > >>> > >
> > >>> > > Sergi
> > >>> > >
> > >>> > > 2017-04-10 12:31 GMT+03:00 Taras Ledkov :
> > >>> > >
> > >>> > > > Folks,
> > >>> > > >
> > >>> > > > I worked on issue https://issues.apache.org/
> > jira/browse/IGNITE-3018
> > >>> > that
> > >>> > > > is related to performance of Rendezvous AF.
> > >>> > > >
> > >>> > > > But Wang/Jenkins hash integer hash distribution is worse then
> > MD5.
> > >>> So,
> > >>> > i
> > >>> > > > try to use simple partition balancer close
> > >>> > > > to Fair AF for Rendezvous AF.
> > >>> > > >
> > >>> > > > Take a look at the heatmaps of distributions at the issue.
> e.g.:
> > >>> > > > - Compare of current Rendezvous AF and new Rendezvous AF based
> of
> > >>> > > > Wang/Jenkins hash: https://issues.apache.org/jira
> > >>> > > > /secure/attachment/12858701/004.png
> > >>> > > > - Compare of current Rendezvous AF and new Rendezvous AF based
> of
> > >>> > > > Wang/Jenkins hash with partition balancer:
> > >>> > > https://issues.apache.org/jira
> > >>> > > > /secure/attachment/12858690/balanced.004.png
> > >>> > > >
> > >>> > > > When the balancer is enabled the distribution of partitions by
> > nodes
> > >>> > > looks
> > >>> > > > like close to even distribution
> > >>> > > > but in this case there is not guarantee that a partition
> doesn't
> > >>> move
> > >>> > > from
> > >>> > > > one node to another
> > >>> > > > when node leave topology.
> > >>> > > > It is not guarantee but we try to minimize it because sorted
> > array
> > >>> of
> > >>> > > > nodes is used (like in for pure-Rendezvous AF).
> > >>> > > >
> > >>> > > > I think we can use new fast Rendezvous AF and use 'useBalancer'
> > flag
> > >>> > > > instead of Fair 

Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Vladimir Ozerov
Sergi,

AFAIK the only reason why RendezvousAffinity is used by default is that
behavior on rebalance is no less important than steady state performance,
especially on large deployments and cloud environments, when nodes
constantly joins and leaves topology. Let's stop guessing and discuss the
numbers - how many partitions reassignments happen with new
RendezvousAffinity flavor? I haven't seen any results so far.

On Mon, Apr 10, 2017 at 4:39 PM, Andrey Gura  wrote:

> Guys,
>
> It seems that both mentioned problem have the same root cause: each
> cache has personal affinity function instance and it leads to
> perfromance problem (we retry the same calcualtions for each cache)
> and problem related with fact that FailAffinityFunction is statefull
> (some co-located cache has different assignment if it was started on
> different topology).
>
> Obvious solution is the same affinity for some cache set. As result
> all caches from one set will use the same assignment that will be
> calculated exactly once and will not depend on cache start topology.
>
>
>
>
>
>
> On Mon, Apr 10, 2017 at 4:05 PM, Sergi Vladykin
>  wrote:
> > As for default value for useBalancer flag, I agree with Yakov, it must be
> > enabled by default. Because performance in steady state is usually more
> > important than performance of rebalancing. For edge cases it can be
> > disabled.
> >
> > Sergi
> >
> > 2017-04-10 15:04 GMT+03:00 Sergi Vladykin :
> >
> >> If the RendezvousAffinity with enabled useBalancer is not much worse
> than
> >> FairAffinity, I see no reason to keep the latter.
> >>
> >> Sergi
> >>
> >> 2017-04-10 13:00 GMT+03:00 Vladimir Ozerov :
> >>
> >>> Guys,
> >>>
> >>> We should not have it enabled by default because as Taras mentioned:
> "but
> >>> in this case there is not guarantee that a partition doesn't move from
> one
> >>> node to another when node leave topology". Let's avoid any rush here.
> >>> There
> >>> is nothing terribly wrong with FairAffinity. It is not enabled by
> default
> >>> and at the very least we can always mark it as deprecated. It is
> better to
> >>> test rigorously rendezvous affinity first in terms of partition
> >>> distribution and partition migration and decide whether results are
> >>> acceptable.
> >>>
> >>> On Mon, Apr 10, 2017 at 12:43 PM, Yakov Zhdanov 
> >>> wrote:
> >>>
> >>> > We should have it enabled by default.
> >>> >
> >>> > --Yakov
> >>> >
> >>> > 2017-04-10 12:42 GMT+03:00 Sergi Vladykin  >:
> >>> >
> >>> > > Why wouldn't we have useBalancer always enabled?
> >>> > >
> >>> > > Sergi
> >>> > >
> >>> > > 2017-04-10 12:31 GMT+03:00 Taras Ledkov :
> >>> > >
> >>> > > > Folks,
> >>> > > >
> >>> > > > I worked on issue https://issues.apache.org/
> jira/browse/IGNITE-3018
> >>> > that
> >>> > > > is related to performance of Rendezvous AF.
> >>> > > >
> >>> > > > But Wang/Jenkins hash integer hash distribution is worse then
> MD5.
> >>> So,
> >>> > i
> >>> > > > try to use simple partition balancer close
> >>> > > > to Fair AF for Rendezvous AF.
> >>> > > >
> >>> > > > Take a look at the heatmaps of distributions at the issue. e.g.:
> >>> > > > - Compare of current Rendezvous AF and new Rendezvous AF based of
> >>> > > > Wang/Jenkins hash: https://issues.apache.org/jira
> >>> > > > /secure/attachment/12858701/004.png
> >>> > > > - Compare of current Rendezvous AF and new Rendezvous AF based of
> >>> > > > Wang/Jenkins hash with partition balancer:
> >>> > > https://issues.apache.org/jira
> >>> > > > /secure/attachment/12858690/balanced.004.png
> >>> > > >
> >>> > > > When the balancer is enabled the distribution of partitions by
> nodes
> >>> > > looks
> >>> > > > like close to even distribution
> >>> > > > but in this case there is not guarantee that a partition doesn't
> >>> move
> >>> > > from
> >>> > > > one node to another
> >>> > > > when node leave topology.
> >>> > > > It is not guarantee but we try to minimize it because sorted
> array
> >>> of
> >>> > > > nodes is used (like in for pure-Rendezvous AF).
> >>> > > >
> >>> > > > I think we can use new fast Rendezvous AF and use 'useBalancer'
> flag
> >>> > > > instead of Fair AF.
> >>> > > >
> >>> > > > On 09.04.2017 14:12, Valentin Kulichenko wrote:
> >>> > > >
> >>> > > >> What is the replacement for FairAffinityFunction?
> >>> > > >>
> >>> > > >> Generally I agree. If FairAffinityFunction can't be changed to
> >>> provide
> >>> > > >> consistent mapping, it should be dropped.
> >>> > > >>
> >>> > > >> -Val
> >>> > > >>
> >>> > > >> On Sun, Apr 9, 2017 at 3:50 AM, Sergi Vladykin <
> >>> > > sergi.vlady...@gmail.com
> >>> > > >> > wrote:
> >>> > > >>
> >>> > > >> Guys,
> >>> > > >>
> >>> > > >> It appeared that our FairAffinityFunction can assign the
> same
> >>> > > >> partitions to
> >>> > > >> different nodes for 

Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Andrey Gura
Guys,

It seems that both mentioned problem have the same root cause: each
cache has personal affinity function instance and it leads to
perfromance problem (we retry the same calcualtions for each cache)
and problem related with fact that FailAffinityFunction is statefull
(some co-located cache has different assignment if it was started on
different topology).

Obvious solution is the same affinity for some cache set. As result
all caches from one set will use the same assignment that will be
calculated exactly once and will not depend on cache start topology.






On Mon, Apr 10, 2017 at 4:05 PM, Sergi Vladykin
 wrote:
> As for default value for useBalancer flag, I agree with Yakov, it must be
> enabled by default. Because performance in steady state is usually more
> important than performance of rebalancing. For edge cases it can be
> disabled.
>
> Sergi
>
> 2017-04-10 15:04 GMT+03:00 Sergi Vladykin :
>
>> If the RendezvousAffinity with enabled useBalancer is not much worse than
>> FairAffinity, I see no reason to keep the latter.
>>
>> Sergi
>>
>> 2017-04-10 13:00 GMT+03:00 Vladimir Ozerov :
>>
>>> Guys,
>>>
>>> We should not have it enabled by default because as Taras mentioned: "but
>>> in this case there is not guarantee that a partition doesn't move from one
>>> node to another when node leave topology". Let's avoid any rush here.
>>> There
>>> is nothing terribly wrong with FairAffinity. It is not enabled by default
>>> and at the very least we can always mark it as deprecated. It is better to
>>> test rigorously rendezvous affinity first in terms of partition
>>> distribution and partition migration and decide whether results are
>>> acceptable.
>>>
>>> On Mon, Apr 10, 2017 at 12:43 PM, Yakov Zhdanov 
>>> wrote:
>>>
>>> > We should have it enabled by default.
>>> >
>>> > --Yakov
>>> >
>>> > 2017-04-10 12:42 GMT+03:00 Sergi Vladykin :
>>> >
>>> > > Why wouldn't we have useBalancer always enabled?
>>> > >
>>> > > Sergi
>>> > >
>>> > > 2017-04-10 12:31 GMT+03:00 Taras Ledkov :
>>> > >
>>> > > > Folks,
>>> > > >
>>> > > > I worked on issue https://issues.apache.org/jira/browse/IGNITE-3018
>>> > that
>>> > > > is related to performance of Rendezvous AF.
>>> > > >
>>> > > > But Wang/Jenkins hash integer hash distribution is worse then MD5.
>>> So,
>>> > i
>>> > > > try to use simple partition balancer close
>>> > > > to Fair AF for Rendezvous AF.
>>> > > >
>>> > > > Take a look at the heatmaps of distributions at the issue. e.g.:
>>> > > > - Compare of current Rendezvous AF and new Rendezvous AF based of
>>> > > > Wang/Jenkins hash: https://issues.apache.org/jira
>>> > > > /secure/attachment/12858701/004.png
>>> > > > - Compare of current Rendezvous AF and new Rendezvous AF based of
>>> > > > Wang/Jenkins hash with partition balancer:
>>> > > https://issues.apache.org/jira
>>> > > > /secure/attachment/12858690/balanced.004.png
>>> > > >
>>> > > > When the balancer is enabled the distribution of partitions by nodes
>>> > > looks
>>> > > > like close to even distribution
>>> > > > but in this case there is not guarantee that a partition doesn't
>>> move
>>> > > from
>>> > > > one node to another
>>> > > > when node leave topology.
>>> > > > It is not guarantee but we try to minimize it because sorted array
>>> of
>>> > > > nodes is used (like in for pure-Rendezvous AF).
>>> > > >
>>> > > > I think we can use new fast Rendezvous AF and use 'useBalancer' flag
>>> > > > instead of Fair AF.
>>> > > >
>>> > > > On 09.04.2017 14:12, Valentin Kulichenko wrote:
>>> > > >
>>> > > >> What is the replacement for FairAffinityFunction?
>>> > > >>
>>> > > >> Generally I agree. If FairAffinityFunction can't be changed to
>>> provide
>>> > > >> consistent mapping, it should be dropped.
>>> > > >>
>>> > > >> -Val
>>> > > >>
>>> > > >> On Sun, Apr 9, 2017 at 3:50 AM, Sergi Vladykin <
>>> > > sergi.vlady...@gmail.com
>>> > > >> > wrote:
>>> > > >>
>>> > > >> Guys,
>>> > > >>
>>> > > >> It appeared that our FairAffinityFunction can assign the same
>>> > > >> partitions to
>>> > > >> different nodes for different caches.
>>> > > >>
>>> > > >> It basically means that there is no collocation between the
>>> caches
>>> > > >> at all
>>> > > >> even if they have the same affinity.
>>> > > >>
>>> > > >> As a result all SQL joins will not work (even collocated ones),
>>> > > other
>>> > > >> operations that rely on cache collocation will be either
>>> broken or
>>> > > >> work
>>> > > >> slower, than expected.
>>> > > >>
>>> > > >> All this stuff is really non-obvious. And I see no reason why
>>> we
>>> > > >> should
>>> > > >> allow that. I suggest to prohibit this behavior and drop
>>> > > >> FairAffinityFunction before 2.0. We have to clearly document
>>> that
>>> > > >> the same
>>> > > 

Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Sergi Vladykin
As for default value for useBalancer flag, I agree with Yakov, it must be
enabled by default. Because performance in steady state is usually more
important than performance of rebalancing. For edge cases it can be
disabled.

Sergi

2017-04-10 15:04 GMT+03:00 Sergi Vladykin :

> If the RendezvousAffinity with enabled useBalancer is not much worse than
> FairAffinity, I see no reason to keep the latter.
>
> Sergi
>
> 2017-04-10 13:00 GMT+03:00 Vladimir Ozerov :
>
>> Guys,
>>
>> We should not have it enabled by default because as Taras mentioned: "but
>> in this case there is not guarantee that a partition doesn't move from one
>> node to another when node leave topology". Let's avoid any rush here.
>> There
>> is nothing terribly wrong with FairAffinity. It is not enabled by default
>> and at the very least we can always mark it as deprecated. It is better to
>> test rigorously rendezvous affinity first in terms of partition
>> distribution and partition migration and decide whether results are
>> acceptable.
>>
>> On Mon, Apr 10, 2017 at 12:43 PM, Yakov Zhdanov 
>> wrote:
>>
>> > We should have it enabled by default.
>> >
>> > --Yakov
>> >
>> > 2017-04-10 12:42 GMT+03:00 Sergi Vladykin :
>> >
>> > > Why wouldn't we have useBalancer always enabled?
>> > >
>> > > Sergi
>> > >
>> > > 2017-04-10 12:31 GMT+03:00 Taras Ledkov :
>> > >
>> > > > Folks,
>> > > >
>> > > > I worked on issue https://issues.apache.org/jira/browse/IGNITE-3018
>> > that
>> > > > is related to performance of Rendezvous AF.
>> > > >
>> > > > But Wang/Jenkins hash integer hash distribution is worse then MD5.
>> So,
>> > i
>> > > > try to use simple partition balancer close
>> > > > to Fair AF for Rendezvous AF.
>> > > >
>> > > > Take a look at the heatmaps of distributions at the issue. e.g.:
>> > > > - Compare of current Rendezvous AF and new Rendezvous AF based of
>> > > > Wang/Jenkins hash: https://issues.apache.org/jira
>> > > > /secure/attachment/12858701/004.png
>> > > > - Compare of current Rendezvous AF and new Rendezvous AF based of
>> > > > Wang/Jenkins hash with partition balancer:
>> > > https://issues.apache.org/jira
>> > > > /secure/attachment/12858690/balanced.004.png
>> > > >
>> > > > When the balancer is enabled the distribution of partitions by nodes
>> > > looks
>> > > > like close to even distribution
>> > > > but in this case there is not guarantee that a partition doesn't
>> move
>> > > from
>> > > > one node to another
>> > > > when node leave topology.
>> > > > It is not guarantee but we try to minimize it because sorted array
>> of
>> > > > nodes is used (like in for pure-Rendezvous AF).
>> > > >
>> > > > I think we can use new fast Rendezvous AF and use 'useBalancer' flag
>> > > > instead of Fair AF.
>> > > >
>> > > > On 09.04.2017 14:12, Valentin Kulichenko wrote:
>> > > >
>> > > >> What is the replacement for FairAffinityFunction?
>> > > >>
>> > > >> Generally I agree. If FairAffinityFunction can't be changed to
>> provide
>> > > >> consistent mapping, it should be dropped.
>> > > >>
>> > > >> -Val
>> > > >>
>> > > >> On Sun, Apr 9, 2017 at 3:50 AM, Sergi Vladykin <
>> > > sergi.vlady...@gmail.com
>> > > >> > wrote:
>> > > >>
>> > > >> Guys,
>> > > >>
>> > > >> It appeared that our FairAffinityFunction can assign the same
>> > > >> partitions to
>> > > >> different nodes for different caches.
>> > > >>
>> > > >> It basically means that there is no collocation between the
>> caches
>> > > >> at all
>> > > >> even if they have the same affinity.
>> > > >>
>> > > >> As a result all SQL joins will not work (even collocated ones),
>> > > other
>> > > >> operations that rely on cache collocation will be either
>> broken or
>> > > >> work
>> > > >> slower, than expected.
>> > > >>
>> > > >> All this stuff is really non-obvious. And I see no reason why
>> we
>> > > >> should
>> > > >> allow that. I suggest to prohibit this behavior and drop
>> > > >> FairAffinityFunction before 2.0. We have to clearly document
>> that
>> > > >> the same
>> > > >> affinity function must provide the same partition assignments
>> for
>> > > >> all the
>> > > >> caches.
>> > > >>
>> > > >> Also I know that Taras Ledkov was working on a decent stateless
>> > > >> replacement
>> > > >> for FairAffinity, so we should not loose anything here.
>> > > >>
>> > > >> Thoughts?
>> > > >>
>> > > >> Sergi
>> > > >>
>> > > >>
>> > > >>
>> > > > --
>> > > > Taras Ledkov
>> > > > Mail-To: tled...@gridgain.com
>> > > >
>> > > >
>> > >
>> >
>>
>
>


Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Sergi Vladykin
If the RendezvousAffinity with enabled useBalancer is not much worse than
FairAffinity, I see no reason to keep the latter.

Sergi

2017-04-10 13:00 GMT+03:00 Vladimir Ozerov :

> Guys,
>
> We should not have it enabled by default because as Taras mentioned: "but
> in this case there is not guarantee that a partition doesn't move from one
> node to another when node leave topology". Let's avoid any rush here. There
> is nothing terribly wrong with FairAffinity. It is not enabled by default
> and at the very least we can always mark it as deprecated. It is better to
> test rigorously rendezvous affinity first in terms of partition
> distribution and partition migration and decide whether results are
> acceptable.
>
> On Mon, Apr 10, 2017 at 12:43 PM, Yakov Zhdanov 
> wrote:
>
> > We should have it enabled by default.
> >
> > --Yakov
> >
> > 2017-04-10 12:42 GMT+03:00 Sergi Vladykin :
> >
> > > Why wouldn't we have useBalancer always enabled?
> > >
> > > Sergi
> > >
> > > 2017-04-10 12:31 GMT+03:00 Taras Ledkov :
> > >
> > > > Folks,
> > > >
> > > > I worked on issue https://issues.apache.org/jira/browse/IGNITE-3018
> > that
> > > > is related to performance of Rendezvous AF.
> > > >
> > > > But Wang/Jenkins hash integer hash distribution is worse then MD5.
> So,
> > i
> > > > try to use simple partition balancer close
> > > > to Fair AF for Rendezvous AF.
> > > >
> > > > Take a look at the heatmaps of distributions at the issue. e.g.:
> > > > - Compare of current Rendezvous AF and new Rendezvous AF based of
> > > > Wang/Jenkins hash: https://issues.apache.org/jira
> > > > /secure/attachment/12858701/004.png
> > > > - Compare of current Rendezvous AF and new Rendezvous AF based of
> > > > Wang/Jenkins hash with partition balancer:
> > > https://issues.apache.org/jira
> > > > /secure/attachment/12858690/balanced.004.png
> > > >
> > > > When the balancer is enabled the distribution of partitions by nodes
> > > looks
> > > > like close to even distribution
> > > > but in this case there is not guarantee that a partition doesn't move
> > > from
> > > > one node to another
> > > > when node leave topology.
> > > > It is not guarantee but we try to minimize it because sorted array of
> > > > nodes is used (like in for pure-Rendezvous AF).
> > > >
> > > > I think we can use new fast Rendezvous AF and use 'useBalancer' flag
> > > > instead of Fair AF.
> > > >
> > > > On 09.04.2017 14:12, Valentin Kulichenko wrote:
> > > >
> > > >> What is the replacement for FairAffinityFunction?
> > > >>
> > > >> Generally I agree. If FairAffinityFunction can't be changed to
> provide
> > > >> consistent mapping, it should be dropped.
> > > >>
> > > >> -Val
> > > >>
> > > >> On Sun, Apr 9, 2017 at 3:50 AM, Sergi Vladykin <
> > > sergi.vlady...@gmail.com
> > > >> > wrote:
> > > >>
> > > >> Guys,
> > > >>
> > > >> It appeared that our FairAffinityFunction can assign the same
> > > >> partitions to
> > > >> different nodes for different caches.
> > > >>
> > > >> It basically means that there is no collocation between the
> caches
> > > >> at all
> > > >> even if they have the same affinity.
> > > >>
> > > >> As a result all SQL joins will not work (even collocated ones),
> > > other
> > > >> operations that rely on cache collocation will be either broken
> or
> > > >> work
> > > >> slower, than expected.
> > > >>
> > > >> All this stuff is really non-obvious. And I see no reason why we
> > > >> should
> > > >> allow that. I suggest to prohibit this behavior and drop
> > > >> FairAffinityFunction before 2.0. We have to clearly document
> that
> > > >> the same
> > > >> affinity function must provide the same partition assignments
> for
> > > >> all the
> > > >> caches.
> > > >>
> > > >> Also I know that Taras Ledkov was working on a decent stateless
> > > >> replacement
> > > >> for FairAffinity, so we should not loose anything here.
> > > >>
> > > >> Thoughts?
> > > >>
> > > >> Sergi
> > > >>
> > > >>
> > > >>
> > > > --
> > > > Taras Ledkov
> > > > Mail-To: tled...@gridgain.com
> > > >
> > > >
> > >
> >
>


Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Vladimir Ozerov
Guys,

We should not have it enabled by default because as Taras mentioned: "but
in this case there is not guarantee that a partition doesn't move from one
node to another when node leave topology". Let's avoid any rush here. There
is nothing terribly wrong with FairAffinity. It is not enabled by default
and at the very least we can always mark it as deprecated. It is better to
test rigorously rendezvous affinity first in terms of partition
distribution and partition migration and decide whether results are
acceptable.

On Mon, Apr 10, 2017 at 12:43 PM, Yakov Zhdanov  wrote:

> We should have it enabled by default.
>
> --Yakov
>
> 2017-04-10 12:42 GMT+03:00 Sergi Vladykin :
>
> > Why wouldn't we have useBalancer always enabled?
> >
> > Sergi
> >
> > 2017-04-10 12:31 GMT+03:00 Taras Ledkov :
> >
> > > Folks,
> > >
> > > I worked on issue https://issues.apache.org/jira/browse/IGNITE-3018
> that
> > > is related to performance of Rendezvous AF.
> > >
> > > But Wang/Jenkins hash integer hash distribution is worse then MD5. So,
> i
> > > try to use simple partition balancer close
> > > to Fair AF for Rendezvous AF.
> > >
> > > Take a look at the heatmaps of distributions at the issue. e.g.:
> > > - Compare of current Rendezvous AF and new Rendezvous AF based of
> > > Wang/Jenkins hash: https://issues.apache.org/jira
> > > /secure/attachment/12858701/004.png
> > > - Compare of current Rendezvous AF and new Rendezvous AF based of
> > > Wang/Jenkins hash with partition balancer:
> > https://issues.apache.org/jira
> > > /secure/attachment/12858690/balanced.004.png
> > >
> > > When the balancer is enabled the distribution of partitions by nodes
> > looks
> > > like close to even distribution
> > > but in this case there is not guarantee that a partition doesn't move
> > from
> > > one node to another
> > > when node leave topology.
> > > It is not guarantee but we try to minimize it because sorted array of
> > > nodes is used (like in for pure-Rendezvous AF).
> > >
> > > I think we can use new fast Rendezvous AF and use 'useBalancer' flag
> > > instead of Fair AF.
> > >
> > > On 09.04.2017 14:12, Valentin Kulichenko wrote:
> > >
> > >> What is the replacement for FairAffinityFunction?
> > >>
> > >> Generally I agree. If FairAffinityFunction can't be changed to provide
> > >> consistent mapping, it should be dropped.
> > >>
> > >> -Val
> > >>
> > >> On Sun, Apr 9, 2017 at 3:50 AM, Sergi Vladykin <
> > sergi.vlady...@gmail.com
> > >> > wrote:
> > >>
> > >> Guys,
> > >>
> > >> It appeared that our FairAffinityFunction can assign the same
> > >> partitions to
> > >> different nodes for different caches.
> > >>
> > >> It basically means that there is no collocation between the caches
> > >> at all
> > >> even if they have the same affinity.
> > >>
> > >> As a result all SQL joins will not work (even collocated ones),
> > other
> > >> operations that rely on cache collocation will be either broken or
> > >> work
> > >> slower, than expected.
> > >>
> > >> All this stuff is really non-obvious. And I see no reason why we
> > >> should
> > >> allow that. I suggest to prohibit this behavior and drop
> > >> FairAffinityFunction before 2.0. We have to clearly document that
> > >> the same
> > >> affinity function must provide the same partition assignments for
> > >> all the
> > >> caches.
> > >>
> > >> Also I know that Taras Ledkov was working on a decent stateless
> > >> replacement
> > >> for FairAffinity, so we should not loose anything here.
> > >>
> > >> Thoughts?
> > >>
> > >> Sergi
> > >>
> > >>
> > >>
> > > --
> > > Taras Ledkov
> > > Mail-To: tled...@gridgain.com
> > >
> > >
> >
>


Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Sergi Vladykin
Why wouldn't we have useBalancer always enabled?

Sergi

2017-04-10 12:31 GMT+03:00 Taras Ledkov :

> Folks,
>
> I worked on issue https://issues.apache.org/jira/browse/IGNITE-3018 that
> is related to performance of Rendezvous AF.
>
> But Wang/Jenkins hash integer hash distribution is worse then MD5. So, i
> try to use simple partition balancer close
> to Fair AF for Rendezvous AF.
>
> Take a look at the heatmaps of distributions at the issue. e.g.:
> - Compare of current Rendezvous AF and new Rendezvous AF based of
> Wang/Jenkins hash: https://issues.apache.org/jira
> /secure/attachment/12858701/004.png
> - Compare of current Rendezvous AF and new Rendezvous AF based of
> Wang/Jenkins hash with partition balancer: https://issues.apache.org/jira
> /secure/attachment/12858690/balanced.004.png
>
> When the balancer is enabled the distribution of partitions by nodes looks
> like close to even distribution
> but in this case there is not guarantee that a partition doesn't move from
> one node to another
> when node leave topology.
> It is not guarantee but we try to minimize it because sorted array of
> nodes is used (like in for pure-Rendezvous AF).
>
> I think we can use new fast Rendezvous AF and use 'useBalancer' flag
> instead of Fair AF.
>
> On 09.04.2017 14:12, Valentin Kulichenko wrote:
>
>> What is the replacement for FairAffinityFunction?
>>
>> Generally I agree. If FairAffinityFunction can't be changed to provide
>> consistent mapping, it should be dropped.
>>
>> -Val
>>
>> On Sun, Apr 9, 2017 at 3:50 AM, Sergi Vladykin > > wrote:
>>
>> Guys,
>>
>> It appeared that our FairAffinityFunction can assign the same
>> partitions to
>> different nodes for different caches.
>>
>> It basically means that there is no collocation between the caches
>> at all
>> even if they have the same affinity.
>>
>> As a result all SQL joins will not work (even collocated ones), other
>> operations that rely on cache collocation will be either broken or
>> work
>> slower, than expected.
>>
>> All this stuff is really non-obvious. And I see no reason why we
>> should
>> allow that. I suggest to prohibit this behavior and drop
>> FairAffinityFunction before 2.0. We have to clearly document that
>> the same
>> affinity function must provide the same partition assignments for
>> all the
>> caches.
>>
>> Also I know that Taras Ledkov was working on a decent stateless
>> replacement
>> for FairAffinity, so we should not loose anything here.
>>
>> Thoughts?
>>
>> Sergi
>>
>>
>>
> --
> Taras Ledkov
> Mail-To: tled...@gridgain.com
>
>


Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Yakov Zhdanov
We should have it enabled by default.

--Yakov

2017-04-10 12:42 GMT+03:00 Sergi Vladykin :

> Why wouldn't we have useBalancer always enabled?
>
> Sergi
>
> 2017-04-10 12:31 GMT+03:00 Taras Ledkov :
>
> > Folks,
> >
> > I worked on issue https://issues.apache.org/jira/browse/IGNITE-3018 that
> > is related to performance of Rendezvous AF.
> >
> > But Wang/Jenkins hash integer hash distribution is worse then MD5. So, i
> > try to use simple partition balancer close
> > to Fair AF for Rendezvous AF.
> >
> > Take a look at the heatmaps of distributions at the issue. e.g.:
> > - Compare of current Rendezvous AF and new Rendezvous AF based of
> > Wang/Jenkins hash: https://issues.apache.org/jira
> > /secure/attachment/12858701/004.png
> > - Compare of current Rendezvous AF and new Rendezvous AF based of
> > Wang/Jenkins hash with partition balancer:
> https://issues.apache.org/jira
> > /secure/attachment/12858690/balanced.004.png
> >
> > When the balancer is enabled the distribution of partitions by nodes
> looks
> > like close to even distribution
> > but in this case there is not guarantee that a partition doesn't move
> from
> > one node to another
> > when node leave topology.
> > It is not guarantee but we try to minimize it because sorted array of
> > nodes is used (like in for pure-Rendezvous AF).
> >
> > I think we can use new fast Rendezvous AF and use 'useBalancer' flag
> > instead of Fair AF.
> >
> > On 09.04.2017 14:12, Valentin Kulichenko wrote:
> >
> >> What is the replacement for FairAffinityFunction?
> >>
> >> Generally I agree. If FairAffinityFunction can't be changed to provide
> >> consistent mapping, it should be dropped.
> >>
> >> -Val
> >>
> >> On Sun, Apr 9, 2017 at 3:50 AM, Sergi Vladykin <
> sergi.vlady...@gmail.com
> >> > wrote:
> >>
> >> Guys,
> >>
> >> It appeared that our FairAffinityFunction can assign the same
> >> partitions to
> >> different nodes for different caches.
> >>
> >> It basically means that there is no collocation between the caches
> >> at all
> >> even if they have the same affinity.
> >>
> >> As a result all SQL joins will not work (even collocated ones),
> other
> >> operations that rely on cache collocation will be either broken or
> >> work
> >> slower, than expected.
> >>
> >> All this stuff is really non-obvious. And I see no reason why we
> >> should
> >> allow that. I suggest to prohibit this behavior and drop
> >> FairAffinityFunction before 2.0. We have to clearly document that
> >> the same
> >> affinity function must provide the same partition assignments for
> >> all the
> >> caches.
> >>
> >> Also I know that Taras Ledkov was working on a decent stateless
> >> replacement
> >> for FairAffinity, so we should not loose anything here.
> >>
> >> Thoughts?
> >>
> >> Sergi
> >>
> >>
> >>
> > --
> > Taras Ledkov
> > Mail-To: tled...@gridgain.com
> >
> >
>


Re: Prohibit stateful affinity (FairAffinityFunction)

2017-04-10 Thread Taras Ledkov

Folks,

I worked on issue https://issues.apache.org/jira/browse/IGNITE-3018 that 
is related to performance of Rendezvous AF.


But Wang/Jenkins hash integer hash distribution is worse then MD5. So, i 
try to use simple partition balancer close

to Fair AF for Rendezvous AF.

Take a look at the heatmaps of distributions at the issue. e.g.:
- Compare of current Rendezvous AF and new Rendezvous AF based of 
Wang/Jenkins hash: 
https://issues.apache.org/jira/secure/attachment/12858701/004.png
- Compare of current Rendezvous AF and new Rendezvous AF based of 
Wang/Jenkins hash with partition balancer: 
https://issues.apache.org/jira/secure/attachment/12858690/balanced.004.png


When the balancer is enabled the distribution of partitions by nodes 
looks like close to even distribution
but in this case there is not guarantee that a partition doesn't move 
from one node to another

when node leave topology.
It is not guarantee but we try to minimize it because sorted array of 
nodes is used (like in for pure-Rendezvous AF).


I think we can use new fast Rendezvous AF and use 'useBalancer' flag 
instead of Fair AF.


On 09.04.2017 14:12, Valentin Kulichenko wrote:

What is the replacement for FairAffinityFunction?

Generally I agree. If FairAffinityFunction can't be changed to provide 
consistent mapping, it should be dropped.


-Val

On Sun, Apr 9, 2017 at 3:50 AM, Sergi Vladykin 
> wrote:


Guys,

It appeared that our FairAffinityFunction can assign the same
partitions to
different nodes for different caches.

It basically means that there is no collocation between the caches
at all
even if they have the same affinity.

As a result all SQL joins will not work (even collocated ones), other
operations that rely on cache collocation will be either broken or
work
slower, than expected.

All this stuff is really non-obvious. And I see no reason why we
should
allow that. I suggest to prohibit this behavior and drop
FairAffinityFunction before 2.0. We have to clearly document that
the same
affinity function must provide the same partition assignments for
all the
caches.

Also I know that Taras Ledkov was working on a decent stateless
replacement
for FairAffinity, so we should not loose anything here.

Thoughts?

Sergi




--
Taras Ledkov
Mail-To: tled...@gridgain.com