RE: Question about data distribution

Stanislav Lukyanov Fri, 26 Jan 2018 07:48:05 -0800

Hi,

How many data entries do you have? 
Are IDs that you use for affinity mapping evenly distributed among the entries?


Can you show the code that you use to define the affinity mapping and your 
cache configuration?

Also, what exactly do you mean by "node gets X IDs to work with"?
Do you mean that a node stores X IDs? How do you check that?

The data distribution across partitions (and, subsequently, nodes) is based on 
hashing,
so it has a probabilistic guarantee to be fairly even, given that the initial 
IDs are evenly distributed
and that the data set is large enough.

Thanks,
Stan

From: svonn
Sent: 26 января 2018 г. 18:01
To: [email protected]
Subject: Question about data distribution

Hi!

I have two server nodes and I've set up an AffinityKey mapping via some ID.
I'm streaming data from kafka to ignite and for my test data, about 5min
worth of data belongs to one ID, then the data for the next ID starts (real
data will mostly come in parallel). The cache I'm streaming the data into
has about 30min expiration policy.

I've noticed that the data seems to get very unevently distributed.
One node sometimes gets 9 IDs to work with, while the other one only works
on a single ID.
Is that due to the fact that they aren't arriving simultaniously? Can this
behaviour be adjusted?

Best regards
svonn





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

RE: Question about data distribution

Reply via email to