subject:"RE\: Replica data distributing between racks"

Re: Replica data distributing between racks

2011-05-04 Thread Eric tamme

On Wed, May 4, 2011 at 10:09 AM, Konstantin  Naryshkin
 wrote:
> The way that I understand it (and that seems to be consistent with what was 
> said in this discussion) is that each DC has its own data space. Using your 
> simplified 1-10 system:
>   DC1   DC2
> 0  D1R1  D2R2
> 1  D1R1  D2R1
> 2  D1R1  D2R1
> 3  D1R1  D2R1
> 4  D1R1  D2R1
> 5  D1R2  D2R1
> 6  D1R2  D2R2
> 7  D1R2  D2R2
> 8  D1R2  D2R2
> 9  D1R2  D2R2
>
> Each node is responsible for half of the ring in its own DC.

Okay that makes sense from a primary distribution perspective, but how
do the nodes magically know where to send the data?  When using NTS,
if there are two nodes that overlap tokens, does NTS choose the
"closest" node to place the primary on?  If that is the case, then it
makes sense.

As far as the replication distribution... with a replica going to each
data center {DC1:1,DC2:1} does NTS take the token and find the
"closest" node in the opposite data center? ... so for token 7 in D1
replicating to D2, it will look for a node with a token range closest
to that? In this scenario it would go to D2R2?

That makes sense as far as why the replication was hot spotting before
where my tokens were N,M,O,P  where N

Re: Replica data distributing between racks

2011-05-04 Thread Konstantin Naryshkin

The way that I understand it (and that seems to be consistent with what was 
said in this discussion) is that each DC has its own data space. Using your 
simplified 1-10 system:
   DC1   DC2
0  D1R1  D2R2
1  D1R1  D2R1
2  D1R1  D2R1
3  D1R1  D2R1
4  D1R1  D2R1
5  D1R2  D2R1
6  D1R2  D2R2
7  D1R2  D2R2
8  D1R2  D2R2
9  D1R2  D2R2

Each node is responsible for half of the ring in its own DC.

- Original Message -
From: "Eric tamme" 
To: user@cassandra.apache.org
Sent: Wednesday, May 4, 2011 1:58:19 PM
Subject: Re: Replica data distributing between racks

>        Jonathan is suggesting the approach Jeremiah was using.
>
>        Calculate the tokens the nodes in each DC independantly, and then add 
> 1 to the tokens if there are two nodes with the same tokens.
>
>        In your case with 2 DC's with 2 nodes each.
>
> In DC 1
> node 1 = 0
> node 2 = 85070591730234615865843651857942052864
>
> In DC 2
> node 1 = 1
> node 2 =  85070591730234615865843651857942052865
>
> This will evenly distribute the keys in each DC, which is what the 
> NetworkTopologyStrategy is trying to do.



Okay - I appreciate the direct solution, but I am still really
confused. I think I am missing some thing conceptual here... it just
isn't "clicking".

If I have nodes 4 nodes, in two data centers, each in it's own rack:
DC1R1, DC1R2, DC2R1, DC2R2

Tokens:

DC1R1: N
DC1R2: M
DC2R1: N+1
DC2R1: M+1

Who is responsible for what in primary distribution and in
replication?  Is DC1R2 responsible for M-M+1 (aka 1 token, M)??? that
doesn't make any sense... or am I supposed to be making primary
distribution uneven so that the uneven replication then balances it?

I am trying to conceptualize this... I drew up a graph of the range
responsibility based on this token assignment based on a simplified
token range of 0-9 http://dl.dropbox.com/u/19254184/tokens.jpg

I must be missing some thing, I just don't know what.  Please if some
one can please explain or point me to resources that clearly explain
this.

Thanks for everyones time

-Eric

Re: Replica data distributing between racks

2011-05-04 Thread Eric tamme

>        Jonathan is suggesting the approach Jeremiah was using.
>
>        Calculate the tokens the nodes in each DC independantly, and then add 
> 1 to the tokens if there are two nodes with the same tokens.
>
>        In your case with 2 DC's with 2 nodes each.
>
> In DC 1
> node 1 = 0
> node 2 = 85070591730234615865843651857942052864
>
> In DC 2
> node 1 = 1
> node 2 =  85070591730234615865843651857942052865
>
> This will evenly distribute the keys in each DC, which is what the 
> NetworkTopologyStrategy is trying to do.



Okay - I appreciate the direct solution, but I am still really
confused. I think I am missing some thing conceptual here... it just
isn't "clicking".

If I have nodes 4 nodes, in two data centers, each in it's own rack:
DC1R1, DC1R2, DC2R1, DC2R2

Tokens:

DC1R1: N
DC1R2: M
DC2R1: N+1
DC2R1: M+1

Who is responsible for what in primary distribution and in
replication?  Is DC1R2 responsible for M-M+1 (aka 1 token, M)??? that
doesn't make any sense... or am I supposed to be making primary
distribution uneven so that the uneven replication then balances it?

I am trying to conceptualize this... I drew up a graph of the range
responsibility based on this token assignment based on a simplified
token range of 0-9 http://dl.dropbox.com/u/19254184/tokens.jpg

I must be missing some thing, I just don't know what.  Please if some
one can please explain or point me to resources that clearly explain
this.

Thanks for everyones time

-Eric

Re: Replica data distributing between racks

2011-05-04 Thread aaron morton

Eric, 
Jonathan is suggesting the approach Jeremiah was using. 

Calculate the tokens the nodes in each DC independantly, and then add 1 
to the tokens if there are two nodes with the same tokens. 

In your case with 2 DC's with 2 nodes each. 

In DC 1
node 1 = 0
node 2 = 85070591730234615865843651857942052864

In DC 2
node 1 = 1
node 2 =  85070591730234615865843651857942052865

This will evenly distribute the keys in each DC, which is what the 
NetworkTopologyStrategy is trying to do. 

You can make this change using nodetool move. 

Hope that helps. 

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 4 May 2011, at 08:20, Eric tamme wrote:

> On Tue, May 3, 2011 at 4:08 PM, Jonathan Ellis  wrote:
>> On Tue, May 3, 2011 at 2:46 PM, aaron morton  wrote:
>>> Jonathan,
>>>I think you are saying each DC should have it's own (logical) token 
>>> ring.
>> 
>> Right. (Only with NTS, although you'd usually end up with a similar
>> effect if you alternate DC locations for nodes in a ONTS cluster.)
>> 
>>>But currently two endpoints cannot have the same token regardless of 
>>> the DC they are in.
>> 
>> Also right.
>> 
>>> Or should people just bump the tokens in extra DC's to avoid the collision?
>> 
>> Yes.
>> 
> 
> 
> I am sorry, but I do not understand fully.  I would appreciate it if
> some one could explain with more verbosity for me.
> 
> I do not understand why data insertion is even, but replication is not.
> 
> I do not understand how to solve the problem.  What does "bumping"
> tokens entail - Is that going to change my insertion distribution?  I
> had no idea you can create different logical keyspaces ... and I am
> not sure what that exactly means... or that I even want to do it.  Is
> there a clear solution to "fixing" the problem I laid out, and getting
> replication data evenly distributed between racks in each DC?
> 
> Sorry again for needing more verbosity - I am learning as I go with
> this stuff.  I appreciate everyones help.
> 
> -Eric

Re: Replica data distributing between racks

2011-05-03 Thread Eric tamme

On Tue, May 3, 2011 at 4:08 PM, Jonathan Ellis  wrote:
> On Tue, May 3, 2011 at 2:46 PM, aaron morton  wrote:
>> Jonathan,
>>        I think you are saying each DC should have it's own (logical) token 
>> ring.
>
> Right. (Only with NTS, although you'd usually end up with a similar
> effect if you alternate DC locations for nodes in a ONTS cluster.)
>
>>        But currently two endpoints cannot have the same token regardless of 
>> the DC they are in.
>
> Also right.
>
>> Or should people just bump the tokens in extra DC's to avoid the collision?
>
> Yes.
>

I am sorry, but I do not understand fully.  I would appreciate it if
some one could explain with more verbosity for me.

I do not understand why data insertion is even, but replication is not.

I do not understand how to solve the problem.  What does "bumping"
tokens entail - Is that going to change my insertion distribution?  I
had no idea you can create different logical keyspaces ... and I am
not sure what that exactly means... or that I even want to do it.  Is
there a clear solution to "fixing" the problem I laid out, and getting
replication data evenly distributed between racks in each DC?

Sorry again for needing more verbosity - I am learning as I go with
this stuff.  I appreciate everyones help.

-Eric

Re: Replica data distributing between racks

2011-05-03 Thread Jonathan Ellis

On Tue, May 3, 2011 at 2:46 PM, aaron morton  wrote:
> Jonathan,
>        I think you are saying each DC should have it's own (logical) token 
> ring.

Right. (Only with NTS, although you'd usually end up with a similar
effect if you alternate DC locations for nodes in a ONTS cluster.)

>        But currently two endpoints cannot have the same token regardless of 
> the DC they are in.

Also right.

> Or should people just bump the tokens in extra DC's to avoid the collision?

Yes.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Replica data distributing between racks

2011-05-03 Thread aaron morton

Jonathan, 
I think you are saying each DC should have it's own (logical) token 
ring. Which makes sense as the only way to balance the load in each dc. I think 
most people assume (including me) there was a single token ring for the entire 
cluster. 

But currently two endpoints cannot have the same token regardless of 
the DC they are in. Or should people just bump the tokens in extra DC's to 
avoid the collision?  

Cheers
Aaron

On 4 May 2011, at 03:03, Eric tamme wrote:

> On Tue, May 3, 2011 at 10:13 AM, Jonathan Ellis  wrote:
>> Right, when you are computing balanced RP tokens for NTS you need to
>> compute the tokens for each DC independently.
> 
> I am confused ... sorry.  Are you saying that ... I need to change how
> my keys are calculated to fix this problem?  Or are you talking about
> the implementation of how replication selects a token?
> 
> -Eric

Re: Replica data distributing between racks

2011-05-03 Thread Eric tamme

On Tue, May 3, 2011 at 10:13 AM, Jonathan Ellis  wrote:
> Right, when you are computing balanced RP tokens for NTS you need to
> compute the tokens for each DC independently.

I am confused ... sorry.  Are you saying that ... I need to change how
my keys are calculated to fix this problem?  Or are you talking about
the implementation of how replication selects a token?

-Eric

RE: Replica data distributing between racks

2011-05-03 Thread Jeremiah Jordan

So we are currently running a 10 node ring in one DC, and we are going to be 
adding 5 more nodes
in another DC.  To keep the rings in each DC balanced, should I really 
calculate the tokens independently
and just make sure none of them are the same? Something like:

DC1 (RF 5):
1:  0
2:  17014118346046923173168730371588410572
3:  34028236692093846346337460743176821144
4:  51042355038140769519506191114765231716
5:  68056473384187692692674921486353642288
6:  85070591730234615865843651857942052860
7:  102084710076281539039012382229530463432
8:  119098828422328462212181112601118874004
9:  136112946768375385385349842972707284576
10: 153127065114422308558518573344295695148

DC2 (RF 3):
1:  1 (one off from DC1 node 1)
2:  34028236692093846346337460743176821145 (one off from DC1 node 3)
3:  68056473384187692692674921486353642290 (two off from DC1 node 5)
4:  102084710076281539039012382229530463435 (three off from DC1 node 7)
5:  136112946768375385385349842972707284580 (four off from DC1 node 9)

Originally I was thinking I should spread the DC2 nodes evenly in between every 
other DC1 node.
Or does it not matter where they are in respect to the DC1 nodes, and long as 
they fall somewhere
after every other DC1 node? So it is DC1-1, DC2-1, DC1-2, DC1-3, DC2-2, DC1-4, 
DC1-5...

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Tuesday, May 03, 2011 9:14 AM
To: user@cassandra.apache.org
Subject: Re: Replica data distributing between racks

Right, when you are computing balanced RP tokens for NTS you need to compute 
the tokens for each DC independently.

On Tue, May 3, 2011 at 6:23 AM, aaron morton  wrote:
> I've been digging into this and worked was able to reproduce something, not 
> sure if it's a fault and I can't work on it any more tonight.
>
>
> To reproduce:
> - 2 node cluster on my mac book
> - set the tokens as if they were nodes 3 and 4 in a 4 node cluster, 
> e.g. node 1 with 85070591730234615865843651857942052864 and node 2 
> 127605887595351923798765477786913079296
> - set cassandra-topology.properties to put the nodes in DC1 on RAC1 
> and RAC2
> - create a keyspace using NTS and strategy_options = [{DC1:1}]
>
> Inserted 10 rows they were distributed as
> - node 1 - 9 rows
> - node 2 - 1 row
>
> I *think* the problem has to do with TokenMetadata.firstTokenIndex(). It 
> often says the closest token to a key is the node 1 because in effect...
>
> - node 1 is responsible for 0 to 
> 85070591730234615865843651857942052864
> - node 2 is responsible for 85070591730234615865843651857942052864 to 
> 127605887595351923798765477786913079296
> - AND node 1 does the wrap around from 
> 127605887595351923798765477786913079296 to 0 as keys that would insert past 
> the last token in the ring array wrap to 0 because  insertMin is false.
>
> Thoughts ?
>
> Aaron
>
>
> On 3 May 2011, at 10:29, Eric tamme wrote:
>
>> On Mon, May 2, 2011 at 5:59 PM, aaron morton  wrote:
>>> My bad, I missed the way TokenMetadata.ringIterator() and firstTokenIndex() 
>>> work.
>>>
>>> Eric, can you show the output from nodetool ring ?
>>>
>>>
>>
>> Sorry if the previous paste was way to unformatted, here is a 
>> pastie.org link with nicer formatting of nodetool ring output than 
>> plain text email allows.
>>
>> http://pastie.org/private/50khpakpffjhsmgf66oetg
>
>



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support 
http://www.datastax.com

Re: Replica data distributing between racks

2011-05-03 Thread Jonathan Ellis

Right, when you are computing balanced RP tokens for NTS you need to
compute the tokens for each DC independently.

On Tue, May 3, 2011 at 6:23 AM, aaron morton  wrote:
> I've been digging into this and worked was able to reproduce something, not 
> sure if it's a fault and I can't work on it any more tonight.
>
>
> To reproduce:
> - 2 node cluster on my mac book
> - set the tokens as if they were nodes 3 and 4 in a 4 node cluster, e.g. node 
> 1 with 85070591730234615865843651857942052864 and node 2 
> 127605887595351923798765477786913079296
> - set cassandra-topology.properties to put the nodes in DC1 on RAC1 and RAC2
> - create a keyspace using NTS and strategy_options = [{DC1:1}]
>
> Inserted 10 rows they were distributed as
> - node 1 - 9 rows
> - node 2 - 1 row
>
> I *think* the problem has to do with TokenMetadata.firstTokenIndex(). It 
> often says the closest token to a key is the node 1 because in effect...
>
> - node 1 is responsible for 0 to 85070591730234615865843651857942052864
> - node 2 is responsible for 85070591730234615865843651857942052864 to 
> 127605887595351923798765477786913079296
> - AND node 1 does the wrap around from 
> 127605887595351923798765477786913079296 to 0 as keys that would insert past 
> the last token in the ring array wrap to 0 because  insertMin is false.
>
> Thoughts ?
>
> Aaron
>
>
> On 3 May 2011, at 10:29, Eric tamme wrote:
>
>> On Mon, May 2, 2011 at 5:59 PM, aaron morton  wrote:
>>> My bad, I missed the way TokenMetadata.ringIterator() and firstTokenIndex() 
>>> work.
>>>
>>> Eric, can you show the output from nodetool ring ?
>>>
>>>
>>
>> Sorry if the previous paste was way to unformatted, here is a
>> pastie.org link with nicer formatting of nodetool ring output than
>> plain text email allows.
>>
>> http://pastie.org/private/50khpakpffjhsmgf66oetg
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Replica data distributing between racks

2011-05-03 Thread aaron morton

I've been digging into this and worked was able to reproduce something, not 
sure if it's a fault and I can't work on it any more tonight. 


To reproduce:
- 2 node cluster on my mac book
- set the tokens as if they were nodes 3 and 4 in a 4 node cluster, e.g. node 1 
with 85070591730234615865843651857942052864 and node 2 
127605887595351923798765477786913079296 
- set cassandra-topology.properties to put the nodes in DC1 on RAC1 and RAC2
- create a keyspace using NTS and strategy_options = [{DC1:1}]

Inserted 10 rows they were distributed as 
- node 1 - 9 rows 
- node 2 - 1 row

I *think* the problem has to do with TokenMetadata.firstTokenIndex(). It often 
says the closest token to a key is the node 1 because in effect...

- node 1 is responsible for 0 to 85070591730234615865843651857942052864
- node 2 is responsible for 85070591730234615865843651857942052864 to 
127605887595351923798765477786913079296
- AND node 1 does the wrap around from 127605887595351923798765477786913079296 
to 0 as keys that would insert past the last token in the ring array wrap to 0 
because  insertMin is false. 

Thoughts ? 

Aaron


On 3 May 2011, at 10:29, Eric tamme wrote:

> On Mon, May 2, 2011 at 5:59 PM, aaron morton  wrote:
>> My bad, I missed the way TokenMetadata.ringIterator() and firstTokenIndex() 
>> work.
>> 
>> Eric, can you show the output from nodetool ring ?
>> 
>> 
> 
> Sorry if the previous paste was way to unformatted, here is a
> pastie.org link with nicer formatting of nodetool ring output than
> plain text email allows.
> 
> http://pastie.org/private/50khpakpffjhsmgf66oetg

Re: Replica data distributing between racks

2011-05-02 Thread Eric tamme

On Mon, May 2, 2011 at 5:59 PM, aaron morton  wrote:
> My bad, I missed the way TokenMetadata.ringIterator() and firstTokenIndex() 
> work.
>
> Eric, can you show the output from nodetool ring ?
>
>

Sorry if the previous paste was way to unformatted, here is a
pastie.org link with nicer formatting of nodetool ring output than
plain text email allows.

http://pastie.org/private/50khpakpffjhsmgf66oetg

Re: Replica data distributing between racks

2011-05-02 Thread Eric tamme

On Mon, May 2, 2011 at 5:59 PM, aaron morton  wrote:
> My bad, I missed the way TokenMetadata.ringIterator() and firstTokenIndex() 
> work.
>
> Eric, can you show the output from nodetool ring ?
>

Here is output from nodtool ring - ip addresses changed obviously.

Address Status State   LoadOwnsToken

127605887595351923798765477786913079296
:0::111:0:0:0: Up Normal  195.28 GB   25.00%  0
:0::111:0:0:0:aaab Up Normal  47.12 GB25.00%
42535295865117307932921825928971026432
:0::112:0:0:0: Up Normal  189.96 GB   25.00%
85070591730234615865843651857942052864
:0::112:0:0:0:aaab Up Normal  42.82 GB25.00%
127605887595351923798765477786913079296

Re: Replica data distributing between racks

2011-05-02 Thread aaron morton

My bad, I missed the way TokenMetadata.ringIterator() and firstTokenIndex() 
work. 

Eric, can you show the output from nodetool ring ?


Aaron

On 3 May 2011, at 07:30, Eric tamme wrote:

> On Mon, May 2, 2011 at 3:22 PM, Jonathan Ellis  wrote:
>> On Mon, May 2, 2011 at 2:18 PM, aaron morton  wrote:
>>> When the NTS selects replicas in a DC it orders the tokens available in  
>>> the DC, then (in the first pass) iterates through them placing a replica in 
>>> each unique rack.  e.g. if the RF in each DC was 2, the replicas would be 
>>> put on 2 unique racks if possible. So the lowest token in the DC will 
>>> *always* get a write.
>> 
>> It's supposed to start w/ the node closest to the token in each DC, so
>> that shouldn't be the case unless you are using BOP/OPP instead of RP.
>> 
> 
> I am using a RandomPartitioner as shown below:
> 
> Cluster Information:
>   Snitch: org.apache.cassandra.locator.PropertyFileSnitch
>   Partitioner: org.apache.cassandra.dht.RandomPartitioner
> 
> So as far as "closeness" .. how does that get factored in when using a
> PropertyFileSnitch?  Is one rack closer than the other?  In reality
> for each data center there are two nodes in the same rack on the same
> switch,  but I set the topology file up to have 2 racks per data
> center specifically so I would get distribution.
> 
> -Eric

Re: Replica data distributing between racks

2011-05-02 Thread Eric tamme

On Mon, May 2, 2011 at 3:22 PM, Jonathan Ellis  wrote:
> On Mon, May 2, 2011 at 2:18 PM, aaron morton  wrote:
>> When the NTS selects replicas in a DC it orders the tokens available in  the 
>> DC, then (in the first pass) iterates through them placing a replica in each 
>> unique rack.  e.g. if the RF in each DC was 2, the replicas would be put on 
>> 2 unique racks if possible. So the lowest token in the DC will *always* get 
>> a write.
>
> It's supposed to start w/ the node closest to the token in each DC, so
> that shouldn't be the case unless you are using BOP/OPP instead of RP.
>

I am using a RandomPartitioner as shown below:

Cluster Information:
   Snitch: org.apache.cassandra.locator.PropertyFileSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner

So as far as "closeness" .. how does that get factored in when using a
PropertyFileSnitch?  Is one rack closer than the other?  In reality
for each data center there are two nodes in the same rack on the same
switch,  but I set the topology file up to have 2 racks per data
center specifically so I would get distribution.

-Eric

Re: Replica data distributing between racks

2011-05-02 Thread Jonathan Ellis

On Mon, May 2, 2011 at 2:18 PM, aaron morton  wrote:
> When the NTS selects replicas in a DC it orders the tokens available in  the 
> DC, then (in the first pass) iterates through them placing a replica in each 
> unique rack.  e.g. if the RF in each DC was 2, the replicas would be put on 2 
> unique racks if possible. So the lowest token in the DC will *always* get a 
> write.

It's supposed to start w/ the node closest to the token in each DC, so
that shouldn't be the case unless you are using BOP/OPP instead of RP.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Replica data distributing between racks

2011-05-02 Thread aaron morton

That appears to be working correctly, but does not sound great. 

When the NTS selects replicas in a DC it orders the tokens available in  the 
DC, then (in the first pass) iterates through them placing a replica in each 
unique rack.  e.g. if the RF in each DC was 2, the replicas would be put on 2 
unique racks if possible. So the lowest token in the DC will *always* get a 
write.

It's not possible to load balance between the racks as there is no state shared 
between requests. A possible alternative would be to find the nearest token to 
the key and start allocating replicas from there. But as each DC contains only 
a part (say half) of the token range the likelihood is that half of the keys 
would match to either end of the DC's range so it would not be a great 
solution. 

I think what you are trying to achieve is not possible. Do you have the 
capacity to run RF 2 in each DC ? That would at least even things out.

Aaron

On 3 May 2011, at 06:40, Eric tamme wrote:

> I am experiencing an issue where replication is not being distributed
> between racks when using PropertyFileSnitch in conjunction with
> NetworkTopologyStrategy.
> 
> I am running 0.7.3 from a tar.gz on  cassandra.apache.org
> 
> I have 4 nodes, 2 data centers, and 2 racks in each data center.  Each
> rack has 1 node.
> 
> I have even token distribution so that each node gets 25%:
> 
> 0
> 425352958651173079329218259289
> 71026432
> 85070591730234615865843651857942052864
> 127605887595351923798765477786913079296
> 
> My cassandra-topology.properties is as follows:
> 
> # Cassandra Node IP=Data Center:Rack
> \:0\:\:\:\:fffe=NY1:RAC1
> \:0\:\:\:\:=NY1:RAC2
> 
> \:0\:\:\:\:fffe=LA1:RAC1
> \:0\:\:\:\:=LA1:RAC2
> 
> # default for unknown nodes
> default=NY1:RAC1
> 
> 
> My Keyspace replication strategy is as follows:
> Keyspace: SipTrace:
>  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
>Options: [LA1:1,NY1:1]
> 
> So each data center should get 1 copy of the data, and this does
> happen.  The problem is that the replicated copies get pinned to the
> first host configured in the properties file, from what I can discern,
> and DO NOT distribute between racks.  So I have 2 nodes that have a 4
> to 1 ratio of data compared to the other 2 nodes.  This is a problem!
> 
> Can any one please tell me if I have misconfigured this?  Or how I can
> get replica data to distribute evenly between racks within a
> datacenter?  I was led to believe that cassandra will try to
> distribute between racks for replica data automatically under this
> setup.
> 
> Thank you for your help in advance!
> 
> -Eric

Re: Replica data distributing between racks

Re: Replica data distributing between racks

Re: Replica data distributing between racks

Re: Replica data distributing between racks

Re: Replica data distributing between racks

Re: Replica data distributing between racks

Re: Replica data distributing between racks

Re: Replica data distributing between racks

RE: Replica data distributing between racks

Re: Replica data distributing between racks

Re: Replica data distributing between racks

Re: Replica data distributing between racks

Re: Replica data distributing between racks

Re: Replica data distributing between racks

Re: Replica data distributing between racks

Re: Replica data distributing between racks

Re: Replica data distributing between racks

17 matches

Site Navigation

Mail list logo

Footer information