Re: Cassandra Pig with network topology and data centers.

Jeremy Hanna Fri, 29 Jul 2011 18:05:18 -0700

fwiw - https://issues.apache.org/jira/browse/CASSANDRA-2970


thoughts? (please post on the ticket)

On Jul 29, 2011, at 7:08 PM, Ryan King wrote:

> It'd be great if we had different settings for inter- and intra-DC read 
> repair.
> 
> -ryan
> 
> On Fri, Jul 29, 2011 at 5:06 PM, Jake Luciani <jak...@gmail.com> wrote:
>> Yes it's read repair you can lower the read repair chance to tune this.
>> 
>> 
>> 
>> On Jul 29, 2011, at 6:31 PM, Aaron Griffith <aaron.c.griff...@gmail.com> 
>> wrote:
>> 
>>> I currently have a 9 node cassandra cluster setup as follows:
>>> 
>>> DC1: Six nodes
>>> DC2: Three nodes
>>> 
>>> The tokens alternate between the two datacenters.
>>> 
>>> I have hadoop installed as tasktracker/datanodes on the
>>> three cassandra nodes in DC2.
>>> 
>>> There is another non cassandra node that is used as the hadoop namenode / 
>>> job
>>> tracker.
>>> 
>>> When running pig scripts pointed to a node in DC2 using LOCAL_QUORUM as read
>>> consistency I am seeing network and cpu spikes on the nodes in DC1.  I was
>>> not expecting any impact on those nodes when local quorum is used.
>>> 
>>> Can read repair be causing the traffic/cpu spikes?
>>> 
>>> The replication settings for DC1 is 5, and for DC2 is 1.
>>> 
>>> When looking at the map tasks I am seeing input splits for computers in
>>> both data centers.  I am not sure what this means.  My thought is
>>> that is should only be getting data from the nodes in DC2.
>>> 
>>> Thanks
>>> 
>>> Aaron
>>> 
>>

Re: Cassandra Pig with network topology and data centers.

Reply via email to