fwiw - https://issues.apache.org/jira/browse/CASSANDRA-2970
thoughts? (please post on the ticket) On Jul 29, 2011, at 7:08 PM, Ryan King wrote: > It'd be great if we had different settings for inter- and intra-DC read > repair. > > -ryan > > On Fri, Jul 29, 2011 at 5:06 PM, Jake Luciani <jak...@gmail.com> wrote: >> Yes it's read repair you can lower the read repair chance to tune this. >> >> >> >> On Jul 29, 2011, at 6:31 PM, Aaron Griffith <aaron.c.griff...@gmail.com> >> wrote: >> >>> I currently have a 9 node cassandra cluster setup as follows: >>> >>> DC1: Six nodes >>> DC2: Three nodes >>> >>> The tokens alternate between the two datacenters. >>> >>> I have hadoop installed as tasktracker/datanodes on the >>> three cassandra nodes in DC2. >>> >>> There is another non cassandra node that is used as the hadoop namenode / >>> job >>> tracker. >>> >>> When running pig scripts pointed to a node in DC2 using LOCAL_QUORUM as read >>> consistency I am seeing network and cpu spikes on the nodes in DC1. I was >>> not expecting any impact on those nodes when local quorum is used. >>> >>> Can read repair be causing the traffic/cpu spikes? >>> >>> The replication settings for DC1 is 5, and for DC2 is 1. >>> >>> When looking at the map tasks I am seeing input splits for computers in >>> both data centers. I am not sure what this means. My thought is >>> that is should only be getting data from the nodes in DC2. >>> >>> Thanks >>> >>> Aaron >>> >>