It'd be great if we had different settings for inter- and intra-DC read repair.
-ryan On Fri, Jul 29, 2011 at 5:06 PM, Jake Luciani <jak...@gmail.com> wrote: > Yes it's read repair you can lower the read repair chance to tune this. > > > > On Jul 29, 2011, at 6:31 PM, Aaron Griffith <aaron.c.griff...@gmail.com> > wrote: > >> I currently have a 9 node cassandra cluster setup as follows: >> >> DC1: Six nodes >> DC2: Three nodes >> >> The tokens alternate between the two datacenters. >> >> I have hadoop installed as tasktracker/datanodes on the >> three cassandra nodes in DC2. >> >> There is another non cassandra node that is used as the hadoop namenode / job >> tracker. >> >> When running pig scripts pointed to a node in DC2 using LOCAL_QUORUM as read >> consistency I am seeing network and cpu spikes on the nodes in DC1. I was >> not expecting any impact on those nodes when local quorum is used. >> >> Can read repair be causing the traffic/cpu spikes? >> >> The replication settings for DC1 is 5, and for DC2 is 1. >> >> When looking at the map tasks I am seeing input splits for computers in >> both data centers. I am not sure what this means. My thought is >> that is should only be getting data from the nodes in DC2. >> >> Thanks >> >> Aaron >> >