If this description is accurate, then it sounds like my only available workaround would be to not use multiget() and instead issue multiple get() calls to random nodes so that I can hit the other replicas.
Edmond On Thu, May 27, 2010 at 2:36 PM, Cagatay Kavukcuoglu <caga...@kavukcuoglu.org> wrote: > I think this is because as an optimization Cassandra sends a read > request only to the closest replica and sends digest requests to other > replicas for read repair. The same replica is probably getting chosen > as the closest for all of your read requests. Maybe it would be a > useful improvement to choose a random node among equally close nodes > (if there's more than one). That would spread read load better across > replicas in a single data center where a lot of nodes are equidistant > from each other. > > CK. > > > > On Thu, May 27, 2010 at 3:59 PM, Edmond Lau <edm...@ooyala.com> wrote: >> Occasionally, one of my six nodes gets a very high >> MESSAGE-DESERIALIZER-POOL pending count (over 100K). When that >> happens, it usually also has a decently high ROW-READ-STAGE pending >> count around 4K. All other nodes have very low load and no pending >> tasks. From reading other threads, this is usually a symptom of GC >> occurring. >> >> When this scenario happens, my multiget_slice() queries with a >> consistency level of one across ~128 keys typically fail to return >> within 30 seconds, even though normally they return in under 50 ms. I >> would've expected that with a consistency level one, Cassandra >> should've been able to bypass the locked up node. My understanding is >> that the coordinator node would issue multiple parallel lookups for a >> key and just wait until the first out of three returns. >> >> I'm using a random partitioner, a replication factor of 3 with >> rack-aware partitioning, and machines with 32GB of RAM and 12GB >> allocated to the java heap. I've set my cassandra rpc timeout to 30 >> seconds. >> >> Anyone have thoughts about why this might happen? >> >> Edmond >> >