There might be something I'm missing ;) On cluster B, as you said, never more than 50% of your handlers are used. Your Ganglia metrics are showing that there is activities (num ops is increasing), which is correct.
Can you please confirm what you think is wrong from your charts? Thanks, JM 2013/12/9 Federico Gaule <[email protected]> > Hi JM, > Cluster B is only receiving replication data (writes), but handlers are > waiting most of the time (never 50% of them are used). As i have read, RPC > queue is only used when handlers are all waiting, does it count for > replication as well? > > Thanks! > > > 2013/12/9 Jean-Marc Spaggiari <[email protected]> > > > Hi, > > > > When you say that B doesn't get any read/write operation, does it mean > you > > stopped the replication? Or B is still getting the write operations from > A > > because of the replication? If so, that's why you RPC queue is used... > > > > JM > > > > > > 2013/12/9 Federico Gaule <[email protected]> > > > > > Not much information in RS logs (DEBUG level set to > > > org.apache.hadoop.hbase). Here is a sample of one regionserver showing > > > increasing rpc.metrics.RpcQueueTime_num_ops and > > > rpc.metrics.RpcQueueTime_avg_time > > > activity: > > > > > > 2013-12-09 08:09:10,699 DEBUG > > > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=23.14 MB, > > > free=2.73 GB, max=2.75 GB, blocks=0, accesses=122442151, > hits=122168501, > > > hitRatio=99.77%, , cachingAccesses=122192927, cachingHits=122162378, > > > cachingHitsRatio=99.97%, , evictions=0, evicted=6768, > > > evictedPerRun=Infinity > > > 2013-12-09 08:09:11,396 INFO > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: Total > > > replicated: 1 > > > 2013-12-09 08:09:14,979 INFO > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: Total > > > replicated: 2 > > > 2013-12-09 08:09:16,016 INFO > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: Total > > > replicated: 1 > > > ... > > > 2013-12-09 08:14:07,659 INFO > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: Total > > > replicated: 1 > > > 2013-12-09 08:14:08,713 INFO > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: Total > > > replicated: 3 > > > 2013-12-09 08:14:10,699 DEBUG > > > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=23.14 MB, > > > free=2.73 GB, max=2.75 GB, blocks=0, accesses=122442151, > hits=122168501, > > > hitRatio=99.77%, , cachingAccesses=122192927, cachingHits=122162378, > > > cachingHitsRatio=99.97%, , evictions=0, evicted=6768, > > > evictedPerRun=Infinity > > > 2013-12-09 08:14:12,711 INFO > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: Total > > > replicated: 1 > > > 2013-12-09 08:14:14,778 INFO > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: Total > > > replicated: 3 > > > ... > > > 2013-12-09 08:15:09,199 INFO > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: Total > > > replicated: 3 > > > 2013-12-09 08:15:12,243 INFO > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: Total > > > replicated: 2 > > > 2013-12-09 08:15:22,086 INFO > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: Total > > > replicated: 2 > > > > > > Thanks > > > > > > > > > 2013/12/7 Bharath Vissapragada <[email protected]> > > > > > > > I'd look into the RS logs to see whats happening there. Difficult to > > > guess > > > > from the given information! > > > > > > > > > > > > On Sat, Dec 7, 2013 at 8:52 PM, Federico Gaule <[email protected]> > > > > wrote: > > > > > > > > > Any clue? > > > > > El dic 5, 2013 9:49 a.m., "Federico Gaule" <[email protected]> > > > > escribió: > > > > > > > > > > > Hi, > > > > > > > > > > > > I have 2 clusters, Master (a) - Slave (b) replication. > > > > > > B doesn't have client write or reads, all handlers (100) are > > waiting > > > > but > > > > > > rpc.metrics.RpcQueueTime_num_ops and > > > rpc.metrics.RpcQueueTime_avg_time > > > > > reports > > > > > > to be rpc calls to be queued. > > > > > > There are some screenshots below to show ganglia metrics. How is > > this > > > > > > behaviour explained? I have looked for metrics specifications but > > > can't > > > > > > find much information. > > > > > > > > > > > > Handlers > > > > > > http://i42.tinypic.com/242ssoz.png > > > > > > > > > > > > NumOps > > > > > > http://tinypic.com/r/of2c8k/5 > > > > > > > > > > > > AvgTime > > > > > > http://tinypic.com/r/2lsvg5w/5 > > > > > > > > > > > > Cheers > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Bharath Vissapragada > > > > <http://www.cloudera.com> > > > > > > > > > > > > > > > > -- > > > > > > [image: http://www.despegar.com/galeria/images/promos/isodespegar1.png > ] > > > > > > *Ing. Federico Gaule* > > > Líder Técnico - PAM <[email protected]> > > > Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU) > > > tel. +54 (11) 4894-3500 > > > > > > *[image: Seguinos en Twitter!] <http://twitter.com/#!/despegarar> > > [image: > > > Seguinos en Facebook!] <http://www.facebook.com/despegar> [image: > > Seguinos > > > en YouTube!] <http://www.youtube.com/Despegar>* > > > *Despegar.com, Inc. * > > > El mejor precio para tu viaje. > > > > > > Este mensaje es confidencial y puede contener información amparada por > > el > > > secreto profesional. > > > Si usted ha recibido este e-mail por error, por favor comunÃquenoslo > > > inmediatamente respondiendo a este e-mail y luego eliminándolo de su > > > sistema. > > > El contenido de este mensaje no deberá ser copiado ni divulgado a > > ninguna > > > persona. > > > > > > > > > -- > > [image: http://www.despegar.com/galeria/images/promos/isodespegar1.png] > > *Ing. Federico Gaule* > Líder Técnico - PAM <[email protected]> > Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU) > tel. +54 (11) 4894-3500 > > *[image: Seguinos en Twitter!] <http://twitter.com/#!/despegarar> [image: > Seguinos en Facebook!] <http://www.facebook.com/despegar> [image: Seguinos > en YouTube!] <http://www.youtube.com/Despegar>* > *Despegar.com, Inc. * > El mejor precio para tu viaje. > > Este mensaje es confidencial y puede contener información amparada por el > secreto profesional. > Si usted ha recibido este e-mail por error, por favor comunÃquenoslo > inmediatamente respondiendo a este e-mail y luego eliminándolo de su > sistema. > El contenido de este mensaje no deberá ser copiado ni divulgado a ninguna > persona. >
