It's hbase.regionserver.metahandler.count. Not sure it causes the issue you're facing, thought. What's your HBase version?
On Tue, Dec 10, 2013 at 1:21 PM, Federico Gaule <[email protected]> wrote: > There is another set of handler we haven't customized "PRI IPC" (priority > ?). What are those handlers used for? What is the property used to increase > the number of handlers? hbase.regionserver.custom.priority.handler.count ? > > Thanks! > > > 2013/12/10 Federico Gaule <[email protected]> > > > I've increased hbase.regionserver.replication.handler.count 10x (30) but > > nothing have changed. rpc.metrics.RpcQueueTime_avg_time still shows > > activity :( > > > > Mon Dec 09 14:04:10 EST 2013REPL IPC Server handler 29 on 60000 WAITING > > (since 16hrs, 58mins, 56sec ago)Waiting for a call (since 16hrs, 58mins, > > 56sec ago) Mon Dec 09 14:04:10 EST 2013 REPL IPC Server handler 28 on > > 60000WAITING (since 16hrs, 58mins, 56sec ago) Waiting for a call (since > > 16hrs, 58mins, 56sec ago)Mon Dec 09 14:04:10 EST 2013 REPL IPC Server > > handler 27 on 60000 WAITING (since 16hrs, 58mins, 56sec ago)Waiting for a > > call (since 16hrs, 58mins, 56sec ago) Mon Dec 09 14:04:10 EST 2013 REPL > > IPC Server handler 26 on 60000WAITING (since 16hrs, 58mins, 56sec > ago)Waiting for a call (since 16hrs, 58mins, 56sec ago) > > ... ... > > ... > > ... > > Mon Dec 09 14:04:10 EST 2013 REPL IPC Server handler 2 on 60000WAITING > > (since 16hrs, 58mins, 56sec ago) Waiting for a call (since 16hrs, 58mins, > > 56sec ago)Mon Dec 09 14:04:10 EST 2013 REPL IPC Server handler 1 on > 60000WAITING (since 16hrs, 58mins, 56sec ago)Waiting > > for a call (since 16hrs, 58mins, 56sec ago) Mon Dec 09 14:04:10 EST > 2013REPL IPC Server handler 0 on 60000WAITING > > (since 16hrs, 58mins, 56sec ago) Waiting for a call (since 16hrs, 58mins, > > 56sec ago) > > Thanks JM > > > > > > 2013/12/9 Jean-Marc Spaggiari <[email protected]> > > > >> Yes, default value is 3 in 0.94.14. If you have not changed it, then > it's > >> still 3. > >> > >> conf.getInt("hbase.regionserver.replication.handler.count", 3); > >> > >> Keep us posted on the results. > >> > >> JM > >> > >> > >> 2013/12/9 Federico Gaule <[email protected]> > >> > >> > Default value for hbase.regionserver.replication.handler.count (can't > >> find > >> > what is the default, Is it 3?) > >> > I'll do a try increasing that property > >> > > >> > Fri Dec 06 12:44:12 EST 2013REPL IPC Server handler 2 on 60020WAITING > >> > (since 8sec ago)Waiting for a call (since 8sec ago)Fri Dec 06 12:44:12 > >> EST > >> > 2013REPL IPC Server handler 1 on 60020WAITING (since 8sec ago)Waiting > >> for a > >> > call (since 8sec ago)Fri Dec 06 12:44:12 EST 2013REPL IPC Server > >> handler 0 > >> > on 60020WAITING (since 2sec ago)Waiting for a call (since 2sec ago) > >> > Thanks JM > >> > > >> > > >> > 2013/12/9 Jean-Marc Spaggiari <[email protected]> > >> > > >> > > For replications, the handlers used on the salve cluster are > >> configured > >> > by > >> > > hbase.regionserver.replication.handler.count. What value do you have > >> for > >> > > this property? > >> > > > >> > > JM > >> > > > >> > > > >> > > 2013/12/9 Federico Gaule <[email protected]> > >> > > > >> > > > Here is a thread saying what i think it should be ( > >> > > > > http://grokbase.com/t/hbase/user/13bmndq53k/average-rpc-queue-time) > >> > > > > >> > > > "The RpcQueueTime metrics are a measurement of how long individual > >> > calls > >> > > > stay in this queued state. If your handlers were never 100% > >> occupied, > >> > > this > >> > > > value would be 0. An average of 3 hours is concerning, it > basically > >> > means > >> > > > that when a call comes into the RegionServer it takes on average 3 > >> > hours > >> > > to > >> > > > start processing, because handlers are all occupied for that > amount > >> of > >> > > > time." > >> > > > > >> > > > Is that correct? > >> > > > > >> > > > > >> > > > > >> > > > 2013/12/9 Federico Gaule <[email protected]> > >> > > > > >> > > > > Correct me if i'm wrong, but, Queues should be used only when > >> > handlers > >> > > > are > >> > > > > all busy, shouldn't it?. > >> > > > > If that's true, i don't get why there is activity related to > >> queues. > >> > > > > > >> > > > > Maybe i'm missing some piece of knowledge about when hbase is > >> using > >> > > > queues > >> > > > > :) > >> > > > > > >> > > > > Thanks > >> > > > > > >> > > > > > >> > > > > 2013/12/9 Jean-Marc Spaggiari <[email protected]> > >> > > > > > >> > > > >> There might be something I'm missing ;) > >> > > > >> > >> > > > >> On cluster B, as you said, never more than 50% of your handlers > >> are > >> > > > used. > >> > > > >> Your Ganglia metrics are showing that there is activities (num > >> ops > >> > is > >> > > > >> increasing), which is correct. > >> > > > >> > >> > > > >> Can you please confirm what you think is wrong from your > charts? > >> > > > >> > >> > > > >> Thanks, > >> > > > >> > >> > > > >> JM > >> > > > >> > >> > > > >> > >> > > > >> 2013/12/9 Federico Gaule <[email protected]> > >> > > > >> > >> > > > >> > Hi JM, > >> > > > >> > Cluster B is only receiving replication data (writes), but > >> > handlers > >> > > > are > >> > > > >> > waiting most of the time (never 50% of them are used). As i > >> have > >> > > read, > >> > > > >> RPC > >> > > > >> > queue is only used when handlers are all waiting, does it > count > >> > for > >> > > > >> > replication as well? > >> > > > >> > > >> > > > >> > Thanks! > >> > > > >> > > >> > > > >> > > >> > > > >> > 2013/12/9 Jean-Marc Spaggiari <[email protected]> > >> > > > >> > > >> > > > >> > > Hi, > >> > > > >> > > > >> > > > >> > > When you say that B doesn't get any read/write operation, > >> does > >> > it > >> > > > mean > >> > > > >> > you > >> > > > >> > > stopped the replication? Or B is still getting the write > >> > > operations > >> > > > >> from > >> > > > >> > A > >> > > > >> > > because of the replication? If so, that's why you RPC queue > >> is > >> > > > used... > >> > > > >> > > > >> > > > >> > > JM > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > 2013/12/9 Federico Gaule <[email protected]> > >> > > > >> > > > >> > > > >> > > > Not much information in RS logs (DEBUG level set to > >> > > > >> > > > org.apache.hadoop.hbase). Here is a sample of one > >> regionserver > >> > > > >> showing > >> > > > >> > > > increasing rpc.metrics.RpcQueueTime_num_ops and > >> > > > >> > > > rpc.metrics.RpcQueueTime_avg_time > >> > > > >> > > > activity: > >> > > > >> > > > > >> > > > >> > > > 2013-12-09 08:09:10,699 DEBUG > >> > > > >> > > > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: > >> > > total=23.14 > >> > > > >> MB, > >> > > > >> > > > free=2.73 GB, max=2.75 GB, blocks=0, accesses=122442151, > >> > > > >> > hits=122168501, > >> > > > >> > > > hitRatio=99.77%, , cachingAccesses=122192927, > >> > > > cachingHits=122162378, > >> > > > >> > > > cachingHitsRatio=99.97%, , evictions=0, evicted=6768, > >> > > > >> > > > evictedPerRun=Infinity > >> > > > >> > > > 2013-12-09 08:09:11,396 INFO > >> > > > >> > > > > >> > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: > >> > > > >> Total > >> > > > >> > > > replicated: 1 > >> > > > >> > > > 2013-12-09 08:09:14,979 INFO > >> > > > >> > > > > >> > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: > >> > > > >> Total > >> > > > >> > > > replicated: 2 > >> > > > >> > > > 2013-12-09 08:09:16,016 INFO > >> > > > >> > > > > >> > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: > >> > > > >> Total > >> > > > >> > > > replicated: 1 > >> > > > >> > > > ... > >> > > > >> > > > 2013-12-09 08:14:07,659 INFO > >> > > > >> > > > > >> > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: > >> > > > >> Total > >> > > > >> > > > replicated: 1 > >> > > > >> > > > 2013-12-09 08:14:08,713 INFO > >> > > > >> > > > > >> > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: > >> > > > >> Total > >> > > > >> > > > replicated: 3 > >> > > > >> > > > 2013-12-09 08:14:10,699 DEBUG > >> > > > >> > > > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: > >> > > total=23.14 > >> > > > >> MB, > >> > > > >> > > > free=2.73 GB, max=2.75 GB, blocks=0, accesses=122442151, > >> > > > >> > hits=122168501, > >> > > > >> > > > hitRatio=99.77%, , cachingAccesses=122192927, > >> > > > cachingHits=122162378, > >> > > > >> > > > cachingHitsRatio=99.97%, , evictions=0, evicted=6768, > >> > > > >> > > > evictedPerRun=Infinity > >> > > > >> > > > 2013-12-09 08:14:12,711 INFO > >> > > > >> > > > > >> > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: > >> > > > >> Total > >> > > > >> > > > replicated: 1 > >> > > > >> > > > 2013-12-09 08:14:14,778 INFO > >> > > > >> > > > > >> > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: > >> > > > >> Total > >> > > > >> > > > replicated: 3 > >> > > > >> > > > ... > >> > > > >> > > > 2013-12-09 08:15:09,199 INFO > >> > > > >> > > > > >> > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: > >> > > > >> Total > >> > > > >> > > > replicated: 3 > >> > > > >> > > > 2013-12-09 08:15:12,243 INFO > >> > > > >> > > > > >> > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: > >> > > > >> Total > >> > > > >> > > > replicated: 2 > >> > > > >> > > > 2013-12-09 08:15:22,086 INFO > >> > > > >> > > > > >> > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink: > >> > > > >> Total > >> > > > >> > > > replicated: 2 > >> > > > >> > > > > >> > > > >> > > > Thanks > >> > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > 2013/12/7 Bharath Vissapragada <[email protected]> > >> > > > >> > > > > >> > > > >> > > > > I'd look into the RS logs to see whats happening there. > >> > > > Difficult > >> > > > >> to > >> > > > >> > > > guess > >> > > > >> > > > > from the given information! > >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > On Sat, Dec 7, 2013 at 8:52 PM, Federico Gaule < > >> > > > >> [email protected]> > >> > > > >> > > > > wrote: > >> > > > >> > > > > > >> > > > >> > > > > > Any clue? > >> > > > >> > > > > > El dic 5, 2013 9:49 a.m., "Federico Gaule" < > >> > > > [email protected] > >> > > > >> > > >> > > > >> > > > > escribió: > >> > > > >> > > > > > > >> > > > >> > > > > > > Hi, > >> > > > >> > > > > > > > >> > > > >> > > > > > > I have 2 clusters, Master (a) - Slave (b) > >> replication. > >> > > > >> > > > > > > B doesn't have client write or reads, all handlers > >> (100) > >> > > are > >> > > > >> > > waiting > >> > > > >> > > > > but > >> > > > >> > > > > > > rpc.metrics.RpcQueueTime_num_ops and > >> > > > >> > > > rpc.metrics.RpcQueueTime_avg_time > >> > > > >> > > > > > reports > >> > > > >> > > > > > > to be rpc calls to be queued. > >> > > > >> > > > > > > There are some screenshots below to show ganglia > >> > metrics. > >> > > > How > >> > > > >> is > >> > > > >> > > this > >> > > > >> > > > > > > behaviour explained? I have looked for metrics > >> > > > specifications > >> > > > >> but > >> > > > >> > > > can't > >> > > > >> > > > > > > find much information. > >> > > > >> > > > > > > > >> > > > >> > > > > > > Handlers > >> > > > >> > > > > > > http://i42.tinypic.com/242ssoz.png > >> > > > >> > > > > > > > >> > > > >> > > > > > > NumOps > >> > > > >> > > > > > > http://tinypic.com/r/of2c8k/5 > >> > > > >> > > > > > > > >> > > > >> > > > > > > AvgTime > >> > > > >> > > > > > > http://tinypic.com/r/2lsvg5w/5 > >> > > > >> > > > > > > > >> > > > >> > > > > > > Cheers > >> > > > >> > > > > > > > >> > > > >> > > > > > > >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > -- > >> > > > >> > > > > Bharath Vissapragada > >> > > > >> > > > > <http://www.cloudera.com> > >> > > > >> > > > > > >> > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > -- > >> > > > >> > > > > >> > > > >> > > > [image: > >> > > > >> http://www.despegar.com/galeria/images/promos/isodespegar1.png > >> > > > >> > ] > >> > > > >> > > > > >> > > > >> > > > *Ing. Federico Gaule* > >> > > > >> > > > Líder Técnico - PAM <[email protected]> > >> > > > >> > > > Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU) > >> > > > >> > > > tel. +54 (11) 4894-3500 > >> > > > >> > > > > >> > > > >> > > > *[image: Seguinos en Twitter!] < > >> > > http://twitter.com/#!/despegarar> > >> > > > >> > > [image: > >> > > > >> > > > Seguinos en Facebook!] <http://www.facebook.com/despegar > > > >> > > [image: > >> > > > >> > > Seguinos > >> > > > >> > > > en YouTube!] <http://www.youtube.com/Despegar>* > >> > > > >> > > > *Despegar.com, Inc. * > >> > > > >> > > > El mejor precio para tu viaje. > >> > > > >> > > > > >> > > > >> > > > Este mensaje es confidencial y puede contener > información > >> > > > amparada > >> > > > >> por > >> > > > >> > > el > >> > > > >> > > > secreto profesional. > >> > > > >> > > > Si usted ha recibido este e-mail por error, por favor > >> > > > >> comunÃquenoslo > >> > > > >> > > > inmediatamente respondiendo a este e-mail y luego > >> > eliminándolo > >> > > de > >> > > > >> su > >> > > > >> > > > sistema. > >> > > > >> > > > El contenido de este mensaje no deberá ser copiado ni > >> > > divulgado a > >> > > > >> > > ninguna > >> > > > >> > > > persona. > >> > > > >> > > > > >> > > > >> > > > >> > > > >> > > >> > > > >> > > >> > > > >> > > >> > > > >> > -- > >> > > > >> > > >> > > > >> > [image: > >> > > > http://www.despegar.com/galeria/images/promos/isodespegar1.png] > >> > > > >> > > >> > > > >> > *Ing. Federico Gaule* > >> > > > >> > Líder Técnico - PAM <[email protected]> > >> > > > >> > Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU) > >> > > > >> > tel. +54 (11) 4894-3500 > >> > > > >> > > >> > > > >> > *[image: Seguinos en Twitter!] < > >> http://twitter.com/#!/despegarar> > >> > > > >> [image: > >> > > > >> > Seguinos en Facebook!] <http://www.facebook.com/despegar> > >> [image: > >> > > > >> Seguinos > >> > > > >> > en YouTube!] <http://www.youtube.com/Despegar>* > >> > > > >> > *Despegar.com, Inc. * > >> > > > >> > El mejor precio para tu viaje. > >> > > > >> > > >> > > > >> > Este mensaje es confidencial y puede contener información > >> > amparada > >> > > > por > >> > > > >> el > >> > > > >> > secreto profesional. > >> > > > >> > Si usted ha recibido este e-mail por error, por favor > >> > > comunÃquenoslo > >> > > > >> > inmediatamente respondiendo a este e-mail y luego > >> eliminándolo de > >> > > su > >> > > > >> > sistema. > >> > > > >> > El contenido de este mensaje no deberá ser copiado ni > >> divulgado a > >> > > > >> ninguna > >> > > > >> > persona. > >> > > > >> > > >> > > > >> > >> > > > > > >> > > > > > >> > > > > > >> > > > > -- > >> > > > > > >> > > > > [image: > >> > http://www.despegar.com/galeria/images/promos/isodespegar1.png > >> > > ] > >> > > > > > >> > > > > *Ing. Federico Gaule* > >> > > > > Líder Técnico - PAM <[email protected]> > >> > > > > > >> > > > > Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU) > >> > > > > tel. +54 (11) 4894-3500 > >> > > > > > >> > > > > > >> > > > > *[image: Seguinos en Twitter!] < > http://twitter.com/#!/despegarar> > >> > > > [image: > >> > > > > Seguinos en Facebook!] <http://www.facebook.com/despegar> > [image: > >> > > > Seguinos > >> > > > > en YouTube!] <http://www.youtube.com/Despegar>* > >> > > > > *Despegar.com, Inc. * > >> > > > > > >> > > > > El mejor precio para tu viaje. > >> > > > > > >> > > > > Este mensaje es confidencial y puede contener información > >> amparada > >> > por > >> > > > el > >> > > > > secreto profesional. > >> > > > > Si usted ha recibido este e-mail por error, por favor > >> comunÃquenoslo > >> > > > > inmediatamente respondiendo a este e-mail y luego eliminándolo > >> de su > >> > > > > sistema. > >> > > > > El contenido de este mensaje no deberá ser copiado ni > divulgado a > >> > > > ninguna > >> > > > > persona. > >> > > > > > >> > > > > >> > > > > >> > > > > >> > > > -- > >> > > > > >> > > > [image: > >> http://www.despegar.com/galeria/images/promos/isodespegar1.png > >> > ] > >> > > > > >> > > > *Ing. Federico Gaule* > >> > > > Líder Técnico - PAM <[email protected]> > >> > > > Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU) > >> > > > tel. +54 (11) 4894-3500 > >> > > > > >> > > > *[image: Seguinos en Twitter!] <http://twitter.com/#!/despegarar> > >> > > [image: > >> > > > Seguinos en Facebook!] <http://www.facebook.com/despegar> [image: > >> > > Seguinos > >> > > > en YouTube!] <http://www.youtube.com/Despegar>* > >> > > > *Despegar.com, Inc. * > >> > > > El mejor precio para tu viaje. > >> > > > > >> > > > Este mensaje es confidencial y puede contener información > amparada > >> por > >> > > el > >> > > > secreto profesional. > >> > > > Si usted ha recibido este e-mail por error, por favor > >> comunÃquenoslo > >> > > > inmediatamente respondiendo a este e-mail y luego eliminándolo de > >> su > >> > > > sistema. > >> > > > El contenido de este mensaje no deberá ser copiado ni divulgado a > >> > > ninguna > >> > > > persona. > >> > > > > >> > > > >> > > >> > > >> > > >> > -- > >> > > >> > [image: > http://www.despegar.com/galeria/images/promos/isodespegar1.png] > >> > > >> > *Ing. Federico Gaule* > >> > Líder Técnico - PAM <[email protected]> > >> > Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU) > >> > tel. +54 (11) 4894-3500 > >> > > >> > *[image: Seguinos en Twitter!] <http://twitter.com/#!/despegarar> > >> [image: > >> > Seguinos en Facebook!] <http://www.facebook.com/despegar> [image: > >> Seguinos > >> > en YouTube!] <http://www.youtube.com/Despegar>* > >> > *Despegar.com, Inc. * > >> > El mejor precio para tu viaje. > >> > > >> > Este mensaje es confidencial y puede contener información amparada > por > >> el > >> > secreto profesional. > >> > Si usted ha recibido este e-mail por error, por favor comunÃquenoslo > >> > inmediatamente respondiendo a este e-mail y luego eliminándolo de su > >> > sistema. > >> > El contenido de este mensaje no deberá ser copiado ni divulgado a > >> ninguna > >> > persona. > >> > > >> > > > > > > > > -- > > > > [image: http://www.despegar.com/galeria/images/promos/isodespegar1.png] > > > > *Ing. Federico Gaule* > > Líder Técnico - PAM <[email protected]> > > > > Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU) > > tel. +54 (11) 4894-3500 > > > > > > *[image: Seguinos en Twitter!] <http://twitter.com/#!/despegarar> > [image: > > Seguinos en Facebook!] <http://www.facebook.com/despegar> [image: > Seguinos > > en YouTube!] <http://www.youtube.com/Despegar>* > > *Despegar.com, Inc. * > > > > El mejor precio para tu viaje. > > > > Este mensaje es confidencial y puede contener información amparada por > el > > secreto profesional. > > Si usted ha recibido este e-mail por error, por favor comunÃquenoslo > > inmediatamente respondiendo a este e-mail y luego eliminándolo de su > > sistema. > > El contenido de este mensaje no deberá ser copiado ni divulgado a > ninguna > > persona. > > > > > > -- > > [image: http://www.despegar.com/galeria/images/promos/isodespegar1.png] > > *Ing. Federico Gaule* > Líder Técnico - PAM <[email protected]> > Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU) > tel. +54 (11) 4894-3500 > > *[image: Seguinos en Twitter!] <http://twitter.com/#!/despegarar> [image: > Seguinos en Facebook!] <http://www.facebook.com/despegar> [image: Seguinos > en YouTube!] <http://www.youtube.com/Despegar>* > *Despegar.com, Inc. * > El mejor precio para tu viaje. > > Este mensaje es confidencial y puede contener información amparada por el > secreto profesional. > Si usted ha recibido este e-mail por error, por favor comunÃquenoslo > inmediatamente respondiendo a este e-mail y luego eliminándolo de su > sistema. > El contenido de este mensaje no deberá ser copiado ni divulgado a ninguna > persona. >
