[ 
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753807#comment-13753807
 ] 

stack commented on HBASE-7525:
------------------------------

I tried it.  It is great.  Nice utility.  But usage needs cleanup because 
otherwise users will be confused on how to use it.

Did it always just run if you did not provide an argument: i.e. if I do this:

 ./hbase-0.95.3-SNAPSHOT/bin/hbase --config /home/stack/conf_hbase 
org.apache.hadoop.hbase.tool.Canary

... it goes off and does something; default looks to go and get from all 
regions.

You add 2013-08-29 09:32:16,463 DEBUG [main] tool.Canary: runCount=2.  What 
does it mean?

Here is current usage:


{code}
Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table/regionserver 
1 [table/regionserver 2...]]
 where [opts] are:
   -help          Show this help and exit.
   -regionserver  replace the table argument to regionserver,
      which means to enable regionserver mode
   -daemon        Continuous check at defined intervals.
   -interval <N>  Interval between checks (sec)
   -e             Use region/regionserver as regular expression
      which means the region/regionserver is regular expression pattern
   -f <B>         stop whole program if first error occurs, default is true
   -t <N>         timeout for a check, default is 600000 (milisecs)
{code}

First, the formatting is off in the above.

So, if I supply -regionserver, then the last argument is a regionserver 
hostname rather than a table it seems? (I tried it and it seems so)

I do not know how to read 'table/regionserver 1'.  What is the '1'?

Would it be clearer if the -regionserver option took arguments as in 
-regionserver=rs1,rs2,rs3 etc.?

How to interpret this then:

Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary -regionserver=rs1 table1

Would above only get regions from table1 on rs1?  If no regions from table1 
then it would print out there are none?  Or if you pass a table1 when you have 
a -regionserver option specified, you could just fail with "Cannot pass a 
tablename when using the -regionserver option" -- that'd probably be simplest.
                
> A canary monitoring program specifically for regionserver
> ---------------------------------------------------------
>
>                 Key: HBASE-7525
>                 URL: https://issues.apache.org/jira/browse/HBASE-7525
>             Project: HBase
>          Issue Type: New Feature
>          Components: monitoring
>    Affects Versions: 0.94.0
>            Reporter: takeshi.miao
>            Priority: Critical
>             Fix For: 0.98.0
>
>         Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch, 
> HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch, 
> HBASE-7525-trunk-v2.patch, HBASE-7525-v0.patch, RegionServerCanary.java
>
>
> *Motivation*
> This ticket is to provide a canary monitoring tool specifically for 
> HRegionserver, details as follows
> 1. This tool is required by operation team due to they thought that the 
> canary for each region of a HBase is too many for them, so I implemented this 
> coarse-granular one based on the original o.a.h.h.tool.Canary for them
> 2. And this tool is implemented by multi-threading, which means the each Get 
> request sent by a thread. the reason I use this way is due to we suffered the 
> region server hung issue by now the root cause is still not clear. so this 
> tool can help operation team to detect hung region server if any.
> *example*
> 1. the tool docs
> ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
> Usage: [opts] [regionServerName 1 [regionServrName 2...]]
>  regionServerName - FQDN serverName, can use linux command:hostname -f to 
> check your serverName
>  where [-opts] are:
>    -help Show this help and exit.
>    -e    Use regionServerName as regular expression
>       which means the regionServerName is regular expression pattern
>    -f <B>         stop whole program if first error occurs, default is true
>    -t <N>         timeout for a check, default is 600000 (milisecs)
>    -daemon        Continuous check at defined intervals.
>    -interval <N>  Interval between checks (sec)
> 2. Will send a request to each regionserver in a HBase cluster
> ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
> 3. Will send a request to a regionserver by given name
> ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
> 4. Will send a request to regionserver(s) by given regular-expression
> /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e 
> rs1.domainname.pattern
> // another example
> ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e 
> tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
> 5. Will send a request to a regionserver and also set a timeout limit for 
> this test
> // query regionserver:rs1.domainname with timeout limit 10sec
> // -f false, means that will not exit this program even test failed
> ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 10000 
> rs1.domainname
> // echo "1" if timeout
> echo "$?"
> 6. Will run as daemon mode, which means it will send request to each 
> regionserver periodically
> ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to