[
https://issues.apache.org/jira/browse/HBASE-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13754458#comment-13754458
]
takeshi.miao commented on HBASE-7525:
-------------------------------------
Dear [~stack]
Here is the answer for your questions
{quote}
./hbase-0.95.3-SNAPSHOT/bin/hbase --config /home/stack/conf_hbase
org.apache.hadoop.hbase.tool.Canary
... it goes off and does something; default looks to go and get from all
regions.
{quote}
Yes, it's default behavior is just align with the old one, does the all regions
monitoring
bq. You add 2013-08-29 09:32:16,463 DEBUG [main] tool.Canary: runCount=2. What
does it mean ?
It is the internal DEBUG msg, for counting how many loop of this monitor
instance did; It can help user to observe the monitor instance's behavior
whether as expected
Following are the questions you asked about _'-regionserver'_ option
{quote}
{code}
Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table/regionserver
1 [table/regionserver 2...]]
...
{code}
{quote}
{quote}
Would it be clearer if the -regionserver option took arguments as in
-regionserver=rs1,rs2,rs3 etc.?
How to interpret this then:
Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary -regionserver=rs1 table1
Would above only get regions from table1 on rs1? If no regions from table1 then
it would print out there are none?
{quote}
The option _'-regionserver'_ (regionserver mode) is exclusive with the default
mode (region mode), which means user can only choose to use default mode or
regionserver mode either
bq. I do not know how to read 'table/regionserver 1'. What is the '1'?
So it seems the usage output confuses the user, I would like to change it to
following, how do you think ?
{code}
Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table|regionserver
[table|regionserver ...]]
...
{code}
{quote}
Or if you pass a table1 when you have a -regionserver option specified, you
could just fail with "Cannot pass a tablename when using the -regionserver
option" – that'd probably be simplest.
{quote}
Yes, this is a good suggestion, but currently I would not check this if the
passed arguments are whether tableNames in HBase, due to I need to new a
HBaseAdim instance to get the table list firstly, then compare them with the
passed argument.
How do you think that I modify the usage output more precisely for
-regionserver option ? such as...
{code}
...
-regionserver replace the table argument to regionserver,
which means to enable regionserver mode, instead of region mode (default)
...
{code}
Either way is ok for me.
I will upload the new patches after we confirm which way to go, and tks for
your questions and suggestions :)
> A canary monitoring program specifically for regionserver
> ---------------------------------------------------------
>
> Key: HBASE-7525
> URL: https://issues.apache.org/jira/browse/HBASE-7525
> Project: HBase
> Issue Type: New Feature
> Components: monitoring
> Affects Versions: 0.94.0
> Reporter: takeshi.miao
> Priority: Critical
> Fix For: 0.98.0
>
> Attachments: HBASE-7525-0.95-v0.patch, HBASE-7525-0.95-v1.patch,
> HBASE-7525-0.95-v3.patch, HBASE-7525-0.95-v4.patch, HBASE-7525-0.95-v6.patch,
> HBASE-7525-trunk-v2.patch, HBASE-7525-v0.patch, RegionServerCanary.java
>
>
> *Motivation*
> This ticket is to provide a canary monitoring tool specifically for
> HRegionserver, details as follows
> 1. This tool is required by operation team due to they thought that the
> canary for each region of a HBase is too many for them, so I implemented this
> coarse-granular one based on the original o.a.h.h.tool.Canary for them
> 2. And this tool is implemented by multi-threading, which means the each Get
> request sent by a thread. the reason I use this way is due to we suffered the
> region server hung issue by now the root cause is still not clear. so this
> tool can help operation team to detect hung region server if any.
> *example*
> 1. the tool docs
> ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -help
> Usage: [opts] [regionServerName 1 [regionServrName 2...]]
> regionServerName - FQDN serverName, can use linux command:hostname -f to
> check your serverName
> where [-opts] are:
> -help Show this help and exit.
> -e Use regionServerName as regular expression
> which means the regionServerName is regular expression pattern
> -f <B> stop whole program if first error occurs, default is true
> -t <N> timeout for a check, default is 600000 (milisecs)
> -daemon Continuous check at defined intervals.
> -interval <N> Interval between checks (sec)
> 2. Will send a request to each regionserver in a HBase cluster
> ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary
> 3. Will send a request to a regionserver by given name
> ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary rs1.domainname
> 4. Will send a request to regionserver(s) by given regular-expression
> /opt/trend/circus-opstool/bin/hbase-canary-monitor-each-regionserver.sh -e
> rs1.domainname.pattern
> // another example
> ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -e
> tw-poc-tm-puppet-hdn[0-9]\{1,2\}.client.tw.trendnet.org
> 5. Will send a request to a regionserver and also set a timeout limit for
> this test
> // query regionserver:rs1.domainname with timeout limit 10sec
> // -f false, means that will not exit this program even test failed
> ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -f false -t 10000
> rs1.domainname
> // echo "1" if timeout
> echo "$?"
> 6. Will run as daemon mode, which means it will send request to each
> regionserver periodically
> ./bin/hbase org.apache.hadoop.hbase.tool.RegionServerCanary -daemon
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira