That email was just informational. Below are the details on my cluster - let me 
know if more is needed. 

I have 2 hbase clusters setup 
-       for production, 6 node cluster,  32G, 8 processors
-       for dev, 3 node cluster , 16GRAM , 4 processors 

1. I installed hadoop0.20.2 and hbase0.20.3 on both these clusters, 
successfully. 
2. After that I loaded 2G+ files into HDFS and HBASE table. 
        An example Hbase table looks like this: 
                {NAME =>'TABLE', FAMILIES => [{NAME => 'data', VERSIONS => 
'100', COM true
                 PRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', 
IN_MEMO
                 RY => 'false', BLOCKCACHE => 'true'}]}
3. I started stargate on one server and accessed Hbase for reading from another 
3rd party application successfully. 
        It took 600 seconds on dev cluster and 250 on production to read .5M 
records from Hbase via stargate.
4. later to boost read performance, it was suggested that upgrading to 
Hbase0.20.6 will be helpful. I did that on production (w/o running the migrate 
script) and re-started stargate and everything was running fine, though I did 
not see a bump in performance. 

5. Eventually, I had to move to dev cluster from production because of some 
resource issues at our end. Dev cluster had 0.20.3 at this time. As I started 
loading more files into Hbase (<10 versions of <1G files) and converting my app 
to use hbase more heavily (via more stargate clients), the performance started 
degrading. I decided it was time to upgrade dev cluster as well to 0.20.6.  (I 
did not run the migrate script here as well, I missed this step in the doc).

6. When Hbase 0.20.6 came back up on dev cluster (with increased block cache 
(.6) and region server handler counts (75) ), pointing to the same rootdir, I 
noticed that some tables were missing. I could see a mention of them in the 
logs, but not when I did 'list' in the shell. I recovered those tables using 
add_table.rb script. 
        a. Is there a way to check the health of all Hbase tables in the 
cluster after an upgrade or even periodically, to make sure that everything is 
healthy ? 
        b. I would like to be able to force this error again and check the 
health of hbase and want it to report to me that some tables were lost. 
Currently, I just found out because I had very less data and it was easy to 
tell. 

7. Here are the issues I face after this upgrade 
        a. when I run stop-hbase.sh, it  does not stop my regionservers on 
other boxes. 
        b. It does start them using start-hbase.sh. 
        c. Is it that stopping regionservers is not reported, but it does stop 
them (I see that happening on production cluster) ? 
        
8. I started stargate in the upgraded 0.20.6 in dev cluster 
        a. earlier when I sent a URL to look for a data row that did not exist, 
the return value was NULL , now I get an xml stating HTTP error 404/405.        
Everything works as expected for an existing data row.
        b. and this works okay on the production cluster after upgrade, it's 
the dev cluster that gives this error. 
        c. examples : 
        On production cluster:
                        :~ hadoop$curl http://localhost:8080/version
                                Stargate 0.0.1 [JVM: Sun Microsystems Inc. 
1.6.0_20-16.3-b01] [OS: SunOS 5.10 x86] [Server: jetty/6.1.14] [Jersey: 
1.1.0-ea]
                        :~ hadoop$curl http://localhost:8080/verison
                        :~ hadoop$curl http://localhost:8080/version/cluster
                                0.20.6

        On dev cluster:
                        :~ hadoop$curl http://localhost:8080/version
                        Stargate 1.0 [JVM: Sun Microsystems Inc. 
1.6.0_20-16.3-b01] [OS: SunOS 5.10 x86] [Server: jetty/6.1.14] [Jersey: 1.1.5.1]
:~ hadoop$curl http://localhost:8080/verison
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
<title>Error 405 METHOD_NOT_ALLOWED</title>
</head>
<body><h2>HTTP ERROR: 405</h2><pre>METHOD_NOT_ALLOWED</pre>
<p>RequestURI=/verson</p><p><i><small><a 
href="http://jetty.mortbay.org/";>Powered by Jetty://</a></small></i></p><br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

</body>
</html>

9. Therefore, I thought I should try downgrading to 0.20.3, basically start 
hbase from that old dir I still have on the dev cluster since stargate was 
working as desired before the upgrade. I changed all my classpaths to point to 
the old dir and restarted hbase and stargate from hbase-0.20.3 dir. 
        a. but I think that doesn't really work. It recognizes 0.20.6 
somehow... since my hbase shell kept pointing to 0.20.6 and 
                also stargate URL "curl http://localhost:8080/version/cluster"; 
reports 0.20.6
        b. I am not sure if there is any such thing as downgrading hbase.

10. Now I started pointing back to 0.20.6 ( running everything out of here). I 
still get the same http error as above.
        Below is another error ... HTTP 404 this time with 0.20.6
hadoop$curl http://localhost:8080/<table_name>/75



<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
<title>Error 404 NOT_FOUND</title>
</head>
<body><h2>HTTP ERROR: 404</h2><pre>NOT_FOUND</pre>
<p>RequestURI=/VRS/75</p><p><i><small><a 
href="http://jetty.mortbay.org/";>Powered by Jetty://</a></small></i></p><br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

</body>
</html>


That was a long email. Please let me know if futher clarifications are needed. 

Thank you, 
-Avani 

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Stack
Sent: Tuesday, August 31, 2010 12:24 PM
To: [email protected]
Subject: Re: HBase table lost on upgrade

On Tue, Aug 31, 2010 at 12:14 PM, Sharma, Avani <[email protected]> wrote:
> Thanks, Stack. Well, I was able to get the basic hbase cluster to run, but 
> now that I am trying to boost read performance, I am running into stuff that 
> is either not working or I cannot easily find solutions to on the net.
>

This mail that you've just written above gives us nothing to go on.
You want to boost read performance saying nothing about what current
performance, datasize, hardware, nor schema looks like.

St.Ack

Reply via email to