Re: HBase table lost on upgrade

Ted Yu Wed, 08 Sep 2010 22:11:39 -0700

You can copy HBaseFsck.java from trunk and compile in 0.20.6

On Wed, Sep 8, 2010 at 3:43 PM, Sharma, Avani <[email protected]> wrote:


> Right.
>
> Anyway, where can I get this file from ? Any pointers?
> I can't find it at
> src/main/java/org/apache/hadoop/hbase/client/HBaseFsck.java in 0.20.6.
>
> -----Original Message-----
> From: Ted Yu [mailto:[email protected]]
> Sent: Wednesday, September 08, 2010 3:09 PM
> To: [email protected]
> Subject: Re: HBase table lost on upgrade
>
> master.jsp shows tables, not regions.
> I personally haven't encountered the problem you're facing.
>
> On Wed, Sep 8, 2010 at 2:36 PM, Sharma, Avani <[email protected]> wrote:
>
> > Ted,
> > I did look at that thread. It seems I need to modify the code in that
> file?
> > Could you point me to the exact steps to get it and compile it?
> >
> > Did you get through the issue if regions being added to catalog , but do
> > not show up in master.jsp?
> >
> >
> >
> >
> > On Sep 4, 2010, at 9:24 PM, Ted Yu <[email protected]> wrote:
> >
> > > The tool Stack mentioned is hbck. If you want to port it to 0.20, see
> > email
> > > thread entitled:
> > > compiling HBaseFsck.java for 0.20.5You should try reducing the number
> of
> > > tables in your system, possibly through HBASE-2473
> > >
> > > Cheers
> > >
> > > On Thu, Sep 2, 2010 at 11:41 AM, Sharma, Avani <[email protected]>
> > wrote:
> > >
> > >>
> > >>
> > >>
> > >> -----Original Message-----
> > >> From: [email protected] [mailto:[email protected]] On Behalf Of
> > Stack
> > >> Sent: Wednesday, September 01, 2010 10:45 PM
> > >> To: [email protected]
> > >> Subject: Re: HBase table lost on upgrade
> > >>
> > >> On Wed, Sep 1, 2010 at 5:49 PM, Sharma, Avani <[email protected]>
> > wrote:
> > >>> That email was just informational. Below are the details on my
> cluster
> > -
> > >> let me know if more is needed.
> > >>>
> > >>> I have 2 hbase clusters setup
> > >>> -       for production, 6 node cluster,  32G, 8 processors
> > >>> -       for dev, 3 node cluster , 16GRAM , 4 processors
> > >>>
> > >>> 1. I installed hadoop0.20.2 and hbase0.20.3 on both these clusters,
> > >> successfully.
> > >>
> > >> Why not latest stable version, 0.20.6?
> > >>
> > >> This was couple of months ago.
> > >>
> > >>
> > >>> 2. After that I loaded 2G+ files into HDFS and HBASE table.
> > >>
> > >>
> > >> Whats this mean?  Each of the .5M cells was 2G in size or the total
> size
> > >> was 2G?
> > >>
> > >> The total file size is 2G. Cells are of the order of hundreds of
> bytes.
> > >>
> > >>
> > >>>       An example Hbase table looks like this:
> > >>>               {NAME =>'TABLE', FAMILIES => [{NAME => 'data', VERSIONS
> > =>
> > >> '100', COM true
> > >>>                PRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE =>
> > >> '65536', IN_MEMO
> > >>>                RY => 'false', BLOCKCACHE => 'true'}]}
> > >>
> > >> That looks fine.
> > >>
> > >>> 3. I started stargate on one server and accessed Hbase for reading
> from
> > >> another 3rd party application successfully.
> > >>>       It took 600 seconds on dev cluster and 250 on production to
> read
> > >> .5M records from Hbase via stargate.
> > >>
> > >>
> > >> That don't sound so good.
> > >>
> > >>
> > >>
> > >>> 4. later to boost read performance, it was suggested that upgrading
> to
> > >> Hbase0.20.6 will be helpful. I did that on production (w/o running the
> > >> migrate script) and re-started stargate and everything was running
> fine,
> > >> though I did not see a bump in performance.
> > >>>
> > >>> 5. Eventually, I had to move to dev cluster from production because
> of
> > >> some resource issues at our end. Dev cluster had 0.20.3 at this time.
> As
> > I
> > >> started loading more files into Hbase (<10 versions of <1G files) and
> > >> converting my app to use hbase more heavily (via more stargate
> clients),
> > the
> > >> performance started degrading. I decided it was time to upgrade dev
> > cluster
> > >> as well to 0.20.6.  (I did not run the migrate script here as well, I
> > missed
> > >> this step in the doc).
> > >>>
> > >>
> > >> What kinda perf you looking for from REST?
> > >>
> > >> Do you have to use REST?  All is base64'd so its safe to transport.
> > >>
> > >> I also have the Java Api code (for testing purposes) and that gave
> > similar
> > >> performance results (520 seconds on dev and 250 on production
> cluster).
> > Is
> > >> there a way to flush the cache before we run the next experiment? I
> > doubt
> > >> that the first lookup always takes longer and then the later ones
> > perform
> > >> better.
> > >>
> > >> I need something that can integrate with C++ - libcurl and stargate
> were
> > >> the easiest to start with. I could look at thrift or anything else the
> > Hbase
> > >> gurus think might be a better fit performance-wise.
> > >>
> > >>
> > >>> 6. When Hbase 0.20.6 came back up on dev cluster (with increased
> block
> > >> cache (.6) and region server handler counts (75) ), pointing to the
> same
> > >> rootdir, I noticed that some tables were missing. I could see a
> mention
> > of
> > >> them in the logs, but not when I did 'list' in the shell. I recovered
> > those
> > >> tables using add_table.rb script.
> > >>
> > >>
> > >> How did you shutdown this cluster?  Did you reboot machines?  Was your
> > >> hdfs homed on /tmp?  What is going on on your systems?  Are they
> > >> swapping?  Did you give HBase more than its default memory?  You read
> > >> the requirements and made sure ulimit and xceivers had been upped on
> > >> these machines?
> > >>
> > >>
> > >> Did not reboot machines. hdfs or hbase do not store data/logs in /tmp.
> > They
> > >> are not swapping.
> > >> Hbase heap size is 2G.  I have upped the xcievers now on your
> > >> recommanedation.  Do I need to restart hdfs after making this change
> in
> > >> hdfs-site.xml ?
> > >> ulimit -n
> > >> 2048
> > >>
> > >>
> > >>
> > >>>       a. Is there a way to check the health of all Hbase tables in
> the
> > >> cluster after an upgrade or even periodically, to make sure that
> > everything
> > >> is healthy ?
> > >>>       b. I would like to be able to force this error again and check
> > the
> > >> health of hbase and want it to report to me that some tables were
> lost.
> > >> Currently, I just found out because I had very less data and it was
> easy
> > to
> > >> tell.
> > >>>
> > >>
> > >> Iin trunk there is such a tool.  In 0.20.x, run a count against our
> > >> table.  See the hbase shell.  Type help to see how.
> > >>
> > >>
> > >> What tool are you talking about here - it wasn't clear ? Count against
> > >> which table ? I want hbase to check all tables and I don't know how
> many
> > >> tables I have since there are too many - is that possible?
> > >>
> > >>> 7. Here are the issues I face after this upgrade
> > >>>       a. when I run stop-hbase.sh, it  does not stop my regionservers
> > on
> > >> other boxes.
> > >>
> > >> Why not.  Whats going on on those machines?  If you tail the logs on
> > >> the hosts that won't go down and/or on master, what do they say?
> > >> Tail the logs.  Should give you (us) clue.
> > >>
> > >> They do go down with some errors in the log, but down't report it on
> the
> > >> terminal.
> > >> http://pastebin.com/0hYwaffL  regionserver log
> > >>
> > >>
> > >>
> > >>>       b. It does start them using start-hbase.sh.
> > >>>       c. Is it that stopping regionservers is not reported, but it
> does
> > >> stop them (I see that happening on production cluster) ?
> > >>>
> > >>
> > >>
> > >>
> > >>> 8. I started stargate in the upgraded 0.20.6 in dev cluster
> > >>>       a. earlier when I sent a URL to look for a data row that did
> not
> > >> exist, the return value was NULL , now I get an xml stating HTTP error
> > >> 404/405.        Everything works as expected for an existing data row.
> > >>
> > >> The latter sounds RESTy.  What would you expect of it?  The null?
> > >>
> > >>
> > >> Yes, it should send NULL like it does in the production server. Is
> there
> > >> anyone else you can point to who would have used REST ? This is the
> main
> > >> showstopper for me currently.
> > >>
> > >>
> > >>
> >
>

Re: HBase table lost on upgrade

Reply via email to