Hi St.Ack,
Thanks for the additional info.
A few additional comments inserted below.
Rick
On Jul 23, 2008, at 11:46 AM, stack wrote:
Rick:
Thanks for the great feedback.
Other comments interspersed:
Rick Hangartner wrote:
...
2) In both approaches, when we tried to do the data migration from
hbase-0.1.3 to hbase-0.2.0 we first got migration failures due to
"unrecovered region server logs". Following the 'Redo Logs'
comments in the "http://wiki.apache.org/hadoop/Hbase/HowToMigrate"
doc, starting afresh with a new copy of our virtualized system
each time, we tried these methods of getting rid of the those logs
and the fatal error:
a) deleting just the log files in the "/hbase" directory
Did this not work? How'd you do it?
We used "bin/hadoop dfs -rm" command with a file spec, but it's
possible we got more than the log files in that delete (I can't
remember the file spec we used, but it was more than just "/hbase").
Unfortunately, this was the one case we didn't get a chance to repeat
to absolutely confirm.
b) deleting the entire contents of the "/hbase" directory (which
means we lost our data, but we are just investigating the upgrade
path, after all)
c) deleting the "/hbase" directory entirely and creating a new "/
hbase" directory.
I should also note that we would need to repeat approach a) to be
100% certain of our results for that case. (We've already repeated
approaches b) and c) and have just run out of time for these
upgrade tests because we need to get to other things).
In all cases, the migrate then failed as:
[EMAIL PROTECTED]:~/hbase$ bin/hbase migrate upgrade
08/07/22 18:03:16 INFO util.Migrate: Verifying that file system is
available...
08/07/22 18:03:16 INFO util.Migrate: Verifying that HBase is not
running...
08/07/22 18:03:17 INFO ipc.Client: Retrying connect to server:
savory1/10.0.0.45:60000. Already tried 1 time(s).
...
08/07/22 18:03:26 INFO ipc.Client: Retrying connect to server:
savory1/10.0.0.45:60000. Already tried 10 time(s).
08/07/22 18:03:27 INFO util.Migrate: Starting upgrade
08/07/22 18:03:27 FATAL util.Migrate: Upgrade failed
java.io.IOException: Install 0.1.x of hbase and run its migration
first
at org.apache.hadoop.hbase.util.Migrate.run(Migrate.java:181)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.hbase.util.Migrate.main(Migrate.java:446)
This would seem to indicate that the hbase.version file is missing
from under /hbase directory.
As you can see, we were then in a bit of a Catch-22. To re-install
HBase-0.1.x also required re-installing Hadoop-0.16.4 (we tried
reinstalling HBase-0.1.x without doing that!) so there was no way
to proceed. Attempting to start up HBase-0.2.0 just resulted in an
error message that we needed to do the migrate.
Hmm. I suppose this error message is kinda useless. If you've
already made the committment to 0.17.x, you can't really go back.
Luo Ning made a patch so you can run 0.1.3 hbase on 0.17.1 hadoop: https://issues.apache.org/jira/browse/HBASE-749
. Maybe this is what we should be recommending folks do in the
migration doc. and in the migration emission?
It certainly seems to me this is a good option to have if one has any
migration propblems. And maybe it is just smart even for people to
run a migrate on 0.1.3 running on 0.17.1 as part of the upgrade
procedure even if there is no apparent reason to do so, if it doesn't
involve a tremendous amount of extra work during a upgrade.
Even if it did involve a lot of work, it would only be once and I know
I'd do it just to be safe. But then I've always been of the school
that it's always best to "measure twice, cut once" when real data is
involved.
3) Since this was just a test, we then blew away the disk used by
Hadoop and re-built the namenode per a standard Hadoop new
install. Hadoop-0.17.1 and Hbase-0.2.0 then started up just fine.
We only ran a few tests with the new Hbase command line shell in
some ways we used the old HQL shell for sanity checks, and
everything seems copacetic.
A few other comments:
- The new shell takes a bit of getting used to, but seems quite
functional (we're not the biggest Ruby fans, but hey, someone took
this on and upgraded the shell so we just say: Thanks!)
Smile.
- We really like how timestamps have become first-class objects in
the HBase-0.2.0 API . Although, we were in the middle of
developing some code under HBase-0.1.3 with workarounds for
timestamps not being first-class objects and we will have to decide
whether we should backup and re-develop for HBase-0.2.0 (we know we
should), or plunge ahead with what we were doing under HBase-0.1.3
just to discard it in the near future because of the other
advantages of HBase-0.2.0. Is there anything we should consider in
making this decision, perhaps about timing of any bug fixes and an
official release of HBase-0.2.0? (HBase-0.2.1?).
We're sick of looking at 0.1.x hbase. Can you factor that into your
decision regards hbase 0.2 or 0.1?
That's actually pretty much exactly what I was hoping to hear --- that
going forward with Hbase-0.2.0 is where your interests lie ...
Joking aside, a stable offering is our #1 priority ahead of all else
whether new features, performance, etc. In some ways, I'd guess
0.2.0 will probably be less stable than 0.1.3 being new but in
others it will be more so (e.g. it has region balancing so no more
will you start a cluster and see on node carrying 2 regions and then
its partner 200). Its hard to say. Best thing to do is do as your
doing, testing the 0.2.0 release candidate. With enough folks
banging on it, its possible that it will not only have more features
but also be as stable if not more so than our current 0.1.3.
I think working with 0.2.0 release candidate(s) as you suggest best
fits our plan right now. And I re-visited our table design today with
the intent of coming up with a re-design that can be implemented in
0.1.3 or 0.2.0 so we can go either way --- particularly since we can
run 0.1.3 on hadoop-0.17.1.
Thanks again for providing the community with a really good system.
Rick
Thanks again for the feedback.
St.Ack