Hi St.Ack,

Thanks for the additional info.

A few additional comments inserted below.

Rick

On Jul 23, 2008, at 11:46 AM, stack wrote:

Rick:

Thanks for the great feedback.

Other comments interspersed:

Rick Hangartner wrote:
...
2) In both approaches, when we tried to do the data migration from hbase-0.1.3 to hbase-0.2.0 we first got migration failures due to "unrecovered region server logs". Following the 'Redo Logs' comments in the "http://wiki.apache.org/hadoop/Hbase/HowToMigrate"; doc, starting afresh with a new copy of our virtualized system each time, we tried these methods of getting rid of the those logs and the fatal error:

  a) deleting just the log files in the "/hbase" directory

Did this not work?  How'd you do it?

We used "bin/hadoop dfs -rm" command with a file spec, but it's possible we got more than the log files in that delete (I can't remember the file spec we used, but it was more than just "/hbase"). Unfortunately, this was the one case we didn't get a chance to repeat to absolutely confirm.




b) deleting the entire contents of the "/hbase" directory (which means we lost our data, but we are just investigating the upgrade path, after all) c) deleting the "/hbase" directory entirely and creating a new "/ hbase" directory.

I should also note that we would need to repeat approach a) to be 100% certain of our results for that case. (We've already repeated approaches b) and c) and have just run out of time for these upgrade tests because we need to get to other things).

In all cases, the migrate then failed as:

[EMAIL PROTECTED]:~/hbase$ bin/hbase migrate upgrade
08/07/22 18:03:16 INFO util.Migrate: Verifying that file system is
available...
08/07/22 18:03:16 INFO util.Migrate: Verifying that HBase is not running...
08/07/22 18:03:17 INFO ipc.Client: Retrying connect to server:
savory1/10.0.0.45:60000. Already tried 1 time(s).
...
08/07/22 18:03:26 INFO ipc.Client: Retrying connect to server:
savory1/10.0.0.45:60000. Already tried 10 time(s).
08/07/22 18:03:27 INFO util.Migrate: Starting upgrade
08/07/22 18:03:27 FATAL util.Migrate: Upgrade failed
java.io.IOException: Install 0.1.x of hbase and run its migration first
      at org.apache.hadoop.hbase.util.Migrate.run(Migrate.java:181)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
      at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
      at org.apache.hadoop.hbase.util.Migrate.main(Migrate.java:446)

This would seem to indicate that the hbase.version file is missing from under /hbase directory.


As you can see, we were then in a bit of a Catch-22. To re-install HBase-0.1.x also required re-installing Hadoop-0.16.4 (we tried reinstalling HBase-0.1.x without doing that!) so there was no way to proceed. Attempting to start up HBase-0.2.0 just resulted in an error message that we needed to do the migrate.

Hmm. I suppose this error message is kinda useless. If you've already made the committment to 0.17.x, you can't really go back.

Luo Ning made a patch so you can run 0.1.3 hbase on 0.17.1 hadoop: https://issues.apache.org/jira/browse/HBASE-749 . Maybe this is what we should be recommending folks do in the migration doc. and in the migration emission?

It certainly seems to me this is a good option to have if one has any migration propblems. And maybe it is just smart even for people to run a migrate on 0.1.3 running on 0.17.1 as part of the upgrade procedure even if there is no apparent reason to do so, if it doesn't involve a tremendous amount of extra work during a upgrade.

Even if it did involve a lot of work, it would only be once and I know I'd do it just to be safe. But then I've always been of the school that it's always best to "measure twice, cut once" when real data is involved.




3) Since this was just a test, we then blew away the disk used by Hadoop and re-built the namenode per a standard Hadoop new install. Hadoop-0.17.1 and Hbase-0.2.0 then started up just fine. We only ran a few tests with the new Hbase command line shell in some ways we used the old HQL shell for sanity checks, and everything seems copacetic.

A few other comments:

- The new shell takes a bit of getting used to, but seems quite functional (we're not the biggest Ruby fans, but hey, someone took this on and upgraded the shell so we just say: Thanks!)

Smile.

- We really like how timestamps have become first-class objects in the HBase-0.2.0 API . Although, we were in the middle of developing some code under HBase-0.1.3 with workarounds for timestamps not being first-class objects and we will have to decide whether we should backup and re-develop for HBase-0.2.0 (we know we should), or plunge ahead with what we were doing under HBase-0.1.3 just to discard it in the near future because of the other advantages of HBase-0.2.0. Is there anything we should consider in making this decision, perhaps about timing of any bug fixes and an official release of HBase-0.2.0? (HBase-0.2.1?).
We're sick of looking at 0.1.x hbase. Can you factor that into your decision regards hbase 0.2 or 0.1?

That's actually pretty much exactly what I was hoping to hear --- that going forward with Hbase-0.2.0 is where your interests lie ...



Joking aside, a stable offering is our #1 priority ahead of all else whether new features, performance, etc. In some ways, I'd guess 0.2.0 will probably be less stable than 0.1.3 being new but in others it will be more so (e.g. it has region balancing so no more will you start a cluster and see on node carrying 2 regions and then its partner 200). Its hard to say. Best thing to do is do as your doing, testing the 0.2.0 release candidate. With enough folks banging on it, its possible that it will not only have more features but also be as stable if not more so than our current 0.1.3.

I think working with 0.2.0 release candidate(s) as you suggest best fits our plan right now. And I re-visited our table design today with the intent of coming up with a re-design that can be implemented in 0.1.3 or 0.2.0 so we can go either way --- particularly since we can run 0.1.3 on hadoop-0.17.1.

Thanks again for providing the community with a really good system.

Rick



Thanks again for the feedback.

St.Ack

Reply via email to