[
https://issues.apache.org/jira/browse/HBASE-4058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062215#comment-13062215
]
stack commented on HBASE-4058:
------------------------------
Dan Harvey who is still on 0.20.x had a similar issue this month. He added
four new servers to his cluster. These new servers were not resolving
properly. What we were seeing is that on startup, I believe, these new servers
would be assigned their portion of the regions on checkin. Then, the
basescanner would run -- its 0.20.x hbase -- and it would not recognize the
address the new servers were writing .META. and it would then think the regions
unassigned and would assign them elsewhere. So, we have double-assignment and
at same time there was splitting and compactions running. His .META. had holes
and overlaps.
In his case, not all tables were honked. Just the big ones. I wonder if an
improved add_table.rb would work in this case; i.e. do the same rewrite of the
.META. content for a single table based off the content in the filesystem
rather than trying fix up on .META. table.
Let me try adding add_table.rb to hbck. Let me add option of running per table
and then a global, restore all tables.
Dan sent me the .META. dir content. It looks like this:
{code}
-rw-r--r--@ 1 Stack staff 0 Jul 7 08:26 281906331022358506
-rw-r--r--@ 1 Stack staff 94283152 Jul 7 08:26 5233066973300534672
-rw-r--r--@ 1 Stack staff 0 Jul 7 08:26 6803125877105432645
-rw-r--r--@ 1 Stack staff 0 Jul 7 08:26 8650632001596730954
{code}
i.e. three zero-length files. I wonder how these were written (I asked him for
a dir listing from actual cluster).
> Extend TestHBaseFsck with a complete .META. recovery scenario
> -------------------------------------------------------------
>
> Key: HBASE-4058
> URL: https://issues.apache.org/jira/browse/HBASE-4058
> Project: HBase
> Issue Type: Improvement
> Reporter: Andrew Purtell
> Fix For: 0.92.0
>
>
> We should have a unit test that launches a minicluster and constructs a few
> tables, then deletes META files on disk, then bounces the master, then
> recovers the result with HBCK. Perhaps it is possible to extend TestHBaseFsck
> to do this.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira