[
https://issues.apache.org/jira/browse/HBASE-22976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201441#comment-17201441
]
Martin Braun commented on HBASE-22976:
--------------------------------------
Is there a workaround for reading in WAL files from recovered.edits? How can I
achieve this?
I have an issue with hbase 2.2.5 (and hadoop-2.8.5) after a full disk event I
have 38 inconsistencies, when I do a
hbase --internal-classpath hbck
I get a bunch of these errors:
ERROR: Orphan region in HDFS: Unable to load .regioninfo from table
tt_ix_bizStep_inserting in hdfs dir
hdfs://localhost:9000/hbase/data/default/tt_ix_bizStep_inserting/8a1acb499bf454b072daeee5960daa73!
It may be an invalid format or version file. Treating as an orphaned regiondir.
ERROR: Orphan region in HDFS: Unable to load .regioninfo from table
tt_ix_bizStep_inserting in hdfs dir
hdfs://localhost:9000/hbase/data/default/tt_ix_bizStep_inserting/8f64025b68958ebddeb812297facdfc6!
It may be an invalid format or version file. Treating as an orphaned regiondir.
When looking into these directories I see that there is indeed no .regioninfo
file:
hdfs dfs -ls -R
hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0
drwxr-xr-x - jenkins supergroup 0 2020-09-21 11:23
hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits
-rw-r--r-- 3 jenkins supergroup 74133 2020-09-21 11:11
hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000285
-rw-r--r-- 3 jenkins supergroup 74413 2020-09-16 19:03
hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000286
-rw-r--r-- 3 jenkins supergroup 74693 2020-09-16 19:05
hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000287
-rw-r--r-- 3 jenkins supergroup 79427 2020-09-16 18:27
hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000305
So I have now a bunch of recovered.edits WAL files I would like to replay - but
how?
The WALPlayer is not able to replay recovered.edits files, the source code
http://hbase.apache.org/2.2/devapidocs/src-html/org/apache/hadoop/hbase/mapreduce/WALInputFormat.html
seems to expect an endtime coded into the filename:
long fileStartTime = Long.parseLong(name.substring(idx+1));
323 if (fileStartTime <= endTime) {
324 LOG.info("Found: " + file);
325 result.add(file);
326 }
327 } catch (NumberFormatException x) {
328 idx = 0;
But the files in recovered.edits are named differently (just a numbers like
00000000000000195).
Would renaming of the files help? But with which endtime?
> [HBCK2] Add RecoveredEditsPlayer
> --------------------------------
>
> Key: HBASE-22976
> URL: https://issues.apache.org/jira/browse/HBASE-22976
> Project: HBase
> Issue Type: Sub-task
> Components: hbck2
> Reporter: Michael Stack
> Priority: Major
>
> We need a recovered edits player. Messing w/ the 'adoption service' --
> tooling to adopt orphan regions and hfiles -- I've been manufacturing damaged
> clusters by moving stuff around under the running cluster. No reason to think
> that an hbase couldn't lose accounting of a whole region if a cataclysm. If
> so, region will have stuff like the '.regioninfo', dirs per column family w/
> store files but it could too have a 'recovered_edits' directory with content
> in it. We have a WALPlayer for errant WALs. We have the FSHLog tool which can
> read recovered_edits content for debugging data loss. Missing is a
> RecoveredEditsPlayer.
> I took a look at extending the WALPlayer since it has a bunch of nice options
> and it can run at bulk. Ideally, it would just digest recovered edits content
> if passed an option or recovered edits directories. On first glance, it
> didn't seem like an easy integration.... Would be worth taking a look again.
> Would be good if we could avoid making a new, distinct tool, just for
> Recovered Edits.
> The bulkload tool expects hfiles in column family directories. Recovered
> edits files are not hfiles and the files are x-columnfamily so this is not
> the way to go though a bulkload-like tool that moved the recovered edits
> files under the appropriate region dir and asked the region reopen would be a
> possibility (Would need the bulk load complete trick of splitting input if
> the region boundaries in the live cluster do not align w/ those of the errant
> recovered edits files).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)