[jira] [Commented] (HBASE-22976) [HBCK2] Add RecoveredEditsPlayer

Martin Braun (Jira) Thu, 24 Sep 2020 03:43:39 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-22976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201441#comment-17201441
 ]


Martin Braun commented on HBASE-22976:
--------------------------------------

Is there a workaround for reading in WAL files from recovered.edits? How can I 
achieve this?

I have an issue with hbase 2.2.5 (and hadoop-2.8.5) after a full disk event I 
have 38 inconsistencies, when I do a

hbase --internal-classpath hbck

I get a bunch of these errors: 
 
ERROR: Orphan region in HDFS: Unable to load .regioninfo from table 
tt_ix_bizStep_inserting in hdfs dir 
hdfs://localhost:9000/hbase/data/default/tt_ix_bizStep_inserting/8a1acb499bf454b072daeee5960daa73!
 It may be an invalid format or version file. Treating as an orphaned regiondir.
ERROR: Orphan region in HDFS: Unable to load .regioninfo from table 
tt_ix_bizStep_inserting in hdfs dir 
hdfs://localhost:9000/hbase/data/default/tt_ix_bizStep_inserting/8f64025b68958ebddeb812297facdfc6!
 It may be an invalid format or version file. Treating as an orphaned regiondir.


When looking into these directories I see that there is indeed no .regioninfo 
file:

hdfs dfs -ls -R 
hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0

drwxr-xr-x - jenkins supergroup 0 2020-09-21 11:23 
hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits
-rw-r--r-- 3 jenkins supergroup 74133 2020-09-21 11:11 
hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000285
-rw-r--r-- 3 jenkins supergroup 74413 2020-09-16 19:03 
hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000286
-rw-r--r-- 3 jenkins supergroup 74693 2020-09-16 19:05 
hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000287
-rw-r--r-- 3 jenkins supergroup 79427 2020-09-16 18:27 
hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000305

 

So I have now a bunch of recovered.edits WAL files I would like to replay - but 
how?

The WALPlayer is not able to replay recovered.edits files, the source code 
http://hbase.apache.org/2.2/devapidocs/src-html/org/apache/hadoop/hbase/mapreduce/WALInputFormat.html

seems to expect an endtime coded into the filename:

long fileStartTime = Long.parseLong(name.substring(idx+1));
323 if (fileStartTime <= endTime) {
324 LOG.info("Found: " + file);
325 result.add(file);
326 }
327 } catch (NumberFormatException x) {
328 idx = 0;

But the files in recovered.edits are named differently (just a numbers like 
00000000000000195).

Would renaming of the files help? But with which endtime?

 

 

> [HBCK2] Add RecoveredEditsPlayer
> --------------------------------
>
>                 Key: HBASE-22976
>                 URL: https://issues.apache.org/jira/browse/HBASE-22976
>             Project: HBase
>          Issue Type: Sub-task
>          Components: hbck2
>            Reporter: Michael Stack
>            Priority: Major
>
> We need a recovered edits player. Messing w/ the 'adoption service' -- 
> tooling to adopt orphan regions and hfiles -- I've been manufacturing damaged 
> clusters by moving stuff around under the running cluster. No reason to think 
> that an hbase couldn't lose accounting of a whole region if a cataclysm. If 
> so, region will have stuff like the '.regioninfo', dirs per column family w/ 
> store files but it could too have a 'recovered_edits' directory with content 
> in it. We have a WALPlayer for errant WALs. We have the FSHLog tool which can 
> read recovered_edits content for debugging data loss. Missing is a 
> RecoveredEditsPlayer.
> I took a look at extending the WALPlayer since it has a bunch of nice options 
> and it can run at bulk. Ideally, it would just digest recovered edits content 
> if passed an option or recovered edits directories. On first glance, it 
> didn't seem like an easy integration.... Would be worth taking a look again. 
> Would be good if we could avoid making a new, distinct tool, just for 
> Recovered Edits.
> The bulkload tool expects hfiles in column family directories. Recovered 
> edits files are not hfiles and the files are x-columnfamily so this is not 
> the way to go though a bulkload-like tool that moved the recovered edits 
> files under the appropriate region dir and asked the region reopen would be a 
> possibility (Would need the bulk load complete trick of splitting input if 
> the region boundaries in the live cluster do not align w/ those of the errant 
> recovered edits files).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HBASE-22976) [HBCK2] Add RecoveredEditsPlayer

Reply via email to