Let me check back in my notes for what step we're missing. On Thu, Sep 24, 2020 at 5:32 AM Martin Braun <[email protected]> wrote: > > Hello all, > > i digged a bit deeper into this: > > the WALPlayer is not able to replay recovered.edits files, the source code > http://hbase.apache.org/2.2/devapidocs/src-html/org/apache/hadoop/hbase/mapreduce/WALInputFormat.html > > seems to expect an endtime coded into the filename: > > long fileStartTime = Long.parseLong(name.substring(idx+1)); > 323 if (fileStartTime <= endTime) { > 324 LOG.info("Found: " + file); > 325 result.add(file); > 326 } > 327 } catch (NumberFormatException x) { > 328 idx = 0; > > But the files in recovered.edits are named differently (just a numbers like > 00000000000000195). > > I have also found also this issue: > > https://issues.apache.org/jira/browse/HBASE-22976 > [HBCK2] Add RecoveredEditsPlayer > > But what can I do now to fix this and replay the WAL files in the recovered > edits? > > Any ideas? > > best, > Martin > > > On 22. Sep 2020, at 18:38, Sean Busbey <[email protected]> wrote: > > > > hurm. following the instructions from the reference guide works for > > me. Is there a specific reason you're passing the > > '--internal-classpath' flag? Do other hadoop jobs work? > > > > what if you submit it as a proper MR job? unfortunately the ref guide > > is thin on explaining this atm, but it looks like: > > > > HADOOP_CLASSPATH="${HBASE_CONF_DIR}:$("${HBASE_HOME}/bin/hbase" > > mapredcp)" yarn jar > > "${HBASE_HOME}/lib/shaded-clients/hbase-shaded-mapreduce-2.2.5.jar" > > WALPlayer some/path/to/wals/ 'some:example' > > > > On Tue, Sep 22, 2020 at 10:24 AM Martin Braun <[email protected]> > > wrote: > >> > >> Hello Sean, > >> > >> thank you for you quick response! > >> > >> Replaying the wal files would be OK- however I am struggling using the > >> WALPlayer: > >> > >> > >> hbase --internal-classpath org.apache.hadoop.hbase.mapreduce.WALPlayer > >> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits > >> tt_ix_parent_item > >> Exception in thread "main" java.lang.NoClassDefFoundError: > >> org/codehaus/jackson/map/JsonMappingException > >> at org.apache.hadoop.mapreduce.Job.getJobSubmitter(Job.java:1325) > >> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1336) > >> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1359) > >> at > >> org.apache.hadoop.hbase.mapreduce.WALPlayer.run(WALPlayer.java:428) > >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > >> at > >> org.apache.hadoop.hbase.mapreduce.WALPlayer.main(WALPlayer.java:417) > >> Caused by: java.lang.ClassNotFoundException: > >> org.codehaus.jackson.map.JsonMappingException > >> at java.net.URLClassLoader.findClass(URLClassLoader.java:382) > >> at java.lang.ClassLoader.loadClass(ClassLoader.java:418) > >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) > >> at java.lang.ClassLoader.loadClass(ClassLoader.java:351) > >> ... 7 more > >> > >> Could you provide some hints how to use the WALPlayer correctly? > >> > >> > >> > >> best, > >> Martin > >> > >>> On 22. Sep 2020, at 16:52, Sean Busbey <[email protected]> wrote: > >>> > >>> bulk loading stuff works with hfiles. recovered.edits files are > >>> formatted the same as WAL files rather than as HFiles. for wal files > >>> you can use the wal replayer to ensure those edits are all present in > >>> the table. > >>> > >>> IIRC there is an unknown sequence of events that can result in the > >>> recovered edits sticking around for a region after they've already > >>> been recovered. Presuming your use case will work for having the same > >>> edit played multiple times (basically if you do not mess about with > >>> cell level timestamps or keeping multiple versions around) then it > >>> should be fine to sideline those edits and then replay them using the > >>> wal player. > >>> > >>> If your use case isn't fine with that, then you can use the wal pretty > >>> printer to examine the edits that are there and check to ensure the > >>> cells are already in the table in a current region. > >>> > >>> sounds like we should update the troubleshooting tips to include some > >>> coverage of stray recovered.edits files. > >>> > >>> On Tue, Sep 22, 2020 at 8:58 AM Martin Braun <[email protected]> > >>> wrote: > >>>> > >>>> Hello all, > >>>> > >>>> I have an issue with hbase 2.2.5 (and hadoop-2.8.5) after a full disk > >>>> event I have 38 inconsistencies, when I do a > >>>> > >>>> hbase --internal-classpath hbck > >>>> > >>>> I get a bunch of these errors: > >>>> > >>>> ERROR: Orphan region in HDFS: Unable to load .regioninfo from table > >>>> tt_ix_bizStep_inserting in hdfs dir > >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_bizStep_inserting/8a1acb499bf454b072daeee5960daa73! > >>>> It may be an invalid format or version file. Treating as an orphaned > >>>> regiondir. > >>>> ERROR: Orphan region in HDFS: Unable to load .regioninfo from table > >>>> tt_ix_bizStep_inserting in hdfs dir > >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_bizStep_inserting/8f64025b68958ebddeb812297facdfc6! > >>>> It may be an invalid format or version file. Treating as an orphaned > >>>> regiondir. > >>>> > >>>> > >>>> When looking into these directories I see that there is indeed no > >>>> .regioninfo file: > >>>> > >>>> hdfs dfs -ls -R > >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0 > >>>> > >>>> drwxr-xr-x - jenkins supergroup 0 2020-09-21 11:23 > >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits > >>>> -rw-r--r-- 3 jenkins supergroup 74133 2020-09-21 11:11 > >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000285 > >>>> -rw-r--r-- 3 jenkins supergroup 74413 2020-09-16 19:03 > >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000286 > >>>> -rw-r--r-- 3 jenkins supergroup 74693 2020-09-16 19:05 > >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000287 > >>>> -rw-r--r-- 3 jenkins supergroup 79427 2020-09-16 18:27 > >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000305 > >>>> > >>>> > >>>> The hbck2 manual from the hbase-operator tools tells me for Orphan Data > >>>> to read > >>>> http://hbase.apache.org/book.html#arch.bulk.load.complete.strays, > >>>> chapter “72.4.1. 'Adopting' Stray Data" > >>>> > >>>> However it seems that this is another case a completebuldload on the > >>>> named directories seems to do nothing… > >>>> > >>>> A scan 'hbase:meta', {COLUMN=>'info:regioninfo’} does not show any > >>>> errors. > >>>> > >>>> > >>>> How can I resolve these inconsistencies of the missing .regioninfo? > >>>> > >>>> TIA > >>>> > >>>> best, > >>>> Martin > >>>> > >>> > >>> > >>> -- > >>> Sean > >> > >> [email protected] > >> T: +49 6227 3984255 > >> F: +49 6227 3984254 > >> ZFabrik Software GmbH & Co. KG > >> Lammstrasse 2, 69190 Walldorf > >> > >> Handelsregister: Amtsgericht Mannheim HRA 702598 > >> Persönlich haftende Gesellschafterin: ZFabrik Verwaltungs GmbH, Sitz > >> Walldorf > >> Geschäftsführer: Dr. H. Blohm u. Udo Offermann > >> Handelsregister: Amtsgericht Mannheim HRB 723699 > >> > >> > >> > > > > > > -- > > Sean > > > > >
-- Sean
