Hello Sean, thanks, as a map reduce job I was able to run the WALPlayer like this:
yarn jar "hbase-2.2.5/lib/shaded-clients/hbase-shaded-mapreduce-2.2.5.jar" WALPlayer "/hbase/data/default/tt_ix_parent_item/60de2aa0715e35657f5dd0e67bf07de8/recovered.edits/" "tt_ix_parent_item" However the johistory logs don’t show that WALPlayer has done anything, there were 0 maps and 0 reduces , and no counters… Runtime is also only 6s. In the full logs I can see the typical starting and ending job logs, but nothing else... Even if I provide a WALFile itself like “/hbase/data/default/tt_ix_parent_item/60de2aa0715e35657f5dd0e67bf07de8/recovered.edits/0000000000000000023" the job seems to do nothing. The path itself seems to be correct, because I get an exception if I provide an incorrect path... The WAL files under the recovered.edits path are not empty and about 20 to 50 KB How/Where can I see that the WALFiles get really replayed? I have 38 of these recovered.edits. What would be the next steps after a successful replay? best, Martin > On 22. Sep 2020, at 18:38, Sean Busbey <[email protected]> wrote: > > hurm. following the instructions from the reference guide works for > me. Is there a specific reason you're passing the > '--internal-classpath' flag? Do other hadoop jobs work? > > what if you submit it as a proper MR job? unfortunately the ref guide > is thin on explaining this atm, but it looks like: > > HADOOP_CLASSPATH="${HBASE_CONF_DIR}:$("${HBASE_HOME}/bin/hbase" > mapredcp)" yarn jar > "${HBASE_HOME}/lib/shaded-clients/hbase-shaded-mapreduce-2.2.5.jar" > WALPlayer some/path/to/wals/ 'some:example' > > On Tue, Sep 22, 2020 at 10:24 AM Martin Braun <[email protected]> wrote: >> >> Hello Sean, >> >> thank you for you quick response! >> >> Replaying the wal files would be OK- however I am struggling using the >> WALPlayer: >> >> >> hbase --internal-classpath org.apache.hadoop.hbase.mapreduce.WALPlayer >> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits >> tt_ix_parent_item >> Exception in thread "main" java.lang.NoClassDefFoundError: >> org/codehaus/jackson/map/JsonMappingException >> at org.apache.hadoop.mapreduce.Job.getJobSubmitter(Job.java:1325) >> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1336) >> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1359) >> at org.apache.hadoop.hbase.mapreduce.WALPlayer.run(WALPlayer.java:428) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) >> at >> org.apache.hadoop.hbase.mapreduce.WALPlayer.main(WALPlayer.java:417) >> Caused by: java.lang.ClassNotFoundException: >> org.codehaus.jackson.map.JsonMappingException >> at java.net.URLClassLoader.findClass(URLClassLoader.java:382) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:418) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:351) >> ... 7 more >> >> Could you provide some hints how to use the WALPlayer correctly? >> >> >> >> best, >> Martin >> >>> On 22. Sep 2020, at 16:52, Sean Busbey <[email protected]> wrote: >>> >>> bulk loading stuff works with hfiles. recovered.edits files are >>> formatted the same as WAL files rather than as HFiles. for wal files >>> you can use the wal replayer to ensure those edits are all present in >>> the table. >>> >>> IIRC there is an unknown sequence of events that can result in the >>> recovered edits sticking around for a region after they've already >>> been recovered. Presuming your use case will work for having the same >>> edit played multiple times (basically if you do not mess about with >>> cell level timestamps or keeping multiple versions around) then it >>> should be fine to sideline those edits and then replay them using the >>> wal player. >>> >>> If your use case isn't fine with that, then you can use the wal pretty >>> printer to examine the edits that are there and check to ensure the >>> cells are already in the table in a current region. >>> >>> sounds like we should update the troubleshooting tips to include some >>> coverage of stray recovered.edits files. >>> >>> On Tue, Sep 22, 2020 at 8:58 AM Martin Braun <[email protected]> >>> wrote: >>>> >>>> Hello all, >>>> >>>> I have an issue with hbase 2.2.5 (and hadoop-2.8.5) after a full disk >>>> event I have 38 inconsistencies, when I do a >>>> >>>> hbase --internal-classpath hbck >>>> >>>> I get a bunch of these errors: >>>> >>>> ERROR: Orphan region in HDFS: Unable to load .regioninfo from table >>>> tt_ix_bizStep_inserting in hdfs dir >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_bizStep_inserting/8a1acb499bf454b072daeee5960daa73! >>>> It may be an invalid format or version file. Treating as an orphaned >>>> regiondir. >>>> ERROR: Orphan region in HDFS: Unable to load .regioninfo from table >>>> tt_ix_bizStep_inserting in hdfs dir >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_bizStep_inserting/8f64025b68958ebddeb812297facdfc6! >>>> It may be an invalid format or version file. Treating as an orphaned >>>> regiondir. >>>> >>>> >>>> When looking into these directories I see that there is indeed no >>>> .regioninfo file: >>>> >>>> hdfs dfs -ls -R >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0 >>>> >>>> drwxr-xr-x - jenkins supergroup 0 2020-09-21 11:23 >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits >>>> -rw-r--r-- 3 jenkins supergroup 74133 2020-09-21 11:11 >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000285 >>>> -rw-r--r-- 3 jenkins supergroup 74413 2020-09-16 19:03 >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000286 >>>> -rw-r--r-- 3 jenkins supergroup 74693 2020-09-16 19:05 >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000287 >>>> -rw-r--r-- 3 jenkins supergroup 79427 2020-09-16 18:27 >>>> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000305 >>>> >>>> >>>> The hbck2 manual from the hbase-operator tools tells me for Orphan Data >>>> to read http://hbase.apache.org/book.html#arch.bulk.load.complete.strays, >>>> chapter “72.4.1. 'Adopting' Stray Data" >>>> >>>> However it seems that this is another case a completebuldload on the named >>>> directories seems to do nothing… >>>> >>>> A scan 'hbase:meta', {COLUMN=>'info:regioninfo’} does not show any errors. >>>> >>>> >>>> How can I resolve these inconsistencies of the missing .regioninfo? >>>> >>>> TIA >>>> >>>> best, >>>> Martin >>>> >>> >>> >>> -- >>> Sean >> >> [email protected] >> T: +49 6227 3984255 >> F: +49 6227 3984254 >> ZFabrik Software GmbH & Co. KG >> Lammstrasse 2, 69190 Walldorf >> >> Handelsregister: Amtsgericht Mannheim HRA 702598 >> Persönlich haftende Gesellschafterin: ZFabrik Verwaltungs GmbH, Sitz Walldorf >> Geschäftsführer: Dr. H. Blohm u. Udo Offermann >> Handelsregister: Amtsgericht Mannheim HRB 723699 >> >> >> > > > -- > Sean [email protected] T: +49 6227 3984255 F: +49 6227 3984254 ZFabrik Software GmbH & Co. KG Lammstrasse 2, 69190 Walldorf Handelsregister: Amtsgericht Mannheim HRA 702598 Persönlich haftende Gesellschafterin: ZFabrik Verwaltungs GmbH, Sitz Walldorf Geschäftsführer: Dr. H. Blohm u. Udo Offermann Handelsregister: Amtsgericht Mannheim HRB 723699
