Do you have rows in your .META. where the info:regioninfo column is missing Geoff? Hack check_meta.rb to emit rows before it goes to deserialize the hregioninfo so you can find problem row. If no info:regioninfo, delete it.. or change check_meta.rb to use getHRegionInfoOrNull instead of getHRegionInfo.. and if null, just move past this null row.
St.Ack On Thu, Aug 11, 2011 at 10:24 PM, Geoff Hendrey <[email protected]> wrote: > Thanks, > > check_meta.rb stack traces with NPE... > > [hroot@doop10 bin]$ hbase org.jruby.Main check_meta.rb > Writables.java:75:in > `org.apache.hadoop.hbase.util.Writables.getWritable': > java.lang.NullPointerException: null (NativeException) > from Writables.java:119:in > `org.apache.hadoop.hbase.util.Writables.getHRegionInfo' > from null:-1:in `sun.reflect.GeneratedMethodAccessor6.invoke' > from DelegatingMethodAccessorImpl.java:43:in > `sun.reflect.DelegatingMethodAccessorImpl.invoke' > from Method.java:616:in `java.lang.reflect.Method.invoke' > from JavaMethod.java:196:in > `org.jruby.javasupport.JavaMethod.invokeWithExceptionHandling' > from JavaMethod.java:182:in > `org.jruby.javasupport.JavaMethod.invoke_static' > from JavaClass.java:371:in > `org.jruby.javasupport.JavaClass$StaticMethodInvoker.execute' > from SimpleCallbackMethod.java:81:in > `org.jruby.internal.runtime.methods.SimpleCallbackMethod.call' > ... 16 levels... > from Main.java:183:in `org.jruby.Main.runInterpreter' > from Main.java:120:in `org.jruby.Main.run' > from Main.java:95:in `org.jruby.Main.main' > Complete Java stackTrace > java.lang.NullPointerException > at > org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:75) > at > org.apache.hadoop.hbase.util.Writables.getHRegionInfo(Writables.java:119 > ) > at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor > Impl.java:43) > at java.lang.reflect.Method.invoke(Method.java:616) > at > org.jruby.javasupport.JavaMethod.invokeWithExceptionHandling(JavaMethod. > java:196) > at > org.jruby.javasupport.JavaMethod.invoke_static(JavaMethod.java:182) > at > org.jruby.javasupport.JavaClass$StaticMethodInvoker.execute(JavaClass.ja > va:371) > at > org.jruby.internal.runtime.methods.SimpleCallbackMethod.call(SimpleCallb > ackMethod.java:81) > at > org.jruby.evaluator.EvaluationState.callNode(EvaluationState.java:571) > at > org.jruby.evaluator.EvaluationState.evalInternal(EvaluationState.java:20 > 7) > at > org.jruby.evaluator.EvaluationState.localAsgnNode(EvaluationState.java:1 > 254) > at > org.jruby.evaluator.EvaluationState.evalInternal(EvaluationState.java:28 > 6) > at > org.jruby.evaluator.EvaluationState.blockNode(EvaluationState.java:533) > at > org.jruby.evaluator.EvaluationState.evalInternal(EvaluationState.java:20 > 1) > at > org.jruby.evaluator.EvaluationState.whileNode(EvaluationState.java:1793) > at > org.jruby.evaluator.EvaluationState.evalInternal(EvaluationState.java:38 > 7) > at > org.jruby.evaluator.EvaluationState.blockNode(EvaluationState.java:533) > at > org.jruby.evaluator.EvaluationState.evalInternal(EvaluationState.java:20 > 1) > at > org.jruby.evaluator.EvaluationState.rootNode(EvaluationState.java:1628) > at > org.jruby.evaluator.EvaluationState.evalInternal(EvaluationState.java:35 > 6) > at > org.jruby.evaluator.EvaluationState.eval(EvaluationState.java:164) > at org.jruby.Ruby.eval(Ruby.java:278) > at org.jruby.Ruby.compileOrFallbackAndRun(Ruby.java:306) > at org.jruby.Main.runInterpreter(Main.java:238) > at org.jruby.Main.runInterpreter(Main.java:183) > at org.jruby.Main.run(Main.java:120) > at org.jruby.Main.main(Main.java:95) > > -----Original Message----- > From: Jinsong Hu [mailto:[email protected]] > Sent: Thursday, August 11, 2011 3:18 PM > To: [email protected] > Subject: Re: corrupt .logs block > > as I said, run "hbase org.jruby.Main add_table.rb <table_name>" first, > then > run "hbase org.jruby.Main check_meta.rb --fix" > then restart hbase. > > It doesn't completely solve problem for me, as hbck still complains. > but at least it recovers all data and I can do full rowcount for the > table. > > > Jimmy. > > -------------------------------------------------- > From: "Geoff Hendrey" <[email protected]> > Sent: Thursday, August 11, 2011 2:21 PM > To: "Jinsong Hu" <[email protected]>; <[email protected]> > Subject: RE: corrupt .logs block > >> Hey - >> >> Our table behaves fine until we try to do a mapreduce job that reads > and >> writes from the table. When we try to retrieve keys from the afflicted >> regions, the job just hangs forever. It's interesting because we never >> get timeouts of any sort. This is different than other failures we've >> seen in which we'd get expired leases. This is a critical bug for us >> because it is preventing the launch of a product databuild which I > have >> to complete in the next week. >> >> Does anyone have any suggestions as to how I can bring the afflicted >> regions online? Worst case, delete the regions? >> >> -geoff >> >> -----Original Message----- >> From: Jinsong Hu [mailto:[email protected]] >> Sent: Thursday, August 11, 2011 11:47 AM >> To: [email protected] >> Cc: Search >> Subject: Re: corrupt .logs block >> >> I run into same issue. I tried check_meta.rb --fix and add_table.rb, > and >> >> still get the same hbck "inconsistent" table, >> however, I am able to do a rowcount for the table and there is no >> problem. >> >> Jimmy >> >> >> -------------------------------------------------- >> From: "Geoff Hendrey" <[email protected]> >> Sent: Thursday, August 11, 2011 10:36 AM >> To: <[email protected]> >> Cc: "Search" <[email protected]> >> Subject: RE: corrupt .logs block >> >>> so I delete the corrpupt .logs files. OK, fine no more issue there. >> But a >>> handful of regions in a very large table (2000+ regions) are offline >>> (".META." says offline=true). >>> >>> How do I go about trying to get the region online, and how come >> restarting >>> hbase has no effect (region still offline). >>> >>> Tried 'hbck -fix', no effect. Hbck simply lists the table as >>> "inconsistent". >>> >>> Would appreciate any advice on how to resolve this. >>> >>> Thanks, >>> geoff >>> >>> -----Original Message----- >>> From: [email protected] [mailto:[email protected]] On Behalf Of >> Stack >>> Sent: Monday, August 08, 2011 4:25 PM >>> To: [email protected] >>> Subject: Re: corrupt .logs block >>> >>> Well, if its a log no longer used, then you could just delete it. >>> That'll get rid of the fsck complaint (True, logs are not per table > so >>> to be safe you'd need to flush all tables -- this would get all edits >>> that the log could be carrying out into the filesystem into hfiles). >>> >>> St.Ack >>> >>> On Mon, Aug 8, 2011 at 4:20 PM, Geoff Hendrey <[email protected]> >>> wrote: >>>> Ah. Thanks for that. No, I don't need the log anymore. I am aware of >> how >>>> to flush a table from the hbase shell. But since the "fsck /" tells >> me a >>>> log file is corrupt, but not which table the corruption pertains to, >>>> does this mean I have to flush all my tables (I have a lot of >> tables). >>>> >>>> -geoff >>>> >>>> -----Original Message----- >>>> From: [email protected] [mailto:[email protected]] On Behalf Of >>>> Stack >>>> Sent: Monday, August 08, 2011 4:09 PM >>>> To: [email protected] >>>> Subject: Re: corrupt .logs block >>>> >>>> On Sat, Aug 6, 2011 at 12:12 PM, Geoff Hendrey > <[email protected]> >>>> wrote: >>>>> I've got a corrupt HDFS block in a region server's ".logs" >> directory. >>>> >>>> You see this when you do hdfs fsck? Is the log still needed? You >>>> could do a flush across the cluster and that should do away with > your >>>> dependency on this log. >>>> >>>> St.Ack >>>> >>> >> >
