That stack trace is really just a debug message left in the Hadoop code (not even HBase!). Also it's surprising that we create a Configuration there, but that's another issue...
So there's something weird with that row, or maybe the following rows too? Could you start a scanner after that row and see if it completes? Then when the scanner is stuck (I guess it fails on a SocketTimeoutException after 60 seconds?) did you try doing a jstack on the region server that's hosting the region? You could also try the HFile too on that region and see what's going on with your data, look at 8.7.5.2.2 under http://hbase.apache.org/book.html#regions.arch Hope this helps, J-D On Sat, Jan 14, 2012 at 5:59 AM, Joel Halbert <[email protected]> wrote: > So in summary, using HBase 0.9.05, java 6 u30, standalone, any client, > including the shell get's stuck at ~ record 10k. > If I run shell> count 'table' it stalls at the 10k count. > > IIf I run HBase at Trace I see this in the logs, repeating, could it be > related? > > 2012-01-14 13:57:05,883 DEBUG org.apache.hadoop.ipc.HBaseServer: got #5 > 2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: PRI IPC > Server handler 2 on 40160: has #5 from 127.0.0.1:52866 > 2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: Served: > close queueTime= 0 procesingTime= 0 > 2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder: responding to #5 from 127.0.0.1:52866 > 2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder: responding to #5 from 127.0.0.1:52866 Wrote 8 bytes. > 2012-01-14 13:57:05,903 DEBUG org.apache.hadoop.ipc.HBaseServer: got #6 > 2012-01-14 13:57:05,904 DEBUG org.apache.hadoop.conf.Configuration: > java.io.IOException: config() > at > org.apache.hadoop.conf.Configuration.<init>(Configuration.java:211) > at > org.apache.hadoop.conf.Configuration.<init>(Configuration.java:198) > at org.apache.hadoop.hbase.client.Scan.createForName(Scan.java:504) > at org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:524) > at > org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:555) > at > org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:127) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:978) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:946) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:522) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:316) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > > > > On 14/01/12 11:28, Joel Halbert wrote: >> >> This problem appears to be unrelated to my use of a scanner, or the client >> code. >> >> If in the hbase shell I run >> >> count 'table' >> >> it also gets stuck, at around record number 10,000, >> >> Is this a corrupted table? Is there any way to repair? >> >> >> >> On 13/01/12 23:03, Joel Halbert wrote: >>> >>> It always hangs waiting on the same record.... >>> >>> >>> On 13/01/12 22:48, Joel Halbert wrote: >>>> >>>> Successfully got a few thousand results....nothing exceptional in the >>>> hbase log: >>>> >>>> |2012-01-13 22:42:13,830 INFO org.apache.hadoop.io.compress.CodecPool: >>>> Got brand-new decompressor >>>> 2012-01-13 22:42:13,832 INFO org.apache.hadoop.io.compress.CodecPool: >>>> Got brand-new decompressor >>>> 2012-01-13 22:42:32,580 DEBUG >>>> org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRUStats: total=332.03 >>>> MB, free=61.32 MB, max=393.35 MB, blocks=1524, accesses=720942, >>>> hits=691565, hitRatio=95.92%%, cachingAccesses=720938, >>>> cachingHits=691565, cachingHitsRatio=95.92%%, evictions=149, >>>> evicted=27849, evictedPerRun=186.90603637695312 >>>> 2012-01-13 22:42:36,222 DEBUG >>>> org.apache.hadoop.hbase.master.LoadBalancer: Server information: >>>> localhost.localdomain,59902,1326492448413=15 >>>> 2012-01-13 22:42:36,223 INFO >>>> org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. >>>> servers=1 regions=15 average=15.0 mostloaded=15 leastloaded=15 >>>> 2012-01-13 22:42:36,236 DEBUG >>>> org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 14 catalog row(s) >>>> and gc'd0 unreferenced parent region(s)| >>>> >>>> >>>> >>>> On 13/01/12 22:46, T Vinod Gupta wrote: >>>>> >>>>> did u get any scan results at all? >>>>> check your region server and master hbase logs for any warnings.. >>>>> >>>>> also, just fyi - the standalone version of hbase is not super stable. i >>>>> have had many similar problems in the past. the distributed mode is >>>>> much >>>>> much robust. >>>>> >>>>> thanks >>>>> >>>>> On Fri, Jan 13, 2012 at 2:36 PM, Joel Halbert<[email protected]> >>>>> wrote: >>>>> >>>>>> I have a standalone instance of HBASE (single instance, on localhost). >>>>>> >>>>>> After reading a few thousand records using a scanner my thread is >>>>>> stuck >>>>>> waiting: >>>>>> >>>>>> "main" prio=10 tid=0x00000000016d4800 nid=0xf3a in Object.wait() >>>>>> [0x00007fbe96dc3000] >>>>>> java.lang.Thread.State: WAITING (on object monitor) >>>>>> at java.lang.Object.wait(Native Method) >>>>>> at java.lang.Object.wait(Object.**java:503) >>>>>> at org.apache.hadoop.hbase.ipc.**HBaseClient.call(HBaseClient.** >>>>>> java:757) >>>>>> - locked<0x00000007e2ba21d0> (a org.apache.hadoop.hbase.ipc.** >>>>>> HBaseClient$Call) >>>>>> at org.apache.hadoop.hbase.ipc.**HBaseRPC$Invoker.invoke(** >>>>>> HBaseRPC.java:257) >>>>>> at $Proxy4.next(Unknown Source) >>>>>> at org.apache.hadoop.hbase.**client.ScannerCallable.call(** >>>>>> ScannerCallable.java:79) >>>>>> at org.apache.hadoop.hbase.**client.ScannerCallable.call(** >>>>>> ScannerCallable.java:38) >>>>>> at org.apache.hadoop.hbase.**client.HConnectionManager$** >>>>>> HConnectionImplementation.**getRegionServerWithRetries(** >>>>>> HConnectionManager.java:1019) >>>>>> at org.apache.hadoop.hbase.**client.MetaScanner.metaScan(** >>>>>> MetaScanner.java:182) >>>>>> at org.apache.hadoop.hbase.**client.MetaScanner.metaScan(** >>>>>> MetaScanner.java:95) >>>>>> at org.apache.hadoop.hbase.**client.HConnectionManager$** >>>>>> HConnectionImplementation.**prefetchRegionCache(** >>>>>> HConnectionManager.java:649) >>>>>> at org.apache.hadoop.hbase.**client.HConnectionManager$** >>>>>> HConnectionImplementation.**locateRegionInMeta(** >>>>>> HConnectionManager.java:703) >>>>>> - locked<0x00000007906dfcf8> (a java.lang.Object) >>>>>> at org.apache.hadoop.hbase.**client.HConnectionManager$** >>>>>> >>>>>> HConnectionImplementation.**locateRegion(**HConnectionManager.java:594) >>>>>> at org.apache.hadoop.hbase.**client.HConnectionManager$** >>>>>> >>>>>> HConnectionImplementation.**locateRegion(**HConnectionManager.java:559) >>>>>> at org.apache.hadoop.hbase.**client.HConnectionManager$** >>>>>> HConnectionImplementation.**getRegionLocation(** >>>>>> HConnectionManager.java:416) >>>>>> at >>>>>> org.apache.hadoop.hbase.**client.ServerCallable.**instantiateServer( >>>>>> **ServerCallable.java:57) >>>>>> at org.apache.hadoop.hbase.**client.ScannerCallable.** >>>>>> instantiateServer(**ScannerCallable.java:63) >>>>>> at org.apache.hadoop.hbase.**client.HConnectionManager$** >>>>>> HConnectionImplementation.**getRegionServerWithRetries(** >>>>>> HConnectionManager.java:1018) >>>>>> at org.apache.hadoop.hbase.**client.HTable$ClientScanner.** >>>>>> nextScanner(HTable.java:1104) >>>>>> at org.apache.hadoop.hbase.**client.HTable$ClientScanner.** >>>>>> next(HTable.java:1196) >>>>>> at org.apache.hadoop.hbase.**client.HTable$ClientScanner$1.** >>>>>> hasNext(HTable.java:1256) >>>>>> at crawler.cache.PageCache.**accept(PageCache.java:254) >>>>>> >>>>>> >>>>>> >>>>>> Concretely, it is stuck on the iterator.next method: >>>>>> >>>>>> Scan scan = new Scan(Bytes.toBytes(**hostnameTarget), >>>>>> Bytes.toBytes(hostnameTarget + (char) 127)); >>>>>> scan.setMaxVersions(1); >>>>>> scan.setCaching(4); >>>>>> ResultScanner resscan = table.getScanner(scan); >>>>>> Iterator<Result> it = resscan.iterator(); >>>>>> while (it.hasNext()) { // stuck here >>>>>> >>>>>> >>>>>> >>>>>> Any clues? >>>>>> >>>> >>>> >>> >> >
