I am also getting stuck using 0.9.05. I am just doing a simple scan in
Java, and it sometimes hangs iterating in the scanner. I am also seeing
my 'hbase shell' get stuck while counting rows, and doing simple queries.
I'm not doing anything fancy.
P
On 1/17/12 5:13 PM, Jean-Daniel Cryans wrote:
That stack trace is really just a debug message left in the Hadoop
code (not even HBase!). Also it's surprising that we create a
Configuration there, but that's another issue...
So there's something weird with that row, or maybe the following rows
too? Could you start a scanner after that row and see if it completes?
Then when the scanner is stuck (I guess it fails on a
SocketTimeoutException after 60 seconds?) did you try doing a jstack
on the region server that's hosting the region? You could also try the
HFile too on that region and see what's going on with your data, look
at 8.7.5.2.2 under http://hbase.apache.org/book.html#regions.arch
Hope this helps,
J-D
On Sat, Jan 14, 2012 at 5:59 AM, Joel Halbert<[email protected]> wrote:
So in summary, using HBase 0.9.05, java 6 u30, standalone, any client,
including the shell get's stuck at ~ record 10k.
If I run shell> count 'table' it stalls at the 10k count.
IIf I run HBase at Trace I see this in the logs, repeating, could it be
related?
2012-01-14 13:57:05,883 DEBUG org.apache.hadoop.ipc.HBaseServer: got #5
2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: PRI IPC
Server handler 2 on 40160: has #5 from 127.0.0.1:52866
2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: Served:
close queueTime= 0 procesingTime= 0
2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: IPC Server
Responder: responding to #5 from 127.0.0.1:52866
2012-01-14 13:57:05,884 DEBUG org.apache.hadoop.ipc.HBaseServer: IPC Server
Responder: responding to #5 from 127.0.0.1:52866 Wrote 8 bytes.
2012-01-14 13:57:05,903 DEBUG org.apache.hadoop.ipc.HBaseServer: got #6
2012-01-14 13:57:05,904 DEBUG org.apache.hadoop.conf.Configuration:
java.io.IOException: config()
at
org.apache.hadoop.conf.Configuration.<init>(Configuration.java:211)
at
org.apache.hadoop.conf.Configuration.<init>(Configuration.java:198)
at org.apache.hadoop.hbase.client.Scan.createForName(Scan.java:504)
at org.apache.hadoop.hbase.client.Scan.readFields(Scan.java:524)
at
org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:555)
at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:127)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:978)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:946)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:522)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:316)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
On 14/01/12 11:28, Joel Halbert wrote:
This problem appears to be unrelated to my use of a scanner, or the client
code.
If in the hbase shell I run
count 'table'
it also gets stuck, at around record number 10,000,
Is this a corrupted table? Is there any way to repair?
On 13/01/12 23:03, Joel Halbert wrote:
It always hangs waiting on the same record....
On 13/01/12 22:48, Joel Halbert wrote:
Successfully got a few thousand results....nothing exceptional in the
hbase log:
|2012-01-13 22:42:13,830 INFO org.apache.hadoop.io.compress.CodecPool:
Got brand-new decompressor
2012-01-13 22:42:13,832 INFO org.apache.hadoop.io.compress.CodecPool:
Got brand-new decompressor
2012-01-13 22:42:32,580 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRUStats: total=332.03
MB, free=61.32 MB, max=393.35 MB, blocks=1524, accesses=720942,
hits=691565, hitRatio=95.92%%, cachingAccesses=720938,
cachingHits=691565, cachingHitsRatio=95.92%%, evictions=149,
evicted=27849, evictedPerRun=186.90603637695312
2012-01-13 22:42:36,222 DEBUG
org.apache.hadoop.hbase.master.LoadBalancer: Server information:
localhost.localdomain,59902,1326492448413=15
2012-01-13 22:42:36,223 INFO
org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing.
servers=1 regions=15 average=15.0 mostloaded=15 leastloaded=15
2012-01-13 22:42:36,236 DEBUG
org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 14 catalog row(s)
and gc'd0 unreferenced parent region(s)|
On 13/01/12 22:46, T Vinod Gupta wrote:
did u get any scan results at all?
check your region server and master hbase logs for any warnings..
also, just fyi - the standalone version of hbase is not super stable. i
have had many similar problems in the past. the distributed mode is
much
much robust.
thanks
On Fri, Jan 13, 2012 at 2:36 PM, Joel Halbert<[email protected]>
wrote:
I have a standalone instance of HBASE (single instance, on localhost).
After reading a few thousand records using a scanner my thread is
stuck
waiting:
"main" prio=10 tid=0x00000000016d4800 nid=0xf3a in Object.wait()
[0x00007fbe96dc3000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.**java:503)
at org.apache.hadoop.hbase.ipc.**HBaseClient.call(HBaseClient.**
java:757)
- locked<0x00000007e2ba21d0> (a org.apache.hadoop.hbase.ipc.**
HBaseClient$Call)
at org.apache.hadoop.hbase.ipc.**HBaseRPC$Invoker.invoke(**
HBaseRPC.java:257)
at $Proxy4.next(Unknown Source)
at org.apache.hadoop.hbase.**client.ScannerCallable.call(**
ScannerCallable.java:79)
at org.apache.hadoop.hbase.**client.ScannerCallable.call(**
ScannerCallable.java:38)
at org.apache.hadoop.hbase.**client.HConnectionManager$**
HConnectionImplementation.**getRegionServerWithRetries(**
HConnectionManager.java:1019)
at org.apache.hadoop.hbase.**client.MetaScanner.metaScan(**
MetaScanner.java:182)
at org.apache.hadoop.hbase.**client.MetaScanner.metaScan(**
MetaScanner.java:95)
at org.apache.hadoop.hbase.**client.HConnectionManager$**
HConnectionImplementation.**prefetchRegionCache(**
HConnectionManager.java:649)
at org.apache.hadoop.hbase.**client.HConnectionManager$**
HConnectionImplementation.**locateRegionInMeta(**
HConnectionManager.java:703)
- locked<0x00000007906dfcf8> (a java.lang.Object)
at org.apache.hadoop.hbase.**client.HConnectionManager$**
HConnectionImplementation.**locateRegion(**HConnectionManager.java:594)
at org.apache.hadoop.hbase.**client.HConnectionManager$**
HConnectionImplementation.**locateRegion(**HConnectionManager.java:559)
at org.apache.hadoop.hbase.**client.HConnectionManager$**
HConnectionImplementation.**getRegionLocation(**
HConnectionManager.java:416)
at
org.apache.hadoop.hbase.**client.ServerCallable.**instantiateServer(
**ServerCallable.java:57)
at org.apache.hadoop.hbase.**client.ScannerCallable.**
instantiateServer(**ScannerCallable.java:63)
at org.apache.hadoop.hbase.**client.HConnectionManager$**
HConnectionImplementation.**getRegionServerWithRetries(**
HConnectionManager.java:1018)
at org.apache.hadoop.hbase.**client.HTable$ClientScanner.**
nextScanner(HTable.java:1104)
at org.apache.hadoop.hbase.**client.HTable$ClientScanner.**
next(HTable.java:1196)
at org.apache.hadoop.hbase.**client.HTable$ClientScanner$1.**
hasNext(HTable.java:1256)
at crawler.cache.PageCache.**accept(PageCache.java:254)
Concretely, it is stuck on the iterator.next method:
Scan scan = new Scan(Bytes.toBytes(**hostnameTarget),
Bytes.toBytes(hostnameTarget + (char) 127));
scan.setMaxVersions(1);
scan.setCaching(4);
ResultScanner resscan = table.getScanner(scan);
Iterator<Result> it = resscan.iterator();
while (it.hasNext()) { // stuck here
Any clues?