the client does synchronizing so that only 1 thread does the actual META lookup to reduce extra traffic on the META table. You can use the object ids to find out which threads are blocking and which is/are the blocker(s). Once you poke at it, it's not too hard to figure out. Personally I use 'less' and the / search feature which also highlights.
During splits you can see this kind of behaviour, does the client "unstick" itself and move on? -ryan On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <[email protected]> wrote: > 1. The .META. table seems ok > I can read my data table (one thread for reading). > I can use hbase shell to scan my data table. > And I can use 1~4 threads to put more data into my data table. > > Before this issue happen, about 800 millions entities (column) have been > put into the table successfully, and there are 253 regions for this table. > > 2. There is no strange things in logs (INFO level). > 3. All clients use HBaseConfiguration.create() for a new Configuration > instance. > > 4. The 8+ client threads running on a single machine and a single JVM. > > 5. Seems all 8+ threads are blocked in same location waiting on call to > return. > > Currently, there is no more clue, and I am digging for more clue. > > On Fri, Jan 28, 2011 at 12:02 PM, Stack <[email protected]> wrote: > >> Thats a lookup on the .META. table. Is the region hosting .META. OK? >> Anything in its logs? Do your clients share a Configuration instance >> or do you make a new one of these each time you make an HTable? Your >> threaded client is running on single machine? Can we see full stack >> trace? Are all 8+ threads blocked in same location waiting on call to >> return? >> >> St.Ack >> >> >> >> On Wed, Jan 26, 2011 at 9:19 AM, Schubert Zhang <[email protected]> wrote: >> > The "Thread-Opr0" the client thread to put data into hbase, it is >> waiting. >> > >> > "Thread-Opr0-EventThread" daemon prio=10 tid=0x00002aaafc7a8000 nid=0xe08 >> > waiting on condition [0x000000004383f000] >> > java.lang.Thread.State: WAITING (parking) >> > at sun.misc.Unsafe.park(Native Method) >> > - parking to wait for <0x00002aaab632ae50> (a >> > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) >> > at >> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) >> > at >> > >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) >> > at >> > >> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) >> > at >> > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502) >> > "Thread-Opr0-SendThread(nd1-rack2-cloud:2181)" daemon prio=10 >> > tid=0x00002aaafc7a6800 nid=0xe07 runnable [0x000000004373e000] >> > java.lang.Thread.State: RUNNABLE >> > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) >> > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210) >> > at >> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) >> > at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) >> > - locked <0x00002aaab6304410> (a sun.nio.ch.Util$1) >> > - locked <0x00002aaab6304428> (a >> > java.util.Collections$UnmodifiableSet) >> > - locked <0x00002aaab632abd0> (a sun.nio.ch.EPollSelectorImpl) >> > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) >> > at >> > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107) >> > >> > "Thread-Opr0" prio=10 tid=0x00002aab0402a000 nid=0xdf2 in Object.wait() >> > [0x000000004262d000] >> > java.lang.Thread.State: WAITING (on object monitor) >> > at java.lang.Object.wait(Native Method) >> > - waiting on <0x00002aaab04302d0> (a >> > org.apache.hadoop.hbase.ipc.HBaseClient$Call) >> > at java.lang.Object.wait(Object.java:485) >> > at >> > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:739) >> > - locked <0x00002aaab04302d0> (a >> > org.apache.hadoop.hbase.ipc.HBaseClient$Call) >> > at >> > org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) >> > at $Proxy0.getClosestRowBefore(Unknown Source) >> > at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:517) >> > at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:515) >> > at >> > >> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1000) >> > at >> > org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:514) >> > at >> > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:133) >> > at >> > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95) >> > at >> > >> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:645) >> > at >> > >> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:699) >> > - locked <0x00002aaab6294660> (a java.lang.Object) >> > at >> > >> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590) >> > at >> > >> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1114) >> > at >> > >> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234) >> > at >> > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819) >> > at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675) >> > at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660) >> > at com.bigdata.bench.hbase.HBaseWriter$Operator.operateTo(Unknown >> > Source) >> > at com.bigdata.bench.hbase.HBaseWriter$Operator.run(Unknown >> Source) >> > >> > On Thu, Jan 27, 2011 at 12:06 AM, Schubert Zhang <[email protected]> >> wrote: >> > >> >> Even though cannot put more data into table, I can read the existing >> data. >> >> >> >> And I stop and re-start the HBase, still cannot put more data. >> >> >> >> hbase(main):031:0> status 'simple' >> >> 8 live servers >> >> nd5-rack2-cloud:60020 1296057544120 >> >> requests=0, regions=32, usedHeap=130, maxHeap=8973 >> >> nd8-rack2-cloud:60020 1296057544350 >> >> requests=0, regions=31, usedHeap=128, maxHeap=8983 >> >> nd2-rack2-cloud:60020 1296057543346 >> >> requests=0, regions=32, usedHeap=130, maxHeap=8973 >> >> nd3-rack2-cloud:60020 1296057544224 >> >> requests=0, regions=32, usedHeap=133, maxHeap=8973 >> >> nd6-rack2-cloud:60020 1296057544482 >> >> requests=0, regions=32, usedHeap=130, maxHeap=8983 >> >> nd9-rack2-cloud:60020 1296057544565 >> >> requests=174, regions=32, usedHeap=180, maxHeap=8983 >> >> nd7-rack2-cloud:60020 1296057544617 >> >> requests=0, regions=32, usedHeap=126, maxHeap=8983 >> >> nd4-rack2-cloud:60020 1296057544138 >> >> requests=0, regions=32, usedHeap=126, maxHeap=8973 >> >> 0 dead servers >> >> Aggregate load: 174, regions: 255 >> >> >> >> >> >> On Wed, Jan 26, 2011 at 11:58 PM, Schubert Zhang <[email protected] >> >wrote: >> >> >> >>> I am using 0.90.0 (8 RS + 1Master) >> >>> and the HDFS is CDH3b3 >> >>> >> >>> During the first hours of running, I puts many (tens of millions >> entites, >> >>> each about 200 bytes), it worked well. >> >>> >> >>> But then, the client cannot put more data. >> >>> >> >>> I checked all log files of hbase, no abnormal is found, I will continue >> to >> >>> check this issue. >> >>> >> >>> It seems related to ZooKeeper...... >> >>> >> >> >> >> >> > >> >
