Re: HBase 0.90.0 cannot be put more data after running hours

Ryan Rawson Thu, 27 Jan 2011 22:45:25 -0800

the client does synchronizing so that only 1 thread does the actual
META lookup to reduce extra traffic on the META table.  You can use
the object ids to find out which threads are blocking and which is/are
the blocker(s).  Once you poke at it, it's not too hard to figure out.
 Personally I use 'less' and the / search feature which also
highlights.


During splits you can see this kind of behaviour, does the client
"unstick" itself and move on?

-ryan

On Thu, Jan 27, 2011 at 10:33 PM, Schubert Zhang <[email protected]> wrote:
> 1. The .META. table seems ok
>     I can read my data table (one thread for reading).
>     I can use hbase shell to scan my data table.
>     And I can use 1~4 threads to put more data into my data table.
>
>    Before this issue happen, about 800 millions entities (column) have been
> put into the table successfully, and there are 253 regions for this table.
>
> 2. There is  no strange things in logs (INFO level).
> 3. All clients use HBaseConfiguration.create() for a new Configuration
> instance.
>
> 4. The 8+ client threads running on a single machine and a single JVM.
>
> 5. Seems all 8+ threads are blocked in same location waiting on call to
> return.
>
> Currently, there is no more clue, and I am digging for more clue.
>
> On Fri, Jan 28, 2011 at 12:02 PM, Stack <[email protected]> wrote:
>
>> Thats a lookup on the .META. table.  Is the region hosting .META. OK?
>> Anything in its logs?  Do your clients share a Configuration instance
>> or do you make a new one of these each time you make an HTable?   Your
>> threaded client is running on single machine?  Can we see full stack
>> trace?  Are all 8+ threads blocked in same location waiting on call to
>> return?
>>
>> St.Ack
>>
>>
>>
>> On Wed, Jan 26, 2011 at 9:19 AM, Schubert Zhang <[email protected]> wrote:
>> > The "Thread-Opr0" the client thread to put data into hbase, it is
>> waiting.
>> >
>> > "Thread-Opr0-EventThread" daemon prio=10 tid=0x00002aaafc7a8000 nid=0xe08
>> > waiting on condition [0x000000004383f000]
>> >   java.lang.Thread.State: WAITING (parking)
>> >        at sun.misc.Unsafe.park(Native Method)
>> >        - parking to wait for  <0x00002aaab632ae50> (a
>> > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>> >        at
>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>> >        at
>> >
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>> >        at
>> >
>> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>> >        at
>> > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
>> > "Thread-Opr0-SendThread(nd1-rack2-cloud:2181)" daemon prio=10
>> > tid=0x00002aaafc7a6800 nid=0xe07 runnable [0x000000004373e000]
>> >   java.lang.Thread.State: RUNNABLE
>> >        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>> >        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
>> >        at
>> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
>> >        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
>> >        - locked <0x00002aaab6304410> (a sun.nio.ch.Util$1)
>> >        - locked <0x00002aaab6304428> (a
>> > java.util.Collections$UnmodifiableSet)
>> >        - locked <0x00002aaab632abd0> (a sun.nio.ch.EPollSelectorImpl)
>> >        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
>> >        at
>> > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1107)
>> >
>> > "Thread-Opr0" prio=10 tid=0x00002aab0402a000 nid=0xdf2 in Object.wait()
>> > [0x000000004262d000]
>> >   java.lang.Thread.State: WAITING (on object monitor)
>> >        at java.lang.Object.wait(Native Method)
>> >        - waiting on <0x00002aaab04302d0> (a
>> > org.apache.hadoop.hbase.ipc.HBaseClient$Call)
>> >        at java.lang.Object.wait(Object.java:485)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:739)
>> >        - locked <0x00002aaab04302d0> (a
>> > org.apache.hadoop.hbase.ipc.HBaseClient$Call)
>> >        at
>> > org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>> >        at $Proxy0.getClosestRowBefore(Unknown Source)
>> >        at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:517)
>> >        at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:515)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1000)
>> >        at
>> > org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:514)
>> >        at
>> > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:133)
>> >        at
>> > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:95)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:645)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:699)
>> >        - locked <0x00002aaab6294660> (a java.lang.Object)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:590)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1114)
>> >        at
>> >
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1234)
>> >        at
>> > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:819)
>> >        at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:675)
>> >        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:660)
>> >        at com.bigdata.bench.hbase.HBaseWriter$Operator.operateTo(Unknown
>> > Source)
>> >        at com.bigdata.bench.hbase.HBaseWriter$Operator.run(Unknown
>> Source)
>> >
>> > On Thu, Jan 27, 2011 at 12:06 AM, Schubert Zhang <[email protected]>
>> wrote:
>> >
>> >> Even though cannot put more data into table, I can read the existing
>> data.
>> >>
>> >> And I stop and re-start the HBase, still cannot put more data.
>> >>
>> >> hbase(main):031:0> status 'simple'
>> >> 8 live servers
>> >>     nd5-rack2-cloud:60020 1296057544120
>> >>         requests=0, regions=32, usedHeap=130, maxHeap=8973
>> >>     nd8-rack2-cloud:60020 1296057544350
>> >>         requests=0, regions=31, usedHeap=128, maxHeap=8983
>> >>     nd2-rack2-cloud:60020 1296057543346
>> >>         requests=0, regions=32, usedHeap=130, maxHeap=8973
>> >>     nd3-rack2-cloud:60020 1296057544224
>> >>         requests=0, regions=32, usedHeap=133, maxHeap=8973
>> >>     nd6-rack2-cloud:60020 1296057544482
>> >>         requests=0, regions=32, usedHeap=130, maxHeap=8983
>> >>     nd9-rack2-cloud:60020 1296057544565
>> >>         requests=174, regions=32, usedHeap=180, maxHeap=8983
>> >>     nd7-rack2-cloud:60020 1296057544617
>> >>         requests=0, regions=32, usedHeap=126, maxHeap=8983
>> >>     nd4-rack2-cloud:60020 1296057544138
>> >>         requests=0, regions=32, usedHeap=126, maxHeap=8973
>> >> 0 dead servers
>> >> Aggregate load: 174, regions: 255
>> >>
>> >>
>> >> On Wed, Jan 26, 2011 at 11:58 PM, Schubert Zhang <[email protected]
>> >wrote:
>> >>
>> >>> I am using 0.90.0 (8 RS + 1Master)
>> >>> and the HDFS is CDH3b3
>> >>>
>> >>> During the first hours of running, I puts many (tens of millions
>> entites,
>> >>> each about 200 bytes), it worked well.
>> >>>
>> >>> But then, the client cannot put more data.
>> >>>
>> >>> I checked all log files of hbase, no abnormal is found, I will continue
>> to
>> >>> check this issue.
>> >>>
>> >>> It seems related to ZooKeeper......
>> >>>
>> >>
>> >>
>> >
>>
>

Re: HBase 0.90.0 cannot be put more data after running hours

Reply via email to