Re: EC2 + Thrift inserts

Stack Sat, 01 May 2010 07:02:20 -0700

I missed the fact that there is a patch to address the single HTable
updating the index using a pool instead.  See
https://issues.apache.org/jira/browse/HBASE-2493.  Unfortunately it
did not get into 0.20.4.
St.Ack


On Fri, Apr 30, 2010 at 4:44 PM, Jean-Daniel Cryans <jdcry...@apache.org> wrote:
> On Fri, Apr 30, 2010 at 4:32 PM, Chris Tarnas <c...@email.com> wrote:
>> Thank you, it is nice to get this help.
>>
>> I definitely understand the overhead of writing the index, although it seems 
>> much worse than just that overhead would indicate. If I understand you 
>> correctly that is because all inserts into an IndexedTable are synchronized 
>> on one table? If that was switched to using an HTablePool it would no longer 
>> be as sever a bottleneck (performance is about an order of magnitude better 
>> without the indexing)?
>
> They are synchronized per region server yes, and it _should_ be better
> with a pool since then you can do parallel inserts. Patching it
> doesn't seem hard, but maybe I'm missing some finer details since I
> usually don't work around that code.
>
>>
>> I'm also using thrift to connect and am wondering if that itself puts an 
>> overall limit on scaling? It does seem that no matter how many more mappers 
>> and servers I add, even without indexing, I am capped at about 5k rows/sec 
>> total. I'm waiting a bit as the table grows so that it is split across more 
>> regionservers, hopefully that will help, but as far as I can tell I am not 
>> hitting any CPU or IO constraint during my tests.
>
> I don't understand the "I'm also using thrift" and "how many more
> mappers" part, you are using Thrift inside a map? Anyways, more
> clients won't help since there's a single mega serialization of all
> the inserts to the index table per region server. It's normal not to
> see any CPU/mem/IO contention since, in this case, it's all about the
> speed at which you can process a single row insertion The rest of the
> threads just wait...
>
>>
>> -chris
>>
>> I'm also using thrift, and while I am using the
>> On Apr 30, 2010, at 3:00 PM, Jean-Daniel Cryans wrote:
>>
>>> The contrib packages doesn't get as much love as core HBase, so they
>>> tend to be under performant and/or reliable and/or maintained and/or
>>> etc. In this case the issue doesn't seem that bad since it could just
>>> use a HTablePool, but using IndexedTables will definitely be slower
>>> than straight insert since it writes to 2 tables (the main table and
>>> the index).
>>>
>>> J-D
>>>
>>> On Fri, Apr 30, 2010 at 2:53 PM, Chris Tarnas <c...@email.com> wrote:
>>>> It appears that for multiple simulations loads using the IndexTables 
>>>> probably not the best choice?
>>>>
>>>> -chris
>>>>
>>>> On Apr 30, 2010, at 2:39 PM, Jean-Daniel Cryans wrote:
>>>>
>>>>> Yeah more handlers won't do it here since there's tons of calls
>>>>> waiting on a single synchronized method, I guess the IndexedRegion
>>>>> should use a pool of HTables instead of a single one in order to
>>>>> improve indexation throughput.
>>>>>
>>>>> J-D
>>>>>
>>>>> On Fri, Apr 30, 2010 at 2:26 PM, Chris Tarnas <c...@email.com> wrote:
>>>>>> Here is the thread dump:
>>>>>>
>>>>>> I cranked up the handlers to 300 just in case and ran 40 mappers that 
>>>>>> loaded data via thrift. Each node runs its own thrift server. I saw an 
>>>>>> average of 18 rows/sec/mapper with no node using more than 10% CPU and 
>>>>>> no IO wait. It seems no matter how many mappers I throw the total number 
>>>>>> of rows/sec doesn't go much above 700 rows/second total, which seems 
>>>>>> very, very slow to me.
>>>>>>
>>>>>> Here is the thread dump from a node:
>>>>>>
>>>>>> http://pastebin.com/U3eLRdMV
>>>>>>
>>>>>> I do see quite a bit of waiting and some blocking in there, not sure how 
>>>>>> exactly to interpret it all though.
>>>>>>
>>>>>> thanks for any help!
>>>>>> -chris
>>>>>>
>>>>>> On Apr 29, 2010, at 9:14 PM, Ryan Rawson wrote:
>>>>>>
>>>>>>> One thing to check is at the peak of your load, run jstack on one of
>>>>>>> the regionservers, and look at the handler threads - if all of them
>>>>>>> are doing something you might be running into handler contention.
>>>>>>>
>>>>>>> it is basically ultimately IO bound.
>>>>>>>
>>>>>>> -ryan
>>>>>>>
>>>>>>> On Thu, Apr 29, 2010 at 9:12 PM, Chris Tarnas <c...@email.com> wrote:
>>>>>>>> They are all at 100, but none of the regionservers are loaded - most 
>>>>>>>> are
>>>>>>>> less than 20% CPU. Is this all network latency?
>>>>>>>>
>>>>>>>> -chris
>>>>>>>>
>>>>>>>> On Apr 29, 2010, at 8:29 PM, Ryan Rawson <ryano...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Every insert on an indexed would require at the very least an RPC to a
>>>>>>>>> different regionserver.  If the regionservers are busy, your request
>>>>>>>>> could wait in the queue for a moment.
>>>>>>>>>
>>>>>>>>> One param to tune would be the handler thread count.  Set it to 100 at
>>>>>>>>> least.
>>>>>>>>>
>>>>>>>>> On Thu, Apr 29, 2010 at 2:16 AM, Chris Tarnas <c...@email.com> wrote:
>>>>>>>>>>
>>>>>>>>>> I just finished some testing with JDK 1.6 u17 - so far no performance
>>>>>>>>>> improvements with just changing that. Disabling LZO compression did 
>>>>>>>>>> gain a
>>>>>>>>>> little bit (up to about 30/sec from 25/sec per thread). Turning of 
>>>>>>>>>> indexes
>>>>>>>>>> helped the most - that brought me up to 115/sec @ 2875 total rows a 
>>>>>>>>>> second.
>>>>>>>>>> A single perl/thrift process can load at over 350 rows/sec so its not
>>>>>>>>>> scaling as well as I would have expected, even without the indexes.
>>>>>>>>>>
>>>>>>>>>> Are the transactional indexes that costly? What is the bottleneck 
>>>>>>>>>> there?
>>>>>>>>>> CPU utilization and network packets went up when I disabled the 
>>>>>>>>>> indexes, I
>>>>>>>>>> don't think those are the bottlenecks for the indexes. I was even 
>>>>>>>>>> able to
>>>>>>>>>> add another 15 insert process (total of 40) and only lost about 10% 
>>>>>>>>>> on a per
>>>>>>>>>> process throughput. I probably could go even higher, none of the 
>>>>>>>>>> nodes are
>>>>>>>>>> above CPU 60% utilization and IO wait was at most 3.5%.
>>>>>>>>>>
>>>>>>>>>> Each rowkey is unique, so there should not be any blocking on the row
>>>>>>>>>> locks. I'll do more indexed tests tomorrow.
>>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>> -chris
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Apr 29, 2010, at 12:18 AM, Todd Lipcon wrote:
>>>>>>>>>>
>>>>>>>>>>> Definitely smells like JDK 1.6.0_18. Downgrade that back to 16 or 
>>>>>>>>>>> 17 and
>>>>>>>>>>> you
>>>>>>>>>>> should be good to go. _18 is a botched release if I ever saw one.
>>>>>>>>>>>
>>>>>>>>>>> -Todd
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 28, 2010 at 10:54 PM, Chris Tarnas <c...@email.com> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Stack,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for looking. I checked the ganglia charts, no server was at 
>>>>>>>>>>>> more
>>>>>>>>>>>> than ~20% CPU utilization at any time during the load test and 
>>>>>>>>>>>> swap was
>>>>>>>>>>>> never used. Network traffic was light - just running a count 
>>>>>>>>>>>> through
>>>>>>>>>>>> hbase
>>>>>>>>>>>> shell generates a much higher use. One the server hosting meta
>>>>>>>>>>>> specifically,
>>>>>>>>>>>> it was at about 15-20% CPU, and IO wait never went above 3%, was
>>>>>>>>>>>> usually
>>>>>>>>>>>> down at near 0.
>>>>>>>>>>>>
>>>>>>>>>>>> The load also died with a thrift timeout on every single node (each
>>>>>>>>>>>> node
>>>>>>>>>>>> connecting to localhost for its thrift server), it looks like a
>>>>>>>>>>>> datanode
>>>>>>>>>>>> just died and caused every thrift connection to timeout - I'll 
>>>>>>>>>>>> have to
>>>>>>>>>>>> up
>>>>>>>>>>>> that limit to handle a node death.
>>>>>>>>>>>>
>>>>>>>>>>>> Checking logs this appears in the logs of the region server hosting
>>>>>>>>>>>> meta,
>>>>>>>>>>>> looks like the dead datanode causing this error:
>>>>>>>>>>>>
>>>>>>>>>>>> 2010-04-29 01:01:38,948 WARN org.apache.hadoop.hdfs.DFSClient:
>>>>>>>>>>>> DFSOutputStream ResponseProcessor exception  for block
>>>>>>>>>>>> blk_508630839844593817_11180java.io.IOException: Bad response 1 for
>>>>>>>>>>>> block
>>>>>>>>>>>> blk_508630839844593817_11180 from datanode 10.195.150.255:50010
>>>>>>>>>>>>      at
>>>>>>>>>>>>
>>>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2423)
>>>>>>>>>>>>
>>>>>>>>>>>> The regionserver log on teh dead node, 10.195.150.255 has some more
>>>>>>>>>>>> errors
>>>>>>>>>>>> in it:
>>>>>>>>>>>>
>>>>>>>>>>>> http://pastebin.com/EFH9jz0w
>>>>>>>>>>>>
>>>>>>>>>>>> I found this in the .out file on the datanode:
>>>>>>>>>>>>
>>>>>>>>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b13 mixed mode
>>>>>>>>>>>> linux-amd64 )
>>>>>>>>>>>> # Problematic frame:
>>>>>>>>>>>> # V  [libjvm.so+0x62263c]
>>>>>>>>>>>> #
>>>>>>>>>>>> # An error report file with more information is saved as:
>>>>>>>>>>>> # /usr/local/hadoop-0.20.1/hs_err_pid1364.log
>>>>>>>>>>>> #
>>>>>>>>>>>> # If you would like to submit a bug report, please visit:
>>>>>>>>>>>> #   http://java.sun.com/webapps/bugreport/crash.jsp
>>>>>>>>>>>> #
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> There is not a single error in the datanode's log though. Also of 
>>>>>>>>>>>> note
>>>>>>>>>>>> -
>>>>>>>>>>>> this happened well into the test, so the node dying cause the load 
>>>>>>>>>>>> to
>>>>>>>>>>>> abort
>>>>>>>>>>>> but not the prior poor performance. Looking through the mailing 
>>>>>>>>>>>> list it
>>>>>>>>>>>> looks like java 1.6.0_18 has a bad rep so I'll update the AMI 
>>>>>>>>>>>> (although
>>>>>>>>>>>> I'm
>>>>>>>>>>>> using the same JVM on other servers in the office w/o issue and 
>>>>>>>>>>>> decent
>>>>>>>>>>>> single node performance and never dying...).
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for any help!
>>>>>>>>>>>> -chris
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Apr 28, 2010, at 10:10 PM, Stack wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> What is load on the server hosting meta like?  Higher than others?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Apr 28, 2010, at 8:42 PM, Chris Tarnas <c...@email.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi JG,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Speed is now down to 18 rows/sec/table per process.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here is a regionserver log that is serving two of the regions:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://pastebin.com/Hx5se0hz
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here is the GC Log from the same server:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://pastebin.com/ChrRvxCx
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here is the master log:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://pastebin.com/L1Kn66qU
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The thrift server logs have nothing in them in the same time 
>>>>>>>>>>>>>> period.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks in advance!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -chris
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Apr 28, 2010, at 7:32 PM, Jonathan Gray wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hey Chris,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That's a really significant slowdown.  I can't think of anything
>>>>>>>>>>>>
>>>>>>>>>>>> obvious that would cause that in your setup.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Any chance of some regionserver and master logs from the time 
>>>>>>>>>>>>>>> it was
>>>>>>>>>>>>
>>>>>>>>>>>> going slow?  Is there any activity in the logs of the regionservers
>>>>>>>>>>>> hosting
>>>>>>>>>>>> the regions of the table being written to?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> JG
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>> From: Christopher Tarnas [mailto:c...@tarnas.org] On Behalf Of 
>>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>> Tarnas
>>>>>>>>>>>>>>>> Sent: Wednesday, April 28, 2010 6:27 PM
>>>>>>>>>>>>>>>> To: hbase-user@hadoop.apache.org
>>>>>>>>>>>>>>>> Subject: EC2 + Thrift inserts
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> First, thanks to all the HBase developers for producing this, 
>>>>>>>>>>>>>>>> it's
>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>> great project and I'm glad to be able to use it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm looking for some help and hints here with insert 
>>>>>>>>>>>>>>>> performance
>>>>>>>>>>>>>>>> help.
>>>>>>>>>>>>>>>> I'm doing some benchmarking, testing how I can scale up using
>>>>>>>>>>>>>>>> HBase,
>>>>>>>>>>>>>>>> not really looking at raw speed. The testing is happening on 
>>>>>>>>>>>>>>>> EC2,
>>>>>>>>>>>>
>>>>>>>>>>>> using
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Andrew's scripts (thanks - those were very helpful) to set 
>>>>>>>>>>>>>>>> them up
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> with a slightly customized version of the default AMIs (added 
>>>>>>>>>>>>>>>> my
>>>>>>>>>>>>>>>> application modules). I'm using HBase 20.3 and Hadoop 20.1. 
>>>>>>>>>>>>>>>> I've
>>>>>>>>>>>>
>>>>>>>>>>>> looked
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> at the tips in the Wiki and it looks like Andrew's scripts are
>>>>>>>>>>>>>>>> already
>>>>>>>>>>>>>>>> setup that way.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm inserting into HBase from a hadoop streaming job that runs 
>>>>>>>>>>>>>>>> perl
>>>>>>>>>>>>
>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> uses the thrift gateway. I'm also using the Transactional 
>>>>>>>>>>>>>>>> tables so
>>>>>>>>>>>>>>>> that alone could be the case, but from what I can tell I don't
>>>>>>>>>>>>>>>> think
>>>>>>>>>>>>>>>> so. LZO compression is also enabled for the column families 
>>>>>>>>>>>>>>>> (much
>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>> the data is highly compressible). My cluster has 7 nodes, 5
>>>>>>>>>>>>>>>> regionservers, 1 master and 1 zookeeper. The regionservers and
>>>>>>>>>>>>>>>> master
>>>>>>>>>>>>>>>> are c1.xlarges. Each regionserver has the tasktrackers that 
>>>>>>>>>>>>>>>> runs
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> hadoop streaming jobs, and regionserver also runs its own 
>>>>>>>>>>>>>>>> thrift
>>>>>>>>>>>>>>>> server. Each mapper that does the load talks to the localhost's
>>>>>>>>>>>>>>>> thrift
>>>>>>>>>>>>>>>> server.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The Row keys a fixed string + an incremental number then the 
>>>>>>>>>>>>>>>> order
>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>> the bytes are reversed, so runA123 becomes 321Anur. I though of
>>>>>>>>>>>>>>>> using
>>>>>>>>>>>>>>>> murmur hash but was worried about collisions.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As I add more insert jobs, each jobs throughput goes down. Way
>>>>>>>>>>>>>>>> down. I
>>>>>>>>>>>>>>>> went from about 200 row/sec/table per job with one job to 
>>>>>>>>>>>>>>>> about 24
>>>>>>>>>>>>>>>> rows/sec/table per job with 25 running jobs. The servers are 
>>>>>>>>>>>>>>>> mostly
>>>>>>>>>>>>>>>> idle. I'm loading into two tables, one has several indexes and 
>>>>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>>>> loading into three column families, the other has no indexes 
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>> column family. Both tables only currently have two region each.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The regionserver that serves the indexed table's regions is 
>>>>>>>>>>>>>>>> using
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> most CPU but is 87% idle. The other servers are all at ~90% 
>>>>>>>>>>>>>>>> idle.
>>>>>>>>>>>>
>>>>>>>>>>>> There
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> is no IO wait. the perl processes are barely ticking over. 
>>>>>>>>>>>>>>>> Java on
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> most "loaded" server is using about 50-60% of one CPU.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Normally when I do load in a pseudo-distrbuted hbase (my
>>>>>>>>>>>>>>>> development
>>>>>>>>>>>>>>>> platform) perl's speed is the limiting factor and uses about 
>>>>>>>>>>>>>>>> 85% of
>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>> CPU. In this cluster they are using only 5-10% of a CPU as 
>>>>>>>>>>>>>>>> they are
>>>>>>>>>>>>
>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> waiting on thrift (hbase). When I run only 1 process on the
>>>>>>>>>>>>>>>> cluster,
>>>>>>>>>>>>>>>> perl uses much more of a CPU, maybe 70%.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Any tips or help in getting the speed/scalability up would be
>>>>>>>>>>>>>>>> great.
>>>>>>>>>>>>>>>> Please let me know if you need any other info.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As I send this - it looks like the main table has split again 
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>> being served by three regionservers.. My performance is going 
>>>>>>>>>>>>>>>> up a
>>>>>>>>>>>>>>>> bit
>>>>>>>>>>>>>>>> (now 35 rows/sec/table per processes), but still seems like 
>>>>>>>>>>>>>>>> I'm not
>>>>>>>>>>>>>>>> using the full potential of even the limited EC2 system, no IO 
>>>>>>>>>>>>>>>> wait
>>>>>>>>>>>>
>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> lots of idle CPU.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> many thanks
>>>>>>>>>>>>>>>> -chris
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Todd Lipcon
>>>>>>>>>>> Software Engineer, Cloudera
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>
>

Re: EC2 + Thrift inserts

Reply via email to