We are generating the load from multiple machines, yes. Do you happen to know what the name of the setting for the number of ThriftServer threads is called? I can't find anything that is obviously about that in the CDH manager.
- Dan On Mar 1, 2013, at 1:46 PM, Varun Sharma wrote: > Did you try running 30-40 proc(s) on one machine and another 30-40 proc(s) > on another machine to see if that doubles the throughput ? > > On Fri, Mar 1, 2013 at 10:46 AM, Varun Sharma <[email protected]> wrote: > >> Hi, >> >> I don't know how many worker threads you have at the thrift servers. Each >> thread gets dedicated to a single connection and only serves that >> connection. New connections get queued. Also, are you sure that you are not >> saturating the client side making the calls ? >> >> Varun >> >> >> On Fri, Mar 1, 2013 at 9:33 AM, Jean-Daniel Cryans >> <[email protected]>wrote: >> >>> The primary unit of load distribution in HBase is the region, make >>> sure you have more than one. This is well documented in the manual >>> http://hbase.apache.org/book/perf.writing.html >>> >>> J-D >>> >>> On Fri, Mar 1, 2013 at 4:17 AM, Dan Crosta <[email protected]> wrote: >>>> We are using a 6-node HBase cluster with a Thrift Server on each of the >>> RegionServer nodes, and trying to evaluate maximum write throughput for our >>> use case (which involves many processes sending mutateRowsTs commands). >>> Somewhere between about 30 and 40 processes writing into the system we >>> cross the threshold where adding additional writers yields only very >>> limited returns to throughput, and I'm not sure why. We see that the CPU >>> and Disk on the DataNode/RegionServer/ThriftServer machines are not >>> saturated, nor is the NIC in those machines. I'm a little unsure where to >>> look next. >>>> >>>> A little more detail about our deployment: >>>> >>>> * CDH 4.1.2 >>>> * DataNode/RegionServer/ThriftServer class: EC2 m1.xlarge >>>> ** RegionServer: 8GB heap >>>> ** ThriftServer: 1GB heap >>>> ** DataNode: 4GB heap >>>> ** EC2 ephemeral (i.e. local, not EBS) volumes used for HDFS >>>> >>>> If there's any other information that I can provide, or any other >>> configuration or system settings I should look at, I'd appreciate the >>> pointers. >>>> >>>> Thanks, >>>> - Dan >>> >> >>
