[jira] Updated: (HBASE-3553) number of active threads in HTable's ThreadPoolExecutor

Gary Helmling (JIRA) Thu, 24 Feb 2011 13:28:01 -0800

     [ 
https://issues.apache.org/jira/browse/HBASE-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Gary Helmling updated HBASE-3553:
---------------------------------

    Attachment: benchmark_results.txt

I tested this with an 8 slave cluster on EC2 to confirm the performance 
improvements.  After added multi-get support to YCSB, I see a throughput 
increase of 30-50% and avg. latency reduction of 22-34%, when correctly setting 
the thread pool size.  Some more detailed results are attached for anyone 
interested.

One thing to note is that increasing the client YCSB threads from 16 to 32 
actually decreased the differential from the single threaded pool, so 8 worker 
threads x 32 HTable instances was likely losing some performance due to client 
thread contention.  For clusters with highly concurrent clients (like webapps), 
it may be advantageous to tune down the "hbase.htable.threads.max" value from 
the default of "# of region servers".  A future improvement could be to allow 
use of a shared, configurable thread pool as well.

> number of active threads in HTable's ThreadPoolExecutor
> -------------------------------------------------------
>
>                 Key: HBASE-3553
>                 URL: https://issues.apache.org/jira/browse/HBASE-3553
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.90.1
>            Reporter: Himanshu Vashishtha
>             Fix For: 0.90.2
>
>         Attachments: ThreadPoolTester.java, benchmark_results.txt
>
>
> Using a ThreadPoolExecutor with corePoolSize = 0 and using 
> LinkedBlockingQueue as the collection to hold incoming runnable tasks seems 
> to be having the effect of running only 1 thread, irrespective of the 
> maxpoolsize set by reading the property hbase.htable.threads.max (or number 
> of RS). (This is what I infer from reading source code of ThreadPoolExecutor 
> class in 1.6)
> On a 3 node ec2 cluster, a full table scan with approx 9m rows results in 
> almost similar timing with a sequential scanner (240 secs) and scanning with 
> a Coprocessor (230 secs), that uses HTable's pool to  submit callable objects 
> for each region. 
> I try to come up with a test class that creates a similar threadpool, and 
> test that whether the pool size ever grows beyond 1. It also confirms that it 
> remains 1 though it executed 100 requests.
> It seems the desired behavior was to release all resources when the client is 
> done reading, but this can be achieved by setting allowCoreThreadTimeOut to 
> true (after setting a +ve corePoolSize).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HBASE-3553) number of active threads in HTable's ThreadPoolExecutor

Reply via email to