[
https://issues.apache.org/jira/browse/HBASE-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gary Helmling updated HBASE-3553:
---------------------------------
Attachment: benchmark_results.txt
I tested this with an 8 slave cluster on EC2 to confirm the performance
improvements. After added multi-get support to YCSB, I see a throughput
increase of 30-50% and avg. latency reduction of 22-34%, when correctly setting
the thread pool size. Some more detailed results are attached for anyone
interested.
One thing to note is that increasing the client YCSB threads from 16 to 32
actually decreased the differential from the single threaded pool, so 8 worker
threads x 32 HTable instances was likely losing some performance due to client
thread contention. For clusters with highly concurrent clients (like webapps),
it may be advantageous to tune down the "hbase.htable.threads.max" value from
the default of "# of region servers". A future improvement could be to allow
use of a shared, configurable thread pool as well.
> number of active threads in HTable's ThreadPoolExecutor
> -------------------------------------------------------
>
> Key: HBASE-3553
> URL: https://issues.apache.org/jira/browse/HBASE-3553
> Project: HBase
> Issue Type: Improvement
> Components: client
> Affects Versions: 0.90.1
> Reporter: Himanshu Vashishtha
> Fix For: 0.90.2
>
> Attachments: ThreadPoolTester.java, benchmark_results.txt
>
>
> Using a ThreadPoolExecutor with corePoolSize = 0 and using
> LinkedBlockingQueue as the collection to hold incoming runnable tasks seems
> to be having the effect of running only 1 thread, irrespective of the
> maxpoolsize set by reading the property hbase.htable.threads.max (or number
> of RS). (This is what I infer from reading source code of ThreadPoolExecutor
> class in 1.6)
> On a 3 node ec2 cluster, a full table scan with approx 9m rows results in
> almost similar timing with a sequential scanner (240 secs) and scanning with
> a Coprocessor (230 secs), that uses HTable's pool to submit callable objects
> for each region.
> I try to come up with a test class that creates a similar threadpool, and
> test that whether the pool size ever grows beyond 1. It also confirms that it
> remains 1 though it executed 100 requests.
> It seems the desired behavior was to release all resources when the client is
> done reading, but this can be achieved by setting allowCoreThreadTimeOut to
> true (after setting a +ve corePoolSize).
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira