[
https://issues.apache.org/jira/browse/PHOENIX-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Junegunn Choi updated PHOENIX-2982:
-----------------------------------
Attachment: cpu-util-without-patch.png
cpu-util-with-patch.png
The attached images show the CPU utilizations of the servers with and without
the patch.
> Keep the number of active handlers balanced across salt buckets
> ---------------------------------------------------------------
>
> Key: PHOENIX-2982
> URL: https://issues.apache.org/jira/browse/PHOENIX-2982
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Junegunn Choi
> Assignee: Junegunn Choi
> Attachments: cpu-util-with-patch.png, cpu-util-without-patch.png
>
>
> I'd like to discuss the idea of keeping the numbers of active handlers
> balanced across the salt buckets during parallel scan by exposing the
> counters to JobManager queue.
> h4. Background
> I was testing Phoenix on a 10-node test cluster. When I was running a few
> full-scan queries simultaneously on a table whose {{SALT_BUCKETS}} is 100, I
> noticed small queries such as {{SELECT * FROM T LIMIT 100}} are occasionally
> blocked for up to tens of seconds due to the exhaustion of regionserver
> handler threads.
> {{hbase.regionserver.handler.count}} was set to 100, which is much larger
> than the default 30, so I didn't expect this to happen
> ({{phoenix.query.threadPoolSize}} = 128 / 10 nodes * 4 queries =~ 50 < 100),
> but from periodic thread dumps, I could observe that the numbers of active
> handlers across the regionservers are often skewed during the execution.
> {noformat}
> # Obtained by periodic thread dumps
> # (key = regionserver ID, value = number of active handlers)
> 17:23:48: {1=>6, 2=>3, 3=>27, 4=>12, 5=>13, 6=>23, 7=>23, 8=>5, 9=>5,
> 10=>10}
> 17:24:18: {1=>8, 2=>6, 3=>26, 4=>3, 5=>13, 6=>41, 7=>11, 8=>5, 9=>8,
> 10=>5}
> 17:24:48: {1=>15, 3=>30, 4=>3, 5=>8, 6=>22, 7=>11, 8=>16, 9=>16, 10=>7}
> 17:25:18: {1=>6, 2=>12, 3=>37, 4=>6, 5=>4, 6=>2, 7=>21, 8=>10, 9=>24,
> 10=>5}
> 17:25:48: {1=>4, 2=>9, 3=>48, 4=>14, 5=>2, 6=>7, 7=>18, 8=>16, 9=>2,
> 10=>8}
> {noformat}
> Although {{ParallelIterators.submitWork}} shuffles the parallel scan tasks
> before submitting them to {{ThreadPoolExecutor}}, there's currently no
> mechanism to prevent the skew from happening during runtime.
> h4. Suggestion
> Maintain "active" counter for each salt bucket, and expose the numbers to
> JobManager queue via specialized {{Producer}} implementation so that it can
> choose a scan for the least loaded bucket.
> By doing so we can prevent the handler exhaustion problem described above and
> can expect more consistent utilization of the resources.
> h4. Evaluation
> I measured the response times of a full-scan query with and without the
> patch. The plan for the query is given as follows:
> {noformat}
> > explain select count(*) from image;
> +---------------------------------------------------------------------------------------------+
> | PLAN
> |
> +---------------------------------------------------------------------------------------------+
> | CLIENT 2700-CHUNK 1395337221 ROWS 817889379319 BYTES PARALLEL 100-WAY FULL
> SCAN OVER IMAGE |
> | SERVER FILTER BY FIRST KEY ONLY
> |
> | SERVER AGGREGATE INTO SINGLE ROW
> |
> +---------------------------------------------------------------------------------------------+
> {noformat}
> Result:
> - Without patch: 260 sec
> - With patch: 220 sec
> And with the patch, CPU utilizations of the regionservers are very stable
> during the run.
> I'd like to hear what you think about the approach. Please let me know if
> there's any concerns.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)