This is an automated email from the ASF dual-hosted git repository. boroknagyz pushed a commit to branch 2.x in repository https://gitbox.apache.org/repos/asf/impala.git
commit 82db0df1090b7c1f93fd6da5f7b89be3322a0a59 Author: Alex Rodoni <arod...@cloudera.com> AuthorDate: Tue Jun 26 14:30:38 2018 -0700 [DOCS] Clarification on admission control and DDL statements Removed the confusing example and paragraphs. Change-Id: I2e3e82bd34e88e7a13de1864aeb97f01023bc715 Reviewed-on: http://gerrit.cloudera.org:8080/10829 Reviewed-by: Tim Armstrong <tarmstr...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> --- docs/topics/impala_admission.xml | 146 ++++++++++++++++----------------------- 1 file changed, 61 insertions(+), 85 deletions(-) diff --git a/docs/topics/impala_admission.xml b/docs/topics/impala_admission.xml index 5de246b..317fa80 100644 --- a/docs/topics/impala_admission.xml +++ b/docs/topics/impala_admission.xml @@ -51,6 +51,11 @@ under the License. not wait indefinitely, so that you can detect and correct <q>starvation</q> scenarios. </p> <p> + Queries, DML statements, and some DDL statements, including + <codeph>CREATE TABLE AS SELECT</codeph> and <codeph>COMPUTE + STATS</codeph> are affected by admission control. + </p> + <p> Enable this feature if your cluster is underutilized at some times and overutilized at others. Overutilization is indicated by performance bottlenecks and queries being cancelled due to out-of-memory conditions, when those same queries are @@ -765,38 +770,42 @@ impala.admission-control.pool-queue-timeout-ms.<varname>queue_name</varname></ph <!-- End Config --> <concept id="admission_guidelines"> - - <title>Guidelines for Using Admission Control</title> - <prolog> - <metadata> - <data name="Category" value="Planning"/> - <data name="Category" value="Guidelines"/> - <data name="Category" value="Best Practices"/> - </metadata> - </prolog> - - <conbody> - - <p> - To see how admission control works for particular queries, examine the profile output for the query. This - information is available through the <codeph>PROFILE</codeph> statement in <cmdname>impala-shell</cmdname> - immediately after running a query in the shell, on the <uicontrol>queries</uicontrol> page of the Impala - debug web UI, or in the Impala log file (basic information at log level 1, more detailed information at log - level 2). The profile output contains details about the admission decision, such as whether the query was - queued or not and which resource pool it was assigned to. It also includes the estimated and actual memory - usage for the query, so you can fine-tune the configuration for the memory limits of the resource pools. - </p> - - <p> - Remember that the limits imposed by admission control are <q>soft</q> limits. - The decentralized nature of this mechanism means that each Impala node makes its own decisions about whether - to allow queries to run immediately or to queue them. These decisions rely on information passed back and forth - between nodes by the statestore service. If a sudden surge in requests causes more queries than anticipated to run - concurrently, then throughput could decrease due to queries spilling to disk or contending for resources; - or queries could be cancelled if they exceed the <codeph>MEM_LIMIT</codeph> setting while running. - </p> - -<!-- + <title>Guidelines for Using Admission Control</title> + <prolog> + <metadata> + <data name="Category" value="Planning"/> + <data name="Category" value="Guidelines"/> + <data name="Category" value="Best Practices"/> + </metadata> + </prolog> + <conbody> + <p> + To see how admission control works for particular queries, examine + the profile output for the query. This information is available + through the <codeph>PROFILE</codeph> statement in + <cmdname>impala-shell</cmdname> immediately after running a query in + the shell, on the <uicontrol>queries</uicontrol> page of the Impala + debug web UI, or in the Impala log file (basic information at log + level 1, more detailed information at log level 2). The profile output + contains details about the admission decision, such as whether the + query was queued or not and which resource pool it was assigned to. It + also includes the estimated and actual memory usage for the query, so + you can fine-tune the configuration for the memory limits of the + resource pools. + </p> + <p> + Remember that the limits imposed by admission control are + <q>soft</q> limits. The decentralized nature of this mechanism means + that each Impala node makes its own decisions about whether to allow + queries to run immediately or to queue them. These decisions rely on + information passed back and forth between nodes by the statestore + service. If a sudden surge in requests causes more queries than + anticipated to run concurrently, then throughput could decrease due to + queries spilling to disk or contending for resources; or queries could + be cancelled if they exceed the <codeph>MEM_LIMIT</codeph> setting + while running. + </p> + <!-- <p> If you have trouble getting a query to run because its estimated memory usage is too high, you can override the estimate by setting the <codeph>MEM_LIMIT</codeph> query option in <cmdname>impala-shell</cmdname>, @@ -806,58 +815,25 @@ impala.admission-control.pool-queue-timeout-ms.<varname>queue_name</varname></ph pre-allocated by the query. </p> --> - - <p> - In <cmdname>impala-shell</cmdname>, you can also specify which resource pool to direct queries to by - setting the <codeph>REQUEST_POOL</codeph> query option. - </p> - - <p> - The statements affected by the admission control feature are primarily queries, but also include statements - that write data such as <codeph>INSERT</codeph> and <codeph>CREATE TABLE AS SELECT</codeph>. Most write - operations in Impala are not resource-intensive, but inserting into a Parquet table can require substantial - memory due to buffering intermediate data before writing out each Parquet data block. See - <xref href="impala_parquet.xml#parquet_etl"/> for instructions about inserting data efficiently into - Parquet tables. - </p> - - <p> - Although admission control does not scrutinize memory usage for other kinds of DDL statements, if a query - is queued due to a limit on concurrent queries or memory usage, subsequent statements in the same session - are also queued so that they are processed in the correct order: - </p> - -<codeblock>-- This query could be queued to avoid out-of-memory at times of heavy load. -select * from huge_table join enormous_table using (id); --- If so, this subsequent statement in the same session is also queued --- until the previous statement completes. -drop table huge_table; -</codeblock> - - <p> - If you set up different resource pools for different users and groups, consider reusing any classifications - you developed for use with Sentry security. See <xref href="impala_authorization.xml#authorization"/> for details. - </p> - - <p> - For details about all the Fair Scheduler configuration settings, see - <xref keyref="FairScheduler">Fair Scheduler Configuration</xref>, in particular the tags such as <codeph><queue></codeph> and - <codeph><aclSubmitApps></codeph> to map users and groups to particular resource pools (queues). - </p> - -<!-- Wait a sec. We say admission control doesn't use RESERVATION_REQUEST_TIMEOUT at all. - What's the real story here? Matt did refer to some timeout option that was - available through the shell but not the DB-centric APIs. -<p> - Because you cannot override query options such as - <codeph>RESERVATION_REQUEST_TIMEOUT</codeph> - in a JDBC or ODBC application, consider configuring timeout periods - on the application side to cancel queries that take - too long due to being queued during times of high load. -</p> ---> - </conbody> - </concept> + <p> + In <cmdname>impala-shell</cmdname>, you can also specify which + resource pool to direct queries to by setting the + <codeph>REQUEST_POOL</codeph> query option. + </p> + <p> + If you set up different resource pools for different users and + groups, consider reusing any classifications you developed for use + with Sentry security. See <xref + href="impala_authorization.xml#authorization"/> for details. + </p> + <p> + For details about all the Fair Scheduler configuration settings, see + <xref keyref="FairScheduler">Fair Scheduler Configuration</xref>, in + particular the tags such as <codeph><queue></codeph> and + <codeph><aclSubmitApps></codeph> to map users and groups to + particular resource pools (queues). + </p> + </conbody> + </concept> </concept> </concept> -