IMPALA-7789: [DOCS] Admission status in Impala Shell Change-Id: I17d788eb716c6a2f7a144ee2d81bbe823f74d16a Reviewed-on: http://gerrit.cloudera.org:8080/11895 Tested-by: Impala Public Jenkins <[email protected]> Reviewed-by: Tim Armstrong <[email protected]> Reviewed-by: Bikramjeet Vig <[email protected]>
Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/cb312029 Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/cb312029 Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/cb312029 Branch: refs/heads/branch-3.1.0 Commit: cb3120295b5ad2fdcec28d99c1528d4f709f5076 Parents: 09dc763 Author: Alex Rodoni <[email protected]> Authored: Tue Nov 6 17:24:31 2018 -0800 Committer: Zoltan Borok-Nagy <[email protected]> Committed: Tue Nov 13 12:51:39 2018 +0100 ---------------------------------------------------------------------- docs/topics/impala_admission.xml | 72 +++++++++---------- docs/topics/impala_live_progress.xml | 26 +++---- docs/topics/impala_live_summary.xml | 115 +++++++++++++----------------- 3 files changed, 100 insertions(+), 113 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/impala/blob/cb312029/docs/topics/impala_admission.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_admission.xml b/docs/topics/impala_admission.xml index 8b114eb..1dc1512 100644 --- a/docs/topics/impala_admission.xml +++ b/docs/topics/impala_admission.xml @@ -783,42 +783,15 @@ impala.admission-control.pool-queue-timeout-ms.<varname>queue_name</varname></ph </metadata> </prolog> <conbody> - <p> - To see how admission control works for particular queries, examine - the profile output for the query. This information is available - through the <codeph>PROFILE</codeph> statement in - <cmdname>impala-shell</cmdname> immediately after running a query in - the shell, on the <uicontrol>queries</uicontrol> page of the Impala - debug web UI, or in the Impala log file (basic information at log - level 1, more detailed information at log level 2). The profile output - contains details about the admission decision, such as whether the - query was queued or not and which resource pool it was assigned to. It - also includes the estimated and actual memory usage for the query, so - you can fine-tune the configuration for the memory limits of the - resource pools. - </p> - <p> - Remember that the limits imposed by admission control are - <q>soft</q> limits. The decentralized nature of this mechanism means - that each Impala node makes its own decisions about whether to allow - queries to run immediately or to queue them. These decisions rely on - information passed back and forth between nodes by the statestore - service. If a sudden surge in requests causes more queries than - anticipated to run concurrently, then throughput could decrease due to - queries spilling to disk or contending for resources; or queries could - be cancelled if they exceed the <codeph>MEM_LIMIT</codeph> setting - while running. - </p> - <!-- - <p> - If you have trouble getting a query to run because its estimated memory usage is too high, you can override - the estimate by setting the <codeph>MEM_LIMIT</codeph> query option in <cmdname>impala-shell</cmdname>, - then issuing the query through the shell in the same session. The <codeph>MEM_LIMIT</codeph> value is - treated as the estimated amount of memory, overriding the estimate that Impala would generate based on - table and column statistics. This value is used only for making admission control decisions, and is not - pre-allocated by the query. - </p> ---> + <p> The limits imposed by admission control are de-centrally managed + <q>soft</q> limits. Each Impala coordinator node makes its own + decisions about whether to allow queries to run immediately or to + queue them. These decisions rely on information passed back and forth + between nodes by the StateStore service. If a sudden surge in requests + causes more queries than anticipated to run concurrently, then the + throughput could decrease due to queries spilling to disk or + contending for resources. Or queries could be cancelled if they exceed + the <codeph>MEM_LIMIT</codeph> setting while running. </p> <p> In <cmdname>impala-shell</cmdname>, you can also specify which resource pool to direct queries to by setting the @@ -830,6 +803,33 @@ impala.admission-control.pool-queue-timeout-ms.<varname>queue_name</varname></ph with Sentry security. See <xref href="impala_authorization.xml#authorization"/> for details. </p> + <p> To see how admission control works for particular queries, examine + the profile output or the summary output for the query. <ul> + <li>Profile<p>The information is available through the + <codeph>PROFILE</codeph> statement in + <cmdname>impala-shell</cmdname> immediately after running a + query in the shell, on the <uicontrol>queries</uicontrol> page + of the Impala debug web UI, or in the Impala log file (basic + information at log level 1, more detailed information at log + level 2). </p><p>The profile output contains details about the + admission decision, such as whether the query was queued or not + and which resource pool it was assigned to. It also includes the + estimated and actual memory usage for the query, so you can + fine-tune the configuration for the memory limits of the + resource pools. </p></li> + <li>Summary<p>Starting in <keyword keyref="impala31"/>, the + information is available in <cmdname>impala-shell</cmdname> when + the <codeph>LIVE_PROGRESS</codeph> or + <codeph>LIVE_SUMMARY</codeph> query option is set to + <codeph>TRUE</codeph>.</p><p>You can also start an + <codeph>impala-shell</codeph> session with the + <codeph>--live_progress</codeph> or + <codeph>--live_summary</codeph> flags to monitor all queries + in that <codeph>impala-shell</codeph> session.</p><p>The summary + output includes the queuing status consisting of whether the + query was queued and what was the latest queuing + reason.</p></li> + </ul></p> <p> For details about all the Fair Scheduler configuration settings, see <xref keyref="FairScheduler">Fair Scheduler Configuration</xref>, in http://git-wip-us.apache.org/repos/asf/impala/blob/cb312029/docs/topics/impala_live_progress.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_live_progress.xml b/docs/topics/impala_live_progress.xml index 0c91824..63297aa 100644 --- a/docs/topics/impala_live_progress.xml +++ b/docs/topics/impala_live_progress.xml @@ -37,19 +37,19 @@ under the License. <conbody> - <p rev="2.3.0"> - <indexterm audience="hidden">LIVE_PROGRESS query option</indexterm> - For queries submitted through the <cmdname>impala-shell</cmdname> command, - displays an interactive progress bar showing roughly what percentage of - processing has been completed. When the query finishes, the progress bar is erased - from the <cmdname>impala-shell</cmdname> console output. - </p> - - <p> - </p> - - <p conref="../shared/impala_common.xml#common/type_boolean"/> - <p conref="../shared/impala_common.xml#common/default_false_0"/> + <p rev="2.3.0"> When the <codeph>LIVE_PROGRESS</codeph> query option is set + to <codeph>TRUE</codeph>, Impala displays an interactive progress bar + showing roughly what percentage of processing has been completed for + queries submitted through the <cmdname>impala-shell</cmdname> command. + When the query finishes, the progress bar is erased from the + <cmdname>impala-shell</cmdname> console output. </p> + <p>Starting in <keyword keyref="impala31"/>, the summary output also + includes the queuing status consisting of whether the query was queued and + what was the latest queuing reason.</p> + <p><b>Type:</b> + <codeph>Boolean</codeph></p> + <p><b>Default:</b> + <codeph>FALSE (0)</codeph></p> <p conref="../shared/impala_common.xml#common/command_line_blurb"/> <p> http://git-wip-us.apache.org/repos/asf/impala/blob/cb312029/docs/topics/impala_live_summary.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_live_summary.xml b/docs/topics/impala_live_summary.xml index 94733d2..10ecae3 100644 --- a/docs/topics/impala_live_summary.xml +++ b/docs/topics/impala_live_summary.xml @@ -36,71 +36,59 @@ under the License. </prolog> <conbody> - - <p rev="2.3.0"> - <indexterm audience="hidden">LIVE_SUMMARY query option</indexterm> - For queries submitted through the <cmdname>impala-shell</cmdname> command, - displays the same output as the <codeph>SUMMARY</codeph> command, - with the measurements updated in real time as the query progresses. - When the query finishes, the final <codeph>SUMMARY</codeph> output remains - visible in the <cmdname>impala-shell</cmdname> console output. - </p> - - <p> - </p> - - <p conref="../shared/impala_common.xml#common/type_boolean"/> - <p conref="../shared/impala_common.xml#common/default_false_0"/> - + <p rev="2.3.0"> When the <codeph>LIVE_SUMMARY</codeph> query option is set + to <codeph>TRUE</codeph>, Impala displays the same output as the + <codeph>SUMMARY</codeph> command for queries submitted through the + <cmdname>impala-shell</cmdname> command, with the measurements updated + in real time as the query progresses. When the query finishes, the final + <codeph>SUMMARY</codeph> output remains visible in the + <cmdname>impala-shell</cmdname> console output. </p> + <p>Starting in <keyword keyref="impala31"/>, the summary output also + includes the queuing status consisting of whether the query was queued and + what was the latest queuing reason.</p> + <p>the queuing status, whether the query was queued and what was the latest + queuing reason.</p> + <p><b>Type:</b> + <codeph>Boolean</codeph></p> + <p><b>Default:</b> + <codeph>FALSE (0)</codeph></p> <p conref="../shared/impala_common.xml#common/command_line_blurb"/> - <p> - You can enable this query option within <cmdname>impala-shell</cmdname> + <p> You can enable this query option within <cmdname>impala-shell</cmdname> by starting the shell with the <codeph>--live_summary</codeph> - command-line option. - You can still turn this setting off and on again within the shell through the - <codeph>SET</codeph> command. - </p> - + command-line option. You can still turn this setting off and on again + within the shell through the <codeph>SET</codeph> command. </p> <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/> - <p> - The live summary output can be useful for evaluating long-running queries, - to evaluate which phase of execution takes up the most time, or if some hosts - take much longer than others for certain operations, dragging overall performance down. - By making the information available in real time, this feature lets you decide what - action to take even before you cancel a query that is taking much longer than normal. - </p> - <p> - For example, you might see the HDFS scan phase taking a long time, and therefore revisit - performance-related aspects of your schema design such as constructing a partitioned table, - switching to the Parquet file format, running the <codeph>COMPUTE STATS</codeph> statement - for the table, and so on. - Or you might see a wide variation between the average and maximum times for all hosts to - perform some phase of the query, and therefore investigate if one particular host - needed more memory or was experiencing a network problem. - </p> + <p> The live summary output can be useful for evaluating long-running + queries, to evaluate which phase of execution takes up the most time, or + if some hosts take much longer than others for certain operations, + dragging overall performance down. By making the information available in + real time, this feature lets you decide what action to take even before + you cancel a query that is taking much longer than normal. </p> + <p> For example, you might see the HDFS scan phase taking a long time, and + therefore revisit performance-related aspects of your schema design such + as constructing a partitioned table, switching to the Parquet file format, + running the <codeph>COMPUTE STATS</codeph> statement for the table, and so + on. Or you might see a wide variation between the average and maximum + times for all hosts to perform some phase of the query, and therefore + investigate if one particular host needed more memory or was experiencing + a network problem. </p> <p conref="../shared/impala_common.xml#common/live_reporting_details"/> - <p> - For a simple and concise way of tracking the progress of an interactive query, see - <xref href="impala_live_progress.xml#live_progress"/>. - </p> - + <p> For a simple and concise way of tracking the progress of an interactive + query, see <xref href="impala_live_progress.xml#live_progress"/>. </p> <p conref="../shared/impala_common.xml#common/restrictions_blurb"/> - <p conref="../shared/impala_common.xml#common/impala_shell_progress_reports_compute_stats_caveat"/> - <p conref="../shared/impala_common.xml#common/impala_shell_progress_reports_shell_only_caveat"/> - + <p + conref="../shared/impala_common.xml#common/impala_shell_progress_reports_compute_stats_caveat"/> + <p + conref="../shared/impala_common.xml#common/impala_shell_progress_reports_shell_only_caveat"/> <p conref="../shared/impala_common.xml#common/added_in_230"/> - <p conref="../shared/impala_common.xml#common/example_blurb"/> - - <p> - The following example shows a series of <codeph>LIVE_SUMMARY</codeph> reports that - are displayed during the course of a query, showing how the numbers increase to - show the progress of different phases of the distributed query. When you do the same - in <cmdname>impala-shell</cmdname>, only a single report is displayed at any one time, - with each update overwriting the previous numbers. - </p> - -<codeblock><![CDATA[[localhost:21000] > set live_summary=true; + <p> The following example shows a series of <codeph>LIVE_SUMMARY</codeph> + reports that are displayed during the course of a query, showing how the + numbers increase to show the progress of different phases of the + distributed query. When you do the same in + <cmdname>impala-shell</cmdname>, only a single report is displayed at any + one time, with each update overwriting the previous numbers. </p> + <codeblock><![CDATA[[localhost:21000] > set live_summary=true; LIVE_SUMMARY set to true [localhost:21000] > select count(*) from customer t1 cross join customer t2; +---------------------+--------+----------+----------+---------+------------+----------+---------------+-----------------------+ @@ -140,9 +128,8 @@ LIVE_SUMMARY set to true +---------------------+--------+----------+----------+---------+------------+----------+---------------+-----------------------+ ]]> </codeblock> - -<!-- Keeping this sample output that illustrates a couple of glitches in the LIVE_SUMMARY display, hidden, to help filing JIRAs. --> -<codeblock audience="hidden"><![CDATA[[ + <!-- Keeping this sample output that illustrates a couple of glitches in the LIVE_SUMMARY display, hidden, to help filing JIRAs. --> + <codeblock audience="hidden"><![CDATA[[ +---------------------+--------+----------+----------+---------+------------+----------+---------------+-----------------------+ | Operator | #Hosts | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem | Est. Peak Mem | Detail | +---------------------+--------+----------+----------+---------+------------+----------+---------------+-----------------------+ @@ -222,8 +209,8 @@ Query: select count(*) from customer t1 cross join customer t2 | Operator | #Hosts | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem | Est. Peak Mem | Detail | ]]> </codeblock> - - <p conref="../shared/impala_common.xml#common/live_progress_live_summary_asciinema"/> - + <p + conref="../shared/impala_common.xml#common/live_progress_live_summary_asciinema" + /> </conbody> </concept>
