This is an automated email from the ASF dual-hosted git repository. tmarshall pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/impala.git
commit c4874d9a94d6e2fda7ead957dbb385fea3a5c0b1 Author: Alex Rodoni <[email protected]> AuthorDate: Thu Sep 26 16:14:51 2019 -0700 IMPALA-8826: [DOCS] Add docs for PLAN_ROOT_SINK and result spooling Change-Id: I78bfceb225d25078c54c1ed8f88ca250ef42dafe Reviewed-on: http://gerrit.cloudera.org:8080/14314 Reviewed-by: Sahil Takiar <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- docs/impala.ditamap | 7 +- docs/impala_keydefs.ditamap | 3 + docs/shared/ImpalaVariables.xml | 11 +- docs/topics/impala_client.xml | 3 + docs/topics/impala_fetch_rows_timeout_ms.xml | 66 +++++++++++ docs/topics/impala_max_result_spooling_mem.xml | 59 ++++++++++ .../impala_max_spilled_result_spooling_mem.xml | 63 ++++++++++ docs/topics/impala_query_results_spooling.xml | 131 +++++++++++++++++++++ docs/topics/impala_spool_query_results.xml | 68 +++++++++++ 9 files changed, 405 insertions(+), 6 deletions(-) diff --git a/docs/impala.ditamap b/docs/impala.ditamap index 839e027..be4b76b 100644 --- a/docs/impala.ditamap +++ b/docs/impala.ditamap @@ -194,6 +194,7 @@ under the License. <topicref href="topics/impala_exec_single_node_rows_threshold.xml"/> <topicref href="topics/impala_exec_time_limit_s.xml"/> <topicref href="topics/impala_explain_level.xml"/> + <topicref href="topics/impala_fetch_rows_timeout_ms.xml"/> <topicref href="topics/impala_hbase_cache_blocks.xml"/> <topicref href="topics/impala_hbase_caching.xml"/> <topicref href="topics/impala_idle_session_timeout.xml"/> @@ -202,9 +203,11 @@ under the License. <topicref href="topics/impala_live_summary.xml"/> <topicref href="topics/impala_max_errors.xml"/> <topicref rev="3.1 IMPALA-6847" href="topics/impala_max_mem_estimate_for_admission.xml"/> - <topicref rev="2.10.0 IMPALA-3200" href="topics/impala_max_row_size.xml"/> <topicref rev="2.5.0" href="topics/impala_max_num_runtime_filters.xml"/> + <topicref href="topics/impala_max_result_spooling_mem.xml"/> + <topicref rev="2.10.0 IMPALA-3200" href="topics/impala_max_row_size.xml"/> <topicref href="topics/impala_max_scan_range_length.xml"/> + <topicref href="topics/impala_max_spilled_result_spooling_mem.xml"/> <topicref href="topics/impala_mem_limit.xml"/> <topicref rev="2.10.0 IMPALA-3200" href="topics/impala_min_spillable_buffer_size.xml"/> <topicref rev="2.8.0" href="topics/impala_mt_dop.xml"/> @@ -240,6 +243,7 @@ under the License. <!-- This option is for internal use only and might go away without ever being documented. --> <!-- <topicref href="topics/impala_seq_compression_mode.xml"/> --> <topicref href="topics/impala_shuffle_distinct_exprs.xml"/> + <topicref href="topics/impala_spool_query_results.xml"/> <topicref href="topics/impala_support_start_over.xml"/> <topicref href="topics/impala_sync_ddl.xml"/> <topicref href="topics/impala_thread_reservation_aggregate_limit.xml"/> @@ -329,6 +333,7 @@ under the License. </topicref> <topicref href="topics/impala_odbc.xml"/> <topicref href="topics/impala_jdbc.xml"/> + <topicref href="topics/impala_query_results_spooling.xml"/> </topicref> <topicref href="topics/impala_troubleshooting.xml"> <topicref href="topics/impala_webui.xml"/> diff --git a/docs/impala_keydefs.ditamap b/docs/impala_keydefs.ditamap index f72dbce..c71bdf2 100644 --- a/docs/impala_keydefs.ditamap +++ b/docs/impala_keydefs.ditamap @@ -10521,6 +10521,7 @@ under the License. <keydef href="https://issues.apache.org/jira/browse/IMPALA-9999" scope="external" format="html" keys="IMPALA-9999"/> <!-- Short form of mapping from Impala release to vendor-specific releases, for use in headings. --> + <keydef keys="impala34"><topicmeta><keywords><keyword>Impala 3.4</keyword></keywords></topicmeta></keydef> <keydef keys="impala33"><topicmeta><keywords><keyword>Impala 3.3</keyword></keywords></topicmeta></keydef> <keydef keys="impala32"><topicmeta><keywords><keyword>Impala 3.2</keyword></keywords></topicmeta></keydef> <keydef keys="impala31"><topicmeta><keywords><keyword>Impala 3.1</keyword></keywords></topicmeta></keydef> @@ -10585,6 +10586,7 @@ under the License. <keydef keys="impala132"><topicmeta><keywords><keyword>Impala 1.3.2</keyword></keywords></topicmeta></keydef> <keydef keys="impala130"><topicmeta><keywords><keyword>Impala 1.3.0</keyword></keywords></topicmeta></keydef> + <keydef keys="impala34_full"><topicmeta><keywords><keyword>Impala 3.4</keyword></keywords></topicmeta></keydef> <keydef keys="impala33_full"><topicmeta><keywords><keyword>Impala 3.3</keyword></keywords></topicmeta></keydef> <keydef keys="impala32_full"><topicmeta><keywords><keyword>Impala 3.2</keyword></keywords></topicmeta></keydef> <keydef keys="impala31_full"><topicmeta><keywords><keyword>Impala 3.1</keyword></keywords></topicmeta></keydef> @@ -10607,6 +10609,7 @@ under the License. <keydef keys="impala13_full"><topicmeta><keywords><keyword>Impala 1.3</keyword></keywords></topicmeta></keydef> <!-- Pointers to changelog pages --> + <keydef keys="changelog_34" href="https://impala.apache.org/docs/changelog-3.4.html" scope="external" format="html"/> <keydef keys="changelog_33" href="https://impala.apache.org/docs/changelog-3.3.html" scope="external" format="html"/> <keydef keys="changelog_32" href="https://impala.apache.org/docs/changelog-3.2.html" scope="external" format="html"/> <keydef keys="changelog_31" href="https://impala.apache.org/docs/changelog-3.1.html" scope="external" format="html"/> diff --git a/docs/shared/ImpalaVariables.xml b/docs/shared/ImpalaVariables.xml index 21c9ae1..5caf289 100644 --- a/docs/shared/ImpalaVariables.xml +++ b/docs/shared/ImpalaVariables.xml @@ -25,13 +25,13 @@ under the License. <prodinfo audience="PDF" id="prodinfo_for_html"> <prodname>Impala</prodname> <vrmlist> - <vrm version="Impala 3.3.x"/> + <vrm version="Impala 3.4.x"/> </vrmlist> </prodinfo> <prodinfo audience="HTML" id="prodinfo_for_pdf"> <prodname></prodname> <vrmlist> - <vrm version="Impala 3.3.x"/> + <vrm version="Impala 3.4.x"/> </vrmlist> </prodinfo> </metadata> @@ -42,6 +42,7 @@ under the License. The docs included with a distro can refer to the distro release number by editing the values here. <ul> + <li><ph id="impala34">Impala 3.4</ph></li> <li><ph id="impala33">Impala 3.3</ph></li> <li><ph id="impala32">Impala 3.2</ph></li> <li><ph id="impala31">Impala 3.1</ph></li> @@ -59,11 +60,11 @@ under the License. <li><ph id="impala13">Impala 1.3</ph></li> </ul> </p> - <p>Release Version Variable - <ph id="ReleaseVersion">Impala 3.3.x</ph></p> + <p>Release Version Variable - <ph id="ReleaseVersion">Impala 3.4.x</ph></p> <p>Banner for examples showing shell version -<ph id="ShellBanner">(Shell - build version: Impala Shell v3.3.x (<varname>hash</varname>) built on + build version: Impala Shell v3.4.x (<varname>hash</varname>) built on <varname>date</varname>)</ph></p> - <p>Banner for examples showing impalad version -<ph id="ImpaladBanner">Server version: impalad version 3.3.x (build + <p>Banner for examples showing impalad version -<ph id="ImpaladBanner">Server version: impalad version 3.4.x (build x.y.z)</ph></p> <data name="version-message" id="version-message"> <foreign> diff --git a/docs/topics/impala_client.xml b/docs/topics/impala_client.xml index a58b3e0..9d9b29d 100644 --- a/docs/topics/impala_client.xml +++ b/docs/topics/impala_client.xml @@ -21,6 +21,9 @@ under the License. <concept id="intro_client"> <title>Impala Client Access</title> + <titlealts audience="PDF"> + <navtitle>Client Access</navtitle> + </titlealts> <conbody> diff --git a/docs/topics/impala_fetch_rows_timeout_ms.xml b/docs/topics/impala_fetch_rows_timeout_ms.xml new file mode 100644 index 0000000..0da8e5da --- /dev/null +++ b/docs/topics/impala_fetch_rows_timeout_ms.xml @@ -0,0 +1,66 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> +<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> +<concept id="FETCH_ROWS_TIMEOUT_MS"> + <title>FETCH_ROWS_TIMEOUT_MS Query Option</title> + <titlealts audience="PDF"> + <navtitle>FETCH_ROWS_TIMEOUT_MS</navtitle> + </titlealts> + <prolog> + <metadata> + <data name="Category" value="Impala"/> + <data name="Category" value="Impala Query Options"/> + <data name="Category" value="Querying"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> + </metadata> + </prolog> + <conbody> + <p>Use the <codeph>FETCH_ROWS_TIMEOUT_MS</codeph> query option to control + how long Impala waits for query results when clients fetch rows.</p> + <p> When this query option is set to <codeph>0</codeph>, fetch requests wait + indefinitely.</p> + <p>The timeout applies both when query result spooling is enabled and disabled:<ul> + <li>When result spooling is disabled (<codeph>SPOOL_QUERY_RESULTS = + FALSE</codeph>), the timeout controls how long a client waits for a + single row batch to be produced by the coordinator. </li> + <li>When result spooling is enabled ( (<codeph>SPOOL_QUERY_RESULTS = + TRUE</codeph>), a client can fetch multiple row batches at a time, + so this timeout controls the total time a client waits for row batches + to be produced.</li> + </ul></p> + <p>The timeout also applies to fetch requests issued against queries in the + 'RUNNING' state. A 'RUNNING' query has no rows available, so any fetch + request will wait until the query transitions to the 'FINISHED' state and + for it to fetch all requested rows. A query in the 'FINISHED' state means + that the rows are available to be fetched.</p> + <p><b>Type:</b> + <codeph>INT</codeph></p> + <p><b>Default:</b> + <codeph>10000</codeph> (10 seconds)</p> + <p><b>Added in:</b> + <keyword keyref="impala34"/></p> + <p><b>Related information:</b> + <xref href="impala_max_result_spooling_mem.xml#MAX_RESULT_SPOOLING_MEM"/>, + <xref + href="impala_max_spilled_result_spooling_mem.xml#MAX_SPILLED_RESULT_SPOOLING_MEM" + />, <xref href="impala_spool_query_results.xml#SPOOL_QUERY_RESULTS"/></p> + </conbody> +</concept> diff --git a/docs/topics/impala_max_result_spooling_mem.xml b/docs/topics/impala_max_result_spooling_mem.xml new file mode 100644 index 0000000..4479047 --- /dev/null +++ b/docs/topics/impala_max_result_spooling_mem.xml @@ -0,0 +1,59 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> +<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> +<concept id="MAX_RESULT_SPOOLING_MEM" rev="2.10.0 IMPALA-3200"> + <title>MAX_RESULT_SPOOLING_MEM Query Option</title> + <titlealts audience="PDF"> + <navtitle>MAX_RESULT_SPOOLING_MEM</navtitle> + </titlealts> + <prolog> + <metadata> + <data name="Category" value="Impala"/> + <data name="Category" value="Impala Query Options"/> + <data name="Category" value="Querying"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> + </metadata> + </prolog> + <conbody> + <p>Use the <codeph>MAX_RESULT_SPOOLING_MEM</codeph> query option to set the + maximum amount of memory used when spooling query results. </p> + <p>If the amount of memory exceeds this value when spooling query results, + all memory will most likely be spilled to disk. </p> + <p>The <codeph>MAX_RESULT_SPOOLING_MEM</codeph> query option is applicable + only when query result spooling is enabled with the + <codeph>SPOOL_QUERY_RESULTS</codeph> query option set to + <codeph>TRUE</codeph>.</p> + <p>Setting the option to <codeph>0</codeph> or <codeph>-1</codeph> means the + memory is unbounded. </p> + <p>You cannot set this query option to values below <codeph>-1</codeph>.</p> + <p><b>Type:</b> + <codeph>INT</codeph></p> + <p><b>Default:</b> + <codeph> 100 * 1024 * 1024 (100 MB)</codeph></p> + <p><b>Added in:</b> + <keyword keyref="impala34"/></p> + <p><b>Related information:</b> + <xref href="impala_fetch_rows_timeout_ms.xml#FETCH_ROWS_TIMEOUT_MS"/>, + <xref + href="impala_max_spilled_result_spooling_mem.xml#MAX_SPILLED_RESULT_SPOOLING_MEM" + />, <xref href="impala_spool_query_results.xml#SPOOL_QUERY_RESULTS"/></p> + </conbody> +</concept> diff --git a/docs/topics/impala_max_spilled_result_spooling_mem.xml b/docs/topics/impala_max_spilled_result_spooling_mem.xml new file mode 100644 index 0000000..55fc36a --- /dev/null +++ b/docs/topics/impala_max_spilled_result_spooling_mem.xml @@ -0,0 +1,63 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> +<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> +<concept id="MAX_SPILLED_RESULT_SPOOLING_MEM" rev="2.10.0 IMPALA-3200"> + <title>MAX_SPILLED_RESULT_SPOOLING_MEM Query Option</title> + <titlealts audience="PDF"> + <navtitle>MAX_SPILLED_RESULT_SPOOLING_MEM</navtitle> + </titlealts> + <prolog> + <metadata> + <data name="Category" value="Impala"/> + <data name="Category" value="Impala Query Options"/> + <data name="Category" value="Querying"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> + </metadata> + </prolog> + <conbody> + <p>Use the <codeph>MAX_SPILLED_RESULT_SPOOLING_MEM</codeph> query option to + set the maximum amount of memory that can be spilled when spooling query + results. </p> + <p>If the amount of memory exceeds this value when spooling query results, + the coordinator fragment will block until the client has consumed enough + rows to free up more memory.</p> + <p>The <codeph>MAX_SPILLED_RESULT_SPOOLING_MEM</codeph> query option is + applicable only when query result spooling is enabled with the + <codeph>SPOOL_QUERY_RESULTS</codeph> query option set to + <codeph>TRUE</codeph>. </p> + <p>The value must be greater than or equal to the value of + <codeph>MAX_RESULT_SPOOLING_MEM</codeph>.</p> + <p>Setting the option to <codeph>0</codeph> or <codeph>-1</codeph> means the + memory is unbounded. </p> + <p>Values below <codeph>-1</codeph> are not allowed for this query + option.</p> + <p><b>Type:</b> + <codeph>INT</codeph></p> + <p><b>Default:</b><codeph> 1024 * 1024 * 1024 (1 GB)</codeph></p> + <p><b>Added in:</b> + <keyword keyref="impala34"/></p> + <p><b>Related information:</b> + <xref href="impala_fetch_rows_timeout_ms.xml#FETCH_ROWS_TIMEOUT_MS"/>, + <xref + href="impala_max_spilled_result_spooling_mem.xml#MAX_SPILLED_RESULT_SPOOLING_MEM" + />, <xref href="impala_spool_query_results.xml#SPOOL_QUERY_RESULTS"/></p> + </conbody> +</concept> diff --git a/docs/topics/impala_query_results_spooling.xml b/docs/topics/impala_query_results_spooling.xml new file mode 100644 index 0000000..a563844 --- /dev/null +++ b/docs/topics/impala_query_results_spooling.xml @@ -0,0 +1,131 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> +<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> +<concept id="data_sink"> + <title>Spooling Impala Query Results</title> + <conbody> + <p>In Impala, you can control how query results are materialized and + returned to clients, e.g. impala-shell, Hue, JDBC apps.</p> + <ul> + <li>When query result spooling is disabled, Impala relies on clients to + fetch results to trigger the generation of more result row batches until + all the result rows have been produced. If a client issues a query + without fetching all the results, the query fragments continue to + consume the resources until the query is cancelled and unregistered, + potentially tying up resources and causing other queries to wait for an + extended period of time in admission control.<p>Impala would materialize + rows on-demand where rows are created only when the client requests + them.</p></li> + <li>When query result spooling is enabled, result sets of queries are + eagerly fetched and spooled in the spooling location, either in memory + or on disk. <p>Once all result rows have been fetched and stored in the + spooling location, the resources are freed up. Incoming client fetches + can get the data from the spooled results.</p></li> + </ul> + <p>Result spooling is turned off by default, but can be enabled via the + <codeph>SPOOL_QUERY_RESULTS</codeph> query option.</p> + <section id="section_av4_hsy_2jb"> + <title>Admission Control and Result Spooling</title> + <p>Query results spooling collects and stores query results in memory that + is controlled by admission control. Use the following query options to + calibrate how much memory to use and when to spill to disk.<dl> + <dlentry> + <dt>MAX_RESULT_SPOOLING_MEM</dt> + <dd> + <p>The maximum amount of memory used when spooling query results. + If this value is exceeded when spooling results, all memory will + most likely be spilled to disk. Set to 100 MB by default. </p> + </dd> + </dlentry> + <dlentry> + <dt>MAX_SPILLED_RESULT_SPOOLING_MEM</dt> + <dd> + <p>The maximum amount of memory that can be spilled to disk when + spooling query results. Must be greater than or equal to + <codeph>MAX_RESULT_SPOOLING_MEM</codeph>. If this value is + exceeded, the coordinator fragment will block until the client + has consumed enough rows to free up more memory. Set to 1 GB by + default.</p> + </dd> + </dlentry> + </dl></p> + </section> + <section id="section_oh2_fsy_2jb"> + <title>Fetch Timeout</title> + <p>Resources for a query are released when the query completes its + execution. To prevent clients from indefinitely waiting for query + results, use the <codeph>FETCH_ROWS_TIMEOUT_MS</codeph> query option to + set the timeout when clients fetch rows. Timeout applies both when query + result spooling is enabled and disabled:<ul> + <li>When result spooling is disabled (<codeph>SPOOL_QUERY_RESULTS = + FALSE</codeph>), the timeout controls how long a client waits for + a single row batch to be produced by the coordinator. </li> + <li>When result spooling is enabled ( (<codeph>SPOOL_QUERY_RESULTS = + TRUE</codeph>), a client can fetch multiple row batches at a time, + so this timeout controls the total time a client waits for row + batches to be produced.</li> + </ul></p> + </section> + <section id="section_ahm_bsy_2jb"> + <title>Explain Plans</title> + <p>Below is the part of the <codeph>EXPLAIN</codeph> plan output for + result spooling.<codeblock>F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 +| Per-Host Resources: mem-estimate=4.02MB mem-reservation=4.00MB thread-reservation=1 +PLAN-ROOT SINK +| mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0</codeblock><ul> + <li>The <codeph>mem-estimate</codeph> for the <codeph>PLAN-ROOT + SINK</codeph> is an estimate of the amount of memory needed to + spool all the rows returned by the query.</li> + <li>The <codeph>mem-reservation</codeph> is the number and size of the + buffers necessary to spool the query results. By default, the read + and write buffers are 2 MB in size each, which is why the default is + 4 MB.</li> + </ul></p> + </section> + <section id="section_ovl_ksy_2jb"> + <title>PlanRootSink</title> + <p dir="ltr">In Impala, the <codeph>PlanRootSink</codeph> class controls + the passing of batches of rows to the clients and acts as a queue of + rows to be sent to clients.</p> + <p> + <ul> + <li> + <p>When result spooling is disabled, a single batch or rows is sent + to the <codeph>PlanRootSink</codeph>, and then the client must + consume that batch before another one can be sent.</p> + </li> + <li> + <p>When result spooling is enabled, multiple batches of rows can be + sent to the <codeph>PlanRootSink</codeph>, and multiple batches + can be consumed by the client.</p> + </li> + </ul> + </p> + </section> + <section> + <p><b>Related information:</b> + <xref href="impala_max_result_spooling_mem.xml#MAX_RESULT_SPOOLING_MEM" + />, <xref + href="impala_max_spilled_result_spooling_mem.xml#MAX_SPILLED_RESULT_SPOOLING_MEM" + />, <xref href="impala_spool_query_results.xml#SPOOL_QUERY_RESULTS" + /></p> + </section> + </conbody> +</concept> diff --git a/docs/topics/impala_spool_query_results.xml b/docs/topics/impala_spool_query_results.xml new file mode 100644 index 0000000..beea3fb --- /dev/null +++ b/docs/topics/impala_spool_query_results.xml @@ -0,0 +1,68 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> +<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> +<concept id="SPOOL_QUERY_RESULTS" rev="2.10.0 IMPALA-3200"> + <title>SPOOL_QUERY_RESULTS Query Option</title> + <titlealts audience="PDF"> + <navtitle>SPOOL_QUERY_RESULTS</navtitle> + </titlealts> + <prolog> + <metadata> + <data name="Category" value="Impala"/> + <data name="Category" value="Impala Query Options"/> + <data name="Category" value="Querying"/> + <data name="Category" value="Developers"/> + <data name="Category" value="Data Analysts"/> + </metadata> + </prolog> + <conbody> + <p>Use the <codeph>SPOOL_QUERY_RESULTS</codeph> query option to enable query + result spooling, which is disabled by default.</p> + <p>Query result spooling controls how rows are returned to the client. <ul> + <li>When query result spooling is disabled (<codeph>SPOOL_QUERY_RESULTS + = FALSE</codeph>), Impala relies on clients to fetch results to + trigger the generation of more result row batches until all the result + rows have been produced. If a client issues a query without fetching + all the results, the query fragments will continue to consume the + resources until the query is cancelled and unregistered, potentially + tying up resources and cause other queries to wait for extended period + of time in admission control.</li> + <li>When query result spooling is enabled (<codeph>SPOOL_QUERY_RESULTS = + TRUE</codeph>), the result sets of queries are eagerly fetched and + spooled, either in memory or on disk. <p>Once all result rows have + been fetched and stored in the spooling location, the resources are + freed up. Incoming client fetches can get the data from the spooled + results.</p></li> + </ul></p> + <p><b>Type:</b> + <codeph>INT</codeph></p> + <p><b>Default:</b> + <codeph>FALSE</codeph></p> + <p><b>Added in:</b> + <keyword keyref="impala34"/></p> + <p><b>Related information:</b> + <xref + href="impala_default_spillable_buffer_size.xml#default_spillable_buffer_size" + />, <xref + href="impala_max_spilled_result_spooling_mem.xml#MAX_SPILLED_RESULT_SPOOLING_MEM" + />, <xref + href="impala_max_result_spooling_mem.xml#MAX_RESULT_SPOOLING_MEM"/></p> + </conbody> +</concept>
