Repository: impala Updated Branches: refs/heads/master 77d07f806 -> 5f2f445e7
IMPALA-6553: [DOCS] load_catalog_in_background default change Change-Id: I548b2d1532c12f8d3c795a940b7f980482ecf09b Reviewed-on: http://gerrit.cloudera.org:8080/9389 Reviewed-by: John Russell <jruss...@cloudera.com> Tested-by: Impala Public Jenkins Project: http://git-wip-us.apache.org/repos/asf/impala/repo Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/3a1d802e Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/3a1d802e Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/3a1d802e Branch: refs/heads/master Commit: 3a1d802eada1b4ab77f5fe1a95fc6e8d2a02d0f4 Parents: 77d07f8 Author: Alex Rodoni <arod...@cloudera.com> Authored: Wed Feb 21 17:20:28 2018 -0800 Committer: Impala Public Jenkins <impala-public-jenk...@gerrit.cloudera.org> Committed: Tue Mar 6 00:48:00 2018 +0000 ---------------------------------------------------------------------- docs/shared/impala_common.xml | 38 +++++++++++++++++++++---- docs/topics/impala_invalidate_metadata.xml | 10 +++---- 2 files changed, 38 insertions(+), 10 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/impala/blob/3a1d802e/docs/shared/impala_common.xml ---------------------------------------------------------------------- diff --git a/docs/shared/impala_common.xml b/docs/shared/impala_common.xml index 03892eb..4c5e57c 100644 --- a/docs/shared/impala_common.xml +++ b/docs/shared/impala_common.xml @@ -3443,10 +3443,39 @@ select * from header_line limit 10; </p> <p id="load_catalog_in_background"> - By default, the metadata loading and caching on startup happens asynchronously, so Impala can begin - accepting requests promptly. To enable the original behavior, where Impala waited until all metadata was - loaded before accepting any requests, set the <cmdname>catalogd</cmdname> configuration option - <codeph>--load_catalog_in_background=false</codeph>. + Use <codeph>--load_catalog_in_background</codeph> option to control when + the metadata of a table is loaded. + <ul> + <li> + If set to <codeph>false</codeph>, the metadata of a table is + loaded when it is referenced for the first time. This means that the + first run of a particular query can be slower than subsequent runs. + Starting in Impala 2.2, the default for + <codeph>load_catalog_in_background</codeph> is + <codeph>false</codeph>. + </li> + <li> + If set to <codeph>true</codeph>, the catalog service attempts to + load metadata for a table even if no query needed that metadata. So + metadata will possibly be already loaded when the first query that + would need it is run. However, for the following reasons, we + recommend not to set the option to <codeph>true</codeph>. + <ul> + <li> + Background load can interfere with query-specific metadata + loading. This can happen on startup or after invalidating + metadata, with a duration depending on the amount of metadata, + and can lead to a seemingly random long running queries that are + difficult to diagnose. + </li> + <li> + Impala may load metadata for tables that are possibly never + used, potentially increasing catalog size and consequently memory + usage for both catalog service and Impala Daemon. + </li> + </ul> + </li> + </ul> </p> <ul id="catalogd_xrefs"> @@ -3458,7 +3487,6 @@ select * from header_line limit 10; <cmdname>catalogd</cmdname> daemon. </p> </li> - <li> <p> The <codeph>REFRESH</codeph> and <codeph>INVALIDATE METADATA</codeph> statements are no longer needed http://git-wip-us.apache.org/repos/asf/impala/blob/3a1d802e/docs/topics/impala_invalidate_metadata.xml ---------------------------------------------------------------------- diff --git a/docs/topics/impala_invalidate_metadata.xml b/docs/topics/impala_invalidate_metadata.xml index 4f63d34..ddd79d5 100644 --- a/docs/topics/impala_invalidate_metadata.xml +++ b/docs/topics/impala_invalidate_metadata.xml @@ -192,11 +192,11 @@ under the License. By default, the <codeph>INVALIDATE METADATA</codeph> command checks HDFS permissions of the underlying data files and directories, caching this information so that a statement can be cancelled immediately if for example the <codeph>impala</codeph> user does not have permission to write to the data directory for the - table. (This checking does not apply if you have set the <cmdname>catalogd</cmdname> configuration option - <codeph>--load_catalog_in_background=false</codeph>.) Impala reports any lack of write permissions as an - <codeph>INFO</codeph> message in the log file, in case that represents an oversight. If you change HDFS - permissions to make data readable or writeable by the Impala user, issue another <codeph>INVALIDATE - METADATA</codeph> to make Impala aware of the change. + table. (This checking does not apply when the <cmdname>catalogd</cmdname> configuration option + <codeph>--load_catalog_in_background</codeph> is set to <codeph>false</codeph>, which it is by default.) + Impala reports any lack of write permissions as an <codeph>INFO</codeph> message in the log file, in case + that represents an oversight. If you change HDFS permissions to make data readable or writeable by the Impala + user, issue another <codeph>INVALIDATE METADATA</codeph> to make Impala aware of the change. </p> <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>