IMPALA-6553: [DOCS] load_catalog_in_background default change

Change-Id: I548b2d1532c12f8d3c795a940b7f980482ecf09b
Reviewed-on: http://gerrit.cloudera.org:8080/9389
Reviewed-by: John Russell <jruss...@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/fc1578e1
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/fc1578e1
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/fc1578e1

Branch: refs/heads/2.x
Commit: fc1578e1a3a58be78c000c725170737decb7d50e
Parents: a73b8f8
Author: Alex Rodoni <arod...@cloudera.com>
Authored: Wed Feb 21 17:20:28 2018 -0800
Committer: Impala Public Jenkins <impala-public-jenk...@gerrit.cloudera.org>
Committed: Tue Mar 6 01:10:15 2018 +0000

----------------------------------------------------------------------
 docs/shared/impala_common.xml              | 38 +++++++++++++++++++++----
 docs/topics/impala_invalidate_metadata.xml | 10 +++----
 2 files changed, 38 insertions(+), 10 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/fc1578e1/docs/shared/impala_common.xml
----------------------------------------------------------------------
diff --git a/docs/shared/impala_common.xml b/docs/shared/impala_common.xml
index a052879..c64d18b 100644
--- a/docs/shared/impala_common.xml
+++ b/docs/shared/impala_common.xml
@@ -3443,10 +3443,39 @@ select * from header_line limit 10;
       </p>
 
       <p id="load_catalog_in_background">
-        By default, the metadata loading and caching on startup happens 
asynchronously, so Impala can begin
-        accepting requests promptly. To enable the original behavior, where 
Impala waited until all metadata was
-        loaded before accepting any requests, set the 
<cmdname>catalogd</cmdname> configuration option
-        <codeph>--load_catalog_in_background=false</codeph>.
+        Use <codeph>--load_catalog_in_background</codeph> option to control 
when
+        the metadata of a table is loaded.
+        <ul>
+          <li>
+            If set to <codeph>false</codeph>, the metadata of a table is
+            loaded when it is referenced for the first time. This means that 
the
+            first run of a particular query can be slower than subsequent runs.
+            Starting in Impala 2.2, the default for
+            <codeph>load_catalog_in_background</codeph> is
+            <codeph>false</codeph>.
+          </li>
+          <li>
+            If set to <codeph>true</codeph>, the catalog service attempts to
+            load metadata for a table even if no query needed that metadata. So
+            metadata will possibly be already loaded when the first query that
+            would need it is run. However, for the following reasons, we
+            recommend not to set the option to <codeph>true</codeph>.
+            <ul>
+              <li>
+                Background load can interfere with query-specific metadata
+                loading. This can happen on startup or after invalidating
+                metadata, with a duration depending on the amount of metadata,
+                and can lead to a seemingly random long running queries that 
are
+                difficult to diagnose.
+              </li>
+              <li>
+                Impala may load metadata for tables that are possibly never
+                used, potentially increasing catalog size and consequently 
memory
+                usage for both catalog service and Impala Daemon.
+              </li>
+            </ul>
+          </li>
+        </ul>
       </p>
 
       <ul id="catalogd_xrefs">
@@ -3458,7 +3487,6 @@ select * from header_line limit 10;
             <cmdname>catalogd</cmdname> daemon.
           </p>
         </li>
-
         <li>
           <p>
             The <codeph>REFRESH</codeph> and <codeph>INVALIDATE 
METADATA</codeph> statements are no longer needed

http://git-wip-us.apache.org/repos/asf/impala/blob/fc1578e1/docs/topics/impala_invalidate_metadata.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_invalidate_metadata.xml 
b/docs/topics/impala_invalidate_metadata.xml
index 4f63d34..ddd79d5 100644
--- a/docs/topics/impala_invalidate_metadata.xml
+++ b/docs/topics/impala_invalidate_metadata.xml
@@ -192,11 +192,11 @@ under the License.
       By default, the <codeph>INVALIDATE METADATA</codeph> command checks HDFS 
permissions of the underlying data
       files and directories, caching this information so that a statement can 
be cancelled immediately if for
       example the <codeph>impala</codeph> user does not have permission to 
write to the data directory for the
-      table. (This checking does not apply if you have set the 
<cmdname>catalogd</cmdname> configuration option
-      <codeph>--load_catalog_in_background=false</codeph>.) Impala reports any 
lack of write permissions as an
-      <codeph>INFO</codeph> message in the log file, in case that represents 
an oversight. If you change HDFS
-      permissions to make data readable or writeable by the Impala user, issue 
another <codeph>INVALIDATE
-      METADATA</codeph> to make Impala aware of the change.
+      table. (This checking does not apply when the 
<cmdname>catalogd</cmdname> configuration option
+      <codeph>--load_catalog_in_background</codeph> is set to 
<codeph>false</codeph>, which it is by default.)
+      Impala reports any lack of write permissions as an <codeph>INFO</codeph> 
message in the log file, in case
+      that represents an oversight. If you change HDFS permissions to make 
data readable or writeable by the Impala
+      user, issue another <codeph>INVALIDATE METADATA</codeph> to make Impala 
aware of the change.
     </p>
 
     <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>

Reply via email to