Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/10339 )
Change subject: IMPALA-6987: [DOCS] Update when INVALIDATE METADATA is required ...................................................................... Patch Set 1: (16 comments) http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml File docs/topics/impala_invalidate_metadata.xml: http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@47 PS1, Line 47: relatively > replace: very Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@48 PS1, Line 48: in the common scenario of adding new data files to an existing table > replace: whenever possible (link to INVALIDATE vs. REFRESH usage page, if e Replaced with "whenever possible" No link to add, though. http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@59 PS1, Line 59: By default > replace: If there is no table specified Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@60 PS1, Line 60: Even for a single table, <codeph>INVALIDATE METADATA</codeph> is more expensive : than <codeph>REFRESH</codeph>, so prefer <codeph>REFRESH</codeph> in the common case where you add new data : files for an existing table. > Same thing mentioned ~10 lines above - remove? Removed http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@69 PS1, Line 69: Therefore, if some other entity modifies information used by Impala in the metastore : that Impala and Hive share, the information cached by Impala must be updated. However, this does not mean : that all metadata updates require an Impala update. > This is vague, explicitly state when is manual invalidate needed (see L100- Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@74 PS1, Line 74: <note> : <p conref="../shared/impala_common.xml#common/catalog_server_124"/> : <p rev="1.2"> : In Impala 1.2 and higher, a dedicated daemon (<cmdname>catalogd</cmdname>) broadcasts DDL changes made : through Impala to all Impala nodes. Formerly, after you created a database or table while connected to one : Impala node, you needed to issue an <codeph>INVALIDATE METADATA</codeph> statement on another Impala node : before accessing the new database or table from the other node. Now, newly created or altered objects are : picked up automatically by all Impala nodes. You must still use the <codeph>INVALIDATE METADATA</codeph> : technique after creating or altering objects through Hive. See : <xref href="impala_components.xml#intro_catalogd"/> for more information on the catalog service. : </p> : <p> : The <codeph>INVALIDATE METADATA</codeph> statement is new in Impala 1.1 and higher, and takes over some of : the use cases of the Impala 1.0 <codeph>REFRESH</codeph> statement. Because <codeph>REFRESH</codeph> now : requires a table name parameter, to flush the metadata for all tables at once, use the <codeph>INVALIDATE : METADATA</codeph> statement. : </p> : <p conref="../shared/impala_common.xml#common/invalidate_then_refresh"/> : </note> > This section is very outdated and mixes up usage of old vs. new usage of RE Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@99 PS1, Line 99: instance > Don't mention individual instances in this context, it's the service as a w Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@100 PS1, Line 100: required if a change is made from another <codeph>impalad</codeph> : instance in your cluster, or through Hive and is distributed by : <codeph>catalogd</codeph>. > INVALIDATE METADATA is required when: Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@106 PS1, Line 106: same Impala node > This is pre-1.2 information. No INVALIDATE is needed as long as the changes Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@127 PS1, Line 127: <codeph>INVALIDATE METADATA</codeph> causes the metadata for that table to be marked as stale, and reloaded : the next time the table is referenced. For a huge table, that process could take a noticeable amount of time; > Repeat of L44-47 The whole paragraph is replaced. http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@129 PS1, Line 129: thus you might prefer to use <codeph>REFRESH</codeph> where practical, to avoid an unpredictable delay later, : for example if the next reference to the table is during a benchmark test. > Key information missing: use REFRESH after invalidating a specific table to Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@137 PS1, Line 137: (such as SequenceFile or HBase tables) > remove Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@139 PS1, Line 139: DESCRIBE > If L129-130 is left in, use REFRESH to make recommendation consistent. Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@154 PS1, Line 154: <p conref="../shared/impala_common.xml#common/permissions_blurb"/> : <p rev=""> : The user ID that the <cmdname>impalad</cmdname> daemon runs under, : typically the <codeph>impala</codeph> user, must have execute : permissions for all the relevant directories holding table data. : (A table could have data spread across multiple directories, : or in unexpected paths, if it uses partitioning or : specifies a <codeph>LOCATION</codeph> attribute for : individual partitions or the entire table.) : Issues with permissions might not cause an immediate error for this statement, : but subsequent statements such as <codeph>SELECT</codeph> : or <codeph>SHOW TABLE STATS</codeph> could fail. : </p> > Not specific to INVALIDATE, remove. Done http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@176 PS1, Line 176: Impala reports any lack of write permissions as an <codeph>INFO</codeph> message in the log file, in case : that represents an oversight. > Is this INVALIDATE specific at all? Removed http://gerrit.cloudera.org:8080/#/c/10339/1/docs/topics/impala_invalidate_metadata.xml@186 PS1, Line 186: The ability to specify <codeph>INVALIDATE METADATA : <varname>table_name</varname></codeph> for a table created in Hive is a new capability in Impala 1.2.4. In : earlier releases, that statement would have returned an error indicating an unknown table, requiring you to : do <codeph>INVALIDATE METADATA</codeph> with no table name, a more expensive operation that reloaded metadata : for all tables and databases. > Remove, already mentioned above. Done -- To view, visit http://gerrit.cloudera.org:8080/10339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2124e14900d0f82569c061cc46006447bb054b36 Gerrit-Change-Number: 10339 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Alex Rodoni <[email protected]> Gerrit-Reviewer: Balazs Jeszenszky <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Comment-Date: Wed, 09 May 2018 23:00:24 +0000 Gerrit-HasComments: Yes
