Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13757 )
Change subject: [docs] update Hive Metastore integration and Impala integration docs ...................................................................... Patch Set 4: (59 comments) http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc File docs/hive_metastore.adoc: http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@92 PS2, Line 92: (as such, the Hive Metastore : will create directories for Kudu tables that will be empty) > Without knowing more about how the HMS works, this doesn't follow logically Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@93 PS2, Line 93: Though Kudu and the HMS : catalog could go out of sync from time to time (due to : link:https://issues.apache.org/jira/browse/KUDU-2475[KUDU-2475]). > Probably better as a WARNING, and also describe what user-facing things can Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@99 PS2, Line 99: ntry). > nit: remove Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@102 PS2, Line 102: Dropping > When dropping Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@102 PS2, Line 102: table > tables Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@103 PS2, Line 103: the data and the table truly are dropped > maybe: the table's data is dropped at its source? dropped in Kudu? Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@104 PS2, Line 104: Moreover, external table can be named differently : as the underlying Kudu table, while internal table always named the same as the Kudu : table. > How about: External tables may refer to tables by a name that is different Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@106 PS2, Line 106: with Hive Metastore integration, Kudu only automatically synchronizes : metadata for internal tables. > maybe: since external tables do not necessarily need to be synchonrized wit Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@108 PS2, Line 108: information. > information about table types in Impala. Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@143 PS2, Line 143: which `hive_metastore_sasl_enabled` > in which `--hive_metastore_sasl_enabled` Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@143 PS2, Line 143: (Kerberos) > nit: remove Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@144 PS2, Line 144: `hive_metastore_kerberos_principal` > `--hive_metastore_kerberos_principal` Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@144 PS2, Line 144: to true, `hive_metastore_kerberos_principal` must match the primary portion of : hive.metastore.kerberos.principal > backticks Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@154 PS2, Line 154: > or during the normal operation of a Kudu cluster. Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@166 PS2, Line 166: > How about "the Kudu HMS tooling". Or "`kudu hms` tool"? Same elsewhere Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@168 PS2, Line 168: > maybe: the Kudu admin user (commonly "kudu") Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@168 PS2, Line 168: > drop Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@170 PS2, Line 170: cluster is specified. : > done by configuring the user as a trusted user via the `--trusted_user_acl` Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@176 PS2, Line 176: user as trusted in the masters (via : > Maybe clarify what is deemed a Kudu table, e.g. It's actually just the storage handler. http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@204 PS2, Line 204: any automatically-fixable issues, for instance, by creating a table entry in : the HMS for each Kudu table that > How about, "Before enabling the Kudu-HMS integration" Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@206 PS2, Line 206: cuting it. When no automatic fix is : available, it will make suggestions on > remove parens and start a new sentence instead Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@210 PS2, Line 210: the reasons documente > Prepare for the Upgrade Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@215 PS2, Line 215: e Hive Metastores. It is disco > using the hms list tool? Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@218 PS2, Line 218: pgrading Existing Tab > Perform the Upgrade Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@228 PS2, Line 228: available, > IMO optional is pretty different from conditional. This must be done, and y Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@237 PS2, Line 237: > Same here, I don't think we should say the below steps are optional. Instea Done http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@291 PS2, Line 291: > nit: maybe link to hive_metastore.adoc#enabling-the-hive-metastore-integrat Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc File docs/hive_metastore.adoc: http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@102 PS4, Line 102: table > Kudu tables ? Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@102 PS4, Line 102: Dropping an internal table from : Impala, the data and the table truly are dropped > maybe rephrase as: Dropping an internal Kudu table from Impala also drops t rephrased http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@141 PS4, Line 141: match hive.metastore.sasl.enabled > Match that property from what source? Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@143 PS4, Line 143: hive_metastore_sasl_enabled > Is this hive.metastore.sasl.enabled property from some XML file or this is Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@144 PS4, Line 144: hive_metastore_kerberos_principal > --hive_metastore_kerberos_principal master's run-time flag ? Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@149 PS4, Line 149: needs to have read and write privileges on directories the HMS created for Kudu tables. > How about: needs to be able to access directories created by the HMS Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@151 PS4, Line 151: command > remove Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@153 PS4, Line 153: * Restart the Kudu master(s). > Does it make sense to mention that Impala also need to be restarted if usin Interesting, I don't think Impala should need to be restarted. Probably worth clarifying a bit. http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@159 PS4, Line 159: any > Drop 'any': I know there are some issues which cannot be fixed with those t Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@162 PS4, Line 162: `hms` tools > nit: The `hms` tools ? Ack http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@178 PS4, Line 178: read and write > nit: the read and the write ? Ack http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@179 PS4, Line 179: which can be done in the Hive Metastore using : `sentry.metastore.service.users` flag > nit: for clarity, move into a separate sentence, something like Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@185 PS4, Line 185: all > I'm not sure what 'all' means here, but the tool doesn't list anything for Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@206 PS4, Line 206: before > instead of Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@209 PS4, Line 209: The `kudu hms fix` tool will not automatically fix Impala external tables : for the reasons documented above > Then maybe it makes sense to provide detailed instructions w.r.t. Kudu exte I just noted that we recommend dropping and recreating them. How's that? http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@259 PS4, Line 259: does not report : any inconsistent catalog warnings > nit: does not report any catalog inconsistencies ? Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@277 PS4, Line 277: fix > Is this about 'kudu hms fix' or it's about 'kudu hms check', actually? I t Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@277 PS4, Line 277: . (Optional) If `kudu hms fix` tool reports inconsistent catalog, perform a dry-run of the : `kudu hms fix` tool to understand how the tool will attempt to address the automatically-fixable : issues. > How does the operator know which issues are fixable and which are not in ge Correct me if I'm wrong, but i think this may be referring to tables with the `impala::` prefix? At least looking through the code, it seems the check tool will only recommend manual renaming if the table name isn't Hive-compatible. http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@306 PS4, Line 306: --ignore_other_clusters=true > Here and elsehwhere: why to add the options which are set to the specified Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@309 PS4, Line 309: drop_orphan_hms_tables > nit: --drop_orphan_hms_tables Ack http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@311 PS4, Line 311: The workaround is to downgrade : the tables using `kudu hms downgrade` tool and then rerun `kudu hms fix` tool. > This assumes the new tables are using the new Kudu table storage handler. Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@314 PS4, Line 314: . Recreate external tables manually using Impala Shell. : : . Enable the Hive Metastore Integration as described in the section above. > Should these two be swapped? I don't think it matters; external table creation is ignored by Kudu, no? http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc File docs/kudu_impala_integration.adoc: http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@102 PS4, Line 102: Work > Working Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@104 PS4, Line 104: that ships with CDH 6.3 > This probably belongs in CDH docs, rather than upstream. Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@104 PS4, Line 104: 1.10 > nit: For consistency, 1.10.0? Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@104 PS4, Line 104: Impala integration > nit: the Impala integration Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@105 PS4, Line 105: work with the Hive Metastore integration to take the advantage of automatically : catalog synchronization between Kudu and the HMS > nit: maybe "can take advantage the automatic Kudu-HMS catalog synchronizati Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@106 PS4, Line 106: Since external table can map to : different Kudu tables with different names, and dropping an external table does not : drop the table from Kudu, only metadata of internal tables are automatically synchronized. > How about "Since there is no one-to-one mapping between Kudu tables and ext Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@109 PS4, Line 109: : NOTE: When Kudu's integration with the Hive Metastore is enabled, Impala should be : con > Probably also worth pointing to the Kudu-HMS docs here. Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@114 PS4, Line 114: : Without the Hive Metastore integration > Without the HMS integration enabled Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@129 PS4, Line 129: However, such a manual process can be skipped with the Hive Metastore integration, : which an internal table with the same name will be automatically created by Kudu. : To query the table, you can simply run `invalidate metadata` to ensure Impala : is picking up the latest metadata. > When the Kudu-HMS integration is enabled, internal table entries will be cr Done http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@746 PS4, Line 746: change table type from intenral : to external (and vice versa) is disallowed, due to potential to introduce catalog inconsistency : between Kudu and the HMS. > changing the table type is disallowed to avoid potentially introducing inco Done -- To view, visit http://gerrit.cloudera.org:8080/13757 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8726a39d56c4e9954f208700e99e7bcf2bbc290d Gerrit-Change-Number: 13757 Gerrit-PatchSet: 4 Gerrit-Owner: Hao Hao <[email protected]> Gerrit-Reviewer: Alex Rodoni <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Priyanka Chheda <[email protected]> Gerrit-Comment-Date: Tue, 02 Jul 2019 04:09:02 +0000 Gerrit-HasComments: Yes
