Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13757 )

Change subject: [docs] update Hive Metastore integration and Impala integration 
docs
......................................................................


Patch Set 4:

(59 comments)

http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc
File docs/hive_metastore.adoc:

http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@92
PS2, Line 92: (as such, the Hive Metastore
            : will create directories for Kudu tables that will be empty)
> Without knowing more about how the HMS works, this doesn't follow logically
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@93
PS2, Line 93: Though Kudu and the HMS
            : catalog could go out of sync from time to time (due to
            : link:https://issues.apache.org/jira/browse/KUDU-2475[KUDU-2475]).
> Probably better as a WARNING, and also describe what user-facing things can
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@99
PS2, Line 99: ntry).
> nit: remove
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@102
PS2, Line 102: Dropping
> When dropping
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@102
PS2, Line 102: table
> tables
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@103
PS2, Line 103: the data and the table truly are dropped
> maybe: the table's data is dropped at its source? dropped in Kudu?
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@104
PS2, Line 104: Moreover, external table can be named differently
             : as the underlying Kudu table, while internal table always named 
the same as the Kudu
             : table.
> How about: External tables may refer to tables by a name that is different
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@106
PS2, Line 106: with Hive Metastore integration, Kudu only automatically 
synchronizes
             : metadata for internal tables.
> maybe: since external tables do not necessarily need to be synchonrized wit
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@108
PS2, Line 108: information.
> information about table types in Impala.
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@143
PS2, Line 143: which `hive_metastore_sasl_enabled`
> in which `--hive_metastore_sasl_enabled`
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@143
PS2, Line 143: (Kerberos)
> nit: remove
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@144
PS2, Line 144: `hive_metastore_kerberos_principal`
> `--hive_metastore_kerberos_principal`
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@144
PS2, Line 144: to true, `hive_metastore_kerberos_principal` must match the 
primary portion of
             : hive.metastore.kerberos.principal
> backticks
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@154
PS2, Line 154:
> or during the normal operation of a Kudu cluster.
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@166
PS2, Line 166:
> How about "the Kudu HMS tooling". Or "`kudu hms` tool"? Same elsewhere
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@168
PS2, Line 168:
> maybe: the Kudu admin user (commonly "kudu")
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@168
PS2, Line 168:
> drop
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@170
PS2, Line 170:  cluster is specified.
             :
> done by configuring the user as a trusted user via the `--trusted_user_acl`
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@176
PS2, Line 176:  user as trusted in the masters (via
             :
> Maybe clarify what is deemed a Kudu table, e.g.
It's actually just the storage handler.


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@204
PS2, Line 204: any automatically-fixable issues, for instance, by creating a 
table entry in
             : the HMS for each Kudu table that
> How about, "Before enabling the Kudu-HMS integration"
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@206
PS2, Line 206: cuting it. When no automatic fix is
             : available, it will make suggestions on
> remove parens and start a new sentence instead
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@210
PS2, Line 210: the reasons documente
> Prepare for the Upgrade
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@215
PS2, Line 215: e Hive Metastores. It is disco
> using the hms list tool?
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@218
PS2, Line 218: pgrading Existing Tab
> Perform the Upgrade
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@228
PS2, Line 228: available,
> IMO optional is pretty different from conditional. This must be done, and y
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@237
PS2, Line 237:
> Same here, I don't think we should say the below steps are optional. Instea
Done


http://gerrit.cloudera.org:8080/#/c/13757/2/docs/hive_metastore.adoc@291
PS2, Line 291:
> nit: maybe link to hive_metastore.adoc#enabling-the-hive-metastore-integrat
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc
File docs/hive_metastore.adoc:

http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@102
PS4, Line 102: table
> Kudu tables ?
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@102
PS4, Line 102: Dropping an internal table from
             : Impala, the data and the table truly are dropped
> maybe rephrase as: Dropping an internal Kudu table from Impala also drops t
rephrased


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@141
PS4, Line 141: match hive.metastore.sasl.enabled
> Match that property from what source?
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@143
PS4, Line 143: hive_metastore_sasl_enabled
> Is this hive.metastore.sasl.enabled property from some XML file or this is
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@144
PS4, Line 144: hive_metastore_kerberos_principal
> --hive_metastore_kerberos_principal master's run-time flag ?
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@149
PS4, Line 149: needs to have read and write privileges on directories the HMS 
created for Kudu tables.
> How about: needs to be able to access directories created by the HMS
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@151
PS4, Line 151: command
> remove
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@153
PS4, Line 153: * Restart the Kudu master(s).
> Does it make sense to mention that Impala also need to be restarted if usin
Interesting, I don't think Impala should need to be restarted. Probably worth 
clarifying a bit.


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@159
PS4, Line 159: any
> Drop 'any': I know there are some issues which cannot be fixed with those t
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@162
PS4, Line 162: `hms` tools
> nit: The `hms` tools ?
Ack


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@178
PS4, Line 178: read and write
> nit: the read and the write ?
Ack


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@179
PS4, Line 179: which can be done in the Hive Metastore using
             : `sentry.metastore.service.users` flag
> nit: for clarity, move into a separate sentence, something like
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@185
PS4, Line 185: all
> I'm not sure what 'all' means here, but the tool doesn't list anything for
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@206
PS4, Line 206: before
> instead of
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@209
PS4, Line 209: The `kudu hms fix` tool will not automatically fix Impala 
external tables
             : for the reasons documented above
> Then maybe it makes sense to provide detailed instructions w.r.t. Kudu exte
I just noted that we recommend dropping and recreating them. How's that?


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@259
PS4, Line 259: does not report
             :   any inconsistent catalog warnings
> nit: does not report any catalog inconsistencies ?
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@277
PS4, Line 277: fix
> Is this about 'kudu hms fix' or it's about 'kudu hms check', actually?  I t
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@277
PS4, Line 277: . (Optional) If `kudu hms fix` tool reports inconsistent 
catalog, perform a dry-run of the
             :   `kudu hms fix` tool to understand how the tool will attempt to 
address the automatically-fixable
             :    issues.
> How does the operator know which issues are fixable and which are not in ge
Correct me if I'm wrong, but i think this may be referring to tables with the 
`impala::` prefix? At least looking through the code, it seems the check tool 
will only recommend manual renaming if the table name isn't Hive-compatible.


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@306
PS4, Line 306: --ignore_other_clusters=true
> Here and elsehwhere: why to add the options which are set to the specified
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@309
PS4, Line 309: drop_orphan_hms_tables
> nit: --drop_orphan_hms_tables
Ack


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@311
PS4, Line 311: The workaround is to downgrade
             : the tables using `kudu hms downgrade` tool and then rerun `kudu 
hms fix` tool.
> This assumes the new tables are using the new Kudu table storage handler.
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/hive_metastore.adoc@314
PS4, Line 314: . Recreate external tables manually using Impala Shell.
             :
             : . Enable the Hive Metastore Integration as described in the 
section above.
> Should these two be swapped?
I don't think it matters; external table creation is ignored by Kudu, no?


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc
File docs/kudu_impala_integration.adoc:

http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@102
PS4, Line 102: Work
> Working
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@104
PS4, Line 104:  that ships with CDH 6.3
> This probably belongs in CDH docs, rather than upstream.
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@104
PS4, Line 104: 1.10
> nit: For consistency, 1.10.0?
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@104
PS4, Line 104: Impala integration
> nit: the Impala integration
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@105
PS4, Line 105:  work with the Hive Metastore integration to take the advantage 
of automatically
             : catalog synchronization between Kudu and the HMS
> nit: maybe "can take advantage the automatic Kudu-HMS catalog synchronizati
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@106
PS4, Line 106: Since external table can map to
             : different Kudu tables with different names, and dropping an 
external table does not
             : drop the table from Kudu, only metadata of internal tables are 
automatically synchronized.
> How about "Since there is no one-to-one mapping between Kudu tables and ext
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@109
PS4, Line 109:
             : NOTE: When Kudu's integration with the Hive Metastore is 
enabled, Impala should be
             : con
> Probably also worth pointing to the Kudu-HMS docs here.
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@114
PS4, Line 114:
             : Without the Hive Metastore integration
> Without the HMS integration enabled
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@129
PS4, Line 129: However, such a manual process can be skipped with the Hive 
Metastore integration,
             : which an internal table with the same name will be automatically 
created by Kudu.
             : To query the table, you can simply run `invalidate metadata` to 
ensure Impala
             : is picking up the latest metadata.
> When the Kudu-HMS integration is enabled, internal table entries will be cr
Done


http://gerrit.cloudera.org:8080/#/c/13757/4/docs/kudu_impala_integration.adoc@746
PS4, Line 746: change table type from intenral
             : to external (and vice versa) is disallowed, due to potential to 
introduce catalog inconsistency
             : between Kudu and the HMS.
> changing the table type is disallowed to avoid potentially introducing inco
Done



--
To view, visit http://gerrit.cloudera.org:8080/13757
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8726a39d56c4e9954f208700e99e7bcf2bbc290d
Gerrit-Change-Number: 13757
Gerrit-PatchSet: 4
Gerrit-Owner: Hao Hao <[email protected]>
Gerrit-Reviewer: Alex Rodoni <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Hao Hao <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Priyanka Chheda <[email protected]>
Gerrit-Comment-Date: Tue, 02 Jul 2019 04:09:02 +0000
Gerrit-HasComments: Yes

Reply via email to