[
https://issues.apache.org/jira/browse/IMPALA-10976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17802247#comment-17802247
]
Michael Smith commented on IMPALA-10976:
----------------------------------------
Did the above commit address this ticket?
> Sync db/table in catalogd to latest HMS event id for all DDLs from Impala
> shell
> -------------------------------------------------------------------------------
>
> Key: IMPALA-10976
> URL: https://issues.apache.org/jira/browse/IMPALA-10976
> Project: IMPALA
> Issue Type: Task
> Components: Catalog, Frontend
> Reporter: Sourabh Goyal
> Assignee: Sai Hemanth Gantasala
> Priority: Major
>
> This is a follow up from IMPALA-10926. The idea is that when any DDL
> operation is performed from Impala shell, it also syncs the db/table to its
> latest event ID as per HMS. This way updates to a db/table's are applied in
> the same order as they appear in the Notification log in HMS which ensures
> consistency. Currently catalogD applies any updates received from Impala
> shell in place. Instead it should perform an HMS operation first and then
> replay all the HMS events since the last synced event.
> However there are subtle differences in how Impala processes DDLs via shell
> vs how it processes HMS events These are:
> * When processing an alter table event, currently catalogD does a full table
> reload. This has a performance impact as table reload is time consuming.
> Whereas in place alter table DDL operation in catalogOpExecutor (via Impala
> shell) is faster since detects when to reload table schema or file metadata
> or both. Need some improvements in Alter table event processing logic to
> detect whether to reload the file metadata or not. --> This is addressed by
> IMPALA-11534
> * Similar improvement is required in processing alter partition event. As of
> now, when processing AlterPartition HMS event, catalogd always reloads file
> metadata but when doing the same from shell, it reloads metadata only when it
> is required.
> * Impala shell already caches hive fns in catalog db’s object. But catalogD
> does *not* process CREATE/DROP Fns HMS event
> * When creating a db/table from Impala shell, if the operation fails because
> the db/table already exists, then there is no reliable way in catalogd to
> determine create event id for that db/table. The create event is required so
> that for any subsequent ddl operations, catalogd can process HMS events
> starting from createEvent Id.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]