[
https://issues.apache.org/jira/browse/IMPALA-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558337#comment-17558337
]
ASF subversion and git services commented on IMPALA-9670:
---------------------------------------------------------
Commit d74cc7319f21a49d9bba952722b3a622db905836 in impala's branch
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d74cc7319 ]
IMPALA-9670: Fix unloaded views are shown as tables for GET_TABLES requests
At startup, catalogd pulls the table names from HMS and tracks each
table using an IncompleteTable which only contains the table name. The
table types (TABLE/VIEW) and comments are unknown until the table/view
is loaded in catalogd. GET_TABLES is a request of the HS2 protocol. It
fetches all the tables with their types and comments. For unloaded
tables/views, Impala always returns them with TABLE type (the default)
and empty comments.
This patch enables catalogd to always load the table types and comments
along with the table names. This behavior is controlled by a
catalogd-only flag, --pull_table_types_and_comments, which is false by
default. When this flag is enabled, catalogd will load table types and
comments at startup and in executing INVALIDATE METADATA commands. In
other words, an unloaded table (IncompleteTable) now not just contains
the table name, but also contains the correct table type and comment.
This is implemented by using the getTableMetas HMS API when invalidating
a table. The original behavior uses getAllTables to load all table names
and uses tableExists to verify whether a table still exists. When the
flag is set, we'll use getTableMetas instead to also load the table
types and comments.
Implementation:
Add a new table type, UNLOADED_TABLE, in TTableType to identify tables
that we just know it's not a view, but don’t know whether it's a Kudu or
HDFS table since its full set of metadata is unloaded.
When propagating catalog objects from catalogd to coordinators, views
are sent using a catalog key explicitly prefixed by VIEW. So
coordinators can create IncompleteTables/LocalIncompleteTables with the
correct types.
In most of the cases in creating an IncompleteTable, we have the table
types and comments in the context. For instance, when adding an
IncompleteTable for a CreateTable/CreateView request, we know exactly
it's a table or view. So we can create IncompleteTables with the correct
types.
Test infra changes:
- Adds get_tables() method for the hs2_client
- Extends ImpalaTestSuite.create_client_for_nth_impalad() to support
hs2 and hs2-http protocols. So we can create HS2 clients on all
impalads.
Tests:
- Add custom cluster tests on all catalog modes (with/without
local-catalog or event processor). Verify the table types and
comments are always correct when pull_table_types_and_comments is
true.
Change-Id: I528bb20272ebdd66a0118c30efc2b0566f2b0e2f
Reviewed-on: http://gerrit.cloudera.org:8080/18626
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Fix unloaded views are shown as tables for GET_TABLES requests
> --------------------------------------------------------------
>
> Key: IMPALA-9670
> URL: https://issues.apache.org/jira/browse/IMPALA-9670
> Project: IMPALA
> Issue Type: Improvement
> Components: Catalog
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
>
> JDBC/ODBC clients will send GET_TABLES requests to get the list of tables of
> a given db. Table types and comments should be returned as well. However, for
> unloaded views which are represended as IncompleteTable, Impala doesn't know
> the exact table types and return them as tables by default. This causes some
> troubles for applications that needs to distinguish views.
> Problems to fix in this JIRA:
> * Views are shown as tables when Impala launches (all views are unloaded).
> * Views are shown as tables after creation (the created view is unloaded).
> Possible solution is adding table type to IncompleteTable:
> * {{IncompleteTable}} s are created in several places. Add table type in
> those places.
> * For startup and global invalidate metadata,
> ** Approach 1: use getTables API with table type.
> *** currently we use IMetaStoreClient#getAllTables(dbName) to get table
> names inside a db and create IncompleteTables for them.
> *** We can add an additional call to load names of views using
> IMetaStoreClient#getTables(dbName, tablePattern, tableType).
> *** This may increase the startup time by 2x.
> ** Approach 2: use getTableMeta API instead of getAllTables.
> *** Pros: can get the comments as well.
> *** Cons: This may increase the startup time by 4x.
> * Load view's metadata at the end of lightweight DDL operations like
> "INVALIDATE METADATA _table_", "CREATE VIEW", etc.
> * EventProcessors can extract table type from the event's HMS table object.
> * Add a feature flag for this so users don't need this can turn it off to
> reduce the startup time.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]