Dimitris Tsirogiannis has uploaded a new patch set (#2).

Change subject: IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables
......................................................................

IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables

With this commit we simplify the syntax and handling of CREATE TABLE
statements for both managed and external Kudu tables.

Syntax example:
CREATE TABLE foo(a INT, b STRING, PRIMARY KEY (a, b))
DISTRIBUTE BY HASH (a) INTO 3 BUCKETS,
RANGE (b) SPLIT ROWS (('abc', 'def'))
STORED AS KUDU

Changes:
1) Remove the requirement to specify table properties such as key
   columns in tblproperties.
2) Read table schema (column definitions, primary keys, and distribution
   schemes) from Kudu instead of the HMS.
3) For external tables, the Kudu table is now required to exist at the
   time of creation in Impala.
4) Disallow table properties that could conflict with an existing
   table. Ex: key_columns cannot be specified.
5) Add KUDU as a file format.
6) Add a startup flag to impalad to specify the default Kudu master
   addresses. The flag is used as the default value for the table
   property kudu_master_addresses but it can still be overriden
   using TBLPROPERTIES.
7) Fix a post merge issue (IMPALA-3178) where DROP DATABASE CASCADE
   wasn't implemented for Kudu tables and silently ignored. The Kudu
   tables wouldn't be removed in Kudu.
8) Remove DDL delegates. There was only one functional delegate (for
   Kudu) the existence of the other delegate and the use of delegates in
   general has led to confusion. The Kudu delegate only exists to provide
   functionality missing from Hive.
9) Add PRIMARY KEY at the column and table level. This syntax is fairly
   standard. When used at the column level, only one column can be
   marked as a key. When used at the table level, multiple columns can
   be used as a key. Only Kudu tables are allowed to use PRIMARY KEY.
   The old "kudu.key_columns" table property is no longer accepted
   though it is still used internally. "PRIMARY" is now a keyword.
   "KEY" is expected to be common enough that the ident style
   declaration is used instead to avoid conflicts.
10) For managed tables, infer a Kudu table name if none was given.
   The table property "kudu.table_name" is optional for managed tables
   and is required for external tables. If for a managed table a Kudu
   table name is not provided, a table name will be generated based
   on the HMS database and table name. If the database is "default",
   then a Kudu table name will not include the database.
11) Use Kudu master as the source of truth for table metadata instead
   of HMS when a table is loaded or refreshed. Table/column metadata
   are cached in the catalog and are stored in HMS in order to be
   able to use table and column statistics.

Change-Id: I7b9d51b2720ab57649abdb7d5c710ea04ff50dc1
---
M be/src/service/frontend.cc
M bin/start-catalogd.sh
M bin/start-impala-cluster.py
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
A fe/src/main/java/com/cloudera/impala/analysis/AnalysisUtils.java
M fe/src/main/java/com/cloudera/impala/analysis/ColumnDef.java
A fe/src/main/java/com/cloudera/impala/analysis/ColumnDefOptions.java
M fe/src/main/java/com/cloudera/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/CreateTableDataSrcStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/CreateTableStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/DistributeParam.java
M fe/src/main/java/com/cloudera/impala/analysis/ModifyStmt.java
A fe/src/main/java/com/cloudera/impala/analysis/TableDataLayout.java
A fe/src/main/java/com/cloudera/impala/analysis/TableDef.java
M fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
M fe/src/main/java/com/cloudera/impala/catalog/Catalog.java
M fe/src/main/java/com/cloudera/impala/catalog/Db.java
M fe/src/main/java/com/cloudera/impala/catalog/HdfsFileFormat.java
M fe/src/main/java/com/cloudera/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/com/cloudera/impala/catalog/KuduTable.java
M fe/src/main/java/com/cloudera/impala/catalog/Table.java
M fe/src/main/java/com/cloudera/impala/catalog/TableLoader.java
D fe/src/main/java/com/cloudera/impala/catalog/delegates/DdlDelegate.java
D fe/src/main/java/com/cloudera/impala/catalog/delegates/KuduDdlDelegate.java
D 
fe/src/main/java/com/cloudera/impala/catalog/delegates/UnsupportedOpDelegate.java
M fe/src/main/java/com/cloudera/impala/planner/HdfsPartitionFilter.java
M fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java
M fe/src/main/java/com/cloudera/impala/service/Frontend.java
M fe/src/main/java/com/cloudera/impala/service/JniCatalog.java
M fe/src/main/java/com/cloudera/impala/service/JniFrontend.java
A fe/src/main/java/com/cloudera/impala/service/KuduCatalogOpExecutor.java
A fe/src/main/java/com/cloudera/impala/util/KuduClient.java
M fe/src/main/java/com/cloudera/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/com/cloudera/impala/analysis/ParserTest.java
M fe/src/test/java/com/cloudera/impala/service/JdbcTest.java
M fe/src/test/java/com/cloudera/impala/testutil/ImpaladTestCatalog.java
M infra/python/deps/requirements.txt
M testdata/bin/generate-schema-statements.py
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/tpch/tpch_schema_template.sql
D testdata/workloads/functional-query/queries/QueryTest/create_kudu.test
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
D testdata/workloads/functional-query/queries/QueryTest/kudu-show-create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
A testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M tests/common/__init__.py
A tests/common/kudu_test_suite.py
M tests/conftest.py
A tests/custom_cluster/test_kudu.py
M tests/metadata/test_ddl.py
M tests/query_test/test_kudu.py
60 files changed, 2,966 insertions(+), 2,075 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/4414/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7b9d51b2720ab57649abdb7d5c710ea04ff50dc1
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>

Reply via email to