[
https://issues.apache.org/jira/browse/IMPALA-3531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973506#comment-16973506
]
ASF subversion and git services commented on IMPALA-3531:
---------------------------------------------------------
Commit e0a98df3fabb84cae8c355d046c1aaa14e5bab25 in impala's branch
refs/heads/master from Anurag Mantripragada
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e0a98df ]
IMPALA-2112: Support primary key/foreign key constraints as part of
create table in Impala.
This is the first of several changes to use informational, unenforced
primary key(pk) and foreign key(fk) specifications in Impala.
The parent JIRA for this effort is IMPALA-3531.
This change adds support for adding pk/fk information during create
table DDLs. There is only limited SQL syntax support as of now and will
add various other SQL styles including ANSI syntax support in later
changes. Currently the only supported way of adding fk/pk information
is after the column definitions. Examples are:
CREATE TABLE pk(col1 INT, col2 STRING, PRIMARY KEY(col1, col2));
CREATE TABLE fk(id INT, col1 INT, col2 STRING, PRIMARY KEY(id),
FOREIGN KEY(col1, col2) REFERENCES pk(col1, col2));
In the current implementation, manual specification of constraint names
is not supported. Internally we use UUIDs for contraint name generation.
Additionally, three constraint states are supported to comply with
Hive's implementation which were taken from Oracle.
DISABLE (default true)
NOVALIDATE (default true)
RELY (default true)
More info here:
https://docs.oracle.com/database/121/DWHSG/schemas.htm#DWHSG9053
These constraints can be optionally specified after each PK/FK
specification like:
CREATE TABLE pk(id INT, PRIMARY KEY(id) DISABLE, NOVALIDATE, RELY);
However, a specification like this will throw an analysis errror:
CREATE TABLE pk(id INT, PRIMARY KEY(id) ENABLE, VALIDATE, RELY);
Notes:
- toSql support is not fully functional. Observability changes like showing
PK/FK information in DESCRIBE output will be done separately.
- Retrieval of primary keys and foreign keys is currently not supported
in Local Catalog Mode.
Tests:
Added tests to:
- AnalyzeDDLTest#TestCreateTable
- ParserTest#TestCreateTable
- ToSqlTest#TestCreateTable
- Built against both Hive-2 and Hive-3
Change-Id: Id03d8d4d41a2ac1b15e7060e2a013e334d044ee7
Reviewed-on: http://gerrit.cloudera.org:8080/14592
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Implement deferrable and optionally enforced PK/FK constraints
> --------------------------------------------------------------
>
> Key: IMPALA-3531
> URL: https://issues.apache.org/jira/browse/IMPALA-3531
> Project: IMPALA
> Issue Type: New Feature
> Components: Catalog, Frontend, Perf Investigation
> Affects Versions: Impala 2.5.0, Impala 2.6.0
> Environment: CDH
> Reporter: Ruslan Dautkhanov
> Assignee: Anurag Mantripragada
> Priority: Critical
> Labels: CBO, performance, ramp-up, sql-language
>
> Oracle has "RELY NOVALIDATE" option for constraints.. Could be easier for
> Hive to start with something like that for PK/FK constraints. So CBO has more
> information for optimizations. It does not have to actually check if that
> constraint is relationship is true; it can just "rely" on that constraint.
> https://docs.oracle.com/database/121/SQLRF/clauses002.htm#sthref2289
> So it would be helpful with join cardinality estimates, and with cases like
> IMPALA-2929.
> https://docs.oracle.com/database/121/DWHSG/schemas.htm#DWHSG9053
> "Overview of Constraint States":
> - Enforcement
> - Validation
> - Belief
> So FK/PK with "rely novalidate" will have Enforcement&Validate disabled but
> Belief = RELY as it is possible to do in Oracle and now in Hive (HIVE-13076).
> It opens a lot of ways to do additional ways to optimize execution plans.
> As exxplined in Tom Kyte's "Metadata matters"
> http://www.peoug.org/wp-content/uploads/2009/12/MetadataMatters_PEOUG_Day2009_TKyte.pdf
> pp.30 - "Tell us how the tables relate and we can remove them from the
> plan...".
> pp.35 - "Tell us how the tables relate and we have more access paths
> available...".
> Also it might be helpful when Impala is being integrated with Kudu as the
> latter have to have a PK.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]