This is an automated email from the ASF dual-hosted git repository.
abukor pushed a commit to branch branch-1.17.x
in repository https://gitbox.apache.org/repos/asf/kudu.git
The following commit(s) were added to refs/heads/branch-1.17.x by this push:
new 476aaa995 KUDU-1945 Update docs with non-unique PK
476aaa995 is described below
commit 476aaa995f17605849ecebd646dcd4deee83dcbf
Author: Marton Greber <[email protected]>
AuthorDate: Wed Apr 26 18:42:41 2023 +0000
KUDU-1945 Update docs with non-unique PK
Added small update to cover non-unique primary key. For further info, I
added a link to the examples folder. Right now we only have the C++
example in place for non-unique PK, I plan to translate that example to
Java and Python as well.
Change-Id: I84e1c6b85d4fdb5ac95bad611246c071a63bcd31
Reviewed-on: http://gerrit.cloudera.org:8080/19809
Tested-by: Kudu Jenkins
Reviewed-by: Wenzhe Zhou <[email protected]>
Reviewed-by: Abhishek Chennaka <[email protected]>
Reviewed-on: http://gerrit.cloudera.org:8080/19924
Reviewed-by: Marton Greber <[email protected]>
Reviewed-by: Yingchun Lai <[email protected]>
---
docs/administration.adoc | 3 +++
docs/schema_design.adoc | 38 +++++++++++++++++++++++++++++++++++++-
2 files changed, 40 insertions(+), 1 deletion(-)
diff --git a/docs/administration.adoc b/docs/administration.adoc
index 7aad2731a..ed9a88727 100644
--- a/docs/administration.adoc
+++ b/docs/administration.adoc
@@ -317,6 +317,9 @@
link:https://spark.apache.org/docs/latest/#downloading[Spark documentation].
Additionally review the Apache Spark documentation for
link:https://spark.apache.org/docs/latest/submitting-applications.html[Submitting
Applications].
+NOTE: Restoring tables with non-unique primary keys/auto-incrementing columns
is
+not supported currently.
+
==== Backing up tables
To backup one or more Kudu tables the `KuduBackup` Spark job can be used.
diff --git a/docs/schema_design.adoc b/docs/schema_design.adoc
index fc7800b4e..95d4d251c 100644
--- a/docs/schema_design.adoc
+++ b/docs/schema_design.adoc
@@ -234,9 +234,12 @@ or double type.
Once set during table creation, the set of columns in the primary key may not
be altered.
-Unlike an RDBMS, Kudu does not provide an auto-incrementing column feature,
+Unlike an RDBMS, Kudu does not provide an explicit auto-incrementing column
feature,
so the application must always provide the full primary key during insert.
+Columns which do not satisfy the uniqueness constraint can still be used as
primary keys, by
+specifying them as non-unique primary keys.
+
Row delete and update operations must also specify the full primary key of the
row to be changed. Kudu does not natively support range deletes or updates.
@@ -257,6 +260,39 @@ NOTE: Primary key indexing optimizations apply to scans on
individual tablets.
See the <<partition-pruning>> section for details on how scans can use
predicates to skip entire tablets.
+[[non-unique_primary_keys]]
+=== Non-unique Primary Key Index
+
+While specifying columns as non-unique primary key, Kudu internally creates an
auto-incrementing
+column. The specified columns and the auto-incrementing column form the
effective primary key.
+
+NOTE: The auto-incrementing counter which is used to assign value for
auto-incrementing column is
+managed by Kudu, the counter values are monotonically increasing per tablet.
+
+Non-unique primary key columns must be non-nullable, and may not be a boolean,
float
+or double type.
+
+Once set during table creation, the set of columns in the non-unique primary
key and the
+auto-incrementing column can not be altered.
+
+For inserts, one has to provide values for the non-unique primary key columns
without specifying
+the values for auto-incrementing column. The auto-incrementing column is
populated on the server
+side automatically.
+
+For updates/deletes the full set of key columns is necessary. One has to
perform a scan before
+update/delete operation to get the auto-incrementing value.
+
+Upsert operation is not supported on tables with non-unique primary key.
+
+The non-unique primary key values of a column may not be updated after the row
is inserted.
+However, the row may be deleted and re-inserted with the updated value,
moreover a new
+auto-incrementing counter value is assigned during insertion for
auto-incrementing column.
+
+Restoring tables with non-unique primary keys is not supported currently.
+
+For more details on how to use non-unique primary key, please check the
+link:https://github.com/apache/kudu/tree/master/examples[examples] folder.
+
[[Backfilling]]
=== Considerations for Backfill Inserts