[jira] [Updated] (IGNITE-16591) DDL syntax COLOCATE BY

Andrey N. Gura (Jira) Mon, 21 Feb 2022 05:00:07 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-16591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Andrey N. Gura updated IGNITE-16591:
------------------------------------
    Description: 
Add COLOCATED BY for CREATE TABLE syntaxis:
h2. Colocation by non-composite primary key

As a typical example, let’s assume we have the following schema:
{code:sql}
CREATE TABLE Accounts (
  account_id INT,
  name       VARCHAR,
  balance    DOUBLE,
  PRIMARY KEY (account_id)
)
{code}
{code:sql}
CREATE TABLE Transactions (
  tx_id      INT,
  account_id INT,
  amount     DOUBLE,
  PRIMARY KEY (tx_id)
)
{code}
Most of the operations on such a schema will be isolated within a single 
account. Examples:
1. Add a new transaction, and update the balance of the corresponding account.
2. Run through all the transactions for an account, and calculate the average 
amount.

To achieve better performance, we want each of these operations to execute on a 
single node. Therefore, we need to group all transactions belonging to the same 
account within the same Raft group (partition, range, etc). Corresponding 
records that represent the account itself should also reside in the same group.

To achieve this, we want to perform affinity mapping for the Transactions table 
using the account_id field (as opposed to the tx_id, as it would happen 
currently). We need to allow users to specify the field used for colocation. 
For example, like this:
{code:sql}
CREATE TABLE Transactions (
  tx_id      INT,
  account_id INT,
  amount     DOUBLE,
  PRIMARY KEY (tx_id, account_id)
)
COLOCATE BY (account_id)
{code}
NOTE: the affinity key must always be a part of the primary key. Otherwise, an 
exception is thrown. 
NOTE: Primary key columns, and therefore affinity key columns, MUST NOT permit 
NULL values according to the SQL standard.

Colocation by composite primary key

In case of using the composite primary key the tables also could be colocated. 
For example:
{code:sql}
CREATE TABLE Orders (
  order_id     INT,
  department_id INT,
  create_date   TIMESTAMP,
  PRIMARY KEY (order_id, department_id)
)
{code}
{code:sql}
CREATE TABLE OrderItems (
  order_item_id      INT,
  order_id          INT,
  department_id      INT,
  amount              INT,
  PRIMARY KEY (order_item_id)
)
{code}
For such data model the OrderItems table could be colocated with the Orders 
table in the following way:
{code:sql}
CREATE TABLE OrderItems (
  order_item_id      INT,
  order_id          INT,
  department_id      INT,
  amount              INT,
  PRIMARY KEY (order_item_id, order_id, department_id)
)
COLOCATE BY (order_id, department_id)
{code}
Note: COLOCATE BY clause of colocated table (OrderItems table in the example 
above) must contain the same set of columns and in the same order as PRIMARY 
KEY clause of the main table (Orders table in the example above) in order to 
properly colocate the data.

  was:
Add COLOCATED BY for CREATE TABLE syntaxis:
Colocate by could contains list of  columns (one or more).
E.g.:

 
{code:java}
CREATE TABLE OrderItems (
  order_item_id      INT,
  order_id       INT,
  department_id      INT,
  amount           INT,
  PRIMARY KEY (order_item_id, order_id, department_id)
)
COLOCATE BY (order_id, department_id)
{code}
 


> DDL syntax COLOCATE BY
> ----------------------
>
>                 Key: IGNITE-16591
>                 URL: https://issues.apache.org/jira/browse/IGNITE-16591
>             Project: Ignite
>          Issue Type: Improvement
>          Components: sql
>            Reporter: Yury Gerzhedovich
>            Priority: Major
>              Labels: ignite-3
>
> Add COLOCATED BY for CREATE TABLE syntaxis:
> h2. Colocation by non-composite primary key
> As a typical example, let’s assume we have the following schema:
> {code:sql}
> CREATE TABLE Accounts (
>   account_id INT,
>   name       VARCHAR,
>   balance    DOUBLE,
>   PRIMARY KEY (account_id)
> )
> {code}
> {code:sql}
> CREATE TABLE Transactions (
>   tx_id      INT,
>   account_id INT,
>   amount     DOUBLE,
>   PRIMARY KEY (tx_id)
> )
> {code}
> Most of the operations on such a schema will be isolated within a single 
> account. Examples:
> 1. Add a new transaction, and update the balance of the corresponding account.
> 2. Run through all the transactions for an account, and calculate the average 
> amount.
> To achieve better performance, we want each of these operations to execute on 
> a single node. Therefore, we need to group all transactions belonging to the 
> same account within the same Raft group (partition, range, etc). 
> Corresponding records that represent the account itself should also reside in 
> the same group.
> To achieve this, we want to perform affinity mapping for the Transactions 
> table using the account_id field (as opposed to the tx_id, as it would happen 
> currently). We need to allow users to specify the field used for colocation. 
> For example, like this:
> {code:sql}
> CREATE TABLE Transactions (
>   tx_id      INT,
>   account_id INT,
>   amount     DOUBLE,
>   PRIMARY KEY (tx_id, account_id)
> )
> COLOCATE BY (account_id)
> {code}
> NOTE: the affinity key must always be a part of the primary key. Otherwise, 
> an exception is thrown. 
> NOTE: Primary key columns, and therefore affinity key columns, MUST NOT 
> permit NULL values according to the SQL standard.
> Colocation by composite primary key
> In case of using the composite primary key the tables also could be 
> colocated. For example:
> {code:sql}
> CREATE TABLE Orders (
>   order_id     INT,
>   department_id INT,
>   create_date   TIMESTAMP,
>   PRIMARY KEY (order_id, department_id)
> )
> {code}
> {code:sql}
> CREATE TABLE OrderItems (
>   order_item_id      INT,
>   order_id          INT,
>   department_id      INT,
>   amount              INT,
>   PRIMARY KEY (order_item_id)
> )
> {code}
> For such data model the OrderItems table could be colocated with the Orders 
> table in the following way:
> {code:sql}
> CREATE TABLE OrderItems (
>   order_item_id      INT,
>   order_id          INT,
>   department_id      INT,
>   amount              INT,
>   PRIMARY KEY (order_item_id, order_id, department_id)
> )
> COLOCATE BY (order_id, department_id)
> {code}
> Note: COLOCATE BY clause of colocated table (OrderItems table in the example 
> above) must contain the same set of columns and in the same order as PRIMARY 
> KEY clause of the main table (Orders table in the example above) in order to 
> properly colocate the data.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (IGNITE-16591) DDL syntax COLOCATE BY

Reply via email to