GitHub user vovik0134 added a comment to the discussion: [Ideas] Description of 
backward-incompatible changes between Greenplum 6 and Cloudberry in the 
documentation.

A large block of backward-incompatible changes related to partitioning.

## Changes in partitioning mechanisms

### Requirement to include all partition keys in unique indexes

Unlike Greenplum 6, in Cloudberry a unique index must include **all** partition 
keys.

Example of a query that works in Greenplum 6:

```sql
CREATE TABLE sales (
    id int NOT NULL,
    region text,
    amount decimal(10, 2)
)
    DISTRIBUTED BY (id)
    PARTITION BY LIST (region)
    (
        VALUES ('Asia'),
        VALUES ('Europe')
    );

INSERT INTO sales (id, region, amount)
VALUES
    (1, 'Asia', 100.00),
    (1, 'Europe', 150.00),
    (2, 'Asia', 200.00);

CREATE UNIQUE INDEX sales_id_idx ON sales (id);
```

The same query does not work in Cloudberry:

```sql
CREATE TABLE sales (
    id int NOT NULL,
    region text,
    amount decimal(10, 2)
)
    DISTRIBUTED BY (id)
    PARTITION BY LIST (region)
    (
        VALUES ('Asia'),
        VALUES ('Europe')
    );

INSERT INTO sales (id, region, amount)
VALUES
    (1, 'Asia', 100.00),
    (1, 'Europe', 150.00),
    (2, 'Asia', 200.00);

CREATE UNIQUE INDEX sales_id_idx ON sales (id);
ERROR:  unique constraint on partitioned table must include all partitioning 
columns
DETAIL:  UNIQUE constraint on table "sales" lacks column "region" which is part 
of the partition key.
```

**How to detect**

All unique indexes that do not include **all** partition keys can be found with 
the following query:

```sql
WITH partitions AS (
SELECT DISTINCT n.nspname, c.relname, c.oid, p.paratts
FROM pg_partition p
    JOIN pg_class c ON p.parrelid = c.oid
    JOIN pg_namespace n ON n.oid = c.relnamespace
)
SELECT p.nspname, p.relname, ir.relname AS indexname
FROM pg_index i
    JOIN partitions p ON p.oid = i.indrelid
    JOIN pg_class ir ON ir.oid = i.indexrelid
WHERE (i.indisunique OR i.indisprimary)
    AND NOT (p.paratts <@ i.indkey);
```

**How to fix**

Recreate the found indexes, adding the partition keys.

### Limitation on list partitioning by multiple columns

In Cloudberry, list partitioning by multiple columns is implemented through 
subpartitioning: the standard Cloudberry syntax assumes one column or 
expression at each level.

This limitation is related to the fact that list-based partitioning by multiple 
columns, for example `PARTITION BY LIST (col1, col2)`, often makes it 
impossible to use the GPORCA optimizer.

The following example works in Greenplum 6:

```sql
CREATE TABLE example (a int, b int, c int)
    PARTITION BY LIST (b, c)
    (
        DEFAULT PARTITION other
    );
```

But it does not work in Cloudberry:

```sql
CREATE TABLE example (a int, b int, c int)
    PARTITION BY LIST (b, c)
    (
        DEFAULT PARTITION other
    );
ERROR:  cannot use "list" partition strategy with more than one column
```

**How to detect**

Tables with list-based partitioning by multiple columns can be found with the 
following query:

```sql
SELECT n.nspname, c.relname
FROM pg_partition p
    JOIN pg_class c ON p.parrelid = c.oid
    JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE parkind = 'l'
    AND parnatts > 1;
```

**How to fix**

In Cloudberry, subpartitioning should be used for this:

```sql
CREATE TABLE example (a int, b int, c int)
PARTITION BY LIST (b)
    SUBPARTITION BY LIST (c)
    SUBPARTITION TEMPLATE (
        DEFAULT SUBPARTITION other_c
    )
    (
        DEFAULT PARTITION other
    );
```

### Support for `FOR (RANK(...))` in partition DDL has been removed

Cloudberry does not support the `FOR (RANK(...))` construct when working with 
partitions.

Working example for Greenplum 6:

```sql
CREATE TABLE sales (
    id int,
    date date,
    amount decimal(10, 2)
)
WITH (appendoptimized = true, compresstype = zstd, compresslevel = 1)
DISTRIBUTED BY (id)
PARTITION BY RANGE (date) (
    START ('2026-01-01') END ('2027-01-01') EVERY (INTERVAL '1 month')
);

ALTER TABLE sales
    ALTER PARTITION FOR (RANK(2))
    TRUNCATE PARTITION FOR (RANK(2));
```

Does not work in Cloudberry:

```sql
CREATE TABLE sales (
    id int,
    date date,
    amount decimal(10, 2)
)
WITH (appendoptimized = true, compresstype = zstd, compresslevel = 1)
DISTRIBUTED BY (id)
PARTITION BY RANGE (date) (
    START ('2026-01-01') END ('2027-01-01') EVERY (INTERVAL '1 month')
);

ALTER TABLE sales
    ALTER PARTITION FOR (RANK(2))
    TRUNCATE PARTITION FOR (RANK(2));
ERROR:  addressing partition by RANK is no longer supported
HINT:  Use partition name or FOR (<partition key value>) instead.
```

**How to detect**

Check the application codebase and scripts for uses of the `FOR (RANK(...))` 
construct.

**How to fix**

Replace `FOR (RANK(...))` with `FOR (VALUE ...)` or `FOR VALUES`.

### Inheritance of storage parameters for AO tables has changed

If an AO parameter is not explicitly specified for a partition, the behavior 
differs:

* in Greenplum 6, the default value is applied;
* in Cloudberry, the parameter is inherited from the parent table.

The following query works in Greenplum 6:

```sql
CREATE TABLE sales
(
    id int,
    date date,
    region text,
    amount decimal(10, 2)
)
    WITH (appendoptimized = true, compresstype = zstd, compresslevel = 1)
    DISTRIBUTED BY (id)
    PARTITION BY LIST (region)
        (
        PARTITION asia VALUES ('Asia')
            WITH (appendoptimized = true, compresslevel = 0),
        PARTITION europe VALUES ('Europe')
        );
```

But it does not work in Cloudberry:

```sql
CREATE TABLE sales
(
    id int,
    date date,
    region text,
    amount decimal(10, 2)
)
    WITH (appendoptimized = true, compresstype = zstd, compresslevel = 1)
    DISTRIBUTED BY (id)
    PARTITION BY LIST (region)
        (
        PARTITION asia VALUES ('Asia')
            WITH (appendoptimized = true, compresslevel = 0),
        PARTITION europe VALUES ('Europe')
        );
ERROR:  compresstype "zstd" can't be used with compresslevel 0
```

**How to detect**

Partitions created without explicit options can be found with the following 
query:

```sql
SELECT
    schemaname,
    tablename,
    partitionschemaname,
    partitiontablename,
    par.reloptions AS parent_options,
    c.reloptions AS child_options,
    po AS parent_option
FROM pg_partitions p
    JOIN pg_class par ON par.oid = format('%s.%s', schemaname, 
tablename)::regclass
    JOIN pg_class c ON c.oid = format('%s.%s', partitionschemaname, 
partitiontablename)::regclass
    JOIN unnest(par.reloptions) po ON TRUE
    LEFT JOIN unnest(c.reloptions) co ON split_part(po, '=', 1) = 
split_part(co, '=', 1)
WHERE co IS NULL;
```

**How to fix**

Create a copy of the problematic partition with explicitly specified options, 
move the data into it, and replace the original partition with the updated one.

### Limitations on `EXCLUSIVE` / `INCLUSIVE` boundaries for `float` / `numeric` 
/ `text` types

Unlike Greenplum 6, in Cloudberry the `START EXCLUSIVE` / `END INCLUSIVE` 
constructs work only for types with the `+` operator, such as `integer` and 
`timestamp`, but not for `float`, `numeric`, or `text`.

Correct example for Greenplum 6:

```sql
CREATE TABLE sales
(
    id int,
    date date,
    amount decimal(10, 2)
)
    WITH (appendoptimized = true)
    DISTRIBUTED BY (id)
    PARTITION BY RANGE (amount)
        (
        START (0) EXCLUSIVE
            END (1000)
            EVERY (250)
        );
```

Does not work in Cloudberry:

```sql
CREATE TABLE sales
(
    id int,
    date date,
    amount decimal(10, 2)
)
    WITH (appendoptimized = true)
    DISTRIBUTED BY (id)
    PARTITION BY RANGE (amount)
        (
        START (0) EXCLUSIVE
            END (1000)
            EVERY (250)
        );
ERROR:  START EXCLUSIVE not supported for partition key data type: numeric
HINT:  Specify an inclusive START value and remove the EXCLUSIVE keyword
```

**How to detect**

Tables with `START EXCLUSIVE` / `END INCLUSIVE` on unsupported types can be 
found with the following query:

```sql
SELECT n.nspname, c.relname, t.typname, pr.parchildrelid::regclass
FROM pg_partition p
    JOIN pg_partition_rule pr ON p.oid = pr.paroid
    JOIN pg_class c ON p.parrelid = c.oid
    JOIN pg_namespace n ON n.oid = c.relnamespace
    JOIN pg_attribute a ON a.attrelid = c.oid AND a.attnum = ANY(p.paratts)
    JOIN pg_type t ON t.oid = a.atttypid
WHERE t.typname IN ('text', 'float8', 'float4', 'numeric')
    AND (NOT pr.parrangestartincl OR pr.parrangeendincl);
```

**How to fix**

Recreate the problematic tables without `EXCLUSIVE` / `INCLUSIVE` on these 
types.

### The level system for partitioned tables has changed

In Greenplum 6, immediate child partitions have `pg_partition.parlevel = 0`. In 
Cloudberry, immediate child partitions have `pg_partition_tree().level = 1`. 
The `pg_partition` view has been removed in Cloudberry, so the 
`pg_partition_tree()` function should be used instead.

This change may break queries that rely on partition levels.

**How to detect**

Check the application codebase and scripts for uses of partition levels.

**How to fix**

Adapt the application codebase and scripts to account for the changed partition 
levels and the removal of the `pg_partition` view. Instead of `pg_partition`, 
use the `pg_partition_tree()` function or the `pg_partitioned_table` view.

### Partition constraints have been moved out of `pg_constraint`

In Greenplum 6, partition boundaries are displayed as `CHECK CONSTRAINT` 
entries in the `pg_constraint` system view.

In Cloudberry, partition boundaries are no longer `CHECK CONSTRAINT` entries 
and are not displayed in `pg_constraint`. Similar information can now be 
obtained from `gp_toolkit.gp_partitions`.

**How to detect**

Check the application codebase and scripts for uses of `pg_constraint` to work 
with partition constraints.

**How to fix**

Use the `gp_toolkit.gp_partitions` view instead of `pg_constraint` to work with 
partition constraints.

GitHub link: 
https://github.com/apache/cloudberry/discussions/1816#discussioncomment-17260313

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to