GitHub user vovik0134 added a comment to the discussion: [Ideas] Description of
backward-incompatible changes between Greenplum 6 and Cloudberry in the
documentation.
A large block of backward-incompatible changes related to partitioning.
## Changes in partitioning mechanisms
### Requirement to include all partition keys in unique indexes
Unlike Greenplum 6, in Cloudberry a unique index must include **all** partition
keys.
Example of a query that works in Greenplum 6:
```sql
CREATE TABLE sales (
id int NOT NULL,
region text,
amount decimal(10, 2)
)
DISTRIBUTED BY (id)
PARTITION BY LIST (region)
(
VALUES ('Asia'),
VALUES ('Europe')
);
INSERT INTO sales (id, region, amount)
VALUES
(1, 'Asia', 100.00),
(1, 'Europe', 150.00),
(2, 'Asia', 200.00);
CREATE UNIQUE INDEX sales_id_idx ON sales (id);
```
The same query does not work in Cloudberry:
```sql
CREATE TABLE sales (
id int NOT NULL,
region text,
amount decimal(10, 2)
)
DISTRIBUTED BY (id)
PARTITION BY LIST (region)
(
VALUES ('Asia'),
VALUES ('Europe')
);
INSERT INTO sales (id, region, amount)
VALUES
(1, 'Asia', 100.00),
(1, 'Europe', 150.00),
(2, 'Asia', 200.00);
CREATE UNIQUE INDEX sales_id_idx ON sales (id);
ERROR: unique constraint on partitioned table must include all partitioning
columns
DETAIL: UNIQUE constraint on table "sales" lacks column "region" which is part
of the partition key.
```
**How to detect**
All unique indexes that do not include **all** partition keys can be found with
the following query:
```sql
WITH partitions AS (
SELECT DISTINCT n.nspname, c.relname, c.oid, p.paratts
FROM pg_partition p
JOIN pg_class c ON p.parrelid = c.oid
JOIN pg_namespace n ON n.oid = c.relnamespace
)
SELECT p.nspname, p.relname, ir.relname AS indexname
FROM pg_index i
JOIN partitions p ON p.oid = i.indrelid
JOIN pg_class ir ON ir.oid = i.indexrelid
WHERE (i.indisunique OR i.indisprimary)
AND NOT (p.paratts <@ i.indkey);
```
**How to fix**
Recreate the found indexes, adding the partition keys.
### Limitation on list partitioning by multiple columns
In Cloudberry, list partitioning by multiple columns is implemented through
subpartitioning: the standard Cloudberry syntax assumes one column or
expression at each level.
This limitation is related to the fact that list-based partitioning by multiple
columns, for example `PARTITION BY LIST (col1, col2)`, often makes it
impossible to use the GPORCA optimizer.
The following example works in Greenplum 6:
```sql
CREATE TABLE example (a int, b int, c int)
PARTITION BY LIST (b, c)
(
DEFAULT PARTITION other
);
```
But it does not work in Cloudberry:
```sql
CREATE TABLE example (a int, b int, c int)
PARTITION BY LIST (b, c)
(
DEFAULT PARTITION other
);
ERROR: cannot use "list" partition strategy with more than one column
```
**How to detect**
Tables with list-based partitioning by multiple columns can be found with the
following query:
```sql
SELECT n.nspname, c.relname
FROM pg_partition p
JOIN pg_class c ON p.parrelid = c.oid
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE parkind = 'l'
AND parnatts > 1;
```
**How to fix**
In Cloudberry, subpartitioning should be used for this:
```sql
CREATE TABLE example (a int, b int, c int)
PARTITION BY LIST (b)
SUBPARTITION BY LIST (c)
SUBPARTITION TEMPLATE (
DEFAULT SUBPARTITION other_c
)
(
DEFAULT PARTITION other
);
```
### Support for `FOR (RANK(...))` in partition DDL has been removed
Cloudberry does not support the `FOR (RANK(...))` construct when working with
partitions.
Working example for Greenplum 6:
```sql
CREATE TABLE sales (
id int,
date date,
amount decimal(10, 2)
)
WITH (appendoptimized = true, compresstype = zstd, compresslevel = 1)
DISTRIBUTED BY (id)
PARTITION BY RANGE (date) (
START ('2026-01-01') END ('2027-01-01') EVERY (INTERVAL '1 month')
);
ALTER TABLE sales
ALTER PARTITION FOR (RANK(2))
TRUNCATE PARTITION FOR (RANK(2));
```
Does not work in Cloudberry:
```sql
CREATE TABLE sales (
id int,
date date,
amount decimal(10, 2)
)
WITH (appendoptimized = true, compresstype = zstd, compresslevel = 1)
DISTRIBUTED BY (id)
PARTITION BY RANGE (date) (
START ('2026-01-01') END ('2027-01-01') EVERY (INTERVAL '1 month')
);
ALTER TABLE sales
ALTER PARTITION FOR (RANK(2))
TRUNCATE PARTITION FOR (RANK(2));
ERROR: addressing partition by RANK is no longer supported
HINT: Use partition name or FOR (<partition key value>) instead.
```
**How to detect**
Check the application codebase and scripts for uses of the `FOR (RANK(...))`
construct.
**How to fix**
Replace `FOR (RANK(...))` with `FOR (VALUE ...)` or `FOR VALUES`.
### Inheritance of storage parameters for AO tables has changed
If an AO parameter is not explicitly specified for a partition, the behavior
differs:
* in Greenplum 6, the default value is applied;
* in Cloudberry, the parameter is inherited from the parent table.
The following query works in Greenplum 6:
```sql
CREATE TABLE sales
(
id int,
date date,
region text,
amount decimal(10, 2)
)
WITH (appendoptimized = true, compresstype = zstd, compresslevel = 1)
DISTRIBUTED BY (id)
PARTITION BY LIST (region)
(
PARTITION asia VALUES ('Asia')
WITH (appendoptimized = true, compresslevel = 0),
PARTITION europe VALUES ('Europe')
);
```
But it does not work in Cloudberry:
```sql
CREATE TABLE sales
(
id int,
date date,
region text,
amount decimal(10, 2)
)
WITH (appendoptimized = true, compresstype = zstd, compresslevel = 1)
DISTRIBUTED BY (id)
PARTITION BY LIST (region)
(
PARTITION asia VALUES ('Asia')
WITH (appendoptimized = true, compresslevel = 0),
PARTITION europe VALUES ('Europe')
);
ERROR: compresstype "zstd" can't be used with compresslevel 0
```
**How to detect**
Partitions created without explicit options can be found with the following
query:
```sql
SELECT
schemaname,
tablename,
partitionschemaname,
partitiontablename,
par.reloptions AS parent_options,
c.reloptions AS child_options,
po AS parent_option
FROM pg_partitions p
JOIN pg_class par ON par.oid = format('%s.%s', schemaname,
tablename)::regclass
JOIN pg_class c ON c.oid = format('%s.%s', partitionschemaname,
partitiontablename)::regclass
JOIN unnest(par.reloptions) po ON TRUE
LEFT JOIN unnest(c.reloptions) co ON split_part(po, '=', 1) =
split_part(co, '=', 1)
WHERE co IS NULL;
```
**How to fix**
Create a copy of the problematic partition with explicitly specified options,
move the data into it, and replace the original partition with the updated one.
### Limitations on `EXCLUSIVE` / `INCLUSIVE` boundaries for `float` / `numeric`
/ `text` types
Unlike Greenplum 6, in Cloudberry the `START EXCLUSIVE` / `END INCLUSIVE`
constructs work only for types with the `+` operator, such as `integer` and
`timestamp`, but not for `float`, `numeric`, or `text`.
Correct example for Greenplum 6:
```sql
CREATE TABLE sales
(
id int,
date date,
amount decimal(10, 2)
)
WITH (appendoptimized = true)
DISTRIBUTED BY (id)
PARTITION BY RANGE (amount)
(
START (0) EXCLUSIVE
END (1000)
EVERY (250)
);
```
Does not work in Cloudberry:
```sql
CREATE TABLE sales
(
id int,
date date,
amount decimal(10, 2)
)
WITH (appendoptimized = true)
DISTRIBUTED BY (id)
PARTITION BY RANGE (amount)
(
START (0) EXCLUSIVE
END (1000)
EVERY (250)
);
ERROR: START EXCLUSIVE not supported for partition key data type: numeric
HINT: Specify an inclusive START value and remove the EXCLUSIVE keyword
```
**How to detect**
Tables with `START EXCLUSIVE` / `END INCLUSIVE` on unsupported types can be
found with the following query:
```sql
SELECT n.nspname, c.relname, t.typname, pr.parchildrelid::regclass
FROM pg_partition p
JOIN pg_partition_rule pr ON p.oid = pr.paroid
JOIN pg_class c ON p.parrelid = c.oid
JOIN pg_namespace n ON n.oid = c.relnamespace
JOIN pg_attribute a ON a.attrelid = c.oid AND a.attnum = ANY(p.paratts)
JOIN pg_type t ON t.oid = a.atttypid
WHERE t.typname IN ('text', 'float8', 'float4', 'numeric')
AND (NOT pr.parrangestartincl OR pr.parrangeendincl);
```
**How to fix**
Recreate the problematic tables without `EXCLUSIVE` / `INCLUSIVE` on these
types.
### The level system for partitioned tables has changed
In Greenplum 6, immediate child partitions have `pg_partition.parlevel = 0`. In
Cloudberry, immediate child partitions have `pg_partition_tree().level = 1`.
The `pg_partition` view has been removed in Cloudberry, so the
`pg_partition_tree()` function should be used instead.
This change may break queries that rely on partition levels.
**How to detect**
Check the application codebase and scripts for uses of partition levels.
**How to fix**
Adapt the application codebase and scripts to account for the changed partition
levels and the removal of the `pg_partition` view. Instead of `pg_partition`,
use the `pg_partition_tree()` function or the `pg_partitioned_table` view.
### Partition constraints have been moved out of `pg_constraint`
In Greenplum 6, partition boundaries are displayed as `CHECK CONSTRAINT`
entries in the `pg_constraint` system view.
In Cloudberry, partition boundaries are no longer `CHECK CONSTRAINT` entries
and are not displayed in `pg_constraint`. Similar information can now be
obtained from `gp_toolkit.gp_partitions`.
**How to detect**
Check the application codebase and scripts for uses of `pg_constraint` to work
with partition constraints.
**How to fix**
Use the `gp_toolkit.gp_partitions` view instead of `pg_constraint` to work with
partition constraints.
GitHub link:
https://github.com/apache/cloudberry/discussions/1816#discussioncomment-17260313
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]