Unsurprisingly ALTER TABLE worked alter table t1_p1 add primary key(a); ALTER TABLE postgres=# \d+ t1_p1 Table "public.t1_p1" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- a | integer | | not null | | plain | | b | integer | | | | plain | | Indexes: "t1_p1_pkey" PRIMARY KEY, btree (a) Inherits: t1
On Thu, Nov 24, 2016 at 11:05 AM, Ashutosh Bapat <ashutosh.ba...@enterprisedb.com> wrote: > I am trying to create a partitioned table with primary keys on the > partitions. Here's the corresponding syntax as per documentation > CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ > IF NOT EXISTS ] table_name > PARTITION OF parent_table [ ( > { column_name WITH OPTIONS [ column_constraint [ ... ] ] > | table_constraint } > [, ... ] > ) ] partition_bound_spec > > IIUC, it should allow "create table t1_p1 partition of t1 (a primary > key) ...", (a primary key) is nothing but "column_name > column_constraint", but here's what happens > create table t1_p1 partition of t1 (a primary key) for values from (0) to > (100); > ERROR: syntax error at or near "primary" > LINE 1: create table t1_p1 partition of t1 (a primary key) for value... > > The same syntax also suggests using table_constraints but that too doesn't > work > create table t1_p1 partition of t1 (primary key (a) ) for values > from (0) to (100); > ERROR: inherited relation "t1" is not a table or foreign table > > of course t1 is a table, what it isn't? > > Am I missing something? How do I define constraints on the partitions? > > > On Tue, Nov 22, 2016 at 2:45 PM, Amit Langote > <langote_amit...@lab.ntt.co.jp> wrote: >> >> Updated patches attached. I merged what used to be 0006 and 0007 into one. >> >> On 2016/11/19 2:23, Robert Haas wrote: >>> On Fri, Nov 18, 2016 at 5:59 AM, Amit Langote >>> <langote_amit...@lab.ntt.co.jp> wrote: >>>> Oh but wait, that means I can insert rows with NULLs in the range >>>> partition key if I choose to insert it directly into the partition, >>>> whereas I have been thinking all this while that there could never be >>>> NULLs in the partition key of a range partition. What's more, >>>> get_qual_for_partbound() (patch 0003) emits a IS NOT NULL constraint for >>>> every partition key column in case of a range partition. Is that >>>> wrongheaded altogether? (also see my reply to your earlier message about >>>> NULLs in the range partition key) >>> >>> The easiest thing to do might be to just enforce that all of the >>> partition key columns have to be not-null when the range-partitioned >>> table is defined, and reject any attempt to DROP NOT NULL on them >>> later. That's probably better that shoehorning it into the table >>> constraint. >> >> Agreed that forcing range partitioning columns to be NOT NULL during table >> creation would be a better approach. But then we would have to reject >> using expressions in the range partition key, right? >> >>>> Thanks for the idea below! >>>> >>>>> 1. Forget the idea of a tree. Instead, let the total number of tables >>>>> in the partitioning hierarchy be N and let the number of those that >> >> [ ... ] >> >>>>> >>>>> No recursion, minimal pointer chasing, no linked lists. The whole >>>>> thing is basically trivial aside from the cost of >>>>> list/range_partition_for_tuple itself; optimizing that is a different >>>>> project. I might have some details slightly off here, but hopefully >>>>> you can see what I'm going for: you want to keep the computation that >>>>> happens in get_partition_for_tuple() to an absolute minimum, and >>>>> instead set things up in advance so that getting the partition for a >>>>> tuple is FAST. And you want the data structures that you are using in >>>>> that process to be very compact, hence arrays instead of linked lists. >>>> >>>> This sounds *much* better. Here is a quick attempt at coding the design >>>> you have outlined above in the attached latest set of patches. >>> >>> That shrank both 0006 and 0007 substantially, and it should be faster, >>> too. I bet you can shrink them further: >> >> Some changes described below have reduced the size to a certain degree. >> >>> >>> - Why is PartitionKeyExecInfo a separate structure and why does it >>> have a NodeTag? I bet you can dump the node tag, merge it into >>> PartitionDispatch, and save some more code and some more >>> pointer-chasing. >> >> OK, I merged the fields of what used to be PartitionKeyExecInfo into >> PartitionDispatchData as the latter's new fields key and keystate. >> >>> - I still think it's a seriously bad idea for list partitioning and >>> range partitioning to need different code-paths all over the place >>> here. List partitions support nulls but not multi-column partitioning >>> keys and range partitions support multi-column partitioning keys but >>> not nulls, but you could use an internal structure that supports both. >>> Then you wouldn't need partition_list_values_bsearch and also >>> partition_rbound_bsearch; you could have one kind of bound structure >>> that can be bsearch'd for either list or range. You might even be >>> able to unify list_partition_for_tuple and range_partition_for_tuple >>> although that looks a little harder. In either case, you bsearch for >>> the greatest value <= the value you have. The only difference is that >>> for list partitioning, you have to enforce at the end that it is an >>> equal value, whereas for range partitioning less-than-or-equal-to is >>> enough. But you should still be able to arrange for more code >>> sharing. >> >> I have considered these suggestions in the latest patch. Now instead of >> PartitionListInfo, PartitionRangeInfo, and BoundCollectionData structs, >> there is only one PartitionBoundInfo which consolidates the partition >> bound information of a partitioned table. Some of the fields are >> applicable only to one of list or range case; for example, null-accepting >> list partition index, infinite status of individual range datums. >> >> Also, there is now only one binary search function named >> partition_bound_bsearch() which invokes a comparison function named >> partition_bound_cmp(). The former searches a probe (a partition bound or >> tuple) within a PartitionBoundInfo, which is passed all the way down to >> the comparison function. >> >> Also, we no longer have list_partition_for_tuple() and >> range_partition_for_tuple(). Instead, in get_partition_for_tuple() >> itself, there is a bsearch followed by list and range partitioning >> specific steps based on the returned offset. >> >>> - I don't see why you need the bound->lower stuff any more. If >>> rangeinfo.bounds[offset] is a lower bound for a partition, then >>> rangeinfo.bounds[offset+1] is either (a) the upper bound for that >>> partition and the partition is followed by a "gap" or (b) both the >>> upper bound for that partition and the lower bound for the next >>> partition. With the inclusive/exclusive bound stuff gone, every range >>> bound has the same sense: if the probed value is <= the bound then >>> we're supposed to be a lower-numbered partition, but if > then we're >>> supposed to be in this partition or a higher-numbered one. >> >> OK, I've managed to get rid of lower. At least it is no longer kept in >> the new relcache struct PartitionBoundInfo. It is still kept in >> PartitionRangeBound which is used to hold individual range bounds when >> sorting them (during relcache build). Comparisons invoked during the >> aforementioned sorting step still need to distinguish between lower and >> upper bounds (such that '1)' < '[1'). >> >> Tuple-routing no longer needs to look at lower. In that case, what you >> described above applies. >> >> As a result, one change became necessary: to how we flag individual range >> bound datum as infinite or not. Previously, it was a regular Boolean >> value (either infinite or not) and to distinguish +infinity from >> -infinity, we looked at whether the bound is lower or upper (the lower >> flag). Now, instead, the variable holding the status of individual range >> bound datum is set to a ternary value: RANGE_DATUM_FINITE (0), >> RANGE_DATUM_NEG_INF (1), and RANGE_DATUM_POS_INF (2), which still fits in >> a bool. Upon encountering an infinite range bound datum, whether it's >> negative or positive infinity derives the comparison result. Consider the >> following example: >> >> partition p1 from (1, unbounded) to (1, 1); >> partition p2 from (1, 1) to (1, 10); >> partition p3 from (1, 10) to (1, unbounded); >> partition p4 from (2, unbounded) to (2, 1); >> ... so on >> >> In this case, we need to be able to conclude, say, (1, -inf) < (1, 15) < >> (1, +inf), so that tuple (1, 15) is assigned to the proper partition. >> >> Does this last thing sound reasonable? >> >> Thanks, >> Amit > > > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers