Re: SQL:2011 application time
On 5/13/24 03:11, Peter Eisentraut wrote: It looks like we missed some of these fundamental design questions early on, and it might be too late now to fix them for PG17. For example, the discussion on unique constraints misses that the question of null values in unique constraints itself is controversial and that there is now a way to change the behavior. So I imagine there is also a selection of possible behaviors you might want for empty ranges. Intuitively, I don't think empty ranges are sensible for temporal unique constraints. But anyway, it's a bit late now to be discussing this. I'm also concerned that if ranges have this fundamental incompatibility with periods, then the plan to eventually evolve this patch set to support standard periods will also have as-yet-unknown problems. Some of these issues might be design flaws in the underlying mechanisms, like range types and exclusion constraints. Like, if you're supposed to use this for scheduling but you can use empty ranges to bypass exclusion constraints, how is one supposed to use this? Yes, a check constraint using isempty() might be the right answer. But I don't see this documented anywhere. On the technical side, adding an implicit check constraint as part of a primary key constraint is quite a difficult implementation task, as I think you are discovering. I'm just reminded about how the patch for catalogued not-null constraints struggled with linking these not-null constraints to primary keys correctly. This sounds a bit similar. I'm afraid that these issues cannot be resolved in good time for this release, so we should revert this patch set for now. I think reverting is a good idea. I'm not really happy with the CHECK constraint solution either. I'd be happy to have some more time to rework this for v18. A couple alternatives I'd like to explore: 1. Domain constraints instead of a CHECK constraint. I think this is probably worse, and I don't plan to spend much time on it, but I thought I'd mention it in case someone else thought otherwise. 2. A slightly different overlaps operator, say &&&, where 'empty' &&& 'empty' is true. But 'empty' with anything else could still be false (or not). That operator would prevent duplicates in an exclusion constraint. This also means we could support more types than just ranges & multiranges. I need to think about whether this combines badly with existing operators, but if not it has a lot of promise. If anything it might be *less* contradictory, because it fits better with 'empty' @> 'empty', which we say is true. Another thing a revert would give me some time to consider: even though it's not standard syntax, I wonder if we want to require syntax something like `PRIMARY KEY USING gist (id, valid_at WITHOUT OVERLAPS)`. Everywhere else we default to btree, so defaulting to gist feels a little weird. In theory we could even someday support WITHOUT OVERLAPS with btree, if we taught that AM to answer that question. (I admit there is probably not a lot of desire for that though.) Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 application time
On 5/12/24 08:51, Paul Jungwirth wrote: On 5/12/24 05:55, Matthias van de Meent wrote: > pg=# CREATE UNIQUE INDEX ON temporal_testing USING gist (id, valid_during); > ERROR: access method "gist" does not support unique indexes To me that error message seems correct. The programmer hasn't said anything about the special temporal behavior they are looking for. But I showed that I had a GIST index that does have the indisunique flag set, which shows that GIST does support indexes with unique semantics. That I can't use CREATE UNIQUE INDEX to create such an index doesn't mean the feature doesn't exist, which is what the error message implies. True, the error message is not really telling the truth anymore. I do think most people who hit this error are not thinking about temporal constraints at all though, and for non-temporal constraints it is still true. It's also true for CREATE INDEX, since WITHOUT OVERLAPS is only available on the *constraint*. So how about adding a hint, something like this?: ERROR: access method "gist" does not support unique indexes HINT: To create a unique constraint with non-overlap behavior, use ADD CONSTRAINT ... WITHOUT OVERLAPS. I thought a little more about eventually implementing WITHOUT OVERLAPS support for CREATE INDEX, and how it relates to this error message in particular. Even when that is done, it will still depend on the stratnum support function for the keys' opclasses, so the GiST AM itself will still have false amcanunique, I believe. Probably the existing error message is still the right one. The hint won't need to mention ADD CONSTRAINT anymore. It should still point users to WITHOUT OVERLAPS, and possibly the stratnum support function too. I think what we are doing for v17 is all compatible with that plan. Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 application time
On 5/5/24 20:01, jian he wrote: hi. I hope I understand the problem correctly. my understanding is that we are trying to solve a corner case: create table t(a int4range, b int4range, primary key(a, b WITHOUT OVERLAPS)); insert into t values ('[1,2]','empty'), ('[1,2]','empty'); I think the entry point is ATAddCheckNNConstraint and index_create. in a chain of DDL commands, you cannot be sure which one (primary key constraint or check constraint) is being created first, you just want to make sure that after both constraints are created, then add a dependency between primary key and check constraint. so you need to validate at different functions (ATAddCheckNNConstraint, index_create) that these two constraints are indeed created, only after that we have a dependency linking these two constraints. I've attached a patch trying to solve this problem. the patch is not totally polished, but works as expected, and also has lots of comments. Thanks for this! I've incorporated it into the CHECK constraint patch with some changes. In particular I thought index_create was a strange place to change the conperiod value of a pg_constraint record, and it is not actually needed if we are copying that value correctly. Some other comments on the patch file: > N.B. we also need to have special care for case > where check constraint was readded, e.g. ALTER TYPE. > if ALTER TYPE is altering the PERIOD column of the primary key, > alter column of primary key makes the index recreate, check constraint recreate, > however, former interally also including add a check constraint. > so we need to take care of merging two check constraint. This is a good point. I've included tests for this based on your patch. > N.B. the check constraint name is hard-wired, so if you create the constraint > with the same name, PERIOD primary key cannot be created. Yes, it may be worth doing something like other auto-named constraints and trying to avoid duplicates. I haven't taken that on yet; I'm curious what others have to say about it. > N.B. what about UNIQUE constraint? See my previous posts on this thread about allowing 'empty' in UNIQUE constraints. > N.B. seems ok to not care about FOREIGN KEY regarding this corner case? Agreed. v3 patches attached, rebased to 3ca43dbbb6. Yours, -- Paul ~{:-) p...@illuminatedcomputing.comFrom 4f4428fb41ea79056a13e425826fdac9c7b5d349 Mon Sep 17 00:00:00 2001 From: "Paul A. Jungwirth" Date: Tue, 2 Apr 2024 15:39:04 -0700 Subject: [PATCH v3 1/2] Don't treat WITHOUT OVERLAPS indexes as unique in planner Because the special rangetype 'empty' never overlaps another value, it is possible for WITHOUT OVERLAPS tables to have two rows with the same key, despite being indisunique, if the application-time range is 'empty'. So to be safe we should not treat WITHOUT OVERLAPS indexes as unique in any proofs. This still needs a test, but I'm having trouble finding a query that gives wrong results. --- src/backend/optimizer/path/indxpath.c | 5 +++-- src/backend/optimizer/plan/analyzejoins.c | 6 +++--- src/backend/optimizer/util/plancat.c | 1 + src/include/nodes/pathnodes.h | 2 ++ 4 files changed, 9 insertions(+), 5 deletions(-) diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c index c0fcc7d78df..72346f78ebe 100644 --- a/src/backend/optimizer/path/indxpath.c +++ b/src/backend/optimizer/path/indxpath.c @@ -3498,13 +3498,14 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel, /* * If the index is not unique, or not immediately enforced, or if it's - * a partial index, it's useless here. We're unable to make use of + * a partial index, or if it's a WITHOUT OVERLAPS index (so not + * literally unique), it's useless here. We're unable to make use of * predOK partial unique indexes due to the fact that * check_index_predicates() also makes use of join predicates to * determine if the partial index is usable. Here we need proofs that * hold true before any joins are evaluated. */ - if (!ind->unique || !ind->immediate || ind->indpred != NIL) + if (!ind->unique || !ind->immediate || ind->indpred != NIL || ind->hasperiod) continue; /* diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c index c3fd4a81f8a..dc8327d5769 100644 --- a/src/backend/optimizer/plan/analyzejoins.c +++ b/src/backend/optimizer/plan/analyzejoins.c @@ -814,8 +814,8 @@ rel_supports_distinctness(PlannerInfo *root, RelOptInfo *rel) * For a plain relation, we only know how to prove uniqueness by * reference to unique indexes. Make sure there's at least one * suitable unique index. It must be immediately enforced, and not a - * partial index. (Keep these conditions in sync with - * relation_has_unique_index_for!) + * partial index, and not WITHOUT OVERLAPS (Keep these conditions + * in sync with relation_has_unique_index_for!)
Re: SQL:2011 application time
On 5/12/24 05:55, Matthias van de Meent wrote: > pg=# CREATE UNIQUE INDEX ON temporal_testing USING gist (id, valid_during); > ERROR: access method "gist" does not support unique indexes To me that error message seems correct. The programmer hasn't said anything about the special temporal behavior they are looking for. But I showed that I had a GIST index that does have the indisunique flag set, which shows that GIST does support indexes with unique semantics. That I can't use CREATE UNIQUE INDEX to create such an index doesn't mean the feature doesn't exist, which is what the error message implies. True, the error message is not really telling the truth anymore. I do think most people who hit this error are not thinking about temporal constraints at all though, and for non-temporal constraints it is still true. It's also true for CREATE INDEX, since WITHOUT OVERLAPS is only available on the *constraint*. So how about adding a hint, something like this?: ERROR: access method "gist" does not support unique indexes HINT: To create a unique constraint with non-overlap behavior, use ADD CONSTRAINT ... WITHOUT OVERLAPS. Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 application time
On 5/9/24 17:44, Matthias van de Meent wrote: I haven't really been following this thread, but after playing around a bit with the feature I feel there are new gaps in error messages. I also think there are gaps in the functionality regarding the (lack of) support for CREATE UNIQUE INDEX, and attaching these indexes to constraints Thank you for trying this out and sharing your thoughts! I think these are good points about CREATE UNIQUE INDEX and then creating the constraint by handing it an existing index. This is something that I am hoping to add, but it's not covered by the SQL:2011 standard, so I think it needs some discussion, and I don't think it needs to go into v17. For instance you are saying: > pg=# CREATE UNIQUE INDEX ON temporal_testing USING gist (id, valid_during); > ERROR: access method "gist" does not support unique indexes To me that error message seems correct. The programmer hasn't said anything about the special temporal behavior they are looking for. To get non-overlapping semantics from an index, this more explicit syntax seems better, similar to PKs in the standard: > pg=# CREATE UNIQUE INDEX ON temporal_testing USING gist (id, valid_during WITHOUT OVERLAPS); > ERROR: access method "gist" does not support unique indexes We could also support *non-temporal* unique GiST indexes, particularly now that we have the stratnum support function. Those would use the syntax you gave, omitting WITHOUT OVERLAPS. But that seems like a separate effort to me. Additionally, because I can't create my own non-constraint-backing unique GIST indexes, I can't pre-create my unique constraints CONCURRENTLY as one could do for the non-temporal case: UNIQUE constraints hold ownership of the index and would drop the index if the constraint is dropped, too, and don't support a CONCURRENTLY modifier, nor an INVALID modifier. This means temporal unique constraints have much less administrative wiggle room than normal unique constraints, and I think that's not great. This is a great use-case for why we should support this eventually, even if it uses non-standard syntax. Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 application time
On 5/11/24 17:00, jian he wrote: I hope I understand the problem correctly. my understanding is that we are trying to solve a corner case: create table t(a int4range, b int4range, primary key(a, b WITHOUT OVERLAPS)); insert into t values ('[1,2]','empty'), ('[1,2]','empty'); but we still not yet address for cases like: create table t10(a int4range, b int4range, unique (a, b WITHOUT OVERLAPS)); insert into t10 values ('[1,2]','empty'), ('[1,2]','empty'); one table can have more than one temporal unique constraint, for each temporal unique constraint adding a check isempty constraint seems not easy. I think we should add the not-empty constraint only for PRIMARY KEYs, not all UNIQUE constraints. The empty edge case is very similar to the NULL edge case, and while every PK column must be non-null, we do allow nulls in ordinary UNIQUE constraints. If users want to have 'empty' in those constraints, I think we should let them. And then the problems you give don't arise. Maybe we can just mention that the special 'empty' range value makes temporal unique constraints not "unique". Just documenting the behavior is also an okay solution here I think. I see two downsides though: (1) it makes rangetype temporal keys differ from PERIOD temporal keys (2) it could allow more planner/etc bugs than we have thought of. So I think it's worth adding the constraint instead. also we can make sure that FOREIGN KEY can only reference primary keys, not unique temporal constraints. so the unique temporal constraints not "unique" implication is limited. I played around with it, we can error out these cases in the function transformFkeyCheckAttrs. I don't think it is a problem to reference a temporal UNIQUE constraint, even if it contains empty values. An empty value means you're not asserting that row at any time (though another row might assert the same thing for some time), so it could never contribute toward fulfilling a reference anyway. I do think it would be nice if the *reference* could contain empty values. Right now the FK SQL will cause that to never match, because we use `&&` as an optimization, but we could tweak the SQL (maybe for v18 instead) so that users could get away with that kind of thing. As I said in an earlier email, this would be you an escape hatch to reference a temporal table from a non-temporal table. Otherwise temporal tables are "contagious," which is a bit of a drawback. Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 application time
Here are a couple new patches, rebased to e305f715, addressing Peter's feedback. I'm still working on integrating jian he's suggestions for the last patch, so I've omitted that one here. On 5/8/24 06:51, Peter Eisentraut wrote: About v2-0001-Fix-ON-CONFLICT-DO-NOTHING-UPDATE-for-temporal-in.patch, I think the ideas are right, but I wonder if we can fine-tune the new conditionals a bit. --- a/src/backend/executor/execIndexing.c +++ b/src/backend/executor/execIndexing.c @@ -210,7 +210,7 @@ ExecOpenIndices(ResultRelInfo *resultRelInfo, bool speculative) * If the indexes are to be used for speculative insertion, add extra * information required by unique index entries. */ - if (speculative && ii->ii_Unique) + if (speculative && ii->ii_Unique && !ii->ii_HasWithoutOverlaps) BuildSpeculativeIndexInfo(indexDesc, ii); Here, I think we could check !indexDesc->rd_index->indisexclusion instead. So we wouldn't need ii_HasWithoutOverlaps. Okay. Or we could push this into BuildSpeculativeIndexInfo(); it could just skip the rest if an exclusion constraint is passed, on the theory that all the speculative index info is already present in that case. I like how BuildSpeculativeIndexInfo starts with an Assert that it's given a unique index, so I've left the check outside the function. This seems cleaner anyway: the function stays more focused. --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -815,7 +815,7 @@ infer_arbiter_indexes(PlannerInfo *root) */ if (indexOidFromConstraint == idxForm->indexrelid) { - if (!idxForm->indisunique && onconflict->action == ONCONFLICT_UPDATE) + if ((!idxForm->indisunique || idxForm->indisexclusion) && onconflict->action == ONCONFLICT_UPDATE) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("ON CONFLICT DO UPDATE not supported with exclusion constraints"))); Shouldn't this use only idxForm->indisexclusion anyway? Like + if (idxForm->indisexclusion && onconflict->action == ONCONFLICT_UPDATE) That matches what the error message is reporting afterwards. Agreed. * constraints), so index under consideration can be immediately * skipped if it's not unique */ - if (!idxForm->indisunique) + if (!idxForm->indisunique || idxForm->indisexclusion) goto next; Maybe here we need a comment. Or make that a separate statement, like Yes, that is nice. Done. Yours, -- Paul ~{:-) p...@illuminatedcomputing.comFrom 0ebe2e6d6cd8dc6f8120fe93b9024cf80472f8cc Mon Sep 17 00:00:00 2001 From: "Paul A. Jungwirth" Date: Sun, 24 Mar 2024 21:46:30 -0700 Subject: [PATCH v2 1/2] Fix ON CONFLICT DO NOTHING/UPDATE for temporal indexes A PRIMARY KEY or UNIQUE constraint with WITHOUT OVERLAPS will be a GiST index, not a B-Tree, but it will still have indisunique set. The code for ON CONFLICT fails if it sees a non-btree index that has indisunique. This commit fixes that and adds some tests. But now that we can't just test indisunique, we also need some extra checks to prevent DO UPDATE from running against a WITHOUT OVERLAPS constraint (because the conflict could happen against more than one row, and we'd only update one). --- src/backend/catalog/index.c | 1 + src/backend/executor/execIndexing.c | 2 +- src/backend/optimizer/util/plancat.c | 9 +- src/include/nodes/execnodes.h | 1 + .../regress/expected/without_overlaps.out | 176 ++ src/test/regress/sql/without_overlaps.sql | 113 +++ 6 files changed, 300 insertions(+), 2 deletions(-) diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 5a8568c55c9..1fd543cc550 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -2443,6 +2443,7 @@ BuildIndexInfo(Relation index) >ii_ExclusionOps, >ii_ExclusionProcs, >ii_ExclusionStrats); + ii->ii_HasWithoutOverlaps = ii->ii_Unique; } return ii; diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c index 9f05b3654c1..59acf67a36a 100644 --- a/src/backend/executor/execIndexing.c +++ b/src/backend/executor/execIndexing.c @@ -210,7 +210,7 @@ ExecOpenIndices(ResultRelInfo *resultRelInfo, bool speculative) * If the indexes are to be used for speculative insertion, add extra * information required by unique index entries. */ - if (speculative && ii->ii_Unique) + if (speculative && ii->ii_Unique && !indexDesc->rd_index->indisexclusion) BuildSpeculativeIndexInfo(indexDesc, ii); relationDescs[i] = indexDesc; diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index 130f838629f..775c3e26cd8 100644 ---
Re: PERIOD foreign key feature
On 5/8/24 07:44, Bruce Momjian wrote: On Wed, May 8, 2024 at 02:29:34PM +0200, Peter Eisentraut wrote: Yes, David is correct here on all points. I like his suggestion to clarify the language here also. If you need a patch from me let me know, but I assume it's something a committer can just make happen? In principle yes, but it's also very helpful if someone produces an actual patch file, with complete commit message, credits, mailing list link, etc. I am ready to do the work, but waited a day for Peter to reply, since he was the author of the text. Here is a patch for this. Yours, -- Paul ~{:-) p...@illuminatedcomputing.comFrom d97a60b7e56c57a242668609c8fb82e6a6a32506 Mon Sep 17 00:00:00 2001 From: "Paul A. Jungwirth" Date: Wed, 8 May 2024 20:41:05 -0700 Subject: [PATCH] Reword docs for temporal PKs with implicit referent The old docs had confusing language about "one side" of the foreign key, which could mean either the referenced primary key itself or the REFERENCES clause of the FK declaration. Author: David G. Johnston Discussion: https://www.postgresql.org/message-id/ZjpApuq8I9DE5Elv%40momjian.us --- doc/src/sgml/ref/create_table.sgml | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml index 02f31d2d6fd..75f06bc49cc 100644 --- a/doc/src/sgml/ref/create_table.sgml +++ b/doc/src/sgml/ref/create_table.sgml @@ -1184,11 +1184,13 @@ WITH ( MODULUS numeric_literal, REM referent for its entire duration. This column must be a range or multirange type. In addition, the referenced table must have a primary key or unique constraint declared with WITHOUT - OVERLAPS. Finally, if one side of the foreign key uses - PERIOD, the other side must too. If the refcolumn list is omitted, the - WITHOUT OVERLAPS part of the primary key is treated - as if marked with PERIOD. + OVERLAPS. Finally, if the foreign key has a PERIOD + column_name specification + the corresponding refcolumn, + if present, must also be marked PERIOD. If the + refcolumn clause is omitted, + and thus the reftable's primary key constraint chosen, the primary key + must have its final column marked WITHOUT OVERLAPS. -- 2.42.0
Re: PERIOD foreign key feature
On 5/7/24 08:23, David G. Johnston wrote: On Tue, May 7, 2024 at 7:54 AM Bruce Momjian mailto:br...@momjian.us>> wrote: In the two marked lines, it says "if one side of the foreign key uses PERIOD, the other side must too." However, looking at the example queries, it seems like if the foreign side has PERIOD, the primary side must have WITHOUT OVERLAPS, not PERIOD. Does this doc text need correcting? The text is factually correct, though a bit hard to parse. "the other side" refers to the part after "REFERENCES": FOREIGN KEY ( column_name [, ... ] [, PERIOD column_name ] ) REFERENCES reftable [ ( refcolumn [, ... ] [, PERIOD column_name ] ) ] ***(shouldn't the second occurrence be [, PERIOD refcolum] ?) The text is pointing out that since the refcolumn specification is optional you may very well not see a second PERIOD keyword in the clause. Instead it will be inferred from the PK. Maybe: Finally, if the foreign key has a PERIOD column_name specification the corresponding refcolumn, if present, must also be marked PERIOD. If the refcolumn clause is omitted, and thus the reftable's primary key constraint chosen, the primary key must have its final column marked WITHOUT OVERLAPS. Yes, David is correct here on all points. I like his suggestion to clarify the language here also. If you need a patch from me let me know, but I assume it's something a committer can just make happen? Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 application time
On 4/30/24 09:24, Robert Haas wrote: Peter, could you have a look at http://postgr.es/m/47550967-260b-4180-9791-b224859fe...@illuminatedcomputing.com and express an opinion about whether each of those proposals are (a) good or bad ideas and (b) whether they need to be fixed for the current release? Here are the same patches but rebased. I've added a fourth which is my progress on adding the CHECK constraint. I don't really consider it finished though, because it has these problems: - The CHECK constraint should be marked as an internal dependency of the PK, so that you can't drop it, and it gets dropped when you drop the PK. I don't see a good way to tie the two together though, so I'd appreciate any advice there. They are separate AlterTableCmds, so how do I get the ObjectAddress of both constraints at the same time? I wanted to store the PK's ObjectAddress on the Constraint node, but since ObjectAddress isn't a Node it doesn't work. - The CHECK constraint should maybe be hidden when you say `\d foo`? Or maybe not, but that's what we do with FK triggers. - When you create partitions you get a warning about the constraint already existing, because it gets created via the PK and then also the partitioning code tries to copy it. Solving the first issue here should solve this nicely though. Alternately we could just fix the GROUP BY functional dependency code to only accept b-tree indexes. But I think the CHECK constraint approach is a better solution. Thanks, -- Paul ~{:-) p...@illuminatedcomputing.comFrom 0dbc008a654ab1fdc5f492345ee4575c352499d3 Mon Sep 17 00:00:00 2001 From: "Paul A. Jungwirth" Date: Sun, 24 Mar 2024 21:46:30 -0700 Subject: [PATCH v2 1/4] Fix ON CONFLICT DO NOTHING/UPDATE for temporal indexes A PRIMARY KEY or UNIQUE constraint with WITHOUT OVERLAPS will be a GiST index, not a B-Tree, but it will still have indisunique set. The code for ON CONFLICT fails if it sees a non-btree index that has indisunique. This commit fixes that and adds some tests. But now that we can't just test indisunique, we also need some extra checks to prevent DO UPDATE from running against a WITHOUT OVERLAPS constraint (because the conflict could happen against more than one row, and we'd only update one). --- src/backend/catalog/index.c | 1 + src/backend/executor/execIndexing.c | 2 +- src/backend/optimizer/util/plancat.c | 4 +- src/include/nodes/execnodes.h | 1 + .../regress/expected/without_overlaps.out | 176 ++ src/test/regress/sql/without_overlaps.sql | 113 +++ 6 files changed, 294 insertions(+), 3 deletions(-) diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 5a8568c55c9..1fd543cc550 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -2443,6 +2443,7 @@ BuildIndexInfo(Relation index) >ii_ExclusionOps, >ii_ExclusionProcs, >ii_ExclusionStrats); + ii->ii_HasWithoutOverlaps = ii->ii_Unique; } return ii; diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c index 9f05b3654c1..faa37ca56db 100644 --- a/src/backend/executor/execIndexing.c +++ b/src/backend/executor/execIndexing.c @@ -210,7 +210,7 @@ ExecOpenIndices(ResultRelInfo *resultRelInfo, bool speculative) * If the indexes are to be used for speculative insertion, add extra * information required by unique index entries. */ - if (speculative && ii->ii_Unique) + if (speculative && ii->ii_Unique && !ii->ii_HasWithoutOverlaps) BuildSpeculativeIndexInfo(indexDesc, ii); relationDescs[i] = indexDesc; diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index 130f838629f..a398d7a78d1 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -815,7 +815,7 @@ infer_arbiter_indexes(PlannerInfo *root) */ if (indexOidFromConstraint == idxForm->indexrelid) { - if (!idxForm->indisunique && onconflict->action == ONCONFLICT_UPDATE) + if ((!idxForm->indisunique || idxForm->indisexclusion) && onconflict->action == ONCONFLICT_UPDATE) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("ON CONFLICT DO UPDATE not supported with exclusion constraints"))); @@ -837,7 +837,7 @@ infer_arbiter_indexes(PlannerInfo *root) * constraints), so index under consideration can be immediately * skipped if it's not unique */ - if (!idxForm->indisunique) + if (!idxForm->indisunique || idxForm->indisexclusion) goto next; /* Build BMS representation of plain (non expression) index attrs */ diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h index d927ac44a82..fdfaef284e9 100644 --- a/src/include/nodes/execnodes.h +++ b/src/include/nodes/execnodes.h @@ -204,6 +204,7 @@ typedef struct IndexInfo bool ii_Summarizing; int ii_ParallelWorkers; Oid ii_Am; + bool
Re: SQL:2011 application time
On 4/26/24 12:25, Robert Haas wrote: I think this thread should be added to the open items list. Thanks! I sent a request to pgsql-www to get edit permission. I didn't realize there was a wiki page tracking things like this. I agree it needs to be fixed if we want to include the feature. Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 application time
On 3/24/24 00:38, Peter Eisentraut wrote:> I have committed the patches v33-0001-Add-temporal-FOREIGN-KEYs.patch and v33-0002-Support-multiranges-in-temporal-FKs.patch (together). Hi Hackers, I found some problems with temporal primary keys and the idea of uniqueness, especially around the indisunique column. Here are some small fixes and a proposal for a larger fix, which I think we need but I'd like some feedback on. The first patch fixes problems with ON CONFLICT DO NOTHING/UPDATE. DO NOTHING fails because it doesn't expect a non-btree unique index. It's fine to make it accept a temporal PRIMARY KEY/UNIQUE index though (i.e. an index with both indisunique and indisexclusion). This is no different than an exclusion constraint. So I skip BuildSpeculativeIndexInfo for WITHOUT OVERLAPS indexes. (Incidentally, AFAICT ii_UniqueOps is never used, only ii_UniqueProcs. Right?) We should still forbid temporally-unique indexes for ON CONFLICT DO UPDATE, since there may be more than one row that conflicts. Ideally we would find and update all the conflicting rows, but we don't now, and we don't want to update just one: postgres=# create table t (id int4range, valid_at daterange, name text, constraint tpk primary key (id, valid_at without overlaps)); CREATE TABLE postgres=# insert into t values ('[1,2)', '[2000-01-01,2001-01-01)', 'a'), ('[1,2)', '[2001-01-01,2002-01-01)', 'b'); INSERT 0 2 postgres=# insert into t values ('[1,2)', '[2000-01-01,2002-01-01)', 'c') on conflict (id, valid_at) do update set name = excluded.name; INSERT 0 1 postgres=# select * from t; id |valid_at | name ---+-+-- [1,2) | [2001-01-01,2002-01-01) | b [1,2) | [2000-01-01,2001-01-01) | c (2 rows) So I also added code to prevent that. This is just preserving the old behavior for exclusion constraints, which was bypassed because of indisunique. All this is in the first patch. That got me thinking about indisunique and where else it could cause problems. Perhaps there are other places that assume only b-trees are unique. I couldn't find anywhere that just gives an error like ON CONFLICT, but I can imagine more subtle problems. A temporal PRIMARY KEY or UNIQUE constraint is unique in at least three ways: It is *metaphorically* unique: the conceit is that the scalar part is unique at every moment in time. You may have id 5 in your table more than once, as long as the records' application times don't overlap. And it is *officially* unique: the standard calls these constraints unique. I think it is correct for us to report them as unique in pg_index. But is it *literally* unique? Well two identical keys, e.g. (5, '[Jan24,Mar24)') and (5, '[Jan24,Mar24)'), do have overlapping ranges, so the second is excluded. Normally a temporal unique index is *more* restrictive than a standard one, since it forbids other values too (e.g. (5, '[Jan24,Feb24)')). But sadly there is one exception: the ranges in these keys do not overlap: (5, 'empty'), (5, 'empty'). With ranges/multiranges, `'empty' && x` is false for all x. You can add that key as many times as you like, despite a PK/UQ constraint: postgres=# insert into t values ('[1,2)', 'empty', 'foo'), ('[1,2)', 'empty', 'bar'); INSERT 0 2 postgres=# select * from t; id | valid_at | name ---+--+-- [1,2) | empty| foo [1,2) | empty| bar (2 rows) Cases like this shouldn't actually happen for temporal tables, since empty is not a meaningful value. An UPDATE/DELETE FOR PORTION OF would never cause an empty. But we should still make sure they don't cause problems. One place we should avoid temporally-unique indexes is REPLICA IDENTITY. Fortunately we already do that, but patch 2 adds a test to keep it that way. Uniqueness is an important property to the planner, too. We consider indisunique often for estimates, where it needn't be 100% true. Even if there are nullable columns or a non-indimmediate index, it still gives useful stats. Duplicates from 'empty' shouldn't cause any new problems there. In proof code we must be more careful. Patch 3 updates relation_has_unique_index_ext and rel_supports_distinctness to disqualify WITHOUT OVERLAPS indexes. Maybe that's more cautious than needed, but better safe than sorry. This patch has no new test though. I had trouble writing SQL that was wrong before its change. I'd be happy for help here! Another problem is GROUP BY and functional dependencies. This is wrong: postgres=# create table a (id int4range, valid_at daterange, name text, constraint apk primary key (id, valid_at without overlaps)); CREATE TABLE postgres=# insert into a values ('[1,2)', 'empty', 'foo'), ('[1,2)', 'empty', 'bar'); INSERT 0 2 postgres=# select * from a group by id, valid_at; id | valid_at | name ---+--+-- [1,2) |
Re: altering a column's collation leaves an invalid foreign key
On 3/23/24 10:04, Paul Jungwirth wrote: Perhaps if the previous collation was nondeterministic we should force a re-check. Here is a patch implementing this. It was a bit more fuss than I expected, so maybe someone has a better way. We have had nondeterministic collations since v12, so perhaps it is something to back-patch? Yours, -- Paul ~{:-) p...@illuminatedcomputing.comFrom a8078f85859400f4e4cffb57d8ec1b069e46bfb8 Mon Sep 17 00:00:00 2001 From: "Paul A. Jungwirth" Date: Sun, 24 Mar 2024 23:37:48 -0700 Subject: [PATCH v1] Re-check foreign key if referenced collation was nondeterministic With deterministic collations, we break ties by looking at binary equality. But nondeterministic collations can allow non-identical values to be considered equal, e.g. 'a' and 'A' when case-insensitive. So some references that used to be valid may not be anymore. --- src/backend/commands/tablecmds.c | 96 --- src/include/nodes/parsenodes.h| 2 + .../regress/expected/collate.icu.utf8.out | 9 ++ src/test/regress/sql/collate.icu.utf8.sql | 8 ++ 4 files changed, 103 insertions(+), 12 deletions(-) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 71740984f33..940b1845edb 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -194,6 +194,7 @@ typedef struct AlteredTableInfo /* Objects to rebuild after completing ALTER TYPE operations */ List *changedConstraintOids; /* OIDs of constraints to rebuild */ List *changedConstraintDefs; /* string definitions of same */ + List *changedCollationOids; /* collation of each attnum */ List *changedIndexOids; /* OIDs of indexes to rebuild */ List *changedIndexDefs; /* string definitions of same */ char *replicaIdentityIndex; /* index to reset as REPLICA IDENTITY */ @@ -572,6 +573,7 @@ static ObjectAddress ATExecAlterColumnType(AlteredTableInfo *tab, Relation rel, AlterTableCmd *cmd, LOCKMODE lockmode); static void RememberAllDependentForRebuilding(AlteredTableInfo *tab, AlterTableType subtype, Relation rel, AttrNumber attnum, const char *colName); +static void RememberCollationForRebuilding(AttrNumber attnum, AlteredTableInfo *tab); static void RememberConstraintForRebuilding(Oid conoid, AlteredTableInfo *tab); static void RememberIndexForRebuilding(Oid indoid, AlteredTableInfo *tab); static void RememberStatisticsForRebuilding(Oid stxoid, AlteredTableInfo *tab); @@ -579,12 +581,12 @@ static void ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode); static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd, List **wqueue, LOCKMODE lockmode, - bool rewrite); + bool rewrite, List *collationOids); static void RebuildConstraintComment(AlteredTableInfo *tab, AlterTablePass pass, Oid objid, Relation rel, List *domname, const char *conname); static void TryReuseIndex(Oid oldId, IndexStmt *stmt); -static void TryReuseForeignKey(Oid oldId, Constraint *con); +static void TryReuseForeignKey(Oid oldId, Constraint *con, List *changedCollationOids); static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char *colName, List *options, LOCKMODE lockmode); static void change_owner_fix_column_acls(Oid relationOid, @@ -9826,6 +9828,7 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel, bool old_check_ok; ObjectAddress address; ListCell *old_pfeqop_item = list_head(fkconstraint->old_conpfeqop); + ListCell *old_collation_item = list_head(fkconstraint->old_collations); /* * Grab ShareRowExclusiveLock on the pk table, so that someone doesn't @@ -10212,9 +10215,15 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel, CoercionPathType new_pathtype; Oid old_castfunc; Oid new_castfunc; + Oid old_collation; Form_pg_attribute attr = TupleDescAttr(tab->oldDesc, fkattnum[i] - 1); + /* Get the collation for this key part. */ + old_collation = lfirst_oid(old_collation_item); + old_collation_item = lnext(fkconstraint->old_collations, + old_collation_item); + /* * Identify coercion pathways from each of the old and new FK-side * column types to the right (foreign) operand type of the pfeqop. @@ -10250,9 +10259,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel, * turn conform to the domain. Consequently, we need not treat * domains specially here. * - * Since we require that all collations share the same notion of - * equality (which they do, because texteq reduces to bitwise - * equality), we don't compare collation here. + * All deterministic collations use bitwise equality to resolve + * ties, but if the previous collation was nonde
altering a column's collation leaves an invalid foreign key
Dear hackers, I was looking at how foreign keys deal with collations, and I came across this comment about not re-checking a foreign key if the column type changes in a compatible way: * Since we require that all collations share the same notion of * equality (which they do, because texteq reduces to bitwise * equality), we don't compare collation here. But now that we have nondeterministic collations, isn't that out of date? For instance I could make this foreign key: paul=# create collation itext (provider = 'icu', locale = 'und-u-ks-level1', deterministic = false); CREATE COLLATION paul=# create table t1 (id text collate itext primary key); CREATE TABLE paul=# create table t2 (id text, parent_id text references t1); CREATE TABLE And then: paul=# insert into t1 values ('a'); INSERT 0 1 paul=# insert into t2 values ('.', 'A'); INSERT 0 1 So far that behavior seems correct, because the user told us 'a' and 'A' were equivalent, but now I can change the collation on the referenced table and the FK doesn't complain: paul=# alter table t1 alter column id type text collate "C"; ALTER TABLE The constraint claims to be valid, but I can't drop & add it: paul=# alter table t2 drop constraint t2_parent_id_fkey; ALTER TABLE paul=# alter table t2 add constraint t2_parent_id_fkey foreign key (parent_id) references t1; ERROR: insert or update on table "t2" violates foreign key constraint "t2_parent_id_fkey" DETAIL: Key (parent_id)=(A) is not present in table "t1". Isn't that a problem? Perhaps if the previous collation was nondeterministic we should force a re-check. (Tested on 17devel 697f8d266c and also 16.) Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 application time
On 2/13/24 21:00, jian he wrote: Hi more minor issues. + FindFKComparisonOperators( + fkconstraint, tab, i, fkattnum, + _check_ok, _pfeqop_item, + pktypoid[i], fktypoid[i], opclasses[i], + is_temporal, false, + [i], [i], [i]); + } + if (is_temporal) { + pkattnum[numpks] = pkperiodattnum; + pktypoid[numpks] = pkperiodtypoid; + fkattnum[numpks] = fkperiodattnum; + fktypoid[numpks] = fkperiodtypoid; - pfeqop = get_opfamily_member(opfamily, opcintype, fktyped, - eqstrategy); - if (OidIsValid(pfeqop)) - { - pfeqop_right = fktyped; - ffeqop = get_opfamily_member(opfamily, fktyped, fktyped, - eqstrategy); - } - else - { - /* keep compiler quiet */ - pfeqop_right = InvalidOid; - ffeqop = InvalidOid; - } + FindFKComparisonOperators( + fkconstraint, tab, numpks, fkattnum, + _check_ok, _pfeqop_item, + pkperiodtypoid, fkperiodtypoid, opclasses[numpks], + is_temporal, true, + [numpks], [numpks], [numpks]); + numfks += 1; + numpks += 1; + } opening curly brace should be the next line, Fixed in v25 (submitted in my other email). also do you think it's good idea to add following in the `if (is_temporal)` branch ` Assert(OidIsValid(fkperiodtypoid) && OidIsValid(pkperiodtypoid)); Assert(OidIsValid(pkperiodattnum > 0 && fkperiodattnum > 0)); ` ` if (is_temporal)` branch, you can set the FindFKComparisonOperators 10th argument (is_temporal) to true, since you are already in the ` if (is_temporal)` branch. maybe we need some extra comments on ` + numfks += 1; + numpks += 1; ` since it might not be that evident? That branch doesn't exist anymore. Same with the increments. Do you think it's a good idea to list arguments line by line (with good indentation) is good format? like: FindFKComparisonOperators(fkconstraint, tab, i, fkattnum, _check_ok, _pfeqop_item, pktypoid[i], fktypoid[i], opclasses[i], false, false, [i], [i], [i]); There are places we do that, but most code I've seen tries to fill the line. I haven't followed that strictly here, but I'm trying to get better at doing what pg_indent wants. Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: automating RangeTblEntry node support
On 1/15/24 02:37, Peter Eisentraut wrote: In this updated patch set, I have also added the treatment of the Constraint type. (I also noted that the manual read/write functions for the Constraint type are out-of-sync again, so simplifying this would be really helpful.) I have also added commit messages to each patch. The way I have re-ordered the patch series now, I think patches 0001 through 0003 are candidates for inclusion after review, patch 0004 still needs a bit more analysis and testing, as described therein. I had to apply the first patch by hand (on 9f13376396), so this looks due for a rebase. Patches 2-4 applied fine. Compiles & passes tests after each patch. The overall idea seems like a good improvement to me. A few remarks about cleaning up the RangeTblEntry comments: After the fourth patch we have a "Fields valid in all RTEs" comment twice in the struct, once at the top and once at the bottom. It's fine IMO but maybe the second could be "More fields valid in all RTEs"? The new order of fields in RangleTblEntry matches the intro comment, which seems like another small benefit. It seems like we are moving away from ever putting RTEKind-specific fields into a union as suggested by the FIXME comment here. It was written in 2002. Is it time to remove it? This now needs to say "above" not "below": /* * join_using_alias is an alias clause attached directly to JOIN/USING. It * is different from the alias field (below) in that it does not hide the * range variables of the tables being joined. */ Alias *join_using_alias pg_node_attr(query_jumble_ignore); Re bloating the serialization output, we could leave this last patch until after the work on that other thread is done to skip default-valued items. Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 application time
On 1/8/24 06:54, jian he wrote: > On Fri, Jan 5, 2024 at 1:06 PM jian he wrote: > > range_intersect returns the intersection of two ranges. > I think here we are doing the opposite. > names the main SQL function "range_not_intersect" and the internal > function as "range_not_intersect_internal" should be fine. > so people don't need to understand the meaning of "portion". Thank you for helping me figure out a name here! I realize that can be a bike-sheddy kind of discussion, so let me share some of my principles. Range and multirange are highly mathematically "pure", and that's something I value in them. It makes them more general-purpose, less encumbered by edge cases, easier to combine, and easier to reason about. Preserving that close connection to math is a big goal. What I've called `without_portion` is (like) a closed form of minus (hence `@-` for the operator). Minus isn't closed under everything (e.g. ranges), so `without_portion` adds arrays---much as to close subtraction we add negative numbers and to close division we add rationals). We get the same effect from multiranges, but that only buys us range support. It would be awesome to support arbitrary types: ranges, multiranges, mdranges, boxes, polygons, inets, etc., so I think an array is the way to go here. And then each array element is a "leftover". What do we call a closed form of minus that returns arrays? Using "not" suggests a function that returns true/false, but `@-` returns an array of things. So instead of "not" let's consider "complement". I think that's what you're expressing re intersection. But `@-` is not the same as the complement of intersection. For one thing, `@-` is not commutative. `old_range @- target_portion` is not the same as `target_portion @- old_range`. But `complement(old_range * target_portion)` *is* the same as `complement(target_portion * old_range)`. Or from another angle: it's true that `old_range @- target_portion = old_range @- (old_range * target_portion)`, but the intersection isn't "doing" anything here. It's true that intersection and minus both "reduce" what you put in, but minus is more accurate. So I think we want a name that captures that idea of "minus". Both "not" and "intersection" are misleading IMO. Of course "minus" is already taken (and you wouldn't expect it to return arrays anyway), which is why I'm thinking about names like "without" or "except". Or maybe "multi-minus". I still think "without portion" is the closest to capturing everything above (and avoids ambiguity with other SQL operations). And the "portion" ties the operator to `FOR PORTION OF`, which is its purpose. But I wouldn't be surprised if there were something better. Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: Improve rowcount estimate for UNNEST(column)
Hello, On 11/26/23 12:22, Tom Lane wrote: > Yes, this regression test is entirely unacceptable; the numbers will > not be stable enough. Even aside from the different-settings issue, > you can't rely on ANALYZE deriving exactly the same stats every time. > Usually what we try to do is devise a query where the plan shape > changes because of the better estimate. Here is a patch with an improved test. With the old "10" estimate we get a Merge Join, but now that the planner can see there are only ~4 elements per array, we get a Nested Loop. It was actually hard to get a new plan, since all our regress tables' arrays have around 5 elements average, which isn't so far from 10. Adding a table with 1- or 2- element arrays would be more dramatic. So I resorted to tuning the query with `WHERE seqno <= 50`. Hopefully that's not cheating too much. I thought about also adding a test where the old code *underestimates*, but then I'd have to add a new table with big arrays. If it's worth it let me know. On 11/26/23 23:05, jian he wrote: > I found using table array_op_test test more convincing. True, arrtest is pretty small. The new test uses array_op_test instead. On 11/29/23 20:35, jian he wrote: > I did a minor change. change estimate_array_length return type to double I'm not sure I want to change estimate_array_length from returning ints to returning doubles, since it's called in many places. I can see how that might give plans that are more accurate yet, so maybe it's worth it? It feels like it ought to be a separate patch to me, but if others want me to include it here please let me know. I did add clamp_row_est since it's essentially free and maybe gives some safety. Rebased onto d43bd090a8. Yours, -- Paul ~{:-) p...@illuminatedcomputing.comFrom f8246ff806f4b3ec1441e515ada97662a8fe8967 Mon Sep 17 00:00:00 2001 From: "Paul A. Jungwirth" Date: Sat, 25 Nov 2023 09:12:52 -0800 Subject: [PATCH v2] Use statitics for estimating UNNEST(column) rows Extends estimate_array_length to consult DECHIST statistics when called with a Var. This improves planning with array_unnest_support (added in a391ff3c3d41) and should give more accurate results for other callers too. DECHIST has the average *distinct* element count which isn't exactly what we want, but it should still be an improvement over a fixed 10. --- src/backend/optimizer/path/costsize.c| 8 ++--- src/backend/utils/adt/array_typanalyze.c | 1 + src/backend/utils/adt/arrayfuncs.c | 2 +- src/backend/utils/adt/selfuncs.c | 39 +++- src/include/utils/selfuncs.h | 2 +- src/test/regress/expected/arrays.out | 20 src/test/regress/sql/arrays.sql | 11 +++ 7 files changed, 70 insertions(+), 13 deletions(-) diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index d6ceafd51c..8bcdcc7fd8 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -1254,7 +1254,7 @@ cost_tidscan(Path *path, PlannerInfo *root, ScalarArrayOpExpr *saop = (ScalarArrayOpExpr *) qual; Node *arraynode = (Node *) lsecond(saop->args); - ntuples += estimate_array_length(arraynode); + ntuples += estimate_array_length(root, arraynode); } else if (IsA(qual, CurrentOfExpr)) { @@ -4741,7 +4741,7 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) Node *arraynode = (Node *) lsecond(saop->args); QualCost sacosts; QualCost hcosts; - int estarraylen = estimate_array_length(arraynode); + int estarraylen = estimate_array_length(context->root, arraynode); set_sa_opfuncid(saop); sacosts.startup = sacosts.per_tuple = 0; @@ -4779,7 +4779,7 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) */ context->total.startup += sacosts.startup; context->total.per_tuple += sacosts.per_tuple * -estimate_array_length(arraynode) * 0.5; +estimate_array_length(context->root, arraynode) * 0.5; } } else if (IsA(node, Aggref) || @@ -4830,7 +4830,7 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) context->total.startup += perelemcost.startup; if (perelemcost.per_tuple > 0) context->total.per_tuple += perelemcost.per_tuple * -estimate_array_length((Node *) acoerce->arg); +estimate_array_length(context->root, (Node *) acoerce->arg); } else if (IsA(node, RowCompareExpr)) { diff --git a/src/backend/utils/adt/array_typanalyze.c b/src/backend/utils/adt/array_typanalyze.c index 04b3764b68..a60d5fb5d2 100644 --- a/src/backend/utils/adt/array_typanalyze.c +++ b/src/backend/utils/adt/array_typanalyze.c @@ -604,6 +604,7 @@ compute_array_stats(VacAttrStats *stats, AnalyzeAttrFetchFunc fetchfunc, /* * Prepare to fill stanumbers with the histogram, followed by the * average count. This array must be stored in anl_context. + * We use the average count in estimate_array_length.
Re: SQL:2011 application time
On Thu, Nov 23, 2023 at 1:08 AM Peter Eisentraut wrote: > After further thought, I think the right solution is to change > btree_gist (and probably also btree_gin) to use the common RT* strategy > numbers. Okay. That will mean bumping the version of btree_gist, and you must be running that version to use the new temporal features, or you will get silly results. Right? Is there a way to protect users against that and communicate they need to upgrade the extension? This also means temporal features may not work in custom GIST opclasses. What we're saying is they must have an appropriate operator for RTEqualStrategyNumber (18) and RTOverlapStrategyNumber (3). Equal matters for the scalar key part(s), overlap for the range part. So equal is more likely to be an issue, but overlap matters if we want to support non-ranges (which I'd say is worth doing). Also if they get it wrong, we won't really have any way to report an error. I did some research on other extensions in contrib, as well as PostGIS. Here is what I found: ## btree_gin: 3 is = 18 is undefined same for all types: macaddr8, int2, int4, int8, float4, float8, oid, timestamp, timestamptz, time, timetz, date, interval, inet, cidr, text, varchar, char, bytea, bit, varbit, numeric, anyenum, uuid, name, bool, bpchar ## cube 3 is && 18 is <=> ## intarray 3 is && 18 is undefined ## ltree 3 is = 18 is undefined ## hstore 3 and 18 are undefined ## seg 3 is && 18 is undefined ## postgis: geometry 3 is && 18 is undefined ## postgis: geometry_nd 3 is &&& 18 is undefined I thought about looking through pgxn for more, but I haven't yet. I may still do that. But already it seems like there is not much consistency. So what do you think of this idea instead?: We could add a new (optional) support function to GiST that translates "well-known" strategy numbers into the opclass's own strategy numbers. This would be support function 12. Then we can say translateStrategyNumber(RTEqualStrategyNumber) and look up the operator with the result. There is not a performance hit, because we do this for the DDL command (create pk/uq/fk), then store the operator in the index/constraint. If you don't provide this new support function, then creating the pk/uq/fk fails with a hint about what you can do to make it work. This approach means we don't change the rules about GiST opclasses: you can still use the stranums how you like. This function would also let me support non-range "temporal" foreign keys, where I'll need to build queries with && and maybe other operators. What do you think? -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 application time
On 11/9/23 05:47, Peter Eisentraut wrote: I went over the patch v17-0001-Add-temporal-PRIMARY-KEY-and-UNIQUE-constraints.patch in more detail Thanks Peter! I'm about halfway through jian he's last two emails. I'll address your feedback also. I wanted to reply to this without waiting though: Overall, with these fixes, I think this patch is structurally okay. We just need to make sure we have all the weird corner cases covered. One remaining issue I know about is with table partitions whose column order has changed. I've got an in-progress fix for that, but I've been prioritizing reviewer feedback the last few months. Just want to make sure you know about it for now. Thanks! -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 application time
On 9/25/23 14:00, Peter Eisentraut wrote: Looking through the tests in v16-0001: +-- PK with no columns just WITHOUT OVERLAPS: +CREATE TABLE temporal_rng ( + valid_at tsrange, + CONSTRAINT temporal_rng_pk PRIMARY KEY (valid_at WITHOUT OVERLAPS) +); +ERROR: syntax error at or near "WITHOUT" +LINE 3: CONSTRAINT temporal_rng_pk PRIMARY KEY (valid_at WITHOUT OV... + ^ I think this error is confusing. The SQL standard requires at least one non-period column in a PK. I don't know why that is or why we should implement it. But if we want to implement it, maybe we should enforce that in parse analysis rather than directly in the parser, to be able to produce a more friendly error message. Okay. (I think the reason the standard requires one non-period column is to identify the "entity". If philosophically the row is an Aristotelian proposition about that thing, the period qualifies it as true just during some time span. So the scalar part is doing the work that a PK conventionally does, and the period part does something else. Perhaps a PK/UNIQUE constraint with no scalar part would still be useful, but not very often I think, and I'm not sure it makes sense to call it PK/UNIQUE.) +-- PK with a range column/PERIOD that isn't there: +CREATE TABLE temporal_rng ( + id INTEGER, + CONSTRAINT temporal_rng_pk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS) +); +ERROR: range or PERIOD "valid_at" in WITHOUT OVERLAPS does not exist I think here we should just produce a "column doesn't exist" error message, the same as if the "id" column was invalid. We don't need to get into the details of what kind of column it should be. That is done in the next test I'll change it. The reason for the different wording is that it might not be a column at all. It might be a PERIOD. So what about just "column or PERIOD doesn't exist"? (Your suggestion is fine too though.) +ERROR: column "valid_at" in WITHOUT OVERLAPS is not a range type Also, in any case it would be nice to have a location pointer here (for both cases). Agreed. +-- PK with one column plus a range: +CREATE TABLE temporal_rng ( + -- Since we can't depend on having btree_gist here, + -- use an int4range instead of an int. + -- (The rangetypes regression test uses the same trick.) + id int4range, + valid_at tsrange, + CONSTRAINT temporal_rng_pk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS) +); I'm confused why you are using int4range here (and in further tests) for the scalar (non-range) part of the primary key. Wouldn't a plaint int4 serve here? A plain int4 would be better, and it would match the normal use-case, but you must have btree_gist to create an index like that, and the regress tests can't assume we have that. Here is the part from sql/rangetypes.sql I'm referring to: -- -- Btree_gist is not included by default, so to test exclusion -- constraints with range types, use singleton int ranges for the "=" -- portion of the constraint. -- create table test_range_excl( room int4range, speaker int4range, during tsrange, exclude using gist (room with =, during with &&), exclude using gist (speaker with =, during with &&) ); +SELECT pg_get_indexdef(conindid, 0, true) FROM pg_constraint WHERE conname = 'temporal_rng_pk'; + pg_get_indexdef +--- + CREATE UNIQUE INDEX temporal_rng_pk ON temporal_rng USING gist (id, valid_at) Shouldn't this somehow show the operator classes for the columns? We are using different operator classes for the id and valid_at columns, aren't we? We only print the operator classes if they are not the default, so they don't appear here. I do suspect something more is desirable though. For exclusion constraints we replace everything before the columns with just "EXCLUDE USING gist". I could embed WITHOUT OVERLAPS but it's not valid syntax in CREATE INDEX. Let me know if you have any ideas. +-- PK with USING INDEX (not possible): +CREATE TABLE temporal3 ( + id int4range, + valid_at tsrange +); +CREATE INDEX idx_temporal3_uq ON temporal3 USING gist (id, valid_at); +ALTER TABLE temporal3 + ADD CONSTRAINT temporal3_pk + PRIMARY KEY USING INDEX idx_temporal3_uq; +ERROR: "idx_temporal3_uq" is not a unique index +LINE 2: ADD CONSTRAINT temporal3_pk + ^ +DETAIL: Cannot create a primary key or unique constraint using such an index. Could you also add a test where the index is unique and the whole thing does work? No problem! Apart from the tests, how about renaming the column pg_constraint.contemporal to something like to conwithoutoverlaps? Is that too verbose? I've got some code already changing it to conoverlaps but I'm probably happier with conwithoutoverlaps, assuming no one else minds it. Yours, --
Re: SQL:2011 application time
Hi Peter et al, On 9/1/23 12:56, Paul Jungwirth wrote: On 9/1/23 11:30, Peter Eisentraut wrote: I think the WITHOUT OVERLAPS clause should be per-column, so that something like UNIQUE (a WITHOUT OVERLAPS, b, c WITHOUT OVERLAPS) would be possible. Then the WITHOUT OVERLAPS clause would directly correspond to the choice between equality or overlaps operator per column. I think allowing multiple uses of `WITHOUT OVERLAPS` (and in any position) is a great recommendation that enables a lot of new functionality. I've been working on implementing this, but I've come to think it is the wrong way to go. If we support this in primary key and unique constraints, then we must also support it for foreign keys and UPDATE/DELETE FOR PORTION OF. But implementing that logic is pretty tricky. For example take a foreign key on (id, PERIOD valid_at, PERIOD asserted_at). We need to ensure the referenced two-dimensional time space `contains` the referencing two-dimensional space. You can visualize a rectangle in two-dimensional space for each referencing record (which we validate one at a time). The referenced records must be aggregated and so form a polygon (of all right angles). For example the referencing record may be (1, [0,2), [0,2)) with referenced records of (1, [0,2), [0,1)) and (1, [0,1), [1,2)). (I'm using intranges since they're easier to read, but you could imagine these as dateranges like [2000-01-01,2002-01-01).) Now the range_agg of their valid_ats is [0,2) and of their asserted_ats is [0,2). But the referenced 2d space still doesn't contain the referencing space. It's got one corner missing. This is a well-known problem among game developers. We're lucky not to have arbitrary polygons, but it's still a tough issue. Besides `contains` we also need to compute `overlaps` and `intersects` to support these temporal features. Implementing that for 2d, 3d, etc looks very complicated, for something that is far outside the normal use case and also not part of the standard. It will cost a little performance for the normal 1d use case too. I think a better approach (which I want to attempt as an add-on patch, not in this main series) is to support not just range types, but any type with the necessary operators. Then you could have an mdrange (multi-dimensional range) or potentially even an arbitrary n-dimensional polygon. (PostGIS has something like this, but its `contains` operator compares (non-concave) *bounding boxes*, so it would not work for the example above. Still the similarity between temporal and spatial data is striking. I'm going to see if I can get some input from PostGIS folks about how useful any of this is to them.) This approach would also let us use multiranges: not for multiple dimensions, but for non-contiguous time spans stored in a single row. This puts the complexity in the types themselves (which seems more appropriate) and is ultimately more flexible (supporting not just mdrange but also multirange, and other things too). This approach also means that instead of storing a mask/list of which columns use WITHOUT OVERLAPS, I can just store one attnum. Again, this saves the common use-case from paying a performance penalty to support a much rarer one. I've still got my multi-WITHOUT OVERLAPS work, but I'm going to switch gears to what I've described here. Please let me know if you disagree! Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 application time
On 9/7/23 18:24, jian he wrote: for a range primary key, is it fine to expect it to be unique, not null and also not overlap? (i am not sure how hard to implement it). - quote from 7IWD2-02-Foundation-2011-12.pdf. 4.18.3.2 Unique constraints, page 97 of 1483. ... - based on the above, the unique constraint does not specify that the column list must be range type. UNIQUE (a, c WITHOUT OVERLAPS). Here column "a" can be a range type (that have overlap property) and can be not. In fact, many of your primary key, foreign key regess test using something like '[11,11]' (which make it more easy to understand), which in logic is a non-range usage. So UNIQUE (a, c WITHOUT OVERLAPS), column "a" be a non-range data type does make sense? I'm not sure I understand this question, but here are a few things that might help clarify things: In SQL:2011, a temporal primary key, unique constraint, or foreign key may have one or more "scalar" parts (just like a regular key) followed by one "PERIOD" part, which is denoted with "WITHOUT OVERLAPS" (in PKs/UNIQUEs) or "PERIOD" (in FKs). Except for this last key part, everything is still compared for equality, just as in a traditional key. But this last part is compared for overlaps. It's exactly the same as `EXCLUDE (id WITH =, valid_at WITH &&)`. The overlap part must come last and you can have only one (but you may have more than one scalar part if you like). In the patch, I have followed that pattern, except I also allow a regular range column anywhere I allow a PERIOD. In fact PERIODs are mostly implemented on top of range types. (Until recently PERIOD support was in the first patch, not the last, and there was code all throughout for handling both, e.g. within indexes, etc. But at pgcon Peter suggested building everything on just range columns, and then having PERIODs create an "internal" GENERATED column, and that cleaned up the code considerably.) One possible source of confusion is that in the tests I'm using range columns *also* for the scalar key part. So valid_at is a tsrange, and int is an int4range. This is not normally how you'd use the feature, but you need the btree_gist extension to mix int & tsrange (e.g.), and that's not available in the regress tests. We are still comparing the int4range for regular equality and the tsrange for overlaps. If you search this thread there was some discussion about bringing btree_gist into core, but it sounds like it doesn't need to happen. (It might be still desirable independently. EXCLUDE constraints are also not really something you can use practically without it, and their tests use the same trick of comparing ranges for plain equality.) The piece of discussion you're replying to is about allowing *multiple* WITHOUT OVERLAPS modifiers on a PK/UNIQUE constraint, and in any position. I think that's a good idea, so I've started adapting the code to support it. (In fact there is a lot of code that assumes the overlaps key part will be in the last position, and I've never really been happy with that, so it's an excuse to make that more robust.) Here I'm saying (1) you will still need at least one scalar key part, (2) if there are no WITHOUT OVERLAPS parts then you just have a regular key, not a temporal one, (3) changing this obliges us to do the same for foreign keys and FOR PORTION OF. I hope that helps! I apologize if I've completely missed the point. If so please try again. :-) Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 application time
On 9/1/23 03:50, Vik Fearing wrote: On 9/1/23 11:30, Peter Eisentraut wrote: 1) If I write UNIQUE (a, b, c WITHOUT OVERLAPS), does the WITHOUT OVERLAPS clause attach to the last column, or to the whole column list? In the SQL standard, you can only have one period and it has to be listed last, so this question does not arise. But here we are building a more general facility to then build the SQL facility on top of. So I think it doesn't make sense that the range column must be last or that there can only be one. Also, your implementation requires at least one non-overlaps column, which also seems like a confusing restriction. I think the WITHOUT OVERLAPS clause should be per-column, so that something like UNIQUE (a WITHOUT OVERLAPS, b, c WITHOUT OVERLAPS) would be possible. Then the WITHOUT OVERLAPS clause would directly correspond to the choice between equality or overlaps operator per column. An alternative interpretation would be that WITHOUT OVERLAPS applies to the whole column list, and we would take it to mean, for any range column, use the overlaps operator, for any non-range column, use the equals operator. But I think this would be confusing and would prevent the case of using the equality operator for some ranges and the overlaps operator for some other ranges in the same key. I prefer the first option. That is: WITHOUT OVERLAPS applies only to the column or expression it is attached to, and need not be last in line. I agree. The second option seems confusing and is more restrictive. I think allowing multiple uses of `WITHOUT OVERLAPS` (and in any position) is a great recommendation that enables a lot of new functionality. Several books[1,2] about temporal databases describe a multi-dimensional temporal space (even beyond application time vs. system time), and the standard is pretty disappointing here. It's not a weird idea. But I just want to be explicit that this isn't something the standard describes. (I think everyone in the conversation so far understands that.) So far I've tried to be pretty scrupulous about following SQL:2011, although personally I'd rather see Postgres support this functionality. And it's not like it goes *against* what the standard says. But if there are any objections, I'd love to hear them before putting in the work. :-) If we allow multiple+anywhere WITHOUT OVERLAPS in PRIMARY KEY & UNIQUE constraints, then surely we also allow multiple+anywhere PERIOD in FOREIGN KEY constraints too. (I guess the standard switched keywords because a FK is more like "MUST OVERLAPS". :-) Also if you have multiple application-time dimensions we probably need to allow multiple FOR PORTION OF clauses. I think the syntax would be: UPDATE t FOR PORTION OF valid_at FROM ... TO ... FOR PORTION OF asserted_at FROM ... TO ... [...] SET foo = bar Does that sound okay? I don't quite understand this part: >> Also, your implementation >> requires at least one non-overlaps column, which also seems like a >> confusing restriction. That's just a regular non-temporal constraint. Right? If I'm missing something let me know. [1] C. J. Date, Hugh Darwen, Nikos Lorentzos. Time and Relational Theory, Second Edition: Temporal Databases in the Relational Model and SQL. 2nd edition, 2014. [2] Tom Johnston. Bitemporal Data: Theory and Practice. 2014. -- Paul ~{:-) p...@illuminatedcomputing.com
Re: Exclusion constraints on partitioned tables
On 1/24/23 06:38, Ronan Dunklau wrote: I've taken a look at the patch, and I'm not sure why you keep the restriction on the Gist operator being of the RTEqualStrategyNumber strategy. I don't think we have any other place where we expect those strategy numbers to match. For hash it's different, as the hash-equality is the only operator strategy and as such there is no other way to look at it. Can't we just enforce partition_operator == exclusion_operator without adding the RTEqualStrategyNumber for the opfamily into the mix ? Thank you for taking a look! I did some research on the history of the code here, and I think I understand Tom's concern about making sure the index uses the same equality operator as the partition. I was confused about his remarks about the opfamily, but I agree with you that if the operator is the same, we should be okay. I added the code about RTEqualStrategyNumber because that's what we need to find an equals operator when the index is GiST (except if it's using an opclass from btree_gist; then it needs to be BTEqual again). But then I realized that for exclusion constraints we have already figured out the operator (in RelationGetExclusionInfo) and put it in indexInfo->ii_ExclusionOps. So we can just compare against that. This works whether your index uses btree_gist or not. Here is an updated patch with that change (also rebased). I also included a more specific error message. If we find a matching column in the index but with the wrong operator, we should say so, and not say there is no matching column. Thanks, -- Paul ~{:-) p...@illuminatedcomputing.comFrom 928a31433a4b8cad25f74017ca96b843aa30e02d Mon Sep 17 00:00:00 2001 From: "Paul A. Jungwirth" Date: Wed, 23 Nov 2022 14:55:43 -0800 Subject: [PATCH v3] Allow some exclusion constraints on partitions Previously we only allowed UNIQUE B-tree constraints on partitions (and only if the constraint included all the partition keys). But we could allow exclusion constraints with the same restriction. We also require that those columns be compared for equality, not something like &&. --- doc/src/sgml/ddl.sgml | 12 ++-- src/backend/commands/indexcmds.c | 66 --- src/backend/parser/parse_utilcmd.c | 6 -- src/test/regress/expected/alter_table.out | 31 +++-- src/test/regress/expected/create_table.out | 16 +++-- src/test/regress/expected/indexing.out | 73 ++ src/test/regress/sql/alter_table.sql | 29 +++-- src/test/regress/sql/create_table.sql | 13 +++- src/test/regress/sql/indexing.sql | 57 +++-- 9 files changed, 236 insertions(+), 67 deletions(-) diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 91c036d1cb..9d3c423ffd 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -4233,11 +4233,13 @@ ALTER INDEX measurement_city_id_logdate_key - There is no way to create an exclusion constraint spanning the - whole partitioned table. It is only possible to put such a - constraint on each leaf partition individually. Again, this - limitation stems from not being able to enforce cross-partition - restrictions. + Similarly an exclusion constraint must include all the + partition key columns. Furthermore the constraint must compare those + columns for equality (not e.g. ). + Again, this limitation stems from not being able to enforce + cross-partition restrictions. The constraint may include additional + columns that aren't part of the partition key, and it may compare + those with any operators you like. diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 16ec0b114e..52d2395daa 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -709,11 +709,6 @@ DefineIndex(Oid relationId, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot create index on partitioned table \"%s\" concurrently", RelationGetRelationName(rel; - if (stmt->excludeOpNames) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("cannot create exclusion constraints on partitioned table \"%s\"", - RelationGetRelationName(rel; } /* @@ -918,15 +913,16 @@ DefineIndex(Oid relationId, index_check_primary_key(rel, indexInfo, is_alter_table, stmt); /* - * If this table is partitioned and we're creating a unique index or a - * primary key, make sure that the partition key is a subset of the - * index's columns. Otherwise it would be possible to violate uniqueness - * by putting values that ought to be unique in different partitions. + * If this table is partitioned and we're creating a unique index, + * primary key, or exclusion constraint, make sure that the partition key + * is a subset of the index's columns. Otherwise it would be possible
Re: Exclusion constraints on partitioned tables
On 12/15/22 16:12, Tom Lane wrote: This patch also requires the matching constraint columns to use equality comparisons (`(foo WITH =)`), so it is really equivalent to the existing b-tree rule. That's not quite good enough: you'd better enforce that it's the same equality operator (and same collation, if relevant) as is being used in the partition key. [snip] It might work better to consider the operator itself and ask if it's equality in the same btree opfamily that's used by the partition key. Thank you for taking a look! Here is a comparison on just the operator itself. I included a collation check too, but I'm not sure it's necessary. Exclusion constraints don't have a collation per se; it comes from the index, and we choose it just a little above in this function. (I'm not even sure how to elicit that new error message in a test case.) I'm not sure what to do about matching the opfamily. In practice an exclusion constraint will typically use gist, but the partition key will always use btree/hash. You're saying that the equals operator can be inconsistent between those access methods? That is surprising to me, but I admit op classes/families are still sinking in. (Even prior to this patch, isn't the code for hash-based partitions looking up ptkey_eqop via the hash opfamily, and then comparing it to idx_eqop looked up via the btree opfamily?) If partitions can only support btree-based exclusion constraints, you still wouldn't be able to partition a temporal table, because those constraints would always be gist. So I guess what I really want is to support gist index constraints on partitioned tables. Regards, -- Paul ~{:-) p...@illuminatedcomputing.comFrom df25cb63cacd992fc7453fc432488c6046e59479 Mon Sep 17 00:00:00 2001 From: "Paul A. Jungwirth" Date: Wed, 23 Nov 2022 14:55:43 -0800 Subject: [PATCH v2] Allow some exclusion constraints on partitions Previously we only allowed UNIQUE B-tree constraints on partitions (and only if the constraint included all the partition keys). But we could allow exclusion constraints with the same restriction. We also require that those columns be compared for equality, not something like &&. --- doc/src/sgml/ddl.sgml | 12 ++-- src/backend/commands/indexcmds.c | 42 +++-- src/backend/parser/parse_utilcmd.c | 6 -- src/test/regress/expected/alter_table.out | 31 +++-- src/test/regress/expected/create_table.out | 16 +++-- src/test/regress/expected/indexing.out | 73 ++ src/test/regress/sql/alter_table.sql | 29 +++-- src/test/regress/sql/create_table.sql | 13 +++- src/test/regress/sql/indexing.sql | 57 +++-- 9 files changed, 227 insertions(+), 52 deletions(-) diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 6e92bbddd2..59be911471 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -4206,11 +4206,13 @@ ALTER INDEX measurement_city_id_logdate_key - There is no way to create an exclusion constraint spanning the - whole partitioned table. It is only possible to put such a - constraint on each leaf partition individually. Again, this - limitation stems from not being able to enforce cross-partition - restrictions. + Similarly an exclusion constraint must include all the + partition key columns. Furthermore the constraint must compare those + columns for equality (not e.g. ). + Again, this limitation stems from not being able to enforce + cross-partition restrictions. The constraint may include additional + columns that aren't part of the partition key, and it may compare + those with any operators you like. diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 7dc1aca8fe..a9ecb7df19 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -709,11 +709,6 @@ DefineIndex(Oid relationId, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot create index on partitioned table \"%s\" concurrently", RelationGetRelationName(rel; - if (stmt->excludeOpNames) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("cannot create exclusion constraints on partitioned table \"%s\"", - RelationGetRelationName(rel; } /* @@ -926,7 +921,7 @@ DefineIndex(Oid relationId, * We could lift this limitation if we had global indexes, but those have * their own problems, so this is a useful feature combination. */ - if (partitioned && (stmt->unique || stmt->primary)) + if (partitioned && (stmt->unique || stmt->primary || stmt->excludeOpNames != NIL)) { PartitionKey key = RelationGetPartitionKey(rel); const char *constraint_type; @@ -983,6 +978,8 @@ DefineIndex(Oid relationId, */ if (accessMethodId == BTREE_AM_OID) eq_strategy =
Exclusion constraints on partitioned tables
Hello Hackers, I'm trying to get things going again on my temporal tables work, and here is a small patch to move that forward. It lets you create exclusion constraints on partitioned tables, similar to today's rules for b-tree primary keys & unique constraints: just as we permit a PK on a partitioned table when the PK's columns are a superset of the partition keys, so we could also allow an exclusion constraint when its columns are a superset of the partition keys. This patch also requires the matching constraint columns to use equality comparisons (`(foo WITH =)`), so it is really equivalent to the existing b-tree rule. Perhaps that is more conservative than necessary, but we can't permit an arbitrary operator, since some might require testing rows that fall into other partitions. For example `(foo WITH <>)` would obviously cause problems. The exclusion constraint may still include other columns beyond the partition keys, and those may use equality operators or something else. This patch is required to support temporal partitioned tables, because temporal tables use exclusion constraints as their primary key. Essentially they are `(id WITH =, valid_at with &&)`. Since the primary key is not a b-tree, partitioning them would be forbidden prior to this patch. But now you could partition that table on `id`, and we could still correctly validate the temporal PK without requiring rows from other partitions. This patch may be helpful beyond just temporal tables (or for DIY temporal tables), so it seems worth submitting it separately. Yours, -- Paul ~{:-) p...@illuminatedcomputing.comFrom 7daadf9e822509186c9e32794d0e29effdc90edc Mon Sep 17 00:00:00 2001 From: "Paul A. Jungwirth" Date: Wed, 23 Nov 2022 14:55:43 -0800 Subject: [PATCH v1] Allow some exclusion constraints on partitions Previously we only allowed UNIQUE B-tree constraints on partitions (and only if the constraint included all the partition keys). But we could allow exclusion constraints with the same restriction. We also require that those columns be compared for equality, not something like &&. --- doc/src/sgml/ddl.sgml | 12 ++-- src/backend/commands/indexcmds.c | 36 +-- src/backend/parser/parse_utilcmd.c | 6 -- src/test/regress/expected/alter_table.out | 31 +++-- src/test/regress/expected/create_table.out | 16 +++-- src/test/regress/expected/indexing.out | 73 ++ src/test/regress/sql/alter_table.sql | 29 +++-- src/test/regress/sql/create_table.sql | 13 +++- src/test/regress/sql/indexing.sql | 57 +++-- 9 files changed, 221 insertions(+), 52 deletions(-) diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 6e92bbddd2..59be911471 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -4206,11 +4206,13 @@ ALTER INDEX measurement_city_id_logdate_key - There is no way to create an exclusion constraint spanning the - whole partitioned table. It is only possible to put such a - constraint on each leaf partition individually. Again, this - limitation stems from not being able to enforce cross-partition - restrictions. + Similarly an exclusion constraint must include all the + partition key columns. Furthermore the constraint must compare those + columns for equality (not e.g. ). + Again, this limitation stems from not being able to enforce + cross-partition restrictions. The constraint may include additional + columns that aren't part of the partition key, and it may compare + those with any operators you like. diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 7dc1aca8fe..28840544b5 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -709,11 +709,6 @@ DefineIndex(Oid relationId, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot create index on partitioned table \"%s\" concurrently", RelationGetRelationName(rel; - if (stmt->excludeOpNames) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("cannot create exclusion constraints on partitioned table \"%s\"", - RelationGetRelationName(rel; } /* @@ -926,7 +921,7 @@ DefineIndex(Oid relationId, * We could lift this limitation if we had global indexes, but those have * their own problems, so this is a useful feature combination. */ - if (partitioned && (stmt->unique || stmt->primary)) + if (partitioned && (stmt->unique || stmt->primary || stmt->excludeOpNames != NIL)) { PartitionKey key = RelationGetPartitionKey(rel); const char *constraint_type; @@ -983,6 +978,8 @@ DefineIndex(Oid relationId, */ if (accessMethodId == BTREE_AM_OID) eq_strategy = BTEqualStrategyNumber; + else if (accessMethodId == GIST_AM_OID) +eq_strategy =
Think-o in foreign key comments
Hello, I noticed a few places in the new foreign key code where a comment says "the ON DELETE SET NULL/DELETE clause". I believe it should say "ON DELETE SET NULL/DEFAULT". These comments were added in d6f96ed94e7, "Allow specifying column list for foreign key ON DELETE SET actions." Here is a patch to correct them. I don't think you usually create a commitfest entry for tiny fixes like this, right? But if you'd like one please let me know and I'll add it. Yours, -- Paul ~{:-) p...@illuminatedcomputing.comFrom dc5e317762266b93bfa44bb303fe1882853d8de7 Mon Sep 17 00:00:00 2001 From: "Paul A. Jungwirth" Date: Fri, 2 Dec 2022 14:02:55 -0800 Subject: [PATCH v1] Fix FK comment think-o --- src/backend/commands/tablecmds.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 3e83f375b5..716793e157 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -9467,10 +9467,10 @@ validateFkOnDeleteSetColumns(int numfks, const int16 *fkattnums, * numfks is the number of columns in the foreign key * pkattnum is the attnum array of referenced attributes. * fkattnum is the attnum array of referencing attributes. - * numfkdelsetcols is the number of columns in the ON DELETE SET NULL/DELETE + * numfkdelsetcols is the number of columns in the ON DELETE SET NULL/DEFAULT * (...) clause * fkdelsetcols is the attnum array of the columns in the ON DELETE SET - * NULL/DELETE clause + * NULL/DEFAULT clause * pf/pp/ffeqoperators are OID array of operators between columns. * old_check_ok signals that this constraint replaces an existing one that * was already validated (thus this one doesn't need validation). @@ -9686,10 +9686,10 @@ addFkRecurseReferenced(List **wqueue, Constraint *fkconstraint, Relation rel, * pkattnum is the attnum array of referenced attributes. * fkattnum is the attnum array of referencing attributes. * pf/pp/ffeqoperators are OID array of operators between columns. - * numfkdelsetcols is the number of columns in the ON DELETE SET NULL/DELETE + * numfkdelsetcols is the number of columns in the ON DELETE SET NULL/DEFAULT * (...) clause * fkdelsetcols is the attnum array of the columns in the ON DELETE SET - * NULL/DELETE clause + * NULL/DEFAULT clause * old_check_ok signals that this constraint replaces an existing one that * was already validated (thus this one doesn't need validation). * lockmode is the lockmode to acquire on partitions when recursing. -- 2.25.1
Re: range_agg with multirange inputs
On 3/10/22 14:07, Chapman Flack wrote: When I apply this patch, I get a func.sgml with two entries for range_intersect_agg(anymultirange). Arg, fixed. In range_agg_transfn, you've changed the message in the "must be called with a range or multirange"; that seems like another good candidate to be an elog. Agreed. Updated here. I think your query finds aggregate declarations that share the same SQL function declaration as their finalizer functions. That seems to be more common. The query I used looks for cases where different SQL-declared functions appear as finalizers of aggregates, but the different SQL declared functions share the same internal C implementation. Okay, I see. I believe that is quite common for ordinary SQL functions. Sharing a prosrc seems even less remarkable than sharing an aggfinalfn. You're right there are no cases for other finalfns yet, but I don't think there is anything special about finalfns that would make this a weirder thing to do there than with ordinary functions. Still, noting it with a comment does seem helpful. I've updated the remark to match what you suggested. Thank you again for the review, and sorry for so many iterations! :-) Yours, -- Paul ~{:-) p...@illuminatedcomputing.comFrom b409d7333132e34a928ec8cc0ecca0a3421b7268 Mon Sep 17 00:00:00 2001 From: "Paul A. Jungwirth" Date: Fri, 10 Dec 2021 16:04:57 -0800 Subject: [PATCH v4] Add range_agg with multirange inputs --- doc/src/sgml/func.sgml| 30 ++ src/backend/utils/adt/multirangetypes.c | 69 ++-- src/include/catalog/pg_aggregate.dat | 3 + src/include/catalog/pg_proc.dat | 11 ++ src/test/regress/expected/multirangetypes.out | 100 ++ src/test/regress/expected/opr_sanity.out | 1 + src/test/regress/sql/multirangetypes.sql | 21 src/test/regress/sql/opr_sanity.sql | 1 + 8 files changed, 230 insertions(+), 6 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index df3cd5987b..ab6c4f0093 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -20012,8 +20012,23 @@ SELECT NULLIF(value, '(none)') ... No + + + + range_agg + +range_agg ( value + anymultirange ) +anymultirange + + +Computes the union of the non-null input values. + + No + + range_intersect_agg @@ -20027,8 +20042,23 @@ SELECT NULLIF(value, '(none)') ... No + + + + range_intersect_agg + +range_intersect_agg ( value + anymultirange ) +anymultirange + + +Computes the intersection of the non-null input values. + + No + + string_agg diff --git a/src/backend/utils/adt/multirangetypes.c b/src/backend/utils/adt/multirangetypes.c index c474b24431..a6d376b083 100644 --- a/src/backend/utils/adt/multirangetypes.c +++ b/src/backend/utils/adt/multirangetypes.c @@ -1344,11 +1344,9 @@ range_agg_transfn(PG_FUNCTION_ARGS) elog(ERROR, "range_agg_transfn called in non-aggregate context"); rngtypoid = get_fn_expr_argtype(fcinfo->flinfo, 1); if (!type_is_range(rngtypoid)) - ereport(ERROR, -(errcode(ERRCODE_DATATYPE_MISMATCH), - errmsg("range_agg must be called with a range"))); + elog(ERROR, "range_agg must be called with a range or multirange"); if (PG_ARGISNULL(0)) state = initArrayResult(rngtypoid, aggContext, false); else @@ -1362,8 +1360,11 @@ range_agg_transfn(PG_FUNCTION_ARGS) } /* * range_agg_finalfn: use our internal array to merge touching ranges. + * + * Shared by range_agg_finalfn(anyrange) and + * multirange_agg_finalfn(anymultirange). */ Datum range_agg_finalfn(PG_FUNCTION_ARGS) { @@ -1397,8 +1398,66 @@ range_agg_finalfn(PG_FUNCTION_ARGS) PG_RETURN_MULTIRANGE_P(make_multirange(mltrngtypoid, typcache->rngtype, range_count, ranges)); } +/* + * multirange_agg_transfn: combine adjacent/overlapping multiranges. + * + * All we do here is gather the input multiranges' ranges into an array + * so that the finalfn can sort and combine them. + */ +Datum +multirange_agg_transfn(PG_FUNCTION_ARGS) +{ + MemoryContext aggContext; + Oid mltrngtypoid; + TypeCacheEntry *typcache; + TypeCacheEntry *rngtypcache; + ArrayBuildState *state; + MultirangeType *current; + int32 range_count; + RangeType **ranges; + int32 i; + + if (!AggCheckCallContext(fcinfo, )) + elog(ERROR, "multirange_agg_transfn called in non-aggregate context"); + + mltrngtypoid = get_fn_expr_argtype(fcinfo->flinfo, 1); + if (!type_is_multirange(mltrngtypoid)) + elog(ERROR, "range_agg must be called with a range or multirange"); + + typcache = multirange_get_typcache(fcinfo, mltrngtypoid); + rngtypcache = typcache->rngtype; + + if (PG_ARGISNULL(0)) +
Re: range_agg with multirange inputs
On 3/1/22 13:33, Chapman Flack wrote: I think the 4 lines should suffice, but it looks like this patch was generated from a rebase of the old one (with three lines) that ended up putting the new 'range_agg' entry ahead of 'max' in func.sgml, which position is now baked into the 4 lines of context. :) You're right, my last rebase messed up the docs. Here it is fixed. Sorry about that! I would not change them to actual Assert, which would blow up the whole process on failure. If it's a genuine "not expected to happen" case, maybe changing it to elog (or ereport with errmsg_internal) would save a little workload for translators. I like the elog solution. I've changed them in both places. I did a small double-take seeing the C range_agg_finalfn being shared by the SQL range_agg_finalfn and multirange_agg_finalfn. I infer that the reason it works is get_fn_expr_rettype works equally well with either parameter type. Do you think it would be worth adding a comment at the C function explaining that? In a quick query I just did, I found no other aggregate final functions sharing a C function that way, so this could be the first. I see 13 other shared finalfns (using select array_agg(aggfnoid::regproc) as procs, array_agg(aggtransfn) as transfns, aggfinalfn from pg_aggregate where aggfinalfn is distinct from 0 group by aggfinalfn having count(*) > 1;) but a comment can't hurt! Added. Thanks, -- Paul ~{:-) p...@illuminatedcomputing.comFrom 1e7a96668b0ad341d29281055927ad693c656843 Mon Sep 17 00:00:00 2001 From: "Paul A. Jungwirth" Date: Fri, 10 Dec 2021 16:04:57 -0800 Subject: [PATCH v3] Add range_agg with multirange inputs --- doc/src/sgml/func.sgml| 45 src/backend/utils/adt/multirangetypes.c | 66 +++- src/include/catalog/pg_aggregate.dat | 3 + src/include/catalog/pg_proc.dat | 11 ++ src/test/regress/expected/multirangetypes.out | 100 ++ src/test/regress/expected/opr_sanity.out | 1 + src/test/regress/sql/multirangetypes.sql | 21 src/test/regress/sql/opr_sanity.sql | 1 + 8 files changed, 244 insertions(+), 4 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index df3cd5987b..fb1963a687 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -20012,8 +20012,23 @@ SELECT NULLIF(value, '(none)') ... No + + + + range_agg + +range_agg ( value + anymultirange ) +anymultirange + + +Computes the union of the non-null input values. + + No + + range_intersect_agg @@ -20027,8 +20042,38 @@ SELECT NULLIF(value, '(none)') ... No + + + + range_intersect_agg + +range_intersect_agg ( value + anymultirange ) +anymultirange + + +Computes the intersection of the non-null input values. + + No + + + + + + range_intersect_agg + +range_intersect_agg ( value + anymultirange ) +anymultirange + + +Computes the intersection of the non-null input values. + + No + + string_agg diff --git a/src/backend/utils/adt/multirangetypes.c b/src/backend/utils/adt/multirangetypes.c index c474b24431..5a4c00457a 100644 --- a/src/backend/utils/adt/multirangetypes.c +++ b/src/backend/utils/adt/multirangetypes.c @@ -1346,9 +1346,9 @@ range_agg_transfn(PG_FUNCTION_ARGS) rngtypoid = get_fn_expr_argtype(fcinfo->flinfo, 1); if (!type_is_range(rngtypoid)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), - errmsg("range_agg must be called with a range"))); + errmsg("range_agg must be called with a range or multirange"))); if (PG_ARGISNULL(0)) state = initArrayResult(rngtypoid, aggContext, false); else @@ -1362,8 +1362,10 @@ range_agg_transfn(PG_FUNCTION_ARGS) } /* * range_agg_finalfn: use our internal array to merge touching ranges. + * + * Shared by range_agg(anyrange) and range_agg(anymultirange). */ Datum range_agg_finalfn(PG_FUNCTION_ARGS) { @@ -1397,8 +1399,66 @@ range_agg_finalfn(PG_FUNCTION_ARGS) PG_RETURN_MULTIRANGE_P(make_multirange(mltrngtypoid, typcache->rngtype, range_count, ranges)); } +/* + * multirange_agg_transfn: combine adjacent/overlapping multiranges. + * + * All we do here is gather the input multiranges' ranges into an array + * so that the finalfn can sort and combine them. + */ +Datum +multirange_agg_transfn(PG_FUNCTION_ARGS) +{ + MemoryContext aggContext; + Oid mltrngtypoid; + TypeCacheEntry *typcache; + TypeCacheEntry *rngtypcache; + ArrayBuildState *state; + MultirangeType *current; + int32 range_count; + RangeType **ranges; + int32 i; + + if
Re: range_agg with multirange inputs
On 2/26/22 17:13, Chapman Flack wrote: This applies (with some fuzz) and passes installcheck-world, but a rebase is needed, because 3 lines of context aren't enough to get the doc changes in the right place in the aggregate function table. (I think generating the patch with 4 lines of context would be enough to keep that from being a recurring issue.) Thank you for the review and the tip re 4 lines of context! Rebase attached. One thing that seems a bit funny is this message in the new multirange_agg_transfn: + if (!type_is_multirange(mltrngtypoid)) + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), +errmsg("range_agg must be called with a multirange"))); I agree it would be more helpful to users to let them know we can take either kind of argument. I changed the message to "range_agg must be called with a range or multirange". How does that seem? I kind of wonder whether either message is really reachable, at least through the aggregate machinery in the expected way. Won't that machinery ensure that it is calling the right transfn with the right type of argument? If that's the case, maybe the messages could only be seen by someone calling the transfn directly ... which also seems ruled out for these transfns because the state type is internal. Is this whole test more of the nature of an assertion? I don't think they are reachable, so perhaps they are more like asserts. Do you think I should change it? It seems like a worthwhile check in any case. Yours, -- Paul ~{:-) p...@illuminatedcomputing.comFrom a6689485aab9b1aaa6e866f2a577368c7a0e324e Mon Sep 17 00:00:00 2001 From: "Paul A. Jungwirth" Date: Fri, 10 Dec 2021 16:04:57 -0800 Subject: [PATCH v2] Add range_agg with multirange inputs --- doc/src/sgml/func.sgml| 30 ++ src/backend/utils/adt/multirangetypes.c | 62 ++- src/include/catalog/pg_aggregate.dat | 3 + src/include/catalog/pg_proc.dat | 11 ++ src/test/regress/expected/multirangetypes.out | 100 ++ src/test/regress/expected/opr_sanity.out | 1 + src/test/regress/sql/multirangetypes.sql | 21 src/test/regress/sql/opr_sanity.sql | 1 + 8 files changed, 228 insertions(+), 1 deletion(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index df3cd5987b..6a72785327 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -19959,8 +19959,23 @@ SELECT NULLIF(value, '(none)') ... No + + + + range_agg + +range_agg ( value + anymultirange ) +anymultirange + + +Computes the union of the non-null input values. + + No + + max @@ -20012,8 +20027,23 @@ SELECT NULLIF(value, '(none)') ... No + + + + range_intersect_agg + +range_intersect_agg ( value + anymultirange ) +anymultirange + + +Computes the intersection of the non-null input values. + + No + + range_intersect_agg diff --git a/src/backend/utils/adt/multirangetypes.c b/src/backend/utils/adt/multirangetypes.c index c474b24431..0efef8cf35 100644 --- a/src/backend/utils/adt/multirangetypes.c +++ b/src/backend/utils/adt/multirangetypes.c @@ -1346,9 +1346,9 @@ range_agg_transfn(PG_FUNCTION_ARGS) rngtypoid = get_fn_expr_argtype(fcinfo->flinfo, 1); if (!type_is_range(rngtypoid)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), - errmsg("range_agg must be called with a range"))); + errmsg("range_agg must be called with a range or multirange"))); if (PG_ARGISNULL(0)) state = initArrayResult(rngtypoid, aggContext, false); else @@ -1397,8 +1397,68 @@ range_agg_finalfn(PG_FUNCTION_ARGS) PG_RETURN_MULTIRANGE_P(make_multirange(mltrngtypoid, typcache->rngtype, range_count, ranges)); } +/* + * multirange_agg_transfn: combine adjacent/overlapping multiranges. + * + * All we do here is gather the input multiranges' ranges into an array + * so that the finalfn can sort and combine them. + */ +Datum +multirange_agg_transfn(PG_FUNCTION_ARGS) +{ + MemoryContext aggContext; + Oid mltrngtypoid; + TypeCacheEntry *typcache; + TypeCacheEntry *rngtypcache; + ArrayBuildState *state; + MultirangeType *current; + int32 range_count; + RangeType **ranges; + int32 i; + + if (!AggCheckCallContext(fcinfo, )) + elog(ERROR, "multirange_agg_transfn called in non-aggregate context"); + + mltrngtypoid = get_fn_expr_argtype(fcinfo->flinfo, 1); + if (!type_is_multirange(mltrngtypoid)) + ereport(ERROR, +(errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg("range_agg must be called with a range or multirange"))); + + typcache = multirange_get_typcache(fcinfo, mltrngtypoid); + rngtypcache = typcache->rngtype; + + if
range_agg with multirange inputs
Here is a patch adding range_agg(anymultirange). Previously range_agg only accepted anyrange. Here is a bug report from last month requesting this addition: https://www.postgresql.org/message-id/CAOC8YUcOtAGscPa31ik8UEMzgn8uAWA09s6CYOGPyP9_cBbWTw%40mail.gmail.com As that message points out, range_intersect_agg accepts either anyrange or anymultirange, so it makes sense for range_agg to do the same. I noticed that the docs only mentioned range_intersect_agg(anyrange), so I added the anymultirange versions of both on the aggregate functions page. I also added a few more tests for range_intersect_agg since the coverage there seemed light. Yours, -- Paul ~{:-) p...@illuminatedcomputing.com >From 116c9dd5b3fbb6626d81588c124cf8fdcb4185ce Mon Sep 17 00:00:00 2001 From: "Paul A. Jungwirth" Date: Fri, 10 Dec 2021 16:04:57 -0800 Subject: [PATCH v1] Add range_agg with multirange inputs --- doc/src/sgml/func.sgml| 30 ++ src/backend/utils/adt/multirangetypes.c | 60 +++ src/include/catalog/pg_aggregate.dat | 3 + src/include/catalog/pg_proc.dat | 11 ++ src/test/regress/expected/multirangetypes.out | 100 ++ src/test/regress/expected/opr_sanity.out | 1 + src/test/regress/sql/multirangetypes.sql | 21 src/test/regress/sql/opr_sanity.sql | 1 + 8 files changed, 227 insertions(+) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 0a725a6711..697bf1924e 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -19951,6 +19951,21 @@ SELECT NULLIF(value, '(none)') ... No + + + + range_agg + +range_agg ( value + anymultirange ) +anymultirange + + +Computes the union of the non-null input values. + + No + + @@ -19966,6 +19981,21 @@ SELECT NULLIF(value, '(none)') ... No + + + + range_intersect_agg + +range_intersect_agg ( value + anymultirange ) +anymultirange + + +Computes the intersection of the non-null input values. + + No + + diff --git a/src/backend/utils/adt/multirangetypes.c b/src/backend/utils/adt/multirangetypes.c index 7773215564..beaa843eab 100644 --- a/src/backend/utils/adt/multirangetypes.c +++ b/src/backend/utils/adt/multirangetypes.c @@ -1394,6 +1394,66 @@ range_agg_finalfn(PG_FUNCTION_ARGS) PG_RETURN_MULTIRANGE_P(make_multirange(mltrngtypoid, typcache->rngtype, range_count, ranges)); } +/* + * multirange_agg_transfn: combine adjacent/overlapping multiranges. + * + * All we do here is gather the input multiranges' ranges into an array + * so that the finalfn can sort and combine them. + */ +Datum +multirange_agg_transfn(PG_FUNCTION_ARGS) +{ + MemoryContext aggContext; + Oid mltrngtypoid; + TypeCacheEntry *typcache; + TypeCacheEntry *rngtypcache; + ArrayBuildState *state; + MultirangeType *current; + int32 range_count; + RangeType **ranges; + int32 i; + + if (!AggCheckCallContext(fcinfo, )) + elog(ERROR, "multirange_agg_transfn called in non-aggregate context"); + + mltrngtypoid = get_fn_expr_argtype(fcinfo->flinfo, 1); + if (!type_is_multirange(mltrngtypoid)) + ereport(ERROR, +(errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg("range_agg must be called with a multirange"))); + + typcache = multirange_get_typcache(fcinfo, mltrngtypoid); + rngtypcache = typcache->rngtype; + + if (PG_ARGISNULL(0)) + state = initArrayResult(rngtypcache->type_id, aggContext, false); + else + state = (ArrayBuildState *) PG_GETARG_POINTER(0); + + /* skip NULLs */ + if (!PG_ARGISNULL(1)) + { + current = PG_GETARG_MULTIRANGE_P(1); + multirange_deserialize(rngtypcache, current, _count, ); + if (range_count == 0) + { + /* + * Add an empty range so we get an empty result (not a null result). + */ + accumArrayResult(state, + RangeTypePGetDatum(make_empty_range(rngtypcache)), + false, rngtypcache->type_id, aggContext); + } + else + { + for (i = 0; i < range_count; i++) +accumArrayResult(state, RangeTypePGetDatum(ranges[i]), false, rngtypcache->type_id, aggContext); + } + } + + PG_RETURN_POINTER(state); +} + Datum multirange_intersect_agg_transfn(PG_FUNCTION_ARGS) { diff --git a/src/include/catalog/pg_aggregate.dat b/src/include/catalog/pg_aggregate.dat index fc6d3bfd94..5a06583798 100644 --- a/src/include/catalog/pg_aggregate.dat +++ b/src/include/catalog/pg_aggregate.dat @@ -557,6 +557,9 @@ { aggfnoid => 'range_agg(anyrange)', aggtransfn => 'range_agg_transfn', aggfinalfn => 'range_agg_finalfn', aggfinalextra => 't', aggtranstype => 'internal' }, +{ aggfnoid => 'range_agg(anymultirange)', aggtransfn => 'multirange_agg_transfn', + aggfinalfn => 'multirange_agg_finalfn', aggfinalextra => 't', + aggtranstype =>
Re: SQL:2011 PERIODS vs Postgres Ranges?
On 10/27/20 7:11 AM, Ibrar Ahmed wrote: I have spent some more time on the patch and did a lot of cleanup along with some fixes, compilation errors, and warnings. Thank you for taking a look at this! I've been swamped with ordinary work and haven't had a chance to focus on it for a while, but I'm hoping to make some improvements over the coming holidays, especially based on feedback from my talk at PgCon. There are a handful of small specific things I'd like to do, and then one big thing: add support for PERIODs. Vik said I could include his old patch for PERIODs, so I'd like to get that working on the latest master, and then rebase my own work on top of it. Then we can accept either ranges or PERIODs in various places (marked by TODOs in the code). Vik also pointed out a way to check foreign keys without using range_agg. He thinks it may even be more efficient. On the other hand it's a much more complicated SQL statement. I'd like to do a performance comparison to get concrete numbers, but if we did use his query, then this patch wouldn't depend on multiranges anymore---which seems like a big aid to moving it forward. Assuming multiranges gets committed, we can always swap in the range_agg query depending on the performance comparison results. I apologize for the slow progress here, and thank you for your help! Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: range_agg
Thanks Alvaro! On Mon, Mar 23, 2020 at 4:33 PM Alvaro Herrera wrote: > > Thinking about the on-disk representation, can we do better than putting > the contained ranges in long-varlena format, including padding; also we > include the type OID with each element. Sounds wasteful. A more > compact representation might be to allow short varlenas and doing away > with the alignment padding, put the the type OID just once. This is > important because we cannot change it later. Can you give me some guidance on this? I don't know how to make the on-disk format different from the in-memory format. (And for the in-memory format, I think it's important to have actual RangeTypes inside the multirange.) Is there something in the documentation, or a README in the repo, or even another type I can follow? > I'm also wondering if multirange_in() is the right strategy. Would it> be sensible to give each range to range_parse or range_parse_bounde, so > that it determines where each range starts and ends? Then that function > doesn't have to worry about each quote and escape, duplicating range > parsing code. (This will probably require changing signature of the > rangetypes.c function, and exporting it; for example have > range_parse_bound allow bound_str to be NULL and in that case don't mess > with the StringInfo and just return the end position of the parsed > bound.) Yeah, I really wanted to do it that way originally too. As you say it would require passing back more information from the range-parsing code. I can take a stab at making the necessary changes. I'm a bit more confident now than I was then in changing the range code we have already. Regards, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: useless RangeIOData->typiofunc
On 3/4/20 1:57 PM, Alvaro Herrera wrote: I noticed while going over the multirange types patch that it adds a pointless typiofunc cached OID to a struct used for I/O functions' fn_extra. It seems to go completely unused, so I checked range types (which this was cribbed from) and indeed, it is completely unused there either. My guess is that it was in turn cribbed from array's ArrayMetaState, which is considerably more sophisticated; I suspect nobody noticed that caching it was pointless. I didn't believe it at first but I think you're right. :-) Here's a patch to remove it from rangetypes.c. It doesn't really waste much memory anyway, but removing it lessens the cognitive load by one or two bits. Looks good to me, and it seems okay to make the same edits to multirangetypes.c Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: range_agg
Thanks for looking at this again! On 3/4/20 1:33 PM, Alvaro Herrera wrote: I came across an interesting thing, namely multirange_canonicalize()'s use of qsort_arg with a callback of range_compare(). range_compare() calls range_deserialize() (non-trivial parsing) for each input range; multirange_canonicalize() later does a few extra deserialize calls of its own. Call me a premature optimization guy if you will, but I think it makes sense to have a different struct (let's call it "InMemoryRange") which stores the parsed representation of each range; then we can deserialize all ranges up front, and use that as many times as needed, without having to deserialize each range every time. I don't know, this sounds like a drastic change. I agree that multirange_deserialize and range_deserialize do a lot of copying (not really any parsing though, and they both assume their inputs are already de-TOASTED). But they are used very extensively, so if you wanted to remove them you'd have to rewrite a lot. I interpreted the intention of range_deserialize to be a way to keep the range struct fairly "private" and give a standard interface to extracting its attributes. Its motive seems akin to deconstruct_array. So I wrote multirange_deserialize to follow that principle. Both functions also handle memory alignment issues for you. With multirange_deserialize, there isn't actually much structure (just the list of ranges), so perhaps you could more easily omit it and give callers direct access into the multirange contents. That still seems risky though, and less well encapsulated. My preference would be to see if these functions are really a performance problem first, and only redo the in-memory structures if they are. Also that seems like something you could do as a separate project. (I wouldn't mind working on it myself, although I'd prefer to do actual temporal database features first.) There are no backwards-compatibility concerns to changing the in-memory structure, right? (Even if there are, it's too late to avoid them for ranges.) While I'm at this, why not name the new file simply multiranges.c instead of multirangetypes.c? As someone who doesn't do a lot of Postgres hacking, I tried to follow the approach in rangetypes.c as closely as I could, especially for naming things. So I named the file multirangetypes.c because there was already rangetypes.c. But also I can see how the "types" emphasizes that ranges and multiranges are not concrete types themselves, but more like abstract data types or generics (like arrays). Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: range_agg
On 11/21/19 1:06 AM, Pavel Stehule wrote: 2. I don't like introduction "safe" operators - now the basic operators are doubled, and nobody without documentation will use @* operators. It is not intuitive. I think is better to map this functionality to basic operators +- * and implement it just for pairs (Multirange, Multirange) and (Multirange, Range) if it is possible It's same relation line Numeric X integer. There should not be introduced new operators. If somebody need it for ranges, then he can use cast to multirange, and can continue. > [snip] 3. There are not prepared casts - postgres=# select int8range(10,15)::int8multirange; ERROR: cannot cast type int8range to int8multirange LINE 1: select int8range(10,15)::int8multirange; ^ There should be some a) fully generic solution, or b) possibility to build implicit cast when any multirange type is created. Okay, I like the idea of just having `range + range` and `multirange + multirange`, then letting you cast between ranges and multiranges. The analogy to int/numeric seems strong. I guess if you cast a multirange with more than one element to a range it will raise an error. That will let me clean up the docs a lot too. Thanks! -- Paul ~{:-) p...@illuminatedcomputing.com
Re: Add json_object(text[], json[])?
On 10/25/19 6:40 AM, Andrew Dunstan wrote: json{b}_build_object and json{b}_build_array are designed for creating nested json{b}. Not sure if they would work for your purpose. Thanks for the suggestion! I looked at these a bit, but they only work if you have a known-ahead-of-time number of arguments. (I did explore building an array and calling jsonb_build_object using VARIADIC, but you can't build an array with alternating text & jsonb elements. That made me curious how these functions even worked, which led me to extract_variadic_args (utils/fmgr/funcapi.c), which has some magic to support heterogeneous types when not called with the VARIADIC keyword, so it seems they bypass the normal variadic handling.) Regards, -- Paul ~{:-) p...@illuminatedcomputing.com
Add json_object(text[], json[])?
Hello, I noticed that our existing 2-param json{,b}_object functions take text[] for both keys and values, so they are only able to build one-layer-deep JSON objects. I'm interested in adding json{,b}_object functions that take text[] for the keys and json{,b}[] for the values. It would otherwise behave the same as json_object(text[], text[]) (e.g. re NULL handling). Does that seem worthwhile to anyone? I'll share my specific problem where I felt I could use this function, although you can stop reading here if that isn't interesting to you. :-) I was building a jsonb_dasherize(j jsonb) function, which converts snake_case JSON keys into dashed-case JSON keys. (It's because of a Javascript framework :-) My function needs to walk the whole JSON structure, doing this recursively when it sees objects inside arrays or other objects. Here is the definition, including a comment where my proposed jsonb_object would have helped: CREATE FUNCTION jsonb_dasherize(j jsonb) RETURNS jsonb IMMUTABLE AS $$ DECLARE t text; key text; val jsonb; ret jsonb; BEGIN t := jsonb_typeof(j); IF t = 'object' THEN -- So close! If only jsonb_object took text[] and jsonb[] params -- SELECT jsonb_object( -- array_agg(dasherize_key(k)), -- array_agg(jsonb_dasherize(v))) -- FROMjsonb_each(j) AS t(k, v); ret := '{}'; FOR key, val IN SELECT * FROM jsonb_each(j) LOOP ret := jsonb_set(ret, array[REPLACE(key, '_', '-')], jsonb_dasherize(val), true); END LOOP; RETURN ret; ELSIF t = 'array' THEN SELECT COALESCE(jsonb_agg(jsonb_dasherize(elem)), '[]') INTOret FROMjsonb_array_elements(j) AS t(elem); RETURN ret; ELSIF t IS NULL THEN -- This should never happen internally -- but only from a passed-in NULL. RETURN NULL; ELSE -- string/number/null: RETURN j; END IF; END; $$ LANGUAGE plpgsql; I also tried a recursive CTE there using jsonb_set, but it was too late at night for me to figure that one out. :-) It seems like a json-taking json_object would be just what I needed. And in general I was surprised that Postgres didn't have a more convenient way to build multi-layer JSON. I'm happy to add this myself if other folks want it. Regards, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 PERIODS vs Postgres Ranges?
Hi Ibrar, On 8/6/19 3:26 AM, Ibrar Ahmed wrote: - Why we are not allowing any other datatype other than ranges in the primary key. Without that there is no purpose of a primary key. A temporal primary key always has at least one ordinary column (of any type), so it is just a traditional primary key *plus* a PERIOD and/or range column to indicate when the record was true. - Thinking about some special token to differentiate between normal primary key and temporal primary key There is already some extra syntax. For the time part of a PK, you say `WITHOUT OVERLAPS`, like this: CONSTRAINT pk_on_t PRIMARY KEY (id, valid_at WITHOUT OVERLAPS) In this example `id` is an ordinary column, and `valid_at` is either a Postgres range or a SQL:2011 PERIOD. (The latter is not yet implemented in my patch but there are some placeholder comments.) Similarly a foreign key has one or more traditional columns *plus* a range/PERIOD. It needs to have a range/PERIOD on both sides. It too has some special syntax, but instead of `WITHOUT OVERLAPS` it is `PERIOD`. (Don't blame me, I didn't write the standard :-) So here is an example: CONSTRAINT fk_t2_to_t FOREIGN KEY (id, PERIOD valid_at) REFERENCES t (id, PERIOD valid_at) You should be able to see my changes to gram.y to support this new syntax. I hope this clears up how it works! I'm happy to answer more questions if you have any. Also if you want to read more: - This paper by Kulkarni & Michels is a 10-page overview of SQL:2011: https://sigmodrecord.org/publications/sigmodRecord/1209/pdfs/07.industry.kulkarni.pdf - This is a talk I gave at PGCon 2019 going over the concepts, with a lot of pictures. You can find text, slides, and a link to the video here: https://github.com/pjungwir/postgres-temporal-talk - This link is ostensibly an annotated bibliography but really tells a story about how the research has developed: https://illuminatedcomputing.com/posts/2017/12/temporal-databases-bibliography/ - There is also some discussion about PERIODs vs ranges upthread here, as well as here: https://www.postgresql-archive.org/Periods-td6022563.html Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: range_agg
On 7/9/19 11:24 PM, David Fetter wrote: I seem to recall that the usual convention (at least in math) is to use intervals that are generally represented as open on the infinity side, but that might not fit how we do things. I think it does, unless I'm misunderstanding? Oh, I was just wondering about the square bracket on the left side of [null, 1). It's not super important. Ah, I understand now. Just a typo on my part. Thanks for catching it, and sorry for the confusion! !mr , perhaps? I like that suggestion. Honestly I'm not sure we even want an inverse, but it's so important theoretically we should at least consider whether it is appropriate here. Or maybe "inverse" is the wrong word for this, or there is a different meaning it should have. Jeff's suggestion of ~ for complement is better. Okay, thanks. I like it better too. Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: range_agg
On 7/9/19 12:01 PM, Alvaro Herrera wrote: On 2019-Jul-08, Paul A Jungwirth wrote: - You can subscript a multirange like you do an array (? This could be a function instead.) Note that we already have a patch in the pipe to make subscripting an extensible operation, which would fit pretty well here, I think. I'll take a look at that! Also, I suppose you would need unnest(multirange) to yield the set of ranges. I think that would be really nice, although it isn't critical I think if you can do something like UNNEST(multirange::tstzrange[]). -- Paul ~{:-) p...@illuminatedcomputing.com
Re: range_agg
I suspect that if you build it, the will come, "they" being anyone who has to schedule coverage, check usage of a resource over time, etc. Is this something you want help with at some level? Coding, testing, promoting... You might be right. :-) Most of this is done already, since it was largely copy/paste from my extension plus figuring out how to register built-in functions with the .dat files. I need to write some docs and do some cleanup and I'll have a CF entry. And I'll probably go ahead and add your two suggestions too Things I'd love help with: - Getting more opinions about the functions' interface, either from you or others, especially: - In the extension I have a boolean param to let you accept gaps or raise an error, and another for overlaps. But what about accepting/raising/returning null? How should the parameters expose that? Maybe leave them as bools but accept true/false/null for permit/raise/nullify respectively? That seems like a questionable UI, but I'm not sure what would be better. Maybe someone with better taste can weigh in. :-) I tried to find existing built-in functions that gave a enumeration of options like that but couldn't find an existing example. - Also: what do you think of the question I asked in my reply to Corey? Is it better to have *all* range_agg functions return an array of ranges, or it is nicer to have a variant that always returns a single range? - Getting it reviewed. - Advice about sequencing it with respect to my temporal foreign keys patch, where I'm planning to call range_agg to check an FK. E.g. should my FK patch be a diff on top of the range_agg code? I assume they should have separate CF entries though? Oh and here's something specific: - I gave oids to my new functions starting with 8000, because I thought I saw some discussion about that recently, and the final committer will correct the oids to the current n+1? But I can't find that discussion anymore, so if that's the wrong approach let me know. Thanks! -- Paul ~{:-) p...@illuminatedcomputing.com
Re: range_agg
On 5/3/19 6:41 PM, David Fetter wrote: This suggests two different ways to extend ranges over aggregation: one which is a union of (in general) disjoint intervals, two others are a union of intervals, each of which has a weight. . . . I think the cases above, or at least the first two of them, should be available. They could be called range_agg, weighted_range_agg, and covering_range_agg. Thanks David! I think these two new functions make sense. Before I implement them too I wonder if anyone else has uses for them? Thanks, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: range_agg
On 5/4/19 3:11 PM, Corey Huinker wrote: One question is how to aggregate ranges that would leave gaps and/or overlaps. So in my extension there is a one-param version that forbids gaps & overlaps, but I let you permit them by passing extra parameters, so the signature is: Perhaps a third way would be to allow and preserve the gaps. Thanks for the feedback! I think this is what I'm doing already (returning an array of ranges), but let me know if I'm misunderstanding. My extension has these signatures: range_agg(anyrange) returning anyrange range_agg(anyrange, bool) returning anyarray range_agg(anyrange, bool, bool) returning anyarray. The first variant raises an error if there are gaps or overlaps and always returns a single range, but the other two return an array of ranges. I was planning to use the same signatures for my patch to pg, unless someone thinks they should be different. But I'm starting to wonder if they shouldn't *all* return arrays. I have two concrete use-cases for these functions and they both require the array-returning versions. Is it helpful to have a version that always returns a single range? Or should I make them all consistent? Thanks, -- Paul ~{:-) p...@illuminatedcomputing.com
range_agg
Hello, I wrote an extension to add a range_agg function with similar behavior to existing *_agg functions, and I'm wondering if folks would like to have it in core? Here is the repo: https://github.com/pjungwir/range_agg I'm also working on a patch for temporal foreign keys, and having range_agg would make the FK check easier and faster, which is why I'd like to get it added. But also it just seems useful, like array_agg, json_agg, etc. One question is how to aggregate ranges that would leave gaps and/or overlaps. So in my extension there is a one-param version that forbids gaps & overlaps, but I let you permit them by passing extra parameters, so the signature is: range_agg(r anyrange, permit_gaps boolean, permit_overlaps boolean) Perhaps another useful choice would be to return NULL if a gap/overlap is found, so that each param would have three choices instead of just two: accept the inputs, raise an error, return a NULL. What do people think? I plan to work on a patch regardless, so that I can use it for temporal FKs, but I'd appreciate some feedback on the "user interface". Thanks, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: Temporal Table Proposal
On 2/25/19 4:21 AM, Ibrar Ahmed wrote: Great, to hear that you are working on that. Do you think I can help you with this? I did some groundwork to make it possible. I can help in coding/reviewing or even can take lead if you want to. Hi Ibrar, I'd love some help with this! I submitted my patch to the March commitfest, and Peter Moser & Anton Dignös submitted theirs also. I still need to rebase on the most recent commits, but I'll try to do that tonight or tomorrow. Personally I'd love some review and feedback, because this is my first substantial patch. (I made a small change to btree_gist a couple years ago also) I think the challenge with temporal functionality is that there are a lot of new concepts, and we'd like them all to hang together in a coherent way. (That's why I want to give a talk about it: to increase background understanding in the Postgres community.) So having someone take the lead on it makes sense. I'm happy to provide some opinions and direction, but my own coding contributions are likely to be slow, and having a regular contributor more closely involved would help a lot. Here are some thoughts about things that need work: - temporal primary keys (my patch) - temporal foreign keys (I've done some work on this adding to my patch but I haven't finished it yet.) - temporal joins (Moser/Dignös patch) - declaring PERIODs (Vik Fearing's patch) - showing PERIODs in the system catalog (Vik Fearing's patch) - using PERIODs in SELECT, WHERE, GROUP BY, HAVING, function arguments, etc. (TODO) - SYSTEM_TIME PERIODs for transaction-time tables (TODO) - temporal insert/update/delete for transaction-time tables (TODO) - temporal insert/update/delete for valid-time tables (TODO) - temporal SELECT for valid-time tables (TODO, could build off the Moser/Dignös work) - temporal SELECT for transaction-time tables (TODO) I think the transaction-time stuff is easier, but also less interesting, and there are well-known patterns for accomplishing it already. I'm more interested in supporting valid-time tables personally. -- Paul ~{:-) p...@illuminatedcomputing.com
Re: Temporal Table Proposal
On 2/22/19 11:31 AM, Euler Taveira wrote: Em sex, 22 de fev de 2019 às 15:41, Ibrar Ahmed escreveu: While working on another PostgreSQL feature, I was thinking that we could use a temporal table in PostgreSQL. Some existing databases offer this. I searched for any discussion on the PostgreSQL mailing list, but could not find any. Maybe my search wasn’t accurate enough: if anyone can point me to a discussion, that would be useful. https://www.postgresql.org/message-id/CA%2BrenyUb%2BXHzsrPHHR6ELqguxaUPGhOPyVc7NW%2BkRsRpBZuUFQ%40mail.gmail.com This is the last one. I don't know why it wasn't in the January CF. Oh that's by me! :-) I didn't put it into the CF because I wanted to get some feedback on primary keys before I got too far into foreign keys, but someone recently advised me to starting adding to CFs anyway with "WIP" in the title, so I'll do that next time. Btw my own patch is very modest, and I'd love to see this other much more extensive patch get some attention: https://www.postgresql.org/message-id/flat/CAHO0eLYyvuqwF%3D2FsgDn1xOs_NOrFBu9Xh-Wq%2BaWfFy0y6%3DjWQ%40mail.gmail.com#4f7fbace3a2f2ce85fcc161cc3fdd273 They were told to adjust where in the query pipeline they do their work, and the latest patch does that (as I understand it), but I don't think anyone has looked at it yet. Both of these patches use range types rather than SQL:2011 PERIODs, but I'd like to *also* support PERIODs (and accept ranges everywhere we accept PERIODs). Vik Fearing already has a patch to let you *declare* PERIODs: https://www.postgresql-archive.org/Periods-td6022563.html Actually using PERIODs in queries seems like a decent chunk of work though: basically it means making our grammar & processing accept PERIODs anywhere they currently accept columns. I'd love to hear some thoughts/suggestions around that. For example: a PERIOD is *similar* to a GENERATED column, so maybe the work being done there can/should influence how we implement them. I'm excited to be getting some momentum around temporal features though! I'm supposed to give a talk about them at PGCon in Ottawa this spring, so hopefully that will help too. Yours, -- Paul ~{:-) p...@illuminatedcomputing.com
Re: SQL:2011 PERIODS vs Postgres Ranges?
Hi Jeff, Thanks for sharing your thoughts and encouragement! :-) > The model in [7] is > based heavily on pack/unpack operators, and it's hard for me to see > how those fit into SQL. Also, the pack/unpack operators have some > theoretical weirdness that the book does not make clear*. > > *: My question was about the significance > of the order when packing on two intervals. Hugh Darwen was kind > enough to reply at length, and offered a lot of insight, but was still > somewhat inconclusive. I'd be interested in seeing that conversation if you ever find it again. I really like how Date/Darwen/Lorentzos use pack/unpack to explain temporal operations as operating on every concurrent "instant" separately, and then bringing the adjacent instants back together into ranges again. Even if you don't materialize that approach, conceptually it makes it easy to understand what's going on. So what is great about the patch from Anton Dignös (https://www.postgresql-archive.org/PROPOSAL-Temporal-query-processing-with-range-types-tt5913058.html) is that (like Date/Darwen/Lorentzos) you still have temporal variants for every operator in the relational algebra, but they give straightforward & efficient implementations of each based on traditional operators plus just their two new "normalize" and "align" operations. (I think they renamed these in later papers/patches though?) Their main paper is at https://files.ifi.uzh.ch/boehlen/Papers/modf174-dignoes.pdf if anyone wants to read it. It's short! :-) The biggest challenge implementing temporal operators in plain SQL is merging/splitting ranges from the left & right sides of an operator so they line up. A single row can get split into multiple rows, or several rows might be merged into one, etc. You can see how tricky Snodgrass's "coalesce" operation is in his book. I gave some example SQL to implement coalesce with UNNEST plus a range_agg function at https://github.com/pjungwir/range_agg but with the Dignös approach I don't think you'd need that. Normalize/align targets roughly the same problem. Anyway I'd be curious whether the theoretical weirdness you found in pack/unpack also applies to normalize/align. Yours, -- Paul ~{:-) p...@illuminatedcomputing.com