I am working through a case that requires very high throughput inserts that would be slowed down by a unique index, but that has externally guaranteed unique values and could benefit from the query planning and documentation provided by a unique constraint.
To fill this need, I propose adding a UNIQUE NOT ENFORCED constraint, which would tell Postgres to assume uniqueness without creating a unique index to enforce it, leaving ensuring uniqueness to the developer, similar to unenforced CHECK and foreign key constraints (https://commitfest.postgresql.org/patch/5285/). -- Reasoning -- In many cases, the idiomatic/generally best way to write a query requires a uniqueness check (a SELECT WHERE ... = ANY()/IN or really any semi-join when optimized to an inner join, UNION, etc), meaning a Unique/HashAggregate node will be added, increasing overhead unless there is an explicit unique constraint. An unenforced unique constraint would allow developers to use their knowledge of the data/previous validation procedures to eliminate the extra node. This would also help with documentation/observability tools by providing more context on the data without adding overhead. A number of SQL databases already implement this, not only including most data warehouses but also many more traditional databases (Db2 (https://www.ibm.com/docs/en/ias?topic=constraints-informational), SingleStore (https://docs.singlestore.com/cloud/create-a-database/specifying-unenforced-unique-constraints/), etc). -- Implementation -- The hardest part of implementing this (as far as I can tell) would be updating the query planner to consider these unenforced constraints. I see two ways to do that: 1. Modify relation/query_has_unique_index (or some other function in the chain to determine uniqueness) to consider constraints instead of just indexes. At plan time, Postgres could include unenforced unique constraints in RelOptInfo to be retrieved by the relevant function. 2. Create a new "dummy index" index type. This would not include any update triggers and would have an infinite cost to prevent usage in query planning, but it would still serve the purpose of proving the existence of a unique index. I am leaning towards the first solution because it would be much simpler and less hacky to implement, although it would lose the flexibility given by custom unique indexes. On the other hand, a dummy index as described in the 2nd solution would likely require much more code. It also might be less intuitive because it redefines what an index means in that specific case, and would expand unique indexes into their own type instead of being a type of B-Tree index. Most other implementation details should be fairly similar to other unenforced constraints, with some differences when altering constraints. Unlike other unenforced constraints, we can't mark the constraint as NOT VALID, because a NOT VALID constraint doesn't validate any data already inserted, and an enforced UNIQUE must be able to scan all data to build the unique index and properly validate uniqueness. Instead, converting a unique constraint will follow the same logic as adding a unique constraint and either block for an extended period to create the index when altering the constraint, or create the index in parallel and then alter the constraint. Unenforced unique constraints also differ from other unenforced constraints in that their use in query planning can yield poor results, rather than just worsening planning estimates. This problem should be clearly documented, and error handling will likely need some changes, but ultimately, it is not fundamentally different from the many other ways that developers can write faulty queries. Let me know what you think. Thanks! Jacob
