Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-23 Thread Jonathan S. Katz

On 5/23/24 8:00 AM, Alvaro Herrera wrote:

Looks good.  Some minor changes:


Thanks for this - right at the deadline! :D


### Query and Operational Performance Improvements

PostgreSQL 17 builds on recent releases and continues to improve performance 
across the entire system. 
[Vacuum](https://www.postgresql.org/docs/17/routine-vacuuming.html), the 
PostgreSQL process responsible for reclaiming storage, has a new internal data 
structure that has shown up to a 20x memory reduction for vacuum,


This reads funny:
"Vacuum ... has shown a memory reduction for vacuum"
Maybe just removing the "for vacuum" words at the end of the phrase is a
sufficient fix.


Fixed.


PostgreSQL 17 can now use both planner statistics and the sort order of [common 
table expressions](https://www.postgresql.org/docs/17/queries-with.html) (aka 
[`WITH` queries](https://www.postgresql.org/docs/17/queries-with.html)) to 
further


Is usage of "aka" typical?  I would have expected "a.k.a." but maybe I'm
just outdated.


Removed, just b/c I can't quickly verify this.


Finally, PostgreSQL 17 adds more explicit SIMD instructions, including AVX-512 
support for the 
[`bit_count](https://www.postgresql.org/docs/17/functions-bitstring.html) 
function.


Note the lack of closing backtick in [`bit_count`].


Fixed.


### Developer Experience

PostgreSQL 17 continues to build on the SQL/JSON standard, adding support for 
the `JSON_TABLE` features that can convert JSON to a standard PostgreSQL table, 
and SQL/JSON constructor (`JSON`, `JSON_SCALAR`, `JSON_SERIALIZE`) and query 
functions (`JSON_EXISTS`, `JSON_QUERY`, `JSON_VALUE`). Notably, these features 
were originally planned for the PostgreSQL 15 release but were reverted during 
the beta period due to design considerations, which is one reason we ask for 
you to help us test features during beta! Additionally, PostgreSQL 17 adds more 
functionality to its `jsonpath` implementation, including the ability to 
convert JSON values to different data types.


I'm not sure it's accurate to say that converting JSON values to
different datatypes is part of the jsonpath implementation; as I
understand, jsonpath is the representation used to search for elements
within JSON values.  If you replace "including" with "and", the result
seems reasonable.


Fixed.


PostgreSQL 17 adds a new connection parameter, `sslnegotation`, which allows 
PostgreSQL to perform direct TLS handshakes when using 
[ALPN](https://en.wikipedia.org/wiki/Application-Layer_Protocol_Negotiation), 
eliminating a network roundtrip. PostgreSQL is registered as `postgresql` in 
the ALPN directory.


Typo here "sslnegotation" missing an i, sslnegotiation.


Fixed.


PostgreSQL 17 normalizes the parameters for `CALL` in 
[`pg_stat_statements`](https://www.postgresql.org/docs/17/pgstatstatements.html),
 reducing the number of entries for frequently called stored procedures. 
Additionally, [`VACUUM` progress 
reporting](https://www.postgresql.org/docs/devel/progress-reporting.html#VACUUM-PROGRESS-REPORTING)
 now shows the progress of vacuuming indexes. PostgreSQL 17 also introduces a 
new view, 
[`pg_wait_events`](https://www.postgresql.org/docs/17/view-pg-wait-events.html),
 which provides descriptions about wait events and can be combined with 
`pg_stat_activity` to give more insight into why an active session is waiting. 
Additionally, some information in the 
[`pg_stat_bgwriter`](https://www.postgresql.org/docs/17/monitoring-stats.html#MONITORING-PG-STAT-BGWRITER-VIEW)
 view is now split out into the new 
[`pg_stat_checkpointer`](https://www.postgresql.org/docs/17/monitoring-stats.html#MONITORING-PG-STAT-CHECKPOINTER-VIEW)
 view.


Note use of one link to "/devel/" here.


Fixed.

Thanks!

Jonathan



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-22 Thread Jonathan S. Katz

On 5/19/24 5:34 PM, Jonathan S. Katz wrote:

On 5/15/24 9:45 PM, Jonathan S. Katz wrote:

Hi,

Attached is a copy of the PostgreSQL 17 Beta 1 release announcement 
draft. This contains a user-facing summary of some of the features 
that will be available in the Beta, as well as a call to test. I've 
made an effort to group them logically around the different workflows 
they affect.


A few notes:

* The section with the features is not 80-char delimited. I will do 
that before the final copy


* There is an explicit callout that we've added in the SQL/JSON 
features that were previously reverted in PG15. I want to ensure we're 
transparent about that, but also use it as a hook to get people testing.


When reviewing:

* Please check for correctness of feature descriptions, keeping in 
mind this is targeted for a general audience


* Please indicate if you believe there's a notable omission, or if we 
should omit any of these callouts


* Please indicate if a description is confusing - I'm happy to rewrite 
to ensure it's clearer.


Please provide feedback no later than Wed 2024-05-22 18:00 UTC. As the 
beta release takes some extra effort, I want to ensure all changes are 
in with time to spare before release day.


Thanks for all the feedback to date. Please see the next revision. 
Again, please provide feedback no later than 2024-05-22 18:00 UTC.


Thanks again everyone for all your feedback. Attached is the final(-ish, 
as I'll do one more readthrough before release) draft of the release 
announcement.


If you catch something and are able to post it prior to 2024-05-23 12:00 
UTC, I may be able to incorporate into the announcement.


Thanks!

Jonathan

The PostgreSQL Global Development Group announces that the first beta release of
PostgreSQL 17 is now [available for 
download](https://www.postgresql.org/download/).
This release contains previews of all features that will be available when
PostgreSQL 17 is made generally available, though some details of the release
can change during the beta period.

You can find information about all of the PostgreSQL 17 features and changes in
the [release notes](https://www.postgresql.org/docs/17/release-17.html):

  
[https://www.postgresql.org/docs/17/release-17.html](https://www.postgresql.org/docs/17/release-17.html)

In the spirit of the open source PostgreSQL community, we strongly encourage you
to test the new features of PostgreSQL 17 on your systems to help us eliminate
bugs or other issues that may exist. While we do not advise you to run
PostgreSQL 17 Beta 1 in production environments, we encourage you to find ways
to run your typical application workloads against this beta release.

Your testing and feedback will help the community ensure that the PostgreSQL 17
release upholds our standards of delivering a stable, reliable release of the
world's most advanced open source relational database. Please read more about
our [beta testing process](https://www.postgresql.org/developer/beta/) and how
you can contribute:

  
[https://www.postgresql.org/developer/beta/](https://www.postgresql.org/developer/beta/)

PostgreSQL 17 Feature Highlights


### Query and Operational Performance Improvements

PostgreSQL 17 builds on recent releases and continues to improve performance 
across the entire system. 
[Vacuum](https://www.postgresql.org/docs/17/routine-vacuuming.html), the 
PostgreSQL process responsible for reclaiming storage, has a new internal data 
structure that has shown up to a 20x memory reduction for vacuum, along with 
improvements in overall time to complete its work. Additionally, the vacuum 
process no longer has a `1GB` limit on the memory it can use (controlled by 
[`maintenance_work_mem`](https://www.postgresql.org/docs/17/runtime-config-resource.html#GUC-MAINTENANCE-WORK-MEM)),
 giving you the option to apply more resources to vacuuming.

This release introduces an interface to stream I/O, and can show performance 
improvements when performing sequential scans and running 
[`ANALYZE`](https://www.postgresql.org/docs/17/sql-analyze.html). PostgreSQL 17 
also includes configuration parameters that can control scalability of 
[transaction, subtransaction and multixact 
buffers](https://www.postgresql.org/docs/17/runtime-config-resource.html#GUC-MULTIXACT-MEMBER-BUFFERS).

PostgreSQL 17 can now use both planner statistics and the sort order of [common 
table expressions](https://www.postgresql.org/docs/17/queries-with.html) (aka 
[`WITH` queries](https://www.postgresql.org/docs/17/queries-with.html)) to 
further optimize these queries and help them to execute more quickly. 
Additionally, this release significantly improves execution time of queries 
that use the `IN` clause with a [B-tree 
index](https://www.postgresql.org/docs/17/indexes-types.html#INDEXES-TYPES-BTREE).
 Starting with this release, PostgreSQL can remove redundant `IS NOT NULL` 
statements from execution on columns that have a `NOT NULL` constraint

Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-22 Thread Jonathan S. Katz

On 5/21/24 6:40 AM, John Naylor wrote:

On Mon, May 20, 2024 at 8:41 PM Masahiko Sawada  wrote:


On Mon, May 20, 2024 at 8:47 PM Jonathan S. Katz  wrote:


On 5/20/24 2:58 AM, John Naylor wrote:

Hi Jon,

Regarding vacuum "has shown up to a 6x improvement in overall time to
complete its work" -- I believe I've seen reported numbers close to
that only 1) when measuring the index phase in isolation or maybe 2)
the entire vacuum of unlogged tables with one, perfectly-correlated
index (testing has less variance with WAL out of the picture). I
believe tables with many indexes would show a lot of improvement, but
I'm not aware of testing that case specifically. Can you clarify where
6x came from?


Sawada-san showed me the original context, but I can't rapidly find it
in the thread. Sawada-san, can you please share the numbers behind this?



I referenced the numbers that I measured during the development[1]
(test scripts are here[2]). IIRC I used unlogged tables and indexes,
and these numbers were the entire vacuum execution time including heap
scanning, index vacuuming and heap vacuuming.


Thanks for confirming.

The wording "has a new internal data structure that reduces memory
usage and has shown up to a 6x improvement in overall time to complete
its work" is specific for runtime, and the memory use is less
specific. Unlogged tables are not the norm, so I'd be cautious of
reporting numbers specifically designed (for testing) to isolate the
thing that changed.

I'm wondering if it might be both more impressive-sounding and more
realistic for the average user experience to reverse that: specific on
memory, and less specific on speed. The best-case memory reduction
occurs for table update patterns that are highly localized, such as
the most recently inserted records, and I'd say those are a lot more
common than the use of unlogged tables.

Maybe something like "has a new internal data structure that reduces
overall time to complete its work and can use up to 20x less memory."

Now, it is true that when dead tuples are sparse and evenly spaced
(e.g. 1 every 100 pages), vacuum can now actually use more memory than
v16. However, the nature of that scenario also means that the number
of TIDs just can't get very big to begin with. In contrast, while the
runtime improvement for normal (logged) tables is likely not
earth-shattering, I believe it will always be at least somewhat
faster, and never slower.


Thanks for the feedback. I flipped it around, per your suggestion:

"has a new internal data structure that has shown up to a 20x memory 
reduction for vacuum, along with improvements in overall time to 
complete its work."


Thanks,

Jonathan


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-22 Thread Jonathan S. Katz

On 5/20/24 6:08 AM, Alvaro Herrera wrote:

On 2024-May-19, Jonathan S. Katz wrote:


### Query and Operational Performance Improvements


In this section I'd add mention the new GUCs to control SLRU memory
size, which is going to be a huge performance boon for cases where the
current fixed-size buffers cause bottlenecks.  Perhaps something like

"Increase scalability of transaction, subtransaction and multixact
shared memory buffer handling, and make their buffer sizes configurable".

I don't know if we have any published numbers of the performance
improvement achieved, but with this patch (or ancestor ones) some
systems go from completely unoperational to working perfectly fine.
Maybe the best link is here
https://www.postgresql.org/docs/devel/runtime-config-resource.html#GUC-MULTIXACT-MEMBER-BUFFERS
though exactly which GUC affects any particular user is workload-
dependant, so I'm not sure how best to do it.


I originally had penciled in this change, but didn't have a good way of 
describing it. The above solves that problem. I went with:


"PostgreSQL 17 also includes configuration parameters that can control 
scalability of [transaction, subtransaction and multixact 
buffers](https://www.postgresql.org/docs/devel/runtime-config-resource.html#GUC-MULTIXACT-MEMBER-BUFFERS)."



### Developer Experience


I think this section should also include the libpq query cancellation
improvements Jelte wrote.  Maybe something like "On the client side,
PostgreSQL 17 provides better support for asynchronous and more secure
query cancellation routines in libpq."  --> link to
https://www.postgresql.org/docs/17/libpq-cancel.html


I went with:

PostgreSQL 17 also provides better support for [asynchronous and more 
secure query cancellation 
routines](https://www.postgresql.org/docs/17/libpq-cancel.html), which 
drivers can adopt using the libpq API.


Thanks,

Jonathan



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-22 Thread Jonathan S. Katz

On 5/19/24 6:15 PM, Erik Rijkers wrote:

Op 5/19/24 om 23:34 schreef Jonathan S. Katz:

On 5/15/24 9:45 PM, Jonathan S. Katz wrote:

Hi,

Attached is a copy of the PostgreSQL 17 Beta 1 release announcement 


'This release introduces adds an interface'  should be:
'This release adds an interface'
    (or 'introduces'; just not both...)


Thanks; adjusted in the next copy.

Jonathan



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-22 Thread Jonathan S. Katz

On 5/20/24 5:34 AM, Bertrand Drouvot wrote:

Hi,

On Sun, May 19, 2024 at 05:10:10PM -0400, Jonathan S. Katz wrote:

On 5/16/24 1:15 AM, Bertrand Drouvot wrote:

Hi,

On Wed, May 15, 2024 at 09:45:35PM -0400, Jonathan S. Katz wrote:

Hi,

Attached is a copy of the PostgreSQL 17 Beta 1 release announcement draft.


Thanks for working on it!

I've one comment:


PostgreSQL 17 also introduces a new view, 
[`pg_wait_events`](https://www.postgresql.org/docs/17/view-pg-wait-events.html),
 which provides descriptions about wait events and can be combined with 
`pg_stat_activity` to give more insight into an operation.


Instead of "to give more insight into an operation", what about "to give more
insight about what a session is waiting for (should it be active)"?


I put:

"to give more in insight into why a session is blocked."


Thanks!



Does that work?



I think using "waiting" is better (as the view is "pg_wait_events" and the
join with pg_stat_activity would be on the "wait_event_type" and "wait_event"
columns).

The reason I mentioned "should it be active" is because wait_event and 
wait_event_type
could be non empty in pg_stat_activity while the session is not in an active 
state
anymore (then not waiting).

A right query would be like the one in [1]:

"
SELECT a.pid, a.wait_event, w.description
   FROM pg_stat_activity a JOIN
pg_wait_events w ON (a.wait_event_type = w.type AND
 a.wait_event = w.name)
   WHERE a.wait_event is NOT NULL and a.state = 'active';
"

means filtering on the "active" state too, and that's what the description
proposal I made was trying to highlight.


Thanks. As such I made it:

"which provides descriptions about wait events and can be combined with 
`pg_stat_activity` to give more insight into why an active session is 
waiting."


Jonathan


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-20 Thread Jonathan S. Katz

On 5/20/24 6:32 AM, David Rowley wrote:

On Mon, 20 May 2024 at 22:11, Alvaro Herrera  wrote:


On 2024-May-16, David Rowley wrote:


On Thu, 16 May 2024 at 17:37, zaidagilist  wrote:

I am trying to open the 17 docs but it looks removed. Getting
following message "Page not found"

https://www.postgresql.org/docs/17/


It's called "devel" for "development" until we branch sometime before July:

https://www.postgresql.org/docs/devel/


Hmm, but that would mean that the Beta1 announce would ship full of
links that will remain broken until July.  I'm not sure what the
workflow for this is, but I hope the /17/ URLs would become valid with
beta1, later this week.


I didn't quite click that it was Jonathan's links that were being
complained about.

I don't know how the website picks up where to link the doc page for a
given version.  I see from e0b82fc8e8 that the PACKAGE_VERSION was
changed from 16devel to 16beta1. Does the website have something that
extracts "devel" from the former and "16" from the latter?  I see the
release announcement for 16beta1 had /16/ links per [1].  So, I guess
it works. I just don't know how.


The tl;dr is that the /17/ links will be available on release day. I've 
validated the current links using the /devel/ heading.


Jonathan



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-20 Thread Jonathan S. Katz

On 5/20/24 8:31 AM, jian he wrote:


release note (https://momjian.us/pgsql_docs/release-17.html)
is
"Add jsonpath methods to convert JSON values to other JSON data types
(Jeevan Chalke)"



Additionally, PostgreSQL 17 adds more functionality to its `jsonpath` 
implementation, including the ability to convert JSON values to different data 
types.

so, I am not sure this is 100% correct.

Maybe we can rephrase it like:


Additionally, PostgreSQL 17 adds more functionality to its `jsonpath` 
implementation, including the ability to convert JSON values to other JSON data 
types.


The release note goes on to state:

==
The jsonpath methods are .bigint(), .boolean(), .date(), 
.decimal([precision [, scale]]), .integer(), .number(), .string(), 
.time(), .time_tz(), .timestamp(), and .timestamp_tz().

==

And reviewing the docs[1], these are converted to a PostgreSQL native 
types, and not JSON types (additionally a bunch of those are not JSON 
types).


Jeevan: can you please confirm that this work converts into the 
PostgreSQL native types?


Thanks,

Jonathan

[1] https://www.postgresql.org/docs/devel/functions-json.html


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-20 Thread Jonathan S. Katz

On 5/20/24 2:58 AM, John Naylor wrote:

Hi Jon,

Regarding vacuum "has shown up to a 6x improvement in overall time to
complete its work" -- I believe I've seen reported numbers close to
that only 1) when measuring the index phase in isolation or maybe 2)
the entire vacuum of unlogged tables with one, perfectly-correlated
index (testing has less variance with WAL out of the picture). I
believe tables with many indexes would show a lot of improvement, but
I'm not aware of testing that case specifically. Can you clarify where
6x came from?


Sawada-san showed me the original context, but I can't rapidly find it 
in the thread. Sawada-san, can you please share the numbers behind this?


We can adjust the claim - but I'd like to ensure we highlight how the 
changes to vacuum will visibly impact users.


Jonathan


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-20 Thread Jonathan S. Katz

On 5/19/24 7:24 PM, David Rowley wrote:

On Mon, 20 May 2024 at 09:35, Jonathan S. Katz  wrote:

Thanks for all the feedback to date. Please see the next revision.
Again, please provide feedback no later than 2024-05-22 18:00 UTC.


Thanks for the updates.


[`COPY`](https://www.postgresql.org/docs/17/sql-copy.html) is used to 
efficiently bulk load data into PostgreSQL, and with PostgreSQL 17 shows a 2x 
performance improvement when loading large rows.


The 2x thing mentioned by Jelte is for COPY TO rather than COPY FROM.
So I think "exporting" or "sending large rows to the client"  rather
than "loading".


Thanks for the clarification - I've edited it as such. That also brings 
up a good point to highlight that COPY is not just for loading (since my 
bias is to do loads these days :) Now it reads:


[`COPY`](https://www.postgresql.org/docs/17/sql-copy.html) is used to 
efficiently bulk load and export data from PostgreSQL, and now with 
PostgreSQL 17 you may see up to a 2x performance improvement when 
exporting large rows.



There's also a stray "with" in that sentence.


Thanks, fixed.

Jonathan



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-19 Thread Jonathan S. Katz

On 5/15/24 9:45 PM, Jonathan S. Katz wrote:

Hi,

Attached is a copy of the PostgreSQL 17 Beta 1 release announcement 
draft. This contains a user-facing summary of some of the features that 
will be available in the Beta, as well as a call to test. I've made an 
effort to group them logically around the different workflows they affect.


A few notes:

* The section with the features is not 80-char delimited. I will do that 
before the final copy


* There is an explicit callout that we've added in the SQL/JSON features 
that were previously reverted in PG15. I want to ensure we're 
transparent about that, but also use it as a hook to get people testing.


When reviewing:

* Please check for correctness of feature descriptions, keeping in mind 
this is targeted for a general audience


* Please indicate if you believe there's a notable omission, or if we 
should omit any of these callouts


* Please indicate if a description is confusing - I'm happy to rewrite 
to ensure it's clearer.


Please provide feedback no later than Wed 2024-05-22 18:00 UTC. As the 
beta release takes some extra effort, I want to ensure all changes are 
in with time to spare before release day.


Thanks for all the feedback to date. Please see the next revision. 
Again, please provide feedback no later than 2024-05-22 18:00 UTC.


Thanks,

Jonathan

The PostgreSQL Global Development Group announces that the first beta release of
PostgreSQL 17 is now [available for 
download](https://www.postgresql.org/download/).
This release contains previews of all features that will be available when
PostgreSQL 17 is made generally available, though some details of the release
can change during the beta period.

You can find information about all of the PostgreSQL 17 features and changes in
the [release notes](https://www.postgresql.org/docs/17/release-17.html):

  
[https://www.postgresql.org/docs/17/release-17.html](https://www.postgresql.org/docs/17/release-17.html)

In the spirit of the open source PostgreSQL community, we strongly encourage you
to test the new features of PostgreSQL 17 on your systems to help us eliminate
bugs or other issues that may exist. While we do not advise you to run
PostgreSQL 17 Beta 1 in production environments, we encourage you to find ways
to run your typical application workloads against this beta release.

Your testing and feedback will help the community ensure that the PostgreSQL 17
release upholds our standards of delivering a stable, reliable release of the
world's most advanced open source relational database. Please read more about
our [beta testing process](https://www.postgresql.org/developer/beta/) and how
you can contribute:

  
[https://www.postgresql.org/developer/beta/](https://www.postgresql.org/developer/beta/)

PostgreSQL 17 Feature Highlights


### Query and Operational Performance Improvements

PostgreSQL 17 builds on recent releases and continues to improve performance 
across the entire system. 
[Vacuum](https://www.postgresql.org/docs/17/routine-vacuuming.html), the 
PostgreSQL process responsible for reclaiming storage, has a new internal data 
structure that reduces memory usage and has shown up to a 6x improvement in 
overall time to complete its work. Additionally, the vacuum process no longer 
has a `1GB` limit on the memory it can use (controlled by 
[`maintenance_work_mem`](https://www.postgresql.org/docs/17/runtime-config-resource.html#GUC-MAINTENANCE-WORK-MEM)),
 giving you the option to apply more resources to vacuuming.

This release introduces adds an interface to stream I/O, and can show 
performance improvements when performing sequential scans and running 
[`ANALYZE`](https://www.postgresql.org/docs/17/sql-analyze.html).

PostgreSQL 17 can now use both planner statistics and the sort order of [common 
table expressions](https://www.postgresql.org/docs/17/queries-with.html) (aka 
[`WITH` queries](https://www.postgresql.org/docs/17/queries-with.html)) to 
further optimize these queries and help them to execute more quickly. 
Additionally, this release significantly improves execution time of queries 
that use the `IN` clause with a [B-tree 
index](https://www.postgresql.org/docs/17/indexes-types.html#INDEXES-TYPES-BTREE).
 Starting with this release, PostgreSQL can remove redundant `IS NOT NULL` 
statements from execution on columns that have a `NOT NULL` constraint, and no 
longer needs to do work on queries that contain an `IS NULL` clause on an `IS 
NOT NULL` column. Now as of PostgreSQL 17, you can use parallel index builds 
for [BRIN](https://www.postgresql.org/docs/17/brin.html) indexes.

Workloads with highly concurrent changes can benefit from PostgreSQL 17 with an 
improvement to how [write-ahead 
log](https://www.postgresql.org/docs/17/wal-intro.html) 
([WAL](https://www.postgresql.org/docs/17/wal-intro.html)) locks are managed, 
with some tests showing up to a 2x performance improvement. 

Finally, PostgreSQL 17 adds more explicit

Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-19 Thread Jonathan S. Katz

On 5/16/24 8:05 AM, Joe Conway wrote:

On 5/15/24 21:45, Jonathan S. Katz wrote:

Please provide feedback no later than Wed 2024-05-22 18:00 UTC. As the
beta release takes some extra effort, I want to ensure all changes are
in with time to spare before release day.


"You can find information about all of the features and changes found in
PostgreSQL 17"

Sounds repetitive; maybe:

"Information about all of the features and changes in PostgreSQL 17 can 
be found in the [release notes]"


The first is active voice, the suggestion passive. However, I tightened 
the language:


You can find information about all of the PostgreSQL 17 features and 
changes in the [release 
notes](https://www.postgresql.org/docs/17/release-17.html):


"PostgreSQL 17 includes a built-in collation provider that provides 
similar semantics to the `C` collation provided by libc."


I think that needs to mention UTF-8 encoding somehow, and "provided by 
libc" is not really true; maybe:


"PostgreSQL 17 includes a built-in collation provider that provides 
similar sorting semantics to the `C` collation except with UTF-8 
encoding rather than SQL_ASCII."


WFM. Taken verbatim.

Thanks,

Jonathan


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-19 Thread Jonathan S. Katz

On 5/16/24 6:41 AM, Jelte Fennema-Nio wrote:

On Thu, 16 May 2024 at 03:45, Jonathan S. Katz  wrote:

Attached is a copy of the PostgreSQL 17 Beta 1 release announcement
draft.


I think we can quickly mention c4ab7da6061 in the COPY paragraph, in
some benchmarks it improved perf by close to 2x. Something like this:
"has improved performance in PostgreSQL 17 when the source encoding
matches the destination encoding *and when sending large rows from
server to client*"


(I'm going to make a note to test this with loading large vectors :) 
I've modified the text to reflect this. Please see the new language 
upthread.



Also, I think it's a bit weird to put the current COPY paragraph under
Developer Experience. I think if you want to keep it there instead of
move it to the per section, we should put the line about IGNORE_ERROR
first instead of the perf improvements. Now the IGNORE_ERROR addition
seems more of an afterthought.


I don't agree with this. I think we want to push COPY as a developer 
feature - I see a lot of people not utilizing COPY appropriate when it 
would really benefit the performance of their app, and I think 
emphasizing it as the way to do bulk loads (while touting that it's even 
faster!) will help make it more apparent.



s/IGNORE_ERROR/ON_ERROR


Thanks.


I think it would be good to clarify if the following applies when
upgrading from or to PostgreSQL 17:
"Starting with PostgreSQL 17, you no longer need to drop logical
replication slots when using pg_upgrade"


Adjusted.


Finally, I personally would have included a lot more links for the new
items in this document. Some that would benefit from being a link
imho:
- pg_createsubscriber
- JSON_TABLE
- SQL/JSON constructor
- SQL/JSON query functions
- ON_ERROR
- sslnegotiation
- PQchangePassword
- pg_maintain


I have to check if these have deep links or not, but I was planning to 
make another pass once the copy (no pun intended) is closer to 
finalized, so I don't have to constantly edit markdown.


Jonathan


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-19 Thread Jonathan S. Katz

On 5/16/24 1:15 AM, Bertrand Drouvot wrote:

Hi,

On Wed, May 15, 2024 at 09:45:35PM -0400, Jonathan S. Katz wrote:

Hi,

Attached is a copy of the PostgreSQL 17 Beta 1 release announcement draft.


Thanks for working on it!

I've one comment:


PostgreSQL 17 also introduces a new view, 
[`pg_wait_events`](https://www.postgresql.org/docs/17/view-pg-wait-events.html),
 which provides descriptions about wait events and can be combined with 
`pg_stat_activity` to give more insight into an operation.


Instead of "to give more insight into an operation", what about "to give more
insight about what a session is waiting for (should it be active)"?


I put:

"to give more in insight into why a session is blocked."

Does that work?

Jonathan



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-19 Thread Jonathan S. Katz

On 5/16/24 1:10 AM, Thom Brown wrote:
On Thu, May 16, 2024, 02:45 Jonathan S. Katz <mailto:jk...@postgresql.org>> wrote:


Hi,

Attached is a copy of the PostgreSQL 17 Beta 1 release announcement
draft. This contains a user-facing summary of some of the features that
will be available in the Beta, as well as a call to test. I've made an
effort to group them logically around the different workflows they
affect.

A few notes:

* The section with the features is not 80-char delimited. I will do
that
before the final copy

* There is an explicit callout that we've added in the SQL/JSON
features
that were previously reverted in PG15. I want to ensure we're
transparent about that, but also use it as a hook to get people testing.

When reviewing:

* Please check for correctness of feature descriptions, keeping in mind
this is targeted for a general audience

* Please indicate if you believe there's a notable omission, or if we
should omit any of these callouts

* Please indicate if a description is confusing - I'm happy to rewrite
to ensure it's clearer.

Please provide feedback no later than Wed 2024-05-22 18:00 UTC. As the
beta release takes some extra effort, I want to ensure all changes are
in with time to spare before release day.


"Now as of PostgreSQL 17, you can now use parallel index builds for 
[BRIN](https://www.postgresql.org/docs/17/brin.html 
<https://www.postgresql.org/docs/17/brin.html>) indexes."


The 2nd "now" is redundant.


"Finally, PostgreSQL 17 adds more explicitly SIMD instructions, 
including AVX-512 support for the 
[`bit_count](https://www.postgresql.org/docs/17/functions-bitstring.html 
<https://www.postgresql.org/docs/17/functions-bitstring.html>) function."


Would "SIMD-explicit instructions" be better? Also, I know you may not 
be using markdown for the final version, but the bit_count backtick 
isn't matched by a closing backtick.



"[`COPY`](https://www.postgresql.org/docs/17/sql-copy.html 
<https://www.postgresql.org/docs/17/sql-copy.html>), used to efficiently 
bulk load data into PostgreSQL"


The "used to" makes me stumble into reading it as meaning "it previously 
could efficiently bulk load data".


Perhaps just add a "which is" before "used"?


"PostgreSQL 17 includes a built-in collation provider that provides 
similar semantics to the `C` collation provided by libc."


"provider", "provides", and "provided" feels too repetitive.

How about, "PostgreSQL 17 includes a built-in collation provider with 
semantics similar to the `C` collation offered by libc."?


Thanks - I accepted (with modifications) most of the suggestions here. 
I'll include in the next version of the draft.


Jonathan



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL 17 Beta 1 release announcement draft

2024-05-19 Thread Jonathan S. Katz

On 5/15/24 10:42 PM, David Rowley wrote:

Thanks for writing that up. It's always exciting to see this each year.

On Thu, 16 May 2024 at 13:45, Jonathan S. Katz  wrote:

* Please indicate if you believe there's a notable omission, or if we
should omit any of these callouts


I'd say the streaming read stuff added in b5a9b18cd and subsequent
commits like b7b0f3f27 and 041b96802 are worth a mention. I'd be happy
to see this over the IS NOT NULL qual stuff that I worked on in there
or even the AVX512 bit counting. Speeding up a backwater aggregate
function is nice, but IMO, not compatible with reducing the number
reads.
There's some benchmarking in a youtube video:
https://youtu.be/QAYzWAlxCYc?si=L0UT6Lrf067ZBv46=237


Nice! Definitely agree on including this - it wasn't initially clear to 
me on the read of the release notes. I'll update it. Please see in the 
next revision (will posted upthread), proposed text here for 
convenience, as I'm not sure I'm appropriately capturing it:


==
This release introduces adds an interface to stream I/O, and can show 
performance improvements when performing sequential scans and running 
[`ANALYZE`](https://www.postgresql.org/docs/17/sql-analyze.html).

==

The AVX-512 bit counting showed solid impact[1] on the binary distance 
functions in pgvector (I have to re-run again w/v17, as I seem to recall 
seeing some numbers that boosted it 5-7x [but recall isn't 100% ;)]).



* Please indicate if a description is confusing - I'm happy to rewrite
to ensure it's clearer.

Please provide feedback no later than Wed 2024-05-22 18:00 UTC.


The only other thing I saw from a quick read was a stray "the" in "the
copy proceed even if the there is an error inserting a row."


Thanks!

Jonathan

[1] https://github.com/pgvector/pgvector/pull/519



OpenPGP_signature.asc
Description: OpenPGP digital signature


PostgreSQL 17 Beta 1 release announcement draft

2024-05-15 Thread Jonathan S. Katz

Hi,

Attached is a copy of the PostgreSQL 17 Beta 1 release announcement 
draft. This contains a user-facing summary of some of the features that 
will be available in the Beta, as well as a call to test. I've made an 
effort to group them logically around the different workflows they affect.


A few notes:

* The section with the features is not 80-char delimited. I will do that 
before the final copy


* There is an explicit callout that we've added in the SQL/JSON features 
that were previously reverted in PG15. I want to ensure we're 
transparent about that, but also use it as a hook to get people testing.


When reviewing:

* Please check for correctness of feature descriptions, keeping in mind 
this is targeted for a general audience


* Please indicate if you believe there's a notable omission, or if we 
should omit any of these callouts


* Please indicate if a description is confusing - I'm happy to rewrite 
to ensure it's clearer.


Please provide feedback no later than Wed 2024-05-22 18:00 UTC. As the 
beta release takes some extra effort, I want to ensure all changes are 
in with time to spare before release day.


Thanks,

Jonathan
The PostgreSQL Global Development Group announces that the first beta release of
PostgreSQL 17 is now [available for 
download](https://www.postgresql.org/download/).
This release contains previews of all features that will be available when
PostgreSQL 17 is made generally available, though some details of the release
can change during the beta period.

You can find information about all of the features and changes found in
PostgreSQL 17 in the [release 
notes](https://www.postgresql.org/docs/17/release-17.html):

  
[https://www.postgresql.org/docs/17/release-17.html](https://www.postgresql.org/docs/17/release-17.html)

In the spirit of the open source PostgreSQL community, we strongly encourage you
to test the new features of PostgreSQL 17 on your systems to help us eliminate
bugs or other issues that may exist. While we do not advise you to run
PostgreSQL 17 Beta 1 in production environments, we encourage you to find ways
to run your typical application workloads against this beta release.

Your testing and feedback will help the community ensure that the PostgreSQL 17
release upholds our standards of delivering a stable, reliable release of the
world's most advanced open source relational database. Please read more about
our [beta testing process](https://www.postgresql.org/developer/beta/) and how
you can contribute:

  
[https://www.postgresql.org/developer/beta/](https://www.postgresql.org/developer/beta/)

PostgreSQL 17 Feature Highlights


### Query and Operational Performance Improvements

PostgreSQL 17 builds on recent releases and continues to improve performance 
across the entire system. 
[Vacuum](https://www.postgresql.org/docs/17/routine-vacuuming.html), the 
PostgreSQL process responsible for reclaiming storage, has a new internal data 
structure that reduces memory usage and has shown up to a 6x improvement in 
overall time to complete its work. Additionally, the vacuum process no longer 
has a `1GB` limit on the memory it can use (controlled by 
[`maintenance_work_mem`](https://www.postgresql.org/docs/17/runtime-config-resource.html#GUC-MAINTENANCE-WORK-MEM)),
 giving you the option to apply more resources to vacuuming.

PostgreSQL 17 can now use both planner statistics and the sort order of [common 
table expressions](https://www.postgresql.org/docs/17/queries-with.html) (aka 
[`WITH` queries](https://www.postgresql.org/docs/17/queries-with.html)) to 
further optimize these queries and help them to execute more quickly. 
Additionally, this release significantly improves execution time of queries 
that use the `IN` clause with a [B-tree 
index](https://www.postgresql.org/docs/17/indexes-types.html#INDEXES-TYPES-BTREE).
 Starting with this release, PostgreSQL can remove redundant `IS NOT NULL` 
statements from execution on columns that have a `NOT NULL` constraint, and no 
longer needs to do work on queries that contain an `IS NULL` clause on an `IS 
NOT NULL` column. Now as of PostgreSQL 17, you can now use parallel index 
builds for [BRIN](https://www.postgresql.org/docs/17/brin.html) indexes.

Workloads with highly concurrent changes can benefit from PostgreSQL 17 with an 
improvement to how [write-ahead 
log](https://www.postgresql.org/docs/17/wal-intro.html) 
([WAL](https://www.postgresql.org/docs/17/wal-intro.html)) locks are managed, 
with some tests showing up to a 2x performance improvement. 

Finally, PostgreSQL 17 adds more explicitly SIMD instructions, including 
AVX-512 support for the 
[`bit_count](https://www.postgresql.org/docs/17/functions-bitstring.html) 
function.


### Partitioned and distributed workloads enhancements

PostgreSQL 17 brings more flexibility to partition management, adding the 
ability to both split and merge partitions, and adds support to partitioned 
tables for identity 

Re: 2024-05-09 release announcement draft

2024-05-08 Thread Jonathan S. Katz

On 5/8/24 5:44 PM, David Rowley wrote:

On Thu, 9 May 2024 at 04:17, Jonathan S. Katz  wrote:

* Fix how
[`INSERT`](https://www.postgresql.org/docs/current/sql-insert.html)
handles multiple
[`VALUES`](https://www.postgresql.org/docs/current/sql-values.html) rows
into a target column that is a domain over an array or composite type.


Maybe it's only me who thinks the double plural of "VALUES rows" is
hard to parse. If that's the case I'll just drop this as it's not that
important.

FWIW, I'd probably write:

* Fix issue with
[`INSERT`](https://www.postgresql.org/docs/current/sql-insert.html)
with a multi-row
[`VALUES`](https://www.postgresql.org/docs/current/sql-values.html) clause
where a target column is a domain over an array or composite type.


I like your wording, and went with that.

Thanks,

Jonathan



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: 2024-05-09 release announcement draft

2024-05-08 Thread Jonathan S. Katz

On 5/7/24 12:16 AM, Tom Lane wrote:

David Rowley  writes:

Why not "Fix INSERT with multi-row VALUES clauses ..."?


To my mind, the VALUES clause is the data source for INSERT,
so "from" seems appropriate.  I'm not going to argue hard
about it.


OK, so I've read through this a few times and have sufficiently confused 
myself. So, how about this:


* Fix how 
[`INSERT`](https://www.postgresql.org/docs/current/sql-insert.html) 
handles multiple 
[`VALUES`](https://www.postgresql.org/docs/current/sql-values.html) rows 
into a target column that is a domain over an array or composite type.


Thanks,

Jonathan


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: 2024-05-09 release announcement draft

2024-05-06 Thread Jonathan S. Katz

On 5/6/24 11:05 PM, David Rowley wrote:

On Tue, 7 May 2024 at 14:58, Jonathan S. Katz  wrote:

* Throw an error if an index is accessed while it is being reindexed.




Based on this, I'd vote to just remove it from the release announcement.


I'd prefer that over leaving the wording the way it is.  Looking at
the test case in [1], it does not seem like a very likely thing for
people to hit. It basically requires someone to be telling lies about
a function's immutability.


I opted for that; and it turned out the other fix was simple, so here's 
an updated draft.


Jonathan

The PostgreSQL Global Development Group has released an update to all supported
versions of PostgreSQL, including 16.3, 15.7, 14.12, 13.15, and 12.19.
This release fixes over 55 bugs reported over the last several months.

For the full list of changes, please review the
[release notes](https://www.postgresql.org/docs/release/).

PostgreSQL 12 EOL Notice


PostgreSQL 12 will stop receiving fixes on November 14, 2024. If you are
running PostgreSQL 12 in a production environment, we suggest that you make
plans to upgrade to a newer, supported version of PostgreSQL. Please see our
[versioning policy](https://www.postgresql.org/support/versioning/) for more
information.

Bug Fixes and Improvements
--
 
This update fixes over 55 bugs that were reported in the last several months.
The issues listed below affect PostgreSQL 16. Some of these issues may also
affect other supported versions of PostgreSQL.

* Fix [`INSERT`](https://www.postgresql.org/docs/current/sql-insert.html) from
multiple [`VALUES`](https://www.postgresql.org/docs/current/sql-values.html)
rows into a target column that is a domain over an array or composite type.
* Require the [SELECT 
privilege](https://www.postgresql.org/docs/current/sql-grant.html)
on the target table when using 
[`MERGE`](https://www.postgresql.org/docs/current/sql-merge.html)
when using `MERGE ... DO NOTHING`.
* Per the SQL standard, throw an error if a target row in `MERGE` joins to more
than one source row during a modification.
* Fix incorrect pruning of `NULL` partition when a table is partitioned on
a boolean column and the query has a boolean `IS NOT` clause.
* Make [`ALTER FOREIGN TABLE ... SET 
SCHEMA`](https://www.postgresql.org/docs/current/sql-alterforeigntable.html)
move any owned sequences into the new schema.
* [`CREATE 
DATABASE`](https://www.postgresql.org/docs/current/sql-createdatabase.html)
now recognizes `STRATEGY` keywords case-insensitively.
* Fix how [EXPLAIN](https://www.postgresql.org/docs/current/sql-explain.html)
counts heap pages during bitmap heap scan to show all counted pages, not just
ones with visible tuples.
* Avoid deadlock during removal of orphaned temporary tables.
* Several fixes for 
[`VACUUM`](https://www.postgresql.org/docs/current/sql-vacuum.html),
including one that can reduce unnecessary I/O.
* Several query planner fixes.
* Add optimization for certain operations where an installation has thousands
of roles.
* Fix confusion for SQL-language procedures that return a single composite-type
column.
* Fix incorrect rounding and overflow hazards in 
[`date_bin()`](https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-BIN).
* Detect integer overflow when adding or subtracting an
[interval](https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT)
to/from a timestamp.
* Fix several race conditions with logical replication, including determining if
a table sync operation is required.
* Disconnect if a new server session's client socket cannot be put into 
non-blocking mode.
* [`initdb -c`](https://www.postgresql.org/docs/current/app-initdb.html) now
matches parameter names case-insensitively.
* Avoid leaking a query result from 
[`psql`](https://www.postgresql.org/docs/current/app-psql.html)
after the query is cancelled.
* Fix how PL/pgSQL parses of [single-line 
comments](https://www.postgresql.org/docs/current/plpgsql-structure.html) (`-- 
style comments`)
following expression.

Updating


All PostgreSQL update releases are cumulative. As with other minor releases,
users are not required to dump and reload their database or use `pg_upgrade` in
order to apply this update release; you may simply shutdown PostgreSQL and
update its binaries.

Users who have skipped one or more update releases may need to run additional
post-update steps; please see the release notes from earlier versions for
details.

For more details, please see the
[release notes](https://www.postgresql.org/docs/release/).

Links
-
* [Download](https://www.postgresql.org/download/)
* [Release Notes](https://www.postgresql.org/docs/release/)
* [Security](https://www.postgresql.org/support/security/)
* [Versioning Policy](https://www.postgresql.org/support/versioning/)
* [PostgreSQL 16 Release Announcement](https://www.postgresql.org/about/press/)
* [Follow @postgresql on X/Twitter

Re: 2024-05-09 release announcement draft,2024-05-09 release announcement draft

2024-05-06 Thread Jonathan S. Katz

On 5/6/24 5:36 PM, Sutou Kouhei wrote:

Hi,

In 
<779790c7-45d7-4010-9305-c3f9e6a60...@postgresql.org>,<779790c7-45d7-4010-9305-c3f9e6a60...@postgresql.org>
   "2024-05-09 release announcement draft,2024-05-09 release announcement 
draft" on Mon, 6 May 2024 13:44:05 -0400,
   "Jonathan S. Katz"  wrote:


* Added optimization for certain operations where an installation has thousands
of roles.


Added ->
Add


Fixed.


* [Follow @postgresql on Twitter](https://twitter.com/postgresql)


Twitter ->
X


I think this one is less clear, from browsing around. I think 
"X/Twitter" is considered acceptable, and regardless the domain is still 
pointing to "Twitter". However, I'll go with the hybrid adjustment.


Jonathan



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: 2024-05-09 release announcement draft

2024-05-06 Thread Jonathan S. Katz

On 5/6/24 5:08 PM, David Rowley wrote:

On Tue, 7 May 2024 at 05:44, Jonathan S. Katz  wrote:

Please provide feedback no later (and preferably sooner) than 2024-05-09
12:00 UTC.


Thanks for the draft.  Here's some feedback.


* Fix [`INSERT`](https://www.postgresql.org/docs/current/sql-insert.html) from
multiple [`VALUES`](https://www.postgresql.org/docs/current/sql-values.html)
rows into a target column that is a domain over an array or composite type.
including requiring the [SELECT 
privilege](https://www.postgresql.org/docs/current/sql-grant.html)
on the target table when using 
[`MERGE`](https://www.postgresql.org/docs/current/sql-merge.html)
when using `MERGE ... DO NOTHING`.


Something looks wrong with the above. Are two separate items merged
into one?  52898c63e and a3f5d2056?


Ugh, I see what happened. I was originally planning to combine them, and 
then had one be the lede, then the other. Given I ended up consolidating 
quite a bit, I'll just have them each stand on their own. I'll fix this 
in the next draft (which I'll upload on my Tuesday).



* Fix confusion for SQL-language procedures that returns a single composite-type
column.


Should "returns" be singular here?


Fixed.


* Throw an error if an index is accessed while it is being reindexed.


  I know you want to keep these short and I understand the above is the
same wording from release notes, but these words sound like a terrible
oversite that we allow any concurrent query to still use the table
while a reindex is in progress.


Yeah, I was not happy with this one at all.

  Maybe we should give more detail

there so people don't think we made such a silly mistake. The release
note version I think does have enough words to allow the reader to
understand that the mistake is more complex. Maybe we could add
something here to make it sound like less of an embarrassing mistake?


Based on this, I'd vote to just remove it from the release announcement.

Jonathan



OpenPGP_signature.asc
Description: OpenPGP digital signature


2024-05-09 release announcement draft

2024-05-06 Thread Jonathan S. Katz

Hi,

Please find the draft of the 2024-05-09 release announcement.

Please review for corrections and any omissions that should be called 
out as part of this release.


Please provide feedback no later (and preferably sooner) than 2024-05-09 
12:00 UTC.


Thanks,

Jonathan
The PostgreSQL Global Development Group has released an update to all supported
versions of PostgreSQL, including 16.3, 15.7, 14.12, 13.15, and 12.19.
This release fixes over 55 bugs reported over the last several months.

For the full list of changes, please review the
[release notes](https://www.postgresql.org/docs/release/).

PostgreSQL 12 EOL Notice


PostgreSQL 12 will stop receiving fixes on November 14, 2024. If you are
running PostgreSQL 12 in a production environment, we suggest that you make
plans to upgrade to a newer, supported version of PostgreSQL. Please see our
[versioning policy](https://www.postgresql.org/support/versioning/) for more
information.

Bug Fixes and Improvements
--
 
This update fixes over 55 bugs that were reported in the last several months.
The issues listed below affect PostgreSQL 16. Some of these issues may also
affect other supported versions of PostgreSQL.

* Fix [`INSERT`](https://www.postgresql.org/docs/current/sql-insert.html) from
multiple [`VALUES`](https://www.postgresql.org/docs/current/sql-values.html)
rows into a target column that is a domain over an array or composite type.
including requiring the [SELECT 
privilege](https://www.postgresql.org/docs/current/sql-grant.html)
on the target table when using 
[`MERGE`](https://www.postgresql.org/docs/current/sql-merge.html)
when using `MERGE ... DO NOTHING`.
* Per the SQL standard, throw an error if a target row in `MERGE` joins to more
than one source row during a modification.
* Fix incorrect pruning of `NULL` partition when a table is partitioned on
a boolean column and the query has a boolean `IS NOT` clause.
* Make [`ALTER FOREIGN TABLE ... SET 
SCHEMA`](https://www.postgresql.org/docs/current/sql-alterforeigntable.html)
move any owned sequences into the new schema.
* [`CREATE 
DATABASE`](https://www.postgresql.org/docs/current/sql-createdatabase.html)
now recognizes `STRATEGY` keywords case-insensitively.
* Fix how [EXPLAIN](https://www.postgresql.org/docs/current/sql-explain.html)
counts heap pages during bitmap heap scan to show all counted pages, not just
ones with visible tuples.
* Avoid deadlock during removal of orphaned temporary tables.
* Several fixes for 
[`VACUUM`](https://www.postgresql.org/docs/current/sql-vacuum.html),
including one that can reduce unnecessary I/O.
* Several query planner fixes.
* Added optimization for certain operations where an installation has thousands
of roles.
* Fix confusion for SQL-language procedures that returns a single composite-type
column.
* Fix incorrect rounding and overflow hazards in 
[`date_bin()`](https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-BIN).
* Detect integer overflow when adding or subtracting an
[interval](https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT)
to/from a timestamp.
* Throw an error if an index is accessed while it is being reindexed.
* Fix several race conditions with logical replication, including determining if
a table sync operation is required.
* Disconnect if a new server session's client socket cannot be put into 
non-blocking mode.
* [`initdb -c`](https://www.postgresql.org/docs/current/app-initdb.html) now
matches parameter names case-insensitively.
* Avoid leaking a query result from 
[`psql`](https://www.postgresql.org/docs/current/app-psql.html)
after the query is cancelled.
* Fix how PL/pgSQL parses of [single-line 
comments](https://www.postgresql.org/docs/current/plpgsql-structure.html) (`-- 
style comments`)
following expression.

Updating


All PostgreSQL update releases are cumulative. As with other minor releases,
users are not required to dump and reload their database or use `pg_upgrade` in
order to apply this update release; you may simply shutdown PostgreSQL and
update its binaries.

Users who have skipped one or more update releases may need to run additional
post-update steps; please see the release notes from earlier versions for
details.

For more details, please see the
[release notes](https://www.postgresql.org/docs/release/).

Links
-
* [Download](https://www.postgresql.org/download/)
* [Release Notes](https://www.postgresql.org/docs/release/)
* [Security](https://www.postgresql.org/support/security/)
* [Versioning Policy](https://www.postgresql.org/support/versioning/)
* [PostgreSQL 16 Release Announcement](https://www.postgresql.org/about/press/)
* [Follow @postgresql on Twitter](https://twitter.com/postgresql)
* [Donate](https://www.postgresql.org/about/donate/)

If you have corrections or suggestions for this release announcement, please
send them to the _pgsql-www@lists.postgresql.org_ public

New committers: Melanie Plageman, Richard Guo

2024-04-26 Thread Jonathan S. Katz
The Core Team would like to extend our congratulations to Melanie 
Plageman and Richard Guo, who have accepted invitations to become our 
newest PostgreSQL committers.


Please join us in wishing them much success and few reverts!

Thanks,

Jonathan


OpenPGP_signature.asc
Description: OpenPGP digital signature


PostgreSQL 17 Beta 1 release date

2024-04-18 Thread Jonathan S. Katz

Hi,

PostgreSQL 17 Beta 1 is planned to be release on May 23, 2024. Please 
continue your hard work on closing out open items[1] ahead of the 
release and have the fixes targeted for the release committed by May 18, 
2024.


Thanks - it's very exciting that we're at this point in the release cycle!

Jonathan

[1] https://wiki.postgresql.org/wiki/PostgreSQL_17_Open_Items


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Reports on obsolete Postgres versions

2024-04-12 Thread Jonathan S. Katz

On 4/12/24 2:12 PM, Bruce Momjian wrote:


I am ready to apply this patch to the website.  How do I do this?  Do I
just commit this to the pgweb git tree?  Does that push to the website?


I pushed this to the website[1].

Thanks,

Jonathan

[1] https://www.postgresql.org/support/versioning/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: PostgreSQL 17 Release Management Team & Feature Freeze

2024-04-08 Thread Jonathan S. Katz

On 4/8/24 8:29 AM, Andres Freund wrote:

Hi,

On 2024-04-08 09:26:09 -0400, Robert Haas wrote:

On Sun, Apr 7, 2024 at 6:50 PM Michael Paquier  wrote:
And maybe we need to think of a way to further mitigate this crush of
last minute commits. e.g. In the last week, you can't have more
feature commits, or more lines of insertions in your commits, than you
did in the prior 3 weeks combined. I don't know. I think this mad rush
of last-minute commits is bad for the project.


I don't think it's very useful to paint a very broad brush here,
unfortunately. Some will just polish commits until the last minute, until the
the dot's on the i's really shine, others will continue picking up more CF
entries until the freeze is reached, others will push half baked stuff.  Of
course there will be an increased commit rate, but it does looks like there
was some stuff that looked somewhat rickety.


I agree with Andres here (though decline to comment on the rickety-ness 
of the patches). I think overcoming human nature to be more proactive at 
a deadline is at least a NP-hard problem. This won't change if we adjust 
deadlines. I think it's better to ensure we're enforcing best practices 
for commits, and maybe that's a separate review to have.


As mentioned in different contexts, we do have several safeguards for a 
release:


* We have a (fairly long) beta period; this allows us to remove patches 
prior to GA and get in further testing.
* We have a RMT that (as Tom mentioned) can use its powers early and 
often to remove patches that are not up to our quality levels.
* We can evaluate the quality of the commits coming in and coach folks 
on what to do better.


I understand that a revert is costly, particularly the longer a commit 
stays in, and I do 100% agree we should maintain the high commit bar we 
have and not rush things in just so "they're in for feature freeze and 
we'll clean them up for beta." That's where best practices come in.


I tend to judge the release by the outcome: once we go GA, how buggy is 
the release? Did something during the release cycle (e.g. a sloppy 
commit during feature freeze, lack of testing) lead to a bug that 
warranted an out-of-cycle release? And yes, how we commit things at 
feature freeze / through the beta impact that - we should ensure we're 
still committing things that we would have committed at a least hectic 
time, but understand that the deadline is still a strong motivator co 
complete the work.


Thanks,

Jonathan


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: 2024-02-08 release announcement draft

2024-02-07 Thread Jonathan S. Katz

On 2/6/24 3:19 AM, jian he wrote:

On Tue, Feb 6, 2024 at 12:43 PM Jonathan S. Katz  wrote:


Hi,

Attached is a draft of the 2024-02-08 release announcement. Please
review for accuracy and notable omissions.

Please provide any feedback no later than 2024-02-08 12:00 UTC (and
preferably sooner).



* In PL/pgSQL, support SQL commands that are `CREATE FUNCTION`/`CREATE
PROCEDURE`
with SQL-standard bodies.

https://git.postgresql.org/cgit/postgresql.git/commit/?id=57b440ec115f57ff6e6a3f0df26063822e3123d2
says only for plpgsql routine or DO block.


I had some trouble wordsmithing this, but the commit title is pretty 
clear to me so I opted for that.



* Ensure initdb always uncomments `postgresql.conf` entries for the
`lc_` family of parameters.

maybe  `initdb`


I applied this change -- thanks!

Jonathan



OpenPGP_signature.asc
Description: OpenPGP digital signature


2024-02-08 release announcement draft

2024-02-05 Thread Jonathan S. Katz

Hi,

Attached is a draft of the 2024-02-08 release announcement. Please 
review for accuracy and notable omissions.


Please provide any feedback no later than 2024-02-08 12:00 UTC (and 
preferably sooner).


Thanks,

Jonathan
The PostgreSQL Global Development Group has released an update to all supported
versions of PostgreSQL, including 16.2, 15.6, 14.11, 13.14, and 12.18.
This release fixes over 65 bugs reported over the last several months.

If you use GIN indexes, you may need to reindex after updating to this release.
Please see the [release notes](https://www.postgresql.org/docs/release/) for
more information.

For the full list of changes, please review the
[release notes](https://www.postgresql.org/docs/release/).

Bug Fixes and Improvements
--
 
This update fixes over 65 bugs that were reported in the last several months.
The issues listed below affect PostgreSQL 16. Some of these issues may also
affect other supported versions of PostgreSQL.

* Fix memory leak when performing JIT inlining that could lead to out-of-memory
conditions.
* Several query planner fixes.
* Align `MERGE` behavior with `UPDATE` when updating a partition key column and
skip firing `AFTER UPDATE ROW` trigger and other post-update actions.
* Fix problems with duplicate token names in `ALTER TEXT SEARCH CONFIGURATION 
... MAPPING`
commands.
* Fix `DROP ROLE` with duplicate role names.
* Properly lock the associated table during DROP STATISTICS to prevent errors if
`ANALYZE` is running concurrently.
* Fix function volatility checking for `GENERATED` and `DEFAULT` expressions.
* Ensure collation matches when matching an existing index to a new partitioned
index.
* Avoid failure if a child index is dropped concurrently with `REINDEX INDEX` 
on a
partitioned index.
* Fix for locking during cleanup of GIN indexes. For this case, if multiple
processes tried to clean the same GIN index page, there was a chance of index
corruption. If you believe you were affected by this issue, reindex your GIN
indexes after installing this update.
* Avoid failure with partitioned SP-GiST indexes.
* Several ownership fixes for large objects.
* In `EXPLAIN (BUFFERS)`, change name of I/O timing data "shared/local" to
"shared".
* Ensure durability of the `CREATE DATABASE` command if a system crash occurred
during or shortly after execution.
* Add more logging messages when starting and ending recovery from a backup.
* Revert a change that made the walreceiver process unresponsive to `SIGTERM`
while waiting for a replication connection to be established.
* Several fixes for logical replication.
* Fix incompatibility with OpenSSL 3.2.
* In PL/pgSQL, support SQL commands that are `CREATE FUNCTION`/`CREATE 
PROCEDURE`
with SQL-standard bodies.
* Fix for error handling in [libpq pipeline 
mode](https://www.postgresql.org/docs/current/libpq-pipeline-mode.html).
* Ensure initdb always uncomments `postgresql.conf` entries for the
`lc_` family of parameters.
* In `pg_dump`, don't dump RLS policies or security labels for extension member
objects.

This release also updates time zone data files to tzdata release 2024a for DST
law changes in Greenland, Kazakhstan, and Palestine, plus corrections for the
Antarctic stations Casey and Vostok. Also historical corrections for Vietnam,
Toronto, and Miquelon.

Updating


All PostgreSQL update releases are cumulative. As with other minor releases,
users are not required to dump and reload their database or use `pg_upgrade` in
order to apply this update release; you may simply shutdown PostgreSQL and
update its binaries.

If you use GIN indexes, you may need to reindex after updating to this release.
Please see the [release notes](https://www.postgresql.org/docs/release/) for
more information.

Users who have skipped one or more update releases may need to run additional
post-update steps; please see the release notes from earlier versions for
details.

For more details, please see the
[release notes](https://www.postgresql.org/docs/release/).

Links
-
* [Download](https://www.postgresql.org/download/)
* [Release Notes](https://www.postgresql.org/docs/release/)
* [Security](https://www.postgresql.org/support/security/)
* [Versioning Policy](https://www.postgresql.org/support/versioning/)
* [PostgreSQL 16 Release Announcement](https://www.postgresql.org/about/press/)
* [Follow @postgresql on Twitter](https://twitter.com/postgresql)
* [Donate](https://www.postgresql.org/about/donate/)

If you have corrections or suggestions for this release announcement, please
send them to the _pgsql-www@lists.postgresql.org_ public
[mailing list](https://www.postgresql.org/list/).


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: heavily contended lwlocks with long wait queues scale badly

2024-01-16 Thread Jonathan S. Katz

On 1/16/24 1:11 AM, Michael Paquier wrote:

On Thu, Jan 11, 2024 at 09:47:33AM -0500, Jonathan S. Katz wrote:

I have similar data sources to Nathan/Michael and I'm trying to avoid piling
on, but one case that's interesting occurred after a major version upgrade
from PG10 to PG14 on a database supporting a very active/highly concurrent
workload. On inspection, it seems like backpatching would help this
particularly case.

With 10/11 EOL, I do wonder if we'll see more of these reports on upgrade to
< PG16.

(I was in favor of backpatching prior; opinion is unchanged).


Hearing nothing, I have prepared a set of patches for v12~v15,
checking all the lwlock paths for all the branches.  At the end the
set of changes look rather sane to me regarding the queue handlings.

I have also run some numbers on all the branches, and the test case
posted upthread falls off dramatically after 512 concurrent
connections at the top of all the stable branches :(

For example on REL_12_STABLE with and without the patch attached:
num  v12   v12+patch
129717.151665  29096.707588
263257.709301  61889.476318
4127921.873393 124575.901330
8231400.571662 230562.725174
16   343911.185351 312432.897015
32   291748.985280 281011.787701
64   268998.728648 269975.605115
128  297332.597018 286449.176950
256  243902.817657 240559.122309
512  190069.602270 194510.718508
768  58915.650225  165714.707198
1024 39920.950552  149433.836901
2048 16922.391688  108164.301054
4096 6229.063321   69032.338708

I'd like to apply that, just let me know if you have any comments
and/or objections.


Wow. All I can say is that my opinion remains unchanged on going forward 
with backpatching.


Looking at the code, I understand an argument for not backpatching given 
we modify the struct, but this does seem low-risk/high-reward and should 
help PostgreSQL to run better on this higher throughput workloads.


Thanks,

Jonathan


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: heavily contended lwlocks with long wait queues scale badly

2024-01-11 Thread Jonathan S. Katz

On 1/10/24 10:45 PM, Michael Paquier wrote:

On Wed, Jan 10, 2024 at 09:17:47PM -0600, Nathan Bossart wrote:

Now that commit a4adc31 has had some time to bake and concerns about
unintended consequences may have abated, I wanted to revive this
back-patching discussion.  I see a few possibly-related reports [0] [1]
[2], and I'm now seeing this in the field, too.  While it is debatable
whether this is a bug, it's a quite nasty issue for users, and it's both
difficult to detect and difficult to work around.


+1, I've seen this becoming a PITA for a few things.  Knowing that the
size of PGPROC does not change at all, I would be in favor for a
backpatch, especially since it's been in the tree for more than 1
year, and even more knowing that we have 16 released with this stuff
in.


I have similar data sources to Nathan/Michael and I'm trying to avoid 
piling on, but one case that's interesting occurred after a major 
version upgrade from PG10 to PG14 on a database supporting a very 
active/highly concurrent workload. On inspection, it seems like 
backpatching would help this particularly case.


With 10/11 EOL, I do wonder if we'll see more of these reports on 
upgrade to < PG16.


(I was in favor of backpatching prior; opinion is unchanged).

Thanks,

Jonathan



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Password leakage avoidance

2024-01-03 Thread Jonathan S. Katz

On 1/2/24 7:23 AM, Sehrope Sarkuni wrote:
Having worked on and just about wrapped up the JDBC driver patch for 
this, couple thoughts:


2. Password encoding should be split out and made available as its own 
functions. Not just as part of a wider "create _or_ alter a user's 
password" function attached to a connection. We went a step further and 
added an intermediate function that generates the "ALTER USER ... 
PASSWORD" SQL.


I agree with this. It's been a minute, but I had done some refactoring 
on the backend-side to support the "don't need a connection" case for 
SCRAM secret generation functions on the server-side[1]. But I think in 
general we should split out the password generation functions, which 
leads to:


5. Our SCRAM specific function allows for customizing the algo iteration 
and salt parameters. That topic came up on hackers previously[1]. Our 
high level "alterUserPassword(...)" function does not have those options 
but it is available as part of our PasswordUtil SCRAM API for advanced 
users who want to leverage it. The higher level functions have defaults 
for iteration counts (4096) and salt size (16-bytes).


This seems like a good approach -- the regular function just has the 
defaults (which can be aligned to the PostgreSQL defaults) (or inherit 
from the server configuration, which then requires the connection to be 
present) and then have a more advanced API available.


Thanks,

Jonathan

[1] 
https://www.postgresql.org/message-id/3a9b7126-01a0-7e1a-1b2a-a76df6176725%40postgresql.org


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Password leakage avoidance

2023-12-31 Thread Jonathan S. Katz

On 12/31/23 9:50 AM, Magnus Hagander wrote:

On Wed, Dec 27, 2023 at 10:31 PM Jonathan S. Katz  wrote:


On 12/24/23 12:15 PM, Tom Lane wrote:


Maybe we need a PQcreaterole that provide the mechanism to set passwords
safely. It'd likely need to take all the options need for creating a
role, but that would at least give the underlying mechanism to ensure
we're always sending a hashed password to the server.


I'm kind of down on that, because it seems like it'd be quite hard to
design an easy-to-use C API that doesn't break the next time somebody
adds another option to CREATE USER.  What's so wrong with suggesting
that the password be set in a separate step?  (For comparison, typical
Unix utilities like useradd(8) also tell you to set the password
separately.)


Modern development frameworks tend to reduce things down to one-step,
even fairly complex operations. Additionally, a lot of these frameworks
don't even require a developer to build backend applications that
involve doing actually work on the backend (e.g. UNIX), so the approach
of useradd(8) et al. are not familiar. Adding the additional step would
lead to errors, e.g. the developer not calling the "change password"
function to create the obfuscated password. Granted, we can push the
problem down to driver authors to "be better" and simplify the process
for their end users, but that still can be error prone, having seen this
with driver authors implementing PostgreSQL SCRAM and having made
(correctable) mistakes that could have lead to security issues.


This seems to confuse "driver" with "framework".

I would say the "two step" approach is perfectly valid for a driver
whereas as you say most people building say webapps or similar on top
of a framework will expect it to handle things for them. But that's
more of a framework thing than a driver thing, depending on
terminology. E.g. it would be up to the "Postgres support driver" in
django/rails/whatnot to reduce it down to one step, not to a low level
driver like libpq (or other low level drivers).

None of those frameworks are likely to want to require direct driver
access anyway, they *want* to take control of that process in my
experience.


Fair point on the framework/driver comparison, but the above still 
applies to drivers. As mentioned above, non-libpq drivers did have 
mistakes that could have lead to security issues while implementing 
PostgreSQL SCRAM. Additionally, CVE-2021-23222[1] presented itself in 
both libpq/non-libpq drivers, either through the issue itself, or 
through implementing the protocol step in a way similar to libpq.


Keeping the implementation surface area simpler for driver maintainers 
does generally help mitigate further issues, though I'd defer to the 
driver maintainers if they agree with that statement.


Thanks,

Jonathan

[1] https://www.postgresql.org/support/security/CVE-2021-23222/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Password leakage avoidance

2023-12-27 Thread Jonathan S. Katz

On 12/24/23 12:15 PM, Tom Lane wrote:


Maybe we need a PQcreaterole that provide the mechanism to set passwords
safely. It'd likely need to take all the options need for creating a
role, but that would at least give the underlying mechanism to ensure
we're always sending a hashed password to the server.


I'm kind of down on that, because it seems like it'd be quite hard to
design an easy-to-use C API that doesn't break the next time somebody
adds another option to CREATE USER.  What's so wrong with suggesting
that the password be set in a separate step?  (For comparison, typical
Unix utilities like useradd(8) also tell you to set the password
separately.)


Modern development frameworks tend to reduce things down to one-step, 
even fairly complex operations. Additionally, a lot of these frameworks 
don't even require a developer to build backend applications that 
involve doing actually work on the backend (e.g. UNIX), so the approach 
of useradd(8) et al. are not familiar. Adding the additional step would 
lead to errors, e.g. the developer not calling the "change password" 
function to create the obfuscated password. Granted, we can push the 
problem down to driver authors to "be better" and simplify the process 
for their end users, but that still can be error prone, having seen this 
with driver authors implementing PostgreSQL SCRAM and having made 
(correctable) mistakes that could have lead to security issues.


That said, I see why trying to keep track of all of the "CREATE ROLE" 
attributes from a C API can be cumbersome, so perhaps we could end up 
adding an API that just does "create-user-with-password" and applies a 
similar method to Joe's patch. That would also align with the developer 
experience above, as in those cases users tend to just be created with a 
password w/o any of the additional role options.


Also open to punting this to a different thread as we can at least make 
things better with the "change password" approach.


Thanks,

Jonathan


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Password leakage avoidance

2023-12-24 Thread Jonathan S. Katz

On 12/24/23 10:14 AM, Joe Conway wrote:

On 12/23/23 11:00, Tom Lane wrote:

Joe Conway  writes:
The attached patch set moves the guts of \password from psql into the 
libpq client side -- PQchangePassword() (patch 0001).


Haven't really read the patch, just looked at the docs, but here's
a bit of bikeshedding:


Thanks!


Prior to bikeshedding -- thanks for putting this together. This should 
generally helpful, as it allows libpq-based drivers to adopt this method 
and provide a safer mechanism for setting/changing passwords! (which we 
should also promote once availbale).



* This seems way too eager to promote the use of md5.  Surely the
default ought to be SCRAM, full stop.  I question whether we even
need an algorithm parameter.  Perhaps it's a good idea for
future-proofing, but we could also plan that the function would
make its own decisions based on noting the server's version.
(libpq is far more likely to be au courant about what to do than
the calling application, IMO.)


We're likely to have new algorithms in the future, as there is a draft 
RFC for updating the SCRAM hashes, and already some regulatory bodies 
are looking to deprecate SHA256. My concern with relying on the 
"encrypted_password" GUC (which is why PQencryptPasswordConn takes 
"conn") makes it any easier for users to choose the algorithm, or if 
they need to rely on the server/session setting.


I guess in its current state, it does, and drivers could mask some of 
the complexity.


One thing I have not done but, considered, is adding an additional 
optional parameter to allow "VALID UNTIL" to be set. Seems like it 
would be useful to be able to set an expiration when setting a new 
password.


No strong opinion about that.


Thanks -- hopefully others will weigh in on that.


I think this is reasonable to add.

I think this is a good start and adds something that's better than what 
we have today. However, it seems like we also need something for "CREATE 
ROLE", otherwise we're either asking users to set passwords in two 
steps, or allowing for the unencrypted password to leak to the logs via 
CREATE ROLE.


Maybe we need a PQcreaterole that provide the mechanism to set passwords 
safely. It'd likely need to take all the options need for creating a 
role, but that would at least give the underlying mechanism to ensure 
we're always sending a hashed password to the server.


Jonathan


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: 2023-11-09 release announcement draft

2023-11-07 Thread Jonathan S. Katz

On 11/6/23 9:52 PM, Noah Misch wrote:

On Mon, Nov 06, 2023 at 05:04:25PM -0500, Jonathan S. Katz wrote:

The PostgreSQL Global Development Group has released an update to all supported
versions of PostgreSQL, including 16.1, 15.5, 14.10, 13.13, 12.17, and 11.22
This release fixes over 55 bugs reported over the last several months.

This release includes fixes for indexes where in certain cases, we advise
reindexing. Please see the "Update" section for more details.


s/"Update" section/"Updating" section/ or change section title below.


Fixed.


Delete lines starting here ...


This is the **final release of PostgreSQL 11**. PostgreSQL 10 will no longer
receive
[security and bug fixes](https://www.postgresql.org/support/versioning/).
If you are running PostgreSQL 10 in a production environment, we suggest that
you make plans to upgrade.


... to here.  They're redundant with "PostgreSQL 11 EOL Notice" below:


Initially, I strongly disagreed with this recommendation, as I've seen 
enough people say that they were unaware that a community version is 
EOL. We can't say this enough.


However, I did decide to clip it out because the notice is just below.

That said, perhaps we should put out a separate announcement that states 
PostgreSQL 11 is EOL. We may want to consider doing standalone EOL 
announcement -- perhaps 6 months out, and then day of, to make it 
abundantly clear that a version is deprecating.


Finally, I included Matthias' downthread recommendation in this version.

Thanks,

Jonathan
The PostgreSQL Global Development Group has released an update to all supported
versions of PostgreSQL, including 16.1, 15.5, 14.10, 13.13, 12.17, and 11.22
This release fixes over 55 bugs reported over the last several months.

This release includes fixes for indexes where in certain cases, we advise
reindexing. Please see the "Updating" section for more details.

For the full list of changes, please review the
[release notes](https://www.postgresql.org/docs/release/).

PostgreSQL 11 EOL Notice


**This is the final release of PostgreSQL 11**. PostgreSQL 11 is now end-of-life
and will no longer receive security and bug fixes. If you are
running PostgreSQL 11 in a production environment, we suggest that you make
plans to upgrade to a newer, supported version of PostgreSQL. Please see our
[versioning policy](https://www.postgresql.org/support/versioning/) for more
information.

Bug Fixes and Improvements
--
 
This update fixes over 55 bugs that were reported in the last several months.
The issues listed below affect PostgreSQL 16. Some of these issues may also
affect other supported versions of PostgreSQL.

* Fix issue where GiST indexes had an incorrect behavior during a "page split"
operation that could lead to incorrect results in subsequent index searches.
Please [reindex](https://www.postgresql.org/docs/current/sql-reindex.html) GiST
indexes after installing this update.
* Fix issue where B-tree indexes would incorrectly de-duplicate `interval`
columns. Please 
[reindex](https://www.postgresql.org/docs/current/sql-reindex.html)
any B-tree index that includes an `interval` column after installing this
update.
* Provide more efficient indexing of `date`, `timestamptz`, and `timestamp`
values in BRIN indexes when using a [`minmax_multi` 
opsclass](https://www.postgresql.org/docs/current/brin-builtin-opclasses.html).
While not required, we recommend
[reindexing](https://www.postgresql.org/docs/current/sql-reindex.html) BRIN
indexes that include these data types after installing this update.
* Fix for bulk table insertion into partitioned tables.
* Fix for hash-partitioned tables with multiple partition keys during step
generation and runtime pruning that could lead to crashes in some cases.
* Throw the correct error if 
[`pgrowlocks()`](https://www.postgresql.org/docs/current/pgrowlocks.html) is 
applied to a partitioned table

* Fix inconsistent rechecking of concurrently-updated rows during
[`MERGE`](https://www.postgresql.org/docs/current/sql-merge.html) when using
[`READ 
COMMITTED`](https://www.postgresql.org/docs/current/transaction-iso.html#XACT-READ-COMMITTED)
mode.
* Correctly identify the target table in an inherited `UPDATE`/`DELETE`/`MERGE`
even when the parent table is excluded by constraints.
* Fix over-allocation of a constructed 
[`tsvector`](https://www.postgresql.org/docs/current/datatype-textsearch.html#DATATYPE-TSVECTOR).
* Fix [`ALTER 
SUBSCRIPTION`](https://www.postgresql.org/docs/current/sql-altersubscription.html)
to apply changes in the `run_as_owner` option.
* Several fixes for [`COPY 
FROM`](https://www.postgresql.org/docs/current/sql-copy.html),
* Several fixes for handling torn reads with 
[`pg_control`](https://www.postgresql.org/docs/current/wal-internals.html).
* Fix "could not find pathkey item to sort" errors occurring while planning
aggregate functions wi

2023-11-09 release announcement draft

2023-11-06 Thread Jonathan S. Katz

Hi,

Attached is the release announcement draft for the 2023-11-09 release 
(16.1 et al.).


Please review for accuracy and notable omissions. Please have all 
feedback in by 2023-11-09 08:00 UTC at the latest (albeit the sooner the 
better).


Thanks,

Jonathan
The PostgreSQL Global Development Group has released an update to all supported
versions of PostgreSQL, including 16.1, 15.5, 14.10, 13.13, 12.17, and 11.22
This release fixes over 55 bugs reported over the last several months.

This release includes fixes for indexes where in certain cases, we advise
reindexing. Please see the "Update" section for more details.

This is the **final release of PostgreSQL 11**. PostgreSQL 10 will no longer
receive
[security and bug fixes](https://www.postgresql.org/support/versioning/).
If you are running PostgreSQL 10 in a production environment, we suggest that
you make plans to upgrade.

For the full list of changes, please review the
[release notes](https://www.postgresql.org/docs/release/).

PostgreSQL 11 EOL Notice


**This is the final release of PostgreSQL 11**. PostgreSQL 11 is now end-of-life
and will no longer receive security and bug fixes. If you are
running PostgreSQL 11 in a production environment, we suggest that you make
plans to upgrade to a newer, supported version of PostgreSQL. Please see our
[versioning policy](https://www.postgresql.org/support/versioning/) for more
information.

Bug Fixes and Improvements
--
 
This update fixes over 55 bugs that were reported in the last several months.
The issues listed below affect PostgreSQL 16. Some of these issues may also
affect other supported versions of PostgreSQL.

* Fix issue where GiST indexes had an incorrect behavior during a "page split"
operation that could lead to incorrect results in subsequent index searches.
Please [reindex](https://www.postgresql.org/docs/current/sql-reindex.html) GiST
indexes after installing this update.
* Fix issue where B-tree indexes would incorrectly de-duplicate `interval`
columns. Please 
[reindex](https://www.postgresql.org/docs/current/sql-reindex.html)
any B-tree index that includes an `interval` column after installing this
update.
* Provide more efficient indexing of `date`, `timestamptz`, and `timestamp`
values in BRIN indexes. While not required, we recommend
[reindexing](https://www.postgresql.org/docs/current/sql-reindex.html) BRIN
indexes that include these data types after installing this update.
* Fix for bulk table insertion into partitioned tables.
* Fix for hash-partitioned tables with multiple partition keys during step
generation and runtime pruning that could lead to crashes in some cases.
* Throw the correct error if 
[`pgrowlocks()`](https://www.postgresql.org/docs/current/pgrowlocks.html) is 
applied to a partitioned table

* Fix inconsistent rechecking of concurrently-updated rows during
[`MERGE`](https://www.postgresql.org/docs/current/sql-merge.html) when using
[`READ 
COMMITTED`](https://www.postgresql.org/docs/current/transaction-iso.html#XACT-READ-COMMITTED)
mode.
* Correctly identify the target table in an inherited `UPDATE`/`DELETE`/`MERGE`
even when the parent table is excluded by constraints.
* Fix over-allocation of a constructed 
[`tsvector`](https://www.postgresql.org/docs/current/datatype-textsearch.html#DATATYPE-TSVECTOR).
* Fix [`ALTER 
SUBSCRIPTION`](https://www.postgresql.org/docs/current/sql-altersubscription.html)
to apply changes in the `run_as_owner` option.
* Several fixes for [`COPY 
FROM`](https://www.postgresql.org/docs/current/sql-copy.html),
* Several fixes for handling torn reads with 
[`pg_control`](https://www.postgresql.org/docs/current/wal-internals.html).
* Fix "could not find pathkey item to sort" errors occurring while planning
aggregate functions with `ORDER BY` or `DISTINCT` options.
* When 
[`track_io_timing`](https://www.postgresql.org/docs/current/runtime-config-statistics.html#GUC-TRACK-IO-TIMING)
is enabled, include the time taken by relation extension operations as write
time.
* Track the dependencies of cached 
[`CALL`](https://www.postgresql.org/docs/current/sql-call.html)
statements, and re-plan them when needed.
* Treat out-of-memory failures as `FATAL` while reading WAL.
* Fix [`pg_dump`](https://www.postgresql.org/docs/current/app-pgdump.html) to
dump the new `run_as_owner` option of subscriptions.
* Fix [`pg_restore`](https://www.postgresql.org/docs/current/app-pgrestore.html)
so that selective restores will include both table-level and column-level ACLs
for selected tables.
* Add logic to 
[`pg_upgrade`](https://www.postgresql.org/docs/current/pgupgrade.html)
to check for use of obsolete data types `abstime`, `reltime`, and `tinterval`.
* Fix [`vacuumdb`](https://www.postgresql.org/docs/current/app-vacuumdb.html)
to have multiple `-N` switches actually exclude tables in multiple schemas.
* [`amcheck`](https://www.postgresql.org/docs/current/amcheck.html)
will no longer report interrupted page 

PostgreSQL 16 RC1 release announcement draft

2023-08-28 Thread Jonathan S. Katz

Hi,

Attached is the PostgreSQL 16 RC1 release announcement draft.

Currently there is only one item in it, as there was only one open item 
marked as closed. If there are any other fixes for the RC1 that were 
specific to v16 and should be included in the announcement, please let 
me know.


Please provide all feedback no later than August 31, 2023 @ 12:00 UTC 
(and preferably before that).


Thanks,

Jonathan
The PostgreSQL Global Development Group announces that the first release
candidate of PostgreSQL 16 is now available for download. As a release
candidate, PostgreSQL 16 RC 1 will be mostly identical to the initial release of
PostgreSQL 16, though some more fixes may be applied prior to the general
availability of PostgreSQL 16.

The planned date for the general availability of PostgreSQL 16 is
September 14, 2023. Please see the "Release Schedule" section for more details.

Upgrading to PostgreSQL 16 RC 1
---

To upgrade to PostgreSQL 16 RC 1 from earlier versions of PostgreSQL, you will
need to use a major version upgrade strategy, e.g. `pg_upgrade` or
`pg_dump` / `pg_restore`. For more information, please visit the documentation
section on 
[upgrading](https://www.postgresql.org/docs/16/static/upgrading.html):

[https://www.postgresql.org/docs/16/static/upgrading.html](https://www.postgresql.org/docs/16/static/upgrading.html)

Changes Since 16 Beta 3
---

Several bug fixes were applied for PostgreSQL 16 during the Beta 3 period. These
include:

* Fixed performance regression when running `COPY` concurrently on a single
table.

For a detailed list of fixes, please visit the
[open 
items](https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items#resolved_before_16rc1)
page.

Release Schedule


This is the first release candidate for PostgreSQL 16. Unless an issue is
discovered that warrants a delay or to produce an additional release candidate,
PostgreSQL 16 should be made generally available on September 14, 2023.

For further information please see the
[Beta Testing](https://www.postgresql.org/developer/beta/) page.

Links
-

* [Download](https://www.postgresql.org/download/)
* [Beta Testing Information](https://www.postgresql.org/developer/beta/)
* [PostgreSQL 16 Beta Release 
Notes](https://www.postgresql.org/docs/16/release-16.html)
* [PostgreSQL 16 Open 
Issues](https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items)
* [Feature 
Matrix](https://www.postgresql.org/about/featurematrix/#configuration-management)
* [Submit a Bug](https://www.postgresql.org/account/submitbug/)
* [Follow @postgresql on Twitter](https://twitter.com/postgresql)
* [Donate](https://www.postgresql.org/about/donate/)


OpenPGP_signature
Description: OpenPGP digital signature


Re: PostgreSQL 16 release announcement draft

2023-08-25 Thread Jonathan S. Katz

On 8/24/23 11:19 AM, Alvaro Herrera wrote:

On 2023-Aug-24, Jonathan S. Katz wrote:


### Performance Improvements

PostgreSQL 16 improves the performance of existing PostgreSQL functionality
through new query planner optimizations. In this latest release, the query
planner can parallelize  `FULL` and `RIGHT` joins, generate better optimized
plans for queries that use aggregate functions (e.g. `count`) with a `DISTINCT`
or `ORDER BY` clause, utilize incremental sorts for `SELECT DISTINCT` queries,
and optimize window function executions so they execute more efficiently.


"optimize window function executions so that they execute blah" sounds
redundant and strange. Maybe just "optimize execution of window
functions" is sufficient?  Also, using "e.g." there looks somewhat out
of place; maybe "(such as `count`)" is a good replacement?


It also introduces `RIGHT` and `OUTER` "anti-joins", which enable users to
identify rows not present in a joined table.


Wait.  Are you saying we didn't have those already?  Looking at
release-16.sgml I think this refers to commit 16dc2703c541, which means
this made them more efficient rather than invented them.



This release includes improvements for bulk loading using `COPY` in both single
and concurrent operations, with tests showing up to a 300% performance
improvement in some cases. PostgreSQL adds support for load balancing in clients


PostgreSQL 16


that use `libpq`, and improvements to vacuum strategy that reduce the necessity
of full-table freezes. Additionally, PostgreSQL 16 introduces CPU acceleration
using `SIMD` in both x86 and ARM architectures, resulting in performance gains
when processing ASCII and JSON strings, and performing array and subtransaction
searches.

### Logical replication

Logical replication lets PostgreSQL users stream data to other PostgreSQL


"L.R. in PostgreSQL lets users"?


instances or subscribers that can interpret the PostgreSQL logical replication
protocol. In PostgreSQL 16, users can perform logical decoding from a standby


s/decoding/replication/ ? (It seems odd to use "decoding" when the
previous sentence used "replication")


instance, meaning a standby can publish logical changes to other servers. This
provides developers with new workload distribution options – for example, using
a standby rather than the busier primary to logically replicate changes to
downstream systems.

Additionally, there are several performance improvements in PostgreSQL 16 to
logical replication. Subscribers can now apply large transactions using parallel
workers. For tables that do not have a `PRIMARY KEY`, subscribers can use B-tree


"a primary key", no caps.


indexes instead of sequential scans to find rows. Under certain conditions,
users can also speed up initial table synchronization using the binary format.

There are several access control improvements to logical replication in
PostgreSQL 16, including the new predefined role pg_create_subscription, which
grants users the ability to create a new logical subscription. Finally, this
release begins adding support for bidirectional logical replication, introducing
functionality to replicate data between two tables from different publishers.


"to create a new logical subscription" -> "to create new logical subscriptions"


### Developer Experience

PostgreSQL 16 adds more syntax from the SQL/JSON standard, including
constructors and predicates such as `JSON_ARRAY()`, `JSON_ARRAYAGG()`, and
`IS JSON`. This release also introduces the ability to use underscores for
thousands separators (e.g. `5_432_000`) and non-decimal integer literals, such
as `0x1538`, `0o12470`, and `0b1010100111000`.

Developers using PostgreSQL 16 will also benefit from the addition of multiple
commands to `psql` client protocol, including the `\bind` command, which allows
users to execute parameterized queries (e.g `SELECT $1 + $2`) then use `\bind`
to substitute the variables.


This paragraph sounds a bit suspicious.  What do you mean with "multiple
commands to psql client protocol"?  Also, I think "to execute parameterized
queries" should be "to prepare parameterized queries", and later "then
use \bind to execute the query substituting the variables".




### Monitoring

A key aspect of tuning the performance of database workloads is understanding
the impact of your I/O operations on your system. PostgreSQL 16 helps simplify
how you can analyze this data with the new pg_stat_io view, which tracks key I/O
statistics such as shared_buffer hits and I/O latency.


Hmm, I think what pg_stat_io gives you is data which wasn't available
previously at all.  Maybe do something like "Pg 16 introduces
pg_stat_io, a new source of key I/O metrics that can be used for more
fine grained something something".


Additionally, this release adds a new field to the `pg_sta

Re: PostgreSQL 16 release announcement draft

2023-08-25 Thread Jonathan S. Katz

On 8/24/23 11:17 AM, Erik Rijkers wrote:

Op 8/24/23 om 16:32 schreef Jonathan S. Katz:

On 8/23/23 5:07 PM, David Rowley wrote:
On Thu, 24 Aug 2023 at 05:55, Jonathan S. Katz  
wrote:


Hi,

When v15 docs have:

"27.2.7. Cascading Replication
The cascading replication feature allows a standby server to accept 
replication connections and stream WAL records to other standbys, acting 
as a relay. This can be used to reduce the number of direct connections 
to the primary and also to minimize inter-site bandwidth overheads."


why then, in the release draft, is that capability mentioned as 
something that is new for v16?

"
In PostgreSQL 16, users can perform logical decoding from a standby
instance, meaning a standby can publish logical changes to other servers.
"

Is there a difference between the two?


Yes. Those docs refer to **physical** replication, where a standby can 
continue to replicate WAL records to other standbys. In v16, standbys 
can now publish changes over **logical** replication.


Thanks,

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: PostgreSQL 16 release announcement draft

2023-08-25 Thread Jonathan S. Katz

On 8/24/23 12:54 PM, Dave Cramer wrote:




 > Postgres, PostgreSQL, and the Elephant Logo (Slonik) are all
registered
 > trademarks of the [PostgreSQL Community Association of
Canada](https://www.postgres.ca ).

Isn't this just the "PostgreSQL Community Association", no Canada?


Certainly confusing from the website, but in the about section is this
"PostgreSQL Community Association is a trade or business name of the 
PostgreSQL Community Association of Canada."


This was something I missed when reviewing the fulltext, and went ahead 
and fixed it. Thanks,


Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: PostgreSQL 16 RC1 + GA release dates

2023-08-24 Thread Jonathan S. Katz

On 8/16/23 3:48 PM, Jonathan S. Katz wrote:

The date for PostgreSQL 16 Release Candidate 1 (RC1) is August 31, 2023. 
Please ensure all open items[1] are completed and committed before 
August 26, 2023 12:00 UTC.


Reminder: the RC1 open item[1] deadline is at August 26, 2023 @ 12:00 UTC.

Thanks,

Jonathan

[1] https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items



OpenPGP_signature
Description: OpenPGP digital signature


Re: PostgreSQL 16 release announcement draft

2023-08-24 Thread Jonathan S. Katz

On 8/24/23 11:16 AM, jian he wrote:

hi. Can you check my first email about "a" versus "the" and "pg_stat_activity".


I did when you first sent it, and did not make any changes.


also:
"including the `\bind` command, which allows
users to execute parameterized queries (e.g `SELECT $1 + $2`) then use `\bind`
to substitute the variables."

The example actually is very hard to reproduce. (it's not that super intuitive).
fail case:
test16-# SELECT $1 + $2 \bind 1 2
test16-# ;

a better example would  be (e.g `SELECT $1 , $2`).
The semicolon still needed to be in the next line.


I agree with updating the example, I'd propose:

SELECT $1::int + $2::int \bind 1 2 \g

which mirrors what's in the docs[1]

Thanks,

Jonathan

[1] 
https://www.postgresql.org/docs/16/app-psql.html#APP-PSQL-META-COMMAND-BIND


OpenPGP_signature
Description: OpenPGP digital signature


Re: PostgreSQL 16 release announcement draft

2023-08-24 Thread Jonathan S. Katz

On 8/23/23 5:07 PM, David Rowley wrote:

On Thu, 24 Aug 2023 at 05:55, Jonathan S. Katz  wrote:

We could add something about 1349d2790 -- do you have suggested wording?


I think it's worth a mention. See the text added in square brackets below:

PostgreSQL 16 improves the performance of existing PostgreSQL functionality
through new query planner optimizations. In this latest release, the query
planner can parallelize  `FULL` and `RIGHT` joins, [generate more
optimal plans for
queries containing aggregate functions with a `DISTINCT` or `ORDER BY` clause,]
utilize incremental sorts for `SELECT DISTINCT` queries, and optimize
window function
executions so they execute more efficiently. It also introduces
`RIGHT` and `OUTER`
"anti-joins", which enable users to identify rows not present in a joined table.


I added this in mostly verbatim. I'm concerned the sentence is a bit 
long, but we could break it up into two: (1) with the new JOIN 
capabilities and (2) with the optimizations.


Jonathan
September 14, 2023 - The PostgreSQL Global Development Group today announced the
release of PostgreSQL 16, the latest version of the world's most advanced open
source database.

PostgreSQL 16 raises its performance, with notable improvements to query
parallelism, bulk data loading, and logical replication. There are many features
in this release for developers and administrators alike, including more SQL/JSON
syntax, new monitoring stats for your workloads, and greater flexibility in
defining access control rules for management of policies across large fleets.



PostgreSQL, an innovative data management system known for its reliability and
robustness, benefits from over 25 years of open source development from a global
developer community and has become the preferred open source relational database
for organizations of all sizes.

### Performance Improvements

PostgreSQL 16 improves the performance of existing PostgreSQL functionality
through new query planner optimizations. In this latest release, the query
planner can parallelize  `FULL` and `RIGHT` joins, generate better optimized
plans for queries that use aggregate functions (e.g. `count`) with a `DISTINCT`
or `ORDER BY` clause, utilize incremental sorts for `SELECT DISTINCT` queries,
and optimize window function executions so they execute more efficiently.
It also introduces `RIGHT` and `OUTER` "anti-joins", which enable users to
identify rows not present in a joined table.

This release includes improvements for bulk loading using `COPY` in both single
and concurrent operations, with tests showing up to a 300% performance
improvement in some cases. PostgreSQL adds support for load balancing in clients
that use `libpq`, and improvements to vacuum strategy that reduce the necessity
of full-table freezes. Additionally, PostgreSQL 16 introduces CPU acceleration
using `SIMD` in both x86 and ARM architectures, resulting in performance gains
when processing ASCII and JSON strings, and performing array and subtransaction
searches.

### Logical replication 

Logical replication lets PostgreSQL users stream data to other PostgreSQL
instances or subscribers that can interpret the PostgreSQL logical replication
protocol. In PostgreSQL 16, users can perform logical decoding from a standby
instance, meaning a standby can publish logical changes to other servers. This
provides developers with new workload distribution options – for example, using
a standby rather than the busier primary to logically replicate changes to
downstream systems.

Additionally, there are several performance improvements in PostgreSQL 16 to
logical replication. Subscribers can now apply large transactions using parallel
workers. For tables that do not have a `PRIMARY KEY`, subscribers can use B-tree
indexes instead of sequential scans to find rows. Under certain conditions,
users can also speed up initial table synchronization using the binary format.

There are several access control improvements to logical replication in
PostgreSQL 16, including the new predefined role pg_create_subscription, which
grants users the ability to create a new logical subscription. Finally, this
release begins adding support for bidirectional logical replication, introducing
functionality to replicate data between two tables from different publishers.

### Developer Experience

PostgreSQL 16 adds more syntax from the SQL/JSON standard, including
constructors and predicates such as `JSON_ARRAY()`, `JSON_ARRAYAGG()`, and
`IS JSON`. This release also introduces the ability to use underscores for
thousands separators (e.g. `5_432_000`) and non-decimal integer literals, such
as `0x1538`, `0o12470`, and `0b1010100111000`.

Developers using PostgreSQL 16 will also benefit from the addition of multiple
commands to `psql` client protocol, including the `\bind` command, which allows
users to execute parameterized queries (e.g `SELECT $1 + $2`) then use `\bind`
to substitute the variables. 

PostgreSQL 16 impr

Re: PostgreSQL 16 release announcement draft

2023-08-23 Thread Jonathan S. Katz

On 8/23/23 8:02 AM, David Rowley wrote:

On Wed, 23 Aug 2023 at 22:21, jian he  wrote:





PostgreSQL 16 improves the performance of existing PostgreSQL functionality
through new query planner optimizations. In this latest release, the query
planner can parallelize  `FULL` and `RIGHT` joins, utilize incremental sorts for
`SELECT DISTINCT` queries, and execute window functions more efficiently. It
also introduces `RIGHT` and `OUTER` "anti-joins", which enable users to identify
rows not present in a joined table.




I think "utilize incremental sorts is for" something like select
my_avg(distinct one),my_sum(one) from (values(1),(3)) t(one);
so it's not the same as `SELECT DISTINCT` queries?
ref: 
https://git.postgresql.org/cgit/postgresql.git/commit/?id=1349d2790bf48a4de072931c722f39337e72055e


The incremental sorts for DISTINCT will likely be a reference to
3c6fc5820, so, not the same thing as 1349d2790.  I don't see anything
there relating to 1349d2790.


We could add something about 1349d2790 -- do you have suggested wording?


also
 "the query planner ., and execute window functions more efficiently."
since the query planner doesn't execute anything. probably "and
optimize window functions execution"?


Yeah, that or "and optimize window functions so they execute more
efficiently" is likely an improvement there.


Modified. See updated announcement, with other incorporated changes.

Reminder that the window to submit changes closes at **August 26, 12:00 
UTC**.


Thanks,

Jonathan

September 14, 2023 - The PostgreSQL Global Development Group today announced the
release of PostgreSQL 16, the latest version of the world's most advanced open
source database.

PostgreSQL 16 raises its performance, with notable improvements to query
parallelism, bulk data loading, and logical replication. There are many features
in this release for developers and administrators alike, including more SQL/JSON
syntax, new monitoring stats for your workloads, and greater flexibility in
defining access control rules for management of policies across large fleets.



PostgreSQL, an innovative data management system known for its reliability and
robustness, benefits from over 25 years of open source development from a global
developer community and has become the preferred open source relational database
for organizations of all sizes.

### Performance Improvements

PostgreSQL 16 improves the performance of existing PostgreSQL functionality
through new query planner optimizations. In this latest release, the query
planner can parallelize  `FULL` and `RIGHT` joins, utilize incremental sorts for
`SELECT DISTINCT` queries, and optimize window function executions so they
execute more efficiently. It also introduces `RIGHT` and `OUTER` "anti-joins",
which enable users to identify rows not present in a joined table.

This release includes improvements for bulk loading using `COPY` in both single
and concurrent operations, with tests showing up to a 300% performance
improvement in some cases. PostgreSQL adds support for load balancing in clients
that use `libpq`, and improvements to vacuum strategy that reduce the necessity
of full-table freezes. Additionally, PostgreSQL 16 introduces CPU acceleration
using `SIMD` in both x86 and ARM architectures, resulting in performance gains
when processing ASCII and JSON strings, and performing array and subtransaction
searches.

### Logical replication 

Logical replication lets PostgreSQL users stream data to other PostgreSQL
instances or subscribers that can interpret the PostgreSQL logical replication
protocol. In PostgreSQL 16, users can perform logical decoding from a standby
instance, meaning a standby can publish logical changes to other servers. This
provides developers with new workload distribution options – for example, using
a standby rather than the busier primary to logically replicate changes to
downstream systems.

Additionally, there are several performance improvements in PostgreSQL 16 to
logical replication. Subscribers can now apply large transactions using parallel
workers. For tables that do not have a `PRIMARY KEY`, subscribers can use B-tree
indexes instead of sequential scans to find rows. Under certain conditions,
users can also speed up initial table synchronization using the binary format.

There are several access control improvements to logical replication in
PostgreSQL 16, including the new predefined role pg_create_subscription, which
grants users the ability to create a new logical subscription. Finally, this
release begins adding support for bidirectional logical replication, introducing
functionality to replicate data between two tables from different publishers.

### Developer Experience

PostgreSQL 16 adds more syntax from the SQL/JSON standard, including
constructors and predicates such as `JSON_ARRAY()`, `JSON_ARRAYAGG()`, and
`IS JSON`. This release also introduces the ability to use underscores for
thousands separators (e.g. `5_432_000`) and 

Re: PostgreSQL 16 RC1 + GA release dates

2023-08-20 Thread Jonathan S. Katz

On 8/20/23 10:50 AM, Tom Lane wrote:

"Jonathan S. Katz"  writes:

The date for PostgreSQL 16 Release Candidate 1 (RC1) is August 31, 2023.
Please ensure all open items[1] are completed and committed before
August 26, 2023 12:00 UTC.


FYI, I moved the "Oversight in reparameterize_path_by_child leading to
executor crash" open item to the "Older bugs affecting stable branches"
section, because it is in fact an old bug: the given test case crashes
in every branch that has enable_partitionwise_join.  I'll still look
at getting in the fix before RC1, but we should understand what we're
dealing with.


[RMT hat]

Thanks -- appreciative of the accurate record keeping.

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


PostgreSQL 16 release announcement draft

2023-08-19 Thread Jonathan S. Katz

Hi,

Attached is the first draft of the PostgreSQL 16 release announcement, 
authored by Chelsea Dole & myself.


To frame this up, the goal of the GA release announcement is to help 
folks discover the awesome new features of PostgreSQL. It's impossible 
to list out every single feature in the release and still have a 
coherent announcement, so we try to target features that have the 
broadest range of impact.


It's possible we missed or incorrectly stated something, so please 
provide feedback if we did so.


(Note I have not added in all of the links etc. to the Markdown yet, as 
I want to wait for the first pass of feedback to come through).


**Please provide feedback by August 26, 12:00 UTC**. After that point, 
we need to freeze all changes so we can begin the release announcement 
translation effort.


Thanks,

Jonathan
September 14, 2023 - The PostgreSQL Global Development Group today announced the
release of PostgreSQL 16, the latest version of the world’s most advanced open
source database.

PostgreSQL 16 raises its performance, with notable improvements to query
parallelism, bulk data loading, and logical replication. There are many features
in this release for developers and administrators alike, including more SQL/JSON
syntax, new monitoring stats for your workloads, and greater flexibility in
defining access control rules for large scale workloads.



PostgreSQL, an innovative data management system known for its reliability and
robustness, benefits from over 25 years of open source development from a global
developer community and has become the preferred open source relational database
for organizations of all sizes.

### Performance Improvements

PostgreSQL 16 improves the performance of existing PostgreSQL functionality
through new query planner optimizations. In this latest release, the query
planner can parallelize  `FULL` and `RIGHT` joins, utilize incremental sorts for
`SELECT DISTINCT` queries, and execute window functions more efficiently. It
also introduces `RIGHT` and `OUTER` "anti-joins", which enable users to identify
rows not present in a joined table.

This release includes improvements for bulk loading using `COPY` in both single
and concurrent operations, with tests showing up to a 300% performance
improvement in some cases. PostgreSQL adds support for load balancing in clients
that use `libpq`, and improvements to vacuum strategy that reduce the necessity
of full-table freezes. Additionally, PostgreSQL 16 introduces CPU acceleration
using `SIMD` in both x86 and ARM architectures, resulting in performance gains
when processing ASCII and JSON strings, and performing array and subtransaction
searches.

### Logical replication 

Logical replication lets PostgreSQL users stream data to other PostgreSQL
instances or subscribers that can interpret the PostgreSQL logical replication
protocol. In PostgreSQL 16, users can perform logical decoding from a standby
instance, meaning a standby can publish logical changes to other servers. This
provides developers with new workload distribution options – for example, using
a standby rather than the busier primary to logically replicate changes to
downstream systems.

Additionally, there are several performance improvements in PostgreSQL 16 to
logical replication. Subscribers can now apply large transactions using parallel
workers. For tables that do not have a `PRIMARY KEY`, subscribers can use B-tree
indexes instead of sequential scans to find rows. Under certain conditions,
users can also speed up initial table synchronization using the binary format.

There are several access control improvements to logical replication in
PostgreSQL 16, including the new predefined role pg_create_subscription, which
grants users the ability to create a new logical subscription. Finally, this
release begins adding support for bidirectional logical replication, introducing
functionality to replicate data between two tables from different publishers.

### Developer Experience

PostgreSQL 16 adds more syntax from the SQL/JSON standard, including
constructors and predicates such as `JSON_ARRAY()`, `JSON_ARRAYAGG()`, and
`IS JSON`. This release also introduces the ability to use underscores for
thousands separators (e.g. `5_432_000`) and non-decimal integer literals, such
as `0x1538`, `0o12470`, and `0b1010100111000`.

Developers using PostgreSQL 16 will also benefit from the addition of multiple
commands to `psql` client protocol, including the `\bind` command, which allows
users to execute parameterized queries (e.g `SELECT $1 + $2`) then use `\bind`
to substitute the variables. 

PostgreSQL 16 improves general support for text collations, which provide rules
for how text is sorted. PostgreSQL 16 builds with ICU support by default,
determines the default ICU locale from the environment, and allows users to
define custom ICU collation rules.

### Monitoring

A key aspect of tuning the performance of database workloads is understanding
the impact of your 

PostgreSQL 16 RC1 + GA release dates

2023-08-16 Thread Jonathan S. Katz

Hi,

The date for PostgreSQL 16 Release Candidate 1 (RC1) is August 31, 2023. 
Please ensure all open items[1] are completed and committed before 
August 26, 2023 12:00 UTC.


This means the current target date for the PostgreSQL 16 GA release is 
September 14, 2023. While this date could change if the release team 
decides the candidate release is not ready, please plan for this date to 
be the GA release.


Thanks,

Jonathan

[1] https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items


OpenPGP_signature
Description: OpenPGP digital signature


Re: 2023-08-10 release announcement draft

2023-08-09 Thread Jonathan S. Katz

On 8/8/23 11:13 PM, Robert Treat wrote:


"Users who have skipped one or more update releases may need to run
additional, post-update steps; "

The comma should be removed.

"please see the release notes for earlier versions for details."

Use of 'for' twice is grammatically incorrect; I am partial to "please
see the release notes from earlier versions for details." but could
also see "please see the release notes for earlier versions to get
details."


Interestingly, I think this language has been unmodified for awhile. 
Upon reading it, I agree, and took your suggestions.


Thanks,

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: 2023-08-10 release announcement draft

2023-08-09 Thread Jonathan S. Katz

On 8/9/23 1:04 AM, Noah Misch wrote:

On Mon, Aug 07, 2023 at 10:03:44PM -0400, Jonathan S. Katz wrote:

Fixes in PostgreSQL 16 Beta 3
-

The following includes fixes included in PostgreSQL 16 Beta 3:


With both "includes" and "included" there, this line reads awkwardly to me.
I'd just delete the line, since the heading has the same information.


I took your suggestion. Thanks!

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: 2023-08-10 release announcement draft

2023-08-08 Thread Jonathan S. Katz

On 8/8/23 1:30 AM, Erik Rijkers wrote:

Op 8/8/23 om 03:15 schreef Jonathan S. Katz:


Please provide your feedback no later than August 10, 2023 0:00 AoE[1].


'You us'  should be
'You use'
    (2x)


It should actually be just "Use" -- but I've fixed both instances. Thanks!

Jonathan




OpenPGP_signature
Description: OpenPGP digital signature


Re: 2023-08-10 release announcement draft

2023-08-07 Thread Jonathan S. Katz

On 8/7/23 9:53 PM, David Rowley wrote:

On Tue, 8 Aug 2023 at 13:49, Jonathan S. Katz  wrote:


On 8/7/23 9:45 PM, David Rowley wrote:


* Fix a performance regression when running concurrent
[`COPY`](https://www.postgresql.org/docs/16/sql-copy.html) statements on a
single table.


I think this is still outstanding. A bit of work has been done for the
int parsing regression but it seems there's still a performance
regression when running multiple COPYs on the same table, per [1].


Hm, the open item was closed[1] -- was that premature, or is this a new
issue (have not yet read the thread you referenced)?


I closed it thinking that enough had been done to resolve the
performance regression. In the linked thread, Sawadasan shows that
that's not the case. So, yes, premature.  I've reverted the change to
the open items list now.


Got it. I reverted it as well from the release announcement. Reattaching 
with the clean copy.


(Aside: I'm super excited for this PG16 improvement + fixed regression, 
as lately I've had to do some bulk imports on a single table that could 
really benefit from this :)


Jonathan

The PostgreSQL Global Development Group has released an update to all supported
versions of PostgreSQL, including 15.4, 14.9, 13.12, 12.16, and 11.21, as well
as the third beta release of PostgreSQL 16. This release fixes over 40 bugs
reported over the last several months.

If you use [BRIN]((https://www.postgresql.org/docs/current/brin-intro.html))
indexes to  look up `NULL` values, you will need to 
[reindex](https://www.postgresql.org/docs/current/sql-reindex.html)
them after upgrading to this release. You us
[`REINDEX 
CONCURRENTLY`](https://www.postgresql.org/docs/current/sql-reindex.html)
to avoid blocking writes to the affected index and table, for example:

```
REINDEX INDEX CONCURRENTLY your_index_name;
```

For the full list of changes, please review the
[release notes](https://www.postgresql.org/docs/release/).

PostgreSQL 11 EOL Notice


PostgreSQL 11 will stop receiving fixes on November 9, 2023. If you are
running PostgreSQL 11 in a production environment, we suggest that you make
plans to upgrade to a newer, supported version of PostgreSQL. Please see our
[versioning policy](https://www.postgresql.org/support/versioning/) for more
information.

A Note on the PostgreSQL 16 Beta


This release marks the third beta release of PostgreSQL 16 and puts the
community one step closer to general availability tentatively around the end of
the third quarter.

In the spirit of the open source PostgreSQL community, we strongly encourage you
to test the new features of PostgreSQL 16 on your systems to help us eliminate
bugs or other issues that may exist. While we do not advise you to run
PostgreSQL 16 Beta 3 in production environments, we encourage you to find ways
to run your typical application workloads against this beta release.

Your testing and feedback will help the community ensure that PostgreSQL 16
upholds our standards of delivering a stable, reliable release of the world's
most advanced open source relational database. Please read more about our
[beta testing process](https://www.postgresql.org/developer/beta/) and how you
can contribute:

  
[https://www.postgresql.org/developer/beta/](https://www.postgresql.org/developer/beta/)

You can find information about all of the PostgreSQL 16 features and changes in
the [release notes](https://www.postgresql.org/docs/16/release-16.html):

  
[https://www.postgresql.org/docs/16/release-16.html](https://www.postgresql.org/docs/16/release-16.html)

Bug Fixes and Improvements
--
 
This update fixes over 40 bugs that were reported in the last several months.
The issues listed below affect PostgreSQL 15. Some of these issues may also
affect other supported versions of PostgreSQL.

* Fix for handling of `NULL` values in 
[BRIN](https://www.postgresql.org/docs/current/brin-intro.html)
indexes. This fix does not apply to existing BRIN indexes -- you will need to
run [`REINDEX`](https://www.postgresql.org/docs/current/sql-reindex.html) to fix
any BRIN indexes used to search for NULL values.
* Avoid leaving a corrupted database behind when DROP DATABASE is interrupted.
* Several fixes for partitioned indexes.
* Fix [`ALTER EXTENSION ... SET 
SCHEMA`](https://www.postgresql.org/docs/current/sql-alterextension.html)
to error if the extension contains any objects outside the extension's schema.
* Fix dependency tracking of access methods for tables.
* Don't use partial unique indexes for uniqueness proofs in the planner.
* Correctly handle sub-SELECTs in RLS policy expressions and security-barrier
views when expanding rule actions.
* Fix race conditions in conflict detection for `SERIALIZABLE` transaction 
isolation mode.
* Fix intermittent failures when trying to update a field of a composite column
that requires [out-of-line 
TOASTing](https://www.postgresql.org/docs/current

Re: 2023-08-10 release announcement draft

2023-08-07 Thread Jonathan S. Katz

On 8/7/23 9:45 PM, David Rowley wrote:


* Fix a performance regression when running concurrent
[`COPY`](https://www.postgresql.org/docs/16/sql-copy.html) statements on a
single table.


I think this is still outstanding. A bit of work has been done for the
int parsing regression but it seems there's still a performance
regression when running multiple COPYs on the same table, per [1].


Hm, the open item was closed[1] -- was that premature, or is this a new 
issue (have not yet read the thread you referenced)?



or an previous major version of PostgreSQL, you will need to use a strategy


"a previous".


Thanks for the catch -- fixed locally.

Jonathan

[1] 
https://wiki.postgresql.org/index.php?title=PostgreSQL_16_Open_Items#resolved_before_16beta3


OpenPGP_signature
Description: OpenPGP digital signature


2023-08-10 release announcement draft

2023-08-07 Thread Jonathan S. Katz

Hi,

Attached is the release announcement draft for the 2023-08-10 update 
release, which also includes the release of PostgreSQL 16 Beta 3.


Please provide your feedback no later than August 10, 2023 0:00 AoE[1].

Thanks,

Jonathan

[1] https://en.wikipedia.org/wiki/Anywhere_on_Earth
The PostgreSQL Global Development Group has released an update to all supported
versions of PostgreSQL, including 15.4, 14.9, 13.12, 12.16, and 11.21, as well
as the third beta release of PostgreSQL 16. This release fixes over 40 bugs
reported over the last several months.

If you use [BRIN]((https://www.postgresql.org/docs/current/brin-intro.html))
indexes to  look up `NULL` values, you will need to 
[reindex](https://www.postgresql.org/docs/current/sql-reindex.html)
them after upgrading to this release. You us
[`REINDEX 
CONCURRENTLY`](https://www.postgresql.org/docs/current/sql-reindex.html)
to avoid blocking writes to the affected index and table, for example:

```
REINDEX INDEX CONCURRENTLY your_index_name;
```

For the full list of changes, please review the
[release notes](https://www.postgresql.org/docs/release/).

PostgreSQL 11 EOL Notice


PostgreSQL 11 will stop receiving fixes on November 9, 2023. If you are
running PostgreSQL 11 in a production environment, we suggest that you make
plans to upgrade to a newer, supported version of PostgreSQL. Please see our
[versioning policy](https://www.postgresql.org/support/versioning/) for more
information.

A Note on the PostgreSQL 16 Beta


This release marks the third beta release of PostgreSQL 16 and puts the
community one step closer to general availability tentatively around the end of
the third quarter.

In the spirit of the open source PostgreSQL community, we strongly encourage you
to test the new features of PostgreSQL 16 on your systems to help us eliminate
bugs or other issues that may exist. While we do not advise you to run
PostgreSQL 16 Beta 3 in production environments, we encourage you to find ways
to run your typical application workloads against this beta release.

Your testing and feedback will help the community ensure that PostgreSQL 16
upholds our standards of delivering a stable, reliable release of the world's
most advanced open source relational database. Please read more about our
[beta testing process](https://www.postgresql.org/developer/beta/) and how you
can contribute:

  
[https://www.postgresql.org/developer/beta/](https://www.postgresql.org/developer/beta/)

You can find information about all of the PostgreSQL 16 features and changes in
the [release notes](https://www.postgresql.org/docs/16/release-16.html):

  
[https://www.postgresql.org/docs/16/release-16.html](https://www.postgresql.org/docs/16/release-16.html)

Bug Fixes and Improvements
--
 
This update fixes over 40 bugs that were reported in the last several months.
The issues listed below affect PostgreSQL 15. Some of these issues may also
affect other supported versions of PostgreSQL.

* Fix for handling of `NULL` values in 
[BRIN](https://www.postgresql.org/docs/current/brin-intro.html)
indexes. This fix does not apply to existing BRIN indexes -- you will need to
run [`REINDEX`](https://www.postgresql.org/docs/current/sql-reindex.html) to fix
any BRIN indexes used to search for NULL values.
* Avoid leaving a corrupted database behind when DROP DATABASE is interrupted.
* Several fixes for partitioned indexes.
* Fix [`ALTER EXTENSION ... SET 
SCHEMA`](https://www.postgresql.org/docs/current/sql-alterextension.html)
to error if the extension contains any objects outside the extension's schema.
* Fix dependency tracking of access methods for tables.
* Don't use partial unique indexes for uniqueness proofs in the planner.
* Correctly handle sub-SELECTs in RLS policy expressions and security-barrier
views when expanding rule actions.
* Fix race conditions in conflict detection for `SERIALIZABLE` transaction 
isolation mode.
* Fix intermittent failures when trying to update a field of a composite column
that requires [out-of-line 
TOASTing](https://www.postgresql.org/docs/current/storage-toast.html).
* Fix several memory leaks that occurred during the lifespan of a query.
* Accept fractional seconds in the input to the [jsonpath 
`datetime()`](https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-SQLJSON-PATH-OPERATORS)
method.
* Increase token limit in `pg_hba.conf` and `pg_ident.conf` to 10,240 bytes.
* An out-of-memory error from JIT will now cause a PostgreSQL `FATAL` error 
instead of a C++ exception.
* Allow `VACUUM` to continue after detecting certain types of B-tree index 
corruption. While this fix allows VACUUM to continue, you still need to 
`REINDEX` to fix the broken index.
* Avoid double replay of prepared transactions during crash recovery.
* Ensure that checkpoint calls `fsync` on a newly created but empty table.
* Silence "missing contrecord" errors to avoid logging 

Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-08-01 Thread Jonathan S. Katz

On 8/1/23 5:39 AM, Amit Kapila wrote:

On Fri, Jul 28, 2023 at 5:48 PM vignesh C  wrote:


Here is a patch which checks that there are no WAL records other than
CHECKPOINT_SHUTDOWN WAL record to be consumed based on the discussion
from [1].



Few comments:
=



2.
+ if (dopt.logical_slots_only)
+ {
+ if (!dopt.binary_upgrade)
+ pg_fatal("options --logical-replication-slots-only requires option
--binary-upgrade");
+
+ if (dopt.dataOnly)
+ pg_fatal("options --logical-replication-slots-only and
-a/--data-only cannot be used together");
+
+ if (dopt.schemaOnly)
+ pg_fatal("options --logical-replication-slots-only and
-s/--schema-only cannot be used together");

Can you please explain why the patch imposes these restrictions? I
guess the binary_upgrade is because you want this option to be used
for the upgrade. Do we want to avoid giving any other option with
logical_slots, if so, are the above checks sufficient and why?


Can I take this a step further on the user interface and ask why the 
flag would be "--include-logical-replication-slots" vs. being enabled by 
default?


Are there reasons why we wouldn't enable this feature by default on 
pg_upgrade, and instead (if need be) have a flag that would be 
"--exclude-logical-replication-slots"? Right now, not having the ability 
to run pg_upgrade with logical replication slots enabled on the 
publisher is a a very big pain point for users, so I would strongly 
recommend against adding friction unless there is a very large challenge 
with such an implementation.


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


PostgreSQL 16 Beta 3 release date

2023-07-31 Thread Jonathan S. Katz

Hi,

The release date for PostgreSQL 16 Beta 3 is August 10, 2023, alongside 
the regular update release[1]. Please be  sure to commit any open 
items[2] for the Beta 3 release before August 6, 2023 0:00 AoE[3] to 
give them enough time to work through the buildfarm.


Thanks,

Jonathan

[1] https://www.postgresql.org/developer/roadmap/
[2] https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items
[3] https://en.wikipedia.org/wiki/Anywhere_on_Earth


OpenPGP_signature
Description: OpenPGP digital signature


Re: psql: Add role's membership options to the \du+ command

2023-07-20 Thread Jonathan S. Katz

On 7/19/23 1:44 PM, Pavel Luzanov wrote:

On 19.07.2023 19:47, Tom Lane wrote:

And done, with some minor editorialization.


Thanks to everyone who participated in the work.
Special thanks to David for moving forward this patch for a long time, 
and to Tom for taking commit responsibilities.


[RMT]

+1; thanks to everyone for seeing this through!

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: Incremental sort for access method with ordered scan support (amcanorderbyop)

2023-07-13 Thread Jonathan S. Katz

On 7/5/23 2:15 AM, Richard Guo wrote:


On Tue, Jul 4, 2023 at 7:15 PM David Rowley > wrote:


On Tue, 4 Jul 2023 at 20:12, Richard Guo mailto:guofengli...@gmail.com>> wrote:
 > The v4 patch looks good to me (maybe some cosmetic tweaks are still
 > needed for the comments).  I think it's now 'Ready for Committer'.

I agree. I went and hit the comments with a large hammer and while
there also adjusted the regression tests. I didn't think having "t" as
a table name was a good idea as it seems like a name with a high risk
of conflicting with a concurrently running test. Also, there didn't
seem to be much need to insert data into that table as the tests
didn't query any of it.

The only other small tweak I made was to not call list_copy_head()
when the list does not need to be shortened. It's likely not that
important, but if the majority of cases are not partial matches, then
we'd otherwise be needlessly making copies of the list.

I pushed the adjusted patch.


The adjustments improve the patch a lot.  Thanks for adjusting and
pushing the patch.


Thanks for working on this! While it allows the planner to consider 
choosing an incremental sort for indexes that implement 
"amcanorderbyop", it also has a positive side-effect that the planner 
will also consider choosing a plan for spawning parallel workers!


Because of that, I'd like to open the discussion that we consider 
backpatching this. Currently, extensions that implement index access 
methods (e.g. pgvector[1]) that are built primarily around 
"amcanorderbyop" are unable to get the planner to consider choosing a 
parallel scan, i.e. at this point in "create_order_paths"[2]:


/*
* If cheapest partial path doesn't need a sort, this is redundant
* with what's already been tried.
*/
if (!pathkeys_contained_in(root->sort_pathkeys,
   cheapest_partial_path->pathkeys))

However, 625d5b3c does unlock this path for these types of indexes to 
allow for a parallel index scan to be chosen, which would allow 
extensions that implement a "amcanorderbyop" scan to use it. I would 
argue that this is a bug, given we offer the ability for index access 
methods to implement parallel index scans.


That said, I do think they may still need to be one planner tweak to 
properly support parallel index scan in this case, as I have yet to see 
costs generated where the parallel index scan is cheaper. However, I 
have not yet narrowed what/where that is.


Thanks,

Jonathan

[1] https://github.com/pgvector/pgvector
[2] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/optimizer/plan/planner.c;#l5188


OpenPGP_signature
Description: OpenPGP digital signature


Re: plan_create_index_workers doesn't account for TOAST

2023-06-29 Thread Jonathan S. Katz

On 6/29/23 10:12 AM, Jonathan S. Katz wrote:

Hi,

plan_create_index_workers[1] does not consider the amount of tuples 
existing in TOAST pages when determining the number of parallel workers 
to use for a build. The estimation comes from estimate_rel_size[2], 
which in this case, will just take the value from rel->rd_rel->relpages.


We probably don't notice this much with B-trees, given a B-tree is 
typically used for data that does not require toasting. However, this 
becomes more visible when working on custom index access methods that 
implement their own parallel build strategy.


For example, pgvector[3] provides its own data types and index access 
method for indexing vector data. Vectors can get quite large fairly 
quickly, e.g. a 768-dimensional vector takes up 8 + 4*768 = 3080 bytes 
on disk, which quickly clears the default TOAST tuple threshold.


In a recent patch proposal to allow for building indexes in parallel[4], 
I performed a few experiments on how many parallel workers would be 
spawned when indexing 1,000,000 (1MM) 768-dim vectors, both with 
EXTEDNED (default) and PLAIN storage. In all cases, I allowed for leader 
participation, but the leader is not considered in 
plan_create_index_workers.


With EXTENDED, plan_create_index_workers recommended 2 workers. The 
build time was ~2x faster than the serial build.


With PLAIN, plan_create_index_workers recommended 4 workers. The build 
time was **~3X faster** than the serial build.


(I've been doing more detailed, less hand-waivy performance testing, but 
I wanted to provide directional numbers here)


It seems like we're leaving some performance for columns with TOASTed 
data that require indexing, so I wanted to propose allowing the pages in 
TOASTed tables to be considered when we're trying to index a column with 
TOASTed attributes.


Just to add to this: there is a lever to get more parallel workers by 
setting "min_parallel_table_scan_size" to a lower value, which does help 
in this case. However, it does mask the fact that a large chunk of the 
data required to build the index exists in the TOAST table, which is not 
intuitive to a user who rarely has to use tuning parameters.


Thanks,

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


plan_create_index_workers doesn't account for TOAST

2023-06-29 Thread Jonathan S. Katz

Hi,

plan_create_index_workers[1] does not consider the amount of tuples 
existing in TOAST pages when determining the number of parallel workers 
to use for a build. The estimation comes from estimate_rel_size[2], 
which in this case, will just take the value from rel->rd_rel->relpages.


We probably don't notice this much with B-trees, given a B-tree is 
typically used for data that does not require toasting. However, this 
becomes more visible when working on custom index access methods that 
implement their own parallel build strategy.


For example, pgvector[3] provides its own data types and index access 
method for indexing vector data. Vectors can get quite large fairly 
quickly, e.g. a 768-dimensional vector takes up 8 + 4*768 = 3080 bytes 
on disk, which quickly clears the default TOAST tuple threshold.


In a recent patch proposal to allow for building indexes in parallel[4], 
I performed a few experiments on how many parallel workers would be 
spawned when indexing 1,000,000 (1MM) 768-dim vectors, both with 
EXTEDNED (default) and PLAIN storage. In all cases, I allowed for leader 
participation, but the leader is not considered in 
plan_create_index_workers.


With EXTENDED, plan_create_index_workers recommended 2 workers. The 
build time was ~2x faster than the serial build.


With PLAIN, plan_create_index_workers recommended 4 workers. The build 
time was **~3X faster** than the serial build.


(I've been doing more detailed, less hand-waivy performance testing, but 
I wanted to provide directional numbers here)


It seems like we're leaving some performance for columns with TOASTed 
data that require indexing, so I wanted to propose allowing the pages in 
TOASTed tables to be considered when we're trying to index a column with 
TOASTed attributes.


Thanks,

Jonathan

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/optimizer/plan/planner.c;hb=refs/heads/master#l6734
[2] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/optimizer/util/plancat.c;hb=refs/heads/master#l1117

[3] https://github.com/pgvector/pgvector
[4] https://github.com/pgvector/pgvector/commits/parallel-index-build


OpenPGP_signature
Description: OpenPGP digital signature


Re: PostgreSQL 16 Beta 2 release announcement draft

2023-06-27 Thread Jonathan S. Katz

On 6/27/23 12:54 PM, Roberto Mello wrote:

On Tue, Jun 27, 2023 at 8:32 AM Jonathan S. Katz  wrote:


I used the open items list[1] to build the draft. If there are any
notable please omissions, please let me know.


I noticed that ldap_password_hook [1] was omitted from the release
notes. I believe it should be included if nothing else so that it's
written somewhere that it's there. AFAIK there's no other
documentation about it.


Was this discussed on the release notes thread?[1]. It can always be 
added to the release notes -- those aren't finalized until GA.


After Beta 1, the announcements are either about new feature 
additions/removals since the last beta release, or bug fixes. I don't 
think it makes sense to include this here.


Thanks,

Jonathan

[1] https://www.postgresql.org/message-id/ZGaPa7M3gc2THeDJ%40momjian.us


OpenPGP_signature
Description: OpenPGP digital signature


PostgreSQL 16 Beta 2 release announcement draft

2023-06-27 Thread Jonathan S. Katz

Hi,

Please see the attached draft of the PostgreSQL 16 Beta 2 release 
announcement.


I used the open items list[1] to build the draft. If there are any 
notable please omissions, please let me know.


Please leave all feedback by June 29, 0:00 AoE.

Thanks,

Jonathan

[1] https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items
The PostgreSQL Global Development Group announces that the second beta release 
of
PostgreSQL 16 is now [available for 
download](https://www.postgresql.org/download/).
This release contains previews of all features that will be available when
PostgreSQL 16 is made generally available, though some details of the release
can change during the beta period.

You can find information about all of the features and changes found in
PostgreSQL 16 in the [release 
notes](https://www.postgresql.org/docs/16/release-16.html):

  
[https://www.postgresql.org/docs/16/release-16.html](https://www.postgresql.org/docs/16/release-16.html)

In the spirit of the open source PostgreSQL community, we strongly encourage you
to test the new features of PostgreSQL 16 on your systems to help us eliminate
bugs or other issues that may exist. While we do not advise you to run
PostgreSQL 16 Beta 1 in production environments, we encourage you to find ways
to run your typical application workloads against this beta release.

Your testing and feedback will help the community ensure that the PostgreSQL 16
release upholds our standards of delivering a stable, reliable release of the
world's most advanced open source relational database. Please read more about
our [beta testing process](https://www.postgresql.org/developer/beta/) and how
you can contribute:

  
[https://www.postgresql.org/developer/beta/](https://www.postgresql.org/developer/beta/)

Upgrading to PostgreSQL 16 Beta 2
-

To upgrade to PostgreSQL 16 Beta 2 from an earlier version of PostgreSQL,
you will need to use a strategy similar to upgrading between major versions of
PostgreSQL (e.g. `pg_upgrade` or `pg_dump` / `pg_restore`). For more
information, please visit the documentation section on
[upgrading](https://www.postgresql.org/docs/15/static/upgrading.html).

Changes Since Beta 1


Fixes and changes in PostgreSQL 16 Beta 2 include:

* The default collation provider selected by `initdb` is changed back to `libc`.
* The behavior of selecting the `C` locale with `libicu` is deferred back to
`libicu`. On ICU 64 and higher, the `C` locale is obsolete, and ICU provides its
own mechanism for selecting a locale or throwing an error.
* Several fixes related to join optimizations.
* Fix for B-tree code related to the changes introduced by logical decoding
from standbys.
* Fix cache lookup hazards when looking up `MAINTAIN` privileges on partition
ancestors.

Please see the [release 
notes](https://www.postgresql.org/docs/16/release-16.html)
for a complete list of new and changed features:

  
[https://www.postgresql.org/docs/16/release-16.html](https://www.postgresql.org/docs/16/release-16.html)

Testing for Bugs & Compatibility


The stability of each PostgreSQL release greatly depends on you, the community,
to test the upcoming version with your workloads and testing tools in order to
find bugs and regressions before the general availability of PostgreSQL 16. As
this is a Beta, minor changes to database behaviors, feature details, and APIs
are still possible. Your feedback and testing will help determine the final
tweaks on the new features, so please test in the near future. The quality of
user testing helps determine when we can make a final release.

A list of [open 
issues](https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items)
is publicly available in the PostgreSQL wiki.  You can
[report bugs](https://www.postgresql.org/account/submitbug/) using this form on
the PostgreSQL website:

  
[https://www.postgresql.org/account/submitbug/](https://www.postgresql.org/account/submitbug/)

Beta Schedule
-

This is the second beta release of version 16. The PostgreSQL Project will
release additional betas as required for testing, followed by one or more
release candidates, until the final release in late 2023. For further
information please see the [Beta 
Testing](https://www.postgresql.org/developer/beta/)
page.

Links
-

* [Download](https://www.postgresql.org/download/)
* [Beta Testing Information](https://www.postgresql.org/developer/beta/)
* [PostgreSQL 16 Beta Release 
Notes](https://www.postgresql.org/docs/16/release-16.html)
* [PostgreSQL 16 Open 
Issues](https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items)
* [Feature 
Matrix](https://www.postgresql.org/about/featurematrix/#configuration-management)
* [Submit a Bug](https://www.postgresql.org/account/submitbug/)


OpenPGP_signature
Description: OpenPGP digital signature


Re: psql: Add role's membership options to the \du+ command

2023-06-23 Thread Jonathan S. Katz

On 6/23/23 12:16 PM, Tom Lane wrote:

"David G. Johnston"  writes:

On Thu, Jun 22, 2023 at 5:08 PM Tom Lane  wrote:

* I agree that the "tabular" format looks nicer and has fewer i18n
issues than the other proposals.



As you are on board with a separate command please clarify whether you mean
the tabular format but still with newlines, one row per grantee, or the
table with one row per grantor-grantee pair.


I'd lean towards a straight table with a row per grantee/grantor.
I tend to think that faking table layout with some newlines is
a poor idea.  I'm not dead set on that approach though.


[Personal hat]

Generally, I find the tabular view w/o newlines is easier to read, and 
makes it simpler to join to other data (though that may not be 
applicable here).


Again, I'm not the target user of this feature (until I need to use it), 
so my opinion comes with a few grains of salt.


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: psql: Add role's membership options to the \du+ command

2023-06-23 Thread Jonathan S. Katz

On 6/23/23 11:52 AM, David G. Johnston wrote:
On Thu, Jun 22, 2023 at 5:08 PM Tom Lane <mailto:t...@sss.pgh.pa.us>> wrote:


"Jonathan S. Katz" mailto:jk...@postgresql.org>> writes:
 > On 6/15/23 2:47 PM, David G. Johnston wrote:
 >> Robert - can you please comment on what you are willing to
commit in
 >> order to close out your open item here.  My take is that the
design for
 >> this, the tabular form a couple of emails ago (copied here), is
 >> ready-to-commit, just needing the actual (trivial) code changes
to be
 >> made to accomplish it.

 > Can we resolve this before Beta 2?[1] The RMT originally advised
to try
 > to resolve before Beta 1[2], and this seems to be lingering.

At this point I kinda doubt that we can get this done before beta2
either, but I'll put in my two cents anyway:


[RMT Hat]

Well, the probability of completing this before the beta 2 freeze is 
effectively zero now. This is a bit disappointing as there was ample 
time since the first RMT nudge on the issue. But let's move forward and 
resolve it before Beta 3.



* I agree that the "tabular" format looks nicer and has fewer i18n
issues than the other proposals.

As you are on board with a separate command please clarify whether you 
mean the tabular format but still with newlines, one row per grantee, or 
the table with one row per grantor-grantee pair.


I still like using newlines here even in the separate meta-command.


(I'll save for the downthread comment).



* Personally I could do without the "empty" business, but that seems
unnecessary in the tabular format; an empty column will serve fine.


I disagree, but not strongly.

I kinda expected you to be on the side of "why are we discussing a 
situation that should just be prohibited" though.


[Personal hat]

I'm still not a fan of "empty" but perhaps the formatting around the 
"separate command" will help drive a conclusion on this.




* I also agree with Pavel's comment that we'd be better off taking
this out of \du altogether and inventing a separate \d command.
Maybe "\drg" for "display role grants"?

Just to be clear, the open item fix proposal is to remove the presently 
broken (due to it showing duplicates without any context) "member of" 
array in \du and make a simple table report output in \drg instead.


I'm good with \drg as a new meta-command.


[Personal hat]

+1 for a new command. The proposal above seems reasonable.

Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: collation-related loose ends before beta2

2023-06-21 Thread Jonathan S. Katz

On 6/20/23 5:02 AM, Jeff Davis wrote:


Status on collation loose ends:

1. There's an open item "Switch to ICU for 17". It's a little bit
confusing exactly what that means, and the CF entry refers to two
items, one of which is the build-time default to --with-icu. As far as
I know, building with ICU by default is a settled issue with no
objections. The second issue is the initdb default, which is covered by
the other open item. So I will just close that open item unless someone
thinks I'm missing something.


[RMT Hat]

No objections. The RMT had interpreted this as "Punt on making ICU the 
building default to v17" but it seems the consensus is to continue to 
leave it in as the default for v16.



2. Open item about the unfriendly rules for choosing an ICU locale at
initdb time. Tom, Robert, and Daniel Verite have expressed concerns
(and at least one objection) to initdb defaulting to  icu for --locale-
provider. Some of the problems have been addressed, but the issue about
C and C.UTF-8 locales is not settled. Even if it were settled I'm not
sure we'd have a clear consensus on all the details. I don't think this
should proceed to beta2 in this state, so I intend to revert back to
libc as the default for initdb. [ I believe we do have a general
consensus that ICU is better, but we can signal it other ways: through
documentation, packaging, etc. ]


[Personal hat]

(Building...)

I do think this raises a good point: it's really the packaging that will 
guide what users are using for v16. I don't know if we want to 
discuss/poll the packagers to see what they are thinking about this?



3. The ICU conversion from "C" to "en-US-u-va-posix": cut out this code
(it was a small part of a larger change). It's only purpose was
consistency between ICU versions, and nobody liked it. It's only here
right now to avoid test failures due to an order-of-commits issue; but
if the initdb default goes back to libc it won't matter and I can
remove it.

4. icu_validation_level WARNING or ERROR: right now an invalid ICU
locale raises a WARNING, but Peter Eisentraut would prefer an ERROR.
I'm still inclined to leave it as a WARNING for one release and
increase it to ERROR later. But if the default collation provider goes
back to libc, the risk of ICU validation errors goes way down, so I
don't object if Peter would like to change it back to an ERROR.


[Personal hat]

I'd be inclined for "WARNING" until getting a sense of what packagers 
who do an initdb as part of the installation process decide what 
collation provider they're going to use.


Thanks,

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: EBCDIC sorting as a use case for ICU rules

2023-06-21 Thread Jonathan S. Katz

On 6/21/23 12:14 PM, Jeff Davis wrote:

On Wed, 2023-06-21 at 15:28 +0200, Daniel Verite wrote:

At a conference this week I was asked if ICU could be able to
sort like EBCDIC [2]. It turns out it has been already  asked on
-general a few years ago [3] with no satisfactory answer at the time
,
and that it can be implemented with rules in v16.


Interesting, thank you!


+1 -- this is very helpful framing the problem, thank you!


This can be useful for people who migrate from mainframes to Postgres
and need their migration tests to produce the same sorted results as
the
original system.
Since rules can be defined at the database level with the icu_rules
option,
they don't even need to tweak their queries to add COLLATE clauses,
which surely is appreciable in that kind of project.


I still had some technical concerns about the ICU rules feature,
unfortunately, and one option is to only allow it for the collation
objects and not the database level collation. How much would that hurt
this use case?



I'm open to suggestions on whether this EBCDIC example is worth being
in the
doc in some form or putting this in the wiki would be good enough.


I like the idea of having a real example. Ideally, we could add some
explanation along the way about how the rule is constructed to match
EBCDIC, which would reduce the shock of a long rule like that.

I wonder why the rule syntax is such that it cannot be broken up? Would
it be incorrect for us to allow some whitespace in there?


I'll give the unhelpful comment of "yes, I agree we should have a real 
world example", especially one that seems relevant to helping more 
people adopt PostgreSQL.




OpenPGP_signature
Description: OpenPGP digital signature


Re: psql: Add role's membership options to the \du+ command

2023-06-19 Thread Jonathan S. Katz

On 6/15/23 2:47 PM, David G. Johnston wrote:
Robert - can you please comment on what you are willing to commit in 
order to close out your open item here.  My take is that the design for 
this, the tabular form a couple of emails ago (copied here), is 
ready-to-commit, just needing the actual (trivial) code changes to be 
made to accomplish it.


Tabular form

  rolname  | memberof |   options   | 
grantor 
--+--+-+-- postgres |  | |  regress_du_admin | regress_du_role0+| admin, inherit, set+| postgres    +  | regress_du_role1+| admin, inherit, set+| postgres    +  | regress_du_role2 | admin, inherit, set | postgres regress_du_role0 |  | |  regress_du_role1 | regress_du_role0+| admin, inherit, set+| regress_du_admin+  | regress_du_role0+| inherit    +| regress_du_role1+  | regress_du_role0 | set | regress_du_role2 regress_du_role2 | regress_du_role0+| admin  +| regress_du_admin+  | regress_du_role0+| inherit, set   +| regress_du_role1+  | regress_du_role0+| empty  +| regress_du_role2+  | regress_du_role1 | admin, set  | regress_du_admin(5 rows)




[RMT hat]

Can we resolve this before Beta 2?[1] The RMT originally advised to try 
to resolve before Beta 1[2], and this seems to be lingering.


Thanks,

Jonathan

[1] 
https://www.postgresql.org/message-id/460ae02a-3123-16a3-f2d7-ccd79778819b%40postgresql.org
[2] 
https://www.postgresql.org/message-id/d61db38b-29d9-81cc-55b3-8a5c704bb969%40postgresql.org




OpenPGP_signature
Description: OpenPGP digital signature


PostgreSQL 16 Beta 2 Release Date

2023-06-17 Thread Jonathan S. Katz

Hi,

The release date for PostgreSQL 16 Beta 2 is June 29, 2023. Please be 
sure to commit any open items[1] for the Beta 2 release before June 25, 
2023 0:00 AoE[2] to give them enough time to work through the buildfarm.


Thanks,

Jonathan

[1] 
https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items#Important_Dates

[2] https://en.wikipedia.org/wiki/Anywhere_on_Earth


OpenPGP_signature
Description: OpenPGP digital signature


Re: Order changes in PG16 since ICU introduction

2023-06-06 Thread Jonathan S. Katz

On 6/6/23 3:56 PM, Joe Conway wrote:

On 6/6/23 15:55, Tom Lane wrote:

Robert Haas  writes:

On Tue, Jun 6, 2023 at 3:25 PM Tom Lane  wrote:

Also +1, except that I find "none" a rather confusing choice of name.
There *is* a provider, it's just PG itself not either libc or ICU.
I thought Joe's suggestion of "internal" made more sense.



Or perhaps "builtin" or "postgresql".


Either OK by me


Same here


Since we're bikeshedding, "postgresql" or "builtin" could make it seem 
to a (app) developer that these may be recommended options, as we're 
trusting PostgreSQL to make the best choices for us. Granted, v16 is 
(theoretically) defaulting to ICU, so that choice is made, but the 
unsuspecting developer could make a switch based on that naming.


However, I don't have a strong alternative -- I understand the concern 
about "internal", so I'd be OK with "postgresql" unless a better name 
appears.


Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: Let's make PostgreSQL multi-threaded

2023-06-05 Thread Jonathan S. Katz

On 6/5/23 11:33 AM, Heikki Linnakangas wrote:

On 05/06/2023 11:18, Tom Lane wrote:

Heikki Linnakangas  writes:

I spoke with some folks at PGCon about making PostgreSQL multi-threaded,
so that the whole server runs in a single process, with multiple
threads. It has been discussed many times in the past, last thread on
pgsql-hackers was back in 2017 when Konstantin made some experiments 
[0].



I feel that there is now pretty strong consensus that it would be a good
thing, more so than before. Lots of work to get there, and lots of
details to be hashed out, but no objections to the idea at a high level.



The purpose of this email is to make that silent consensus explicit. If
you have objections to switching from the current multi-process
architecture to a single-process, multi-threaded architecture, please
speak up.


For the record, I think this will be a disaster.  There is far too much
code that will get broken, largely silently, and much of it is not
under our control.


Noted. Other large projects have gone through this transition. It's not 
easy, but it's a lot easier now than it was 10 years ago. The platform 
and compiler support is there now, all libraries have thread-safe 
interfaces, etc.


I don't expect you or others to buy into any particular code change at 
this point, or to contribute time into it. Just to accept that it's a 
worthwhile goal. If the implementation turns out to be a disaster, then 
it won't be accepted, of course. But I'm optimistic.


I don't have enough expertise in this area to comment on if it'd be a 
"disaster" or not. My zoomed out observations are two-fold:


1. It seems like there's a lack of consensus on which of processes vs. 
threads yield the best performance benefit, and from talking to folks 
with greater expertise than me, this can vary between workloads. I 
believe one DB even gives uses a choice if they want to run in processes 
vs. threads.


2. While I wouldn't want to necessarily discourage a moonshot effort, I 
would ask if developer time could be better spent on tackling some of 
the other problems around vertical scalability? Per some PGCon 
discussions, there's still room for improvement in how PostgreSQL can 
best utilize resources available very large "commodity" machines (a 
448-core / 24TB RAM instance comes to mind).


I'm purposely giving a nonanswer on whether it's a worthwhile goal, but 
rather I'd be curious where it could stack up against some other efforts 
to continue to help PostgreSQL improve performance and handle very large 
workloads.


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: Docs: Encourage strong server verification with SCRAM

2023-06-05 Thread Jonathan S. Katz

On 6/5/23 11:22 AM, Jacob Champion wrote:

On Sat, Jun 3, 2023 at 2:50 PM Michael Paquier  wrote:

Took me some time to get back to it, but applied this way.


Thanks all!


+1; thank you!

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: Docs: Encourage strong server verification with SCRAM

2023-05-28 Thread Jonathan S. Katz

On 5/26/23 6:47 PM, Jacob Champion wrote:

On Thu, May 25, 2023 at 6:10 PM Jonathan S. Katz  wrote:



+   To prevent server spoofing from occurring when using
+   scram-sha-256 password authentication
+   over a network, you should ensure you are connecting using SSL.


seems to backtrack on the recommendation -- you have to use
sslmode=verify-full, not just SSL, to avoid handing a weak(er) hash to
an untrusted party.


The above assumes that the reader reviewed the previous paragraph and 
followed the guidelines there. However, we can make it explicit. Please 
see attached.


Thanks,

Jonathan
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index dbe23db54f..9a9fa7b206 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -2014,6 +2014,19 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433
CA.
   
 
+  
+   To prevent server spoofing from occurring when using
+   scram-sha-256 password authentication
+   over a network, you should ensure that you connect to the server using SSL
+   and with one of the anti-spoofing methods described in the previous
+   paragraph. Additionally, the SCRAM implementation in
+   libpq cannot protect the entire authentication
+   exchange, but using the channel_binding=require 
connection
+   parameter provides a mitigation against server spoofing. An attacker that
+   uses a rogue server to intercept a SCRAM exchange can use offline analysis 
to
+   determine the hashed password from the client.
+  
+
   
 To prevent spoofing with GSSAPI, the server must be configured to accept
 only hostgssenc connections


OpenPGP_signature
Description: OpenPGP digital signature


Re: vector search support

2023-05-26 Thread Jonathan S. Katz

On 4/26/23 9:31 AM, Giuseppe Broccolo wrote:

Hi Nathan,

I find the patches really interesting. Personally, as Data/MLOps 
Engineer, I'm involved in a project where we use embedding techniques to 
generate vectors from documents, and use clustering and kNN searches to 
find similar documents basing on spatial neighbourhood of generated 
vectors.


Thanks! This seems to be a pretty common use-case these days.

We finally opted for ElasticSearch as search engine, considering that it 
was providing what we needed:


* support to store dense vectors
* support for kNN searches (last version of ElasticSearch allows this)


I do want to note that we can implement indexing techniques with GiST 
that perform K-NN searches with the "distance" support function[1], so 
adding the fundamental functions to help with this around known vector 
search techniques could add this functionality. We already have this 
today with "cube", but as Nathan mentioned, it's limited to 100 dims.


An internal benchmark showed us that we were able to achieve the 
expected performance, although we are still lacking some points:


* clustering of vectors (this has to be done outside the search engine, 
using DBScan for our use case)


From your experience, have you found any particular clustering 
algorithms better at driving a good performance/recall tradeoff?


* concurrency in updating the ElasticSearch indexes storing the dense 
vectors


I do think concurrent updates of vector-based indexes is one area 
PostgreSQL can ultimately be pretty good at, whether in core or in an 
extension.


I found these patches really interesting, considering that they would 
solve some of open issues when storing dense vectors. Index support 
would help a lot with searches though.


Great -- thanks for the feedback,

Jonathan

[1] https://www.postgresql.org/docs/devel/gist-extensibility.html


OpenPGP_signature
Description: OpenPGP digital signature


Re: vector search support

2023-05-26 Thread Jonathan S. Katz

On 5/25/23 1:48 PM, Oliver Rice wrote:

A nice side effect of using the float8[] to represent vectors is that it 
allows for vectors of different sizes to coexist in the same column.


We most frequently see (pgvector) vector columns being used for storing 
ML embeddings. Given that different models produce embeddings with a 
different number of dimensions, the need to specify a vector’s size in 
DDL tightly couples the schema to a single model. Support for variable 
length vectors would be a great way to decouple those concepts. It would 
also be a differentiating feature from existing vector stores.


I hadn't thought of that, given most of what I've seen (or at least my 
personal bias in designing systems) is you keep a vector of one 
dimensionality in a column. But this sounds like where having native 
support in a variable array would help.


One drawback is that variable length vectors complicates indexing for 
similarity search because similarity measures require vectors of 
consistent length. Partial indexes are a possible solution to that challenge


Yeah, that presents a challenge. This may also be an argument for a 
vector data type, since that would eliminate the need to check for 
consistent dimensionality on the indexing.


Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: vector search support

2023-05-26 Thread Jonathan S. Katz

Hi,

On 4/21/23 8:07 PM, Nathan Bossart wrote:

Attached is a proof-of-concept/work-in-progress patch set that adds
functions for "vectors" repreѕented with one-dimensional float8 arrays.
These functions may be used in a variety of applications, but I am
proposing them with the AI/ML use-cases in mind.  I am posting this early
in the v17 cycle in hopes of gathering feedback prior to PGCon.


Thanks for proposing this. Looking forward to discussing more in person. 
There's definitely demand to use PostgreSQL to store / search over 
vector data, and I do think we need to improve upon this in core.



With the accessibility of AI/ML tools such as large language models (LLMs),
there has been a demand for storing and manipulating high-dimensional
vectors in PostgreSQL, particularly around nearest-neighbor queries.  Many
of these vectors have more than 1500 dimensions. 


1536 seems to be a popular one from LLMs, but I've been seeing much 
higher dimensionality (8K, 16K etc). My hunch is that at a practical 
level, apps are going to favor data sets / sources that use a reduced 
dimensionality, but I wouldn't be shocked if we see vectors of all sizes.



The cube extension [0]
provides some of the distance functionality (e.g., taxicab, Euclidean, and
Chebyshev), but it is missing some popular functions (e.g., cosine
similarity, dot product), and it is limited to 100 dimensions.  We could
extend cube to support more dimensions, but this would require reworking
its indexing code and filling in gaps between the cube data type and the
array types.  For some previous discussion about using the cube extension
for this kind of data, see [1].


I've stared at the cube code quite a bit over the past few months. There 
are definitely some clever methods in it for handling searches over 
(now) lower dimensionality data, but I generally agree we should add 
functionality that's specific to ARRAY types.


I'll start making specific comments on the patches below.



float8[] is well-supported and allows for effectively unlimited dimensions
of data.  float8 matches the common output format of many AI embeddings,
and it allows us or extensions to implement indexing methods around these
functions.  This patch set does not yet contain indexing support, but we
are exploring using GiST or GIN for the use-cases in question.  It might
also be desirable to add support for other linear algebra operations (e.g.,
operations on matrices).  The attached patches likely only scratch the
surface of the "vector search" use-case.

The patch set is broken up as follows:

  * 0001 does some minor refactoring of dsqrt() in preparation for 0002.


This seems pretty benign and may as well do anyway, though we may need 
to expand on it based on comments on next patch. Question on:


+static inline float8
+float8_sqrt(const float8 val)
+{
+   float8  result;
+
+   if (unlikely(val < 0))

Should this be:

  if (unlikely(float8_lt(val, 0))

Similarly:

+   if (unlikely(result == 0.0) && val != 0.0)

  if (unlikely(float8_eq(result,0.0)) && float8_ne(val, 0.0))



  * 0002 adds several vector-related functions, including distance functions
and a kmeans++ implementation.


Nice. Generally I like this patch. The functions seems to match the most 
commonly used vector distance functions I'm seeing, and it includes a 
function that can let a user specify a constraint on an ARRAY column so 
they can ensure it contains valid vectors.


While I think supporting float8 is useful, I've been seeing a mix of 
data types in the different AI/ML vector embeddings, i.e. float4 / 
float8. Additionally, it could be helpful to support integers as well, 
particularly based on some of the dimensionality reduction techniques 
I've seen. I think this holds double true for kmeans, which is often 
used in those calculations.


I'd suggest ensure these functions support:

* float4, float8
* int2, int4, int8

There's probably some nuance of how we document this too, given our 
docs[1] specify real / double precision, and smallint, int, bigint.


(Separately, we mention the int2/int4/int8 aliases in [1], but not 
float4/float8, which seems like a small addition we should make).


If you agree, I'd be happy to review more closely once that's implemented.

Other things:

* kmeans -- we're using kmeans++, should the function name reflect that? 
Do you think we could end up with a different kmeans algo in the future? 
Maybe we allow the user to specify the kmeans algo from the function 
name (with the default / only option today being kmeans++)?



  * 0003 adds support for optionally using the OpenBLAS library, which is an
implementation of the Basic Linear Algebra Subprograms [2]
specification.  Basic testing with this library showed a small
performance boost, although perhaps not enough to justify giving this
patch serious consideration.


It'd be good to see what else we could use OpenBLAS with. Maybe that's a 
discussion for PGCon.



Of 

Re: Docs: Encourage strong server verification with SCRAM

2023-05-25 Thread Jonathan S. Katz

On 5/25/23 3:27 PM, Jacob Champion wrote:

On Thu, May 25, 2023 at 10:48 AM Jonathan S. Katz  wrote:

Overall, +1 to tightening the language around the docs in this area.

However, to paraphrase Stephen, I think the language, as currently
written, makes the problem sound scarier than it actually is, and I
agree that we should just inline it above.


How does v2 look? I more or less divided the current text into a local
section and a network section. (I'm still not clear on where in the
current text you're wanting me to inline a sudden aside on SCRAM; it
doesn't really seem to fit in any of the existing paragraphs.)


I read through the proposal and like this much better. I missed 
Stephen's point on the "where" to put it in this section; I actually 
don't know if I agree with that (he says while painting the bikeshed), 
given the we spend two paragraphs describing how to prevent spoofing in 
general over the network, vs. just during SCRAM authentication.


I rewrote this to just focus on server spoofing that can occur with 
SCRAM authentication and did some wordsmithing. I was torn on keeping in 
the part of offline analysis of an intercepted hash, given one can do 
this with md5 as well, but perhaps it helps elaborate on the consequences.


Thanks,

Jonathan
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index dbe23db54f..a3f4b258f7 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -2014,6 +2014,18 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433
CA.
   
 
+  
+   To prevent server spoofing from occurring when using
+   scram-sha-256 password authentication
+   over a network, you should ensure you are connecting using SSL. 
Additionally,
+   the SCRAM implementation in libpq cannot protect
+   the entire authentication exchange, but using the
+   channel_binding=require connection parameter provides a
+   mitigation against server spoofing. An attacker that uses a rogue server to
+   intercept a SCRAM exchange can use offline analysis to determine the hashed
+   password from the client.
+  
+
   
 To prevent spoofing with GSSAPI, the server must be configured to accept
 only hostgssenc connections


OpenPGP_signature
Description: OpenPGP digital signature


Re: Docs: Encourage strong server verification with SCRAM

2023-05-25 Thread Jonathan S. Katz

On 5/25/23 1:29 PM, Stephen Frost wrote:

Greetings,

* Jacob Champion (jchamp...@timescale.com) wrote:

On 5/24/23 05:04, Daniel Gustafsson wrote:

On 23 May 2023, at 23:02, Stephen Frost  wrote:
Perhaps more succinctly- maybe we should be making adjustments to the
current language instead of just adding a new paragraph.


+1


I'm 100% on board for doing that as well, but the "instead of"
suggestion makes me think I didn't explain my goal very well. For
example, Stephen, you said


I have to admit that the patch as presented strikes me as a warning
without really providing steps for how to address the issues mentioned
though; there's a reference to what was just covered at the bottom but
nothing really new.


but the last sentence of my patch *is* the suggested step:


+  ... It's recommended to employ strong server
+  authentication, using one of the above anti-spoofing measures, to prevent
+  these attacks.


I was referring specifically to that ordering as not being ideal or in
line with the rest of the flow of that section.  We should integrate the
concerns higher in the section where we outline the reason these things
matter and then follow those with the specific steps for how to address
them, not give a somewhat unclear reference from the very bottom back up
to something above.


Caught up on this exchange. Some of my comments may be out-of-order.

Overall, +1 to tightening the language around the docs in this area.

However, to paraphrase Stephen, I think the language, as currently 
written, makes the problem sound scarier than it actually is, and I 
agree that we should just inline it above. There may also be some 
follow-up development work we can do to mitigate the issues.


I think it's fine for us to recommend using channel binding, but if 
we're concerned about server spoofing / rogue servers, we should also 
recommend using sslmode=verify-full to ensure server-identity. That 
doesn't necessarily help in the case the server itself has gone rogue, 
but that mitigates the MITM risk. The SCRAM RFC is also very clear that 
you should be using TLS[1].


I really don't think the "startup packet" is an actual issue, but I 
think recommending good TLS / channel binding mitigates this.


For the iteration count, I'm generally ambivalent here. I think again, 
if you're using good TLS, this is most likely mitigated. If you're 
connecting to a rogue server using good TLS, you likely have other 
issues to deal with. However, there may be a libpq feature here that 
lets a client specify a minimum iteration count it will accept for 
purposes of SCRAM.


Thanks,

Jonathan

[1] https://www.rfc-editor.org/rfc/rfc5802#section-9



OpenPGP_signature
Description: OpenPGP digital signature


Re: PostgreSQL 16 Beta 1 release announcement draft

2023-05-25 Thread Jonathan S. Katz

On 5/25/23 12:16 AM, Andres Freund wrote:

Hi,

On 2023-05-24 23:30:58 -0400, Jonathan S. Katz wrote:

Ah, OK, that's why I didn't grok it. I read through the first message
in[1] and definitely agree it should be in the announcement. How about:

"PostgreSQL 16 also shows up to a 300% improvement when concurrently
loading data with `COPY`"


I currently have it as the below in the release announcement. If it you send
any suggested updates, I can try to put them in before release:

PostgreSQL 16 can also improve the performance of concurrent bulk loading of
data using [`COPY`](https://www.postgresql.org/docs/16/sql-copy.html) up to
a 300%.


It also speeds up concurrent loading when not using COPY, just to a lesser
degree. But I can't come up with a concise phrasing for that right now...


I left as is (in part because of a hurried morning), but we can improve 
upon it for the GA.


Thanks,

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: PostgreSQL 16 Beta 1 release announcement draft

2023-05-24 Thread Jonathan S. Katz

On 5/24/23 11:30 PM, Jonathan S. Katz wrote:

On 5/24/23 9:20 PM, Jonathan S. Katz wrote:


I currently have it as the below in the release announcement. If it you 
send any suggested updates, I can try to put them in before release:


PostgreSQL 16 can also improve the performance of concurrent bulk 
loading of data using 
[`COPY`](https://www.postgresql.org/docs/16/sql-copy.html) up to a 300%.


(without the "a 300%" typo).

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: PostgreSQL 16 Beta 1 release announcement draft

2023-05-24 Thread Jonathan S. Katz

On 5/24/23 9:20 PM, Jonathan S. Katz wrote:

On 5/24/23 8:04 PM, Andres Freund wrote:

Hi,

On 2023-05-24 19:57:39 -0400, Jonathan S. Katz wrote:

On 5/24/23 5:28 PM, Andres Freund wrote:


I think the relation extension improvements ought to be mentioned 
here as

well? Up to 3x faster concurrent data load with COPY seems practically
relevant.


I missed that -- not sure I'm finding it in the release notes with a 
quick

grep -- which commit/thread is this?


It was split over quite a few commits, the one improving COPY most
significantly is

commit 00d1e02be24987180115e371abaeb84738257ae2
Author: Andres Freund 
Date:   2023-04-06 16:35:21 -0700

 hio: Use ExtendBufferedRelBy() to extend tables more efficiently

Relevant thread: 
https://postgr.es/m/20221029025420.eplyow6k7tgu6...@awork3.anarazel.de


It's in the release notes as:
   Allow more efficient addition of heap and index pages (Andres Freund)


Ah, OK, that's why I didn't grok it. I read through the first message 
in[1] and definitely agree it should be in the announcement. How about:


"PostgreSQL 16 also shows up to a 300% improvement when concurrently 
loading data with `COPY`"


I currently have it as the below in the release announcement. If it you 
send any suggested updates, I can try to put them in before release:


PostgreSQL 16 can also improve the performance of concurrent bulk 
loading of data using 
[`COPY`](https://www.postgresql.org/docs/16/sql-copy.html) up to a 300%.


Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: PostgreSQL 16 Beta 1 release announcement draft

2023-05-24 Thread Jonathan S. Katz

On 5/24/23 8:04 PM, Andres Freund wrote:

Hi,

On 2023-05-24 19:57:39 -0400, Jonathan S. Katz wrote:

On 5/24/23 5:28 PM, Andres Freund wrote:


I think the relation extension improvements ought to be mentioned here as
well? Up to 3x faster concurrent data load with COPY seems practically
relevant.


I missed that -- not sure I'm finding it in the release notes with a quick
grep -- which commit/thread is this?


It was split over quite a few commits, the one improving COPY most
significantly is

commit 00d1e02be24987180115e371abaeb84738257ae2
Author: Andres Freund 
Date:   2023-04-06 16:35:21 -0700

 hio: Use ExtendBufferedRelBy() to extend tables more efficiently

Relevant thread: 
https://postgr.es/m/20221029025420.eplyow6k7tgu6...@awork3.anarazel.de

It's in the release notes as:
   Allow more efficient addition of heap and index pages (Andres Freund)


Ah, OK, that's why I didn't grok it. I read through the first message 
in[1] and definitely agree it should be in the announcement. How about:


"PostgreSQL 16 also shows up to a 300% improvement when concurrently 
loading data with `COPY`"


Thanks,

Jonathan

[1] https://postgr.es/m/20221029025420.eplyow6k7tgu6...@awork3.anarazel.de



OpenPGP_signature
Description: OpenPGP digital signature


Re: PostgreSQL 16 Beta 1 release announcement draft

2023-05-24 Thread Jonathan S. Katz

On 5/24/23 5:28 PM, Andres Freund wrote:


I think the relation extension improvements ought to be mentioned here as
well? Up to 3x faster concurrent data load with COPY seems practically
relevant.


I missed that -- not sure I'm finding it in the release notes with a 
quick grep -- which commit/thread is this?


But yes this does sound like something that should be included, I just 
want to read upon it.


Thanks,

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: PostgreSQL 16 Beta 1 release announcement draft

2023-05-24 Thread Jonathan S. Katz

On 5/22/23 3:23 PM, Erik Rijkers wrote:

Op 5/21/23 om 19:07 schreef Jonathan S. Katz:

On 5/19/23 12:17 AM, Jonathan S. Katz wrote:

Hi,

Attached is a draft of the release announcement for PostgreSQL 16 
Beta Please provide feedback no later than May 24, 0:00 AoE. This 
will give 
Thanks everyone for your feedback. Here is the updated text that 


'substransaction'  should be
'subtransaction'


Fixed.


'use thousands separators'  perhaps is better:
'use underscore as digit-separator, as in `5_432` and `1_00_000`'


I looked at how other languages document this, and they do use the term 
"thousands separators." I left that in, but explicitly called out the 
underscore.



'instrcut'  should be
'instruct'


Fixed. Attached is the (hopefully) final draft.

Thanks,

Jonathan

The PostgreSQL Global Development Group announces that the first beta release of
PostgreSQL 16 is now [available for 
download](https://www.postgresql.org/download/).
This release contains previews of all features that will be available when
PostgreSQL 16 is made generally available, though some details of the release
can change during the beta period.

You can find information about all of the features and changes found in
PostgreSQL 16 in the [release 
notes](https://www.postgresql.org/docs/16/release-16.html):

  
[https://www.postgresql.org/docs/16/release-16.html](https://www.postgresql.org/docs/16/release-16.html)

In the spirit of the open source PostgreSQL community, we strongly encourage you
to test the new features of PostgreSQL 16 on your systems to help us eliminate
bugs or other issues that may exist. While we do not advise you to run
PostgreSQL 16 Beta 1 in production environments, we encourage you to find ways
to run your typical application workloads against this beta release.

Your testing and feedback will help the community ensure that the PostgreSQL 16
release upholds our standards of delivering a stable, reliable release of the
world's most advanced open source relational database. Please read more about
our [beta testing process](https://www.postgresql.org/developer/beta/) and how
you can contribute:

  
[https://www.postgresql.org/developer/beta/](https://www.postgresql.org/developer/beta/)

PostgreSQL 16 Feature Highlights


### Performance

PostgreSQL 16 includes performance improvements in query execution. This release
adds more query parallelism, including allowing `FULL` and `RIGHT` joins to
execute in parallel, and parallel execution of the `string_agg` and `array_agg`
aggregate functions. Additionally, PostgreSQL 16 can use incremental sorts in
`SELECT DISTINCT` queries. There are also several optimizations for
[window 
queries](https://www.postgresql.org/docs/16/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS),
improvements in lookups for `RANGE` and `LIST` partitions, and support for
"anti-joins" in `RIGHT` and `OUTER` queries.

This release also introduces support for CPU acceleration using SIMD for both
x86 and ARM architectures, including optimizations for processing ASCII and JSON
strings, and array and subtransaction searches. Additionally, PostgreSQL 16
introduces [load 
balancing](https://www.postgresql.org/docs/16/libpq-connect.html#LIBPQ-CONNECT-LOAD-BALANCE-HOSTS)
to libpq, the client library for PostgreSQL.

### Logical Replication Enhancements

Logical replication lets PostgreSQL users stream data in real-time to other
PostgreSQL or other external systems that implement the logical protocol. Until
PostgreSQL 16, users could only create logical replication publishers on primary
instances. PostgreSQL 16 adds the ability to perform logical decoding on a
standby instance, giving users more options to distribute their workload, for
example, use a standby that's less busy than a primary to logically replicate
changes.

PostgreSQL 16 also includes several performance improvements to logical
replication. This includes allowing the subscriber to apply large transactions
in parallel, use indexes other than the `PRIMARY KEY` to perform lookups during
`UPDATE` or `DELETE` operations, and allow for tables to be copied using binary
format during initialization.

### Developer Experience

PostgreSQL 16 continues to implement the 
[SQL/JSON](https://www.postgresql.org/docs/16/functions-json.html)
standard for manipulating 
[JSON](https://www.postgresql.org/docs/16/datatype-json.html)
data, including support for SQL/JSON constructors (e.g. `JSON_ARRAY()`,
`JSON_ARRAYAGG()` et al), and identity functions (`IS JSON`). This release also
adds the SQL standard 
[`ANY_VALUE`](https://www.postgresql.org/docs/16/functions-aggregate.html#id-1.5.8.27.5.2.4.1.1.1.1)
aggregate function, which returns any arbitrary value from the aggregate set.
For convenience, PostgreSQL 16 now lets you specify non-decimal integer
literals, such as `0xff`, `0o777`, and `0b101010`, and use underscores as
thousands separators, such as `5_432`.

This release adds support for the ext

Re: PG 16 draft release notes ready

2023-05-24 Thread Jonathan S. Katz

On 5/24/23 12:13 AM, David Rowley wrote:

On Wed, 24 May 2023 at 15:54, Bruce Momjian  wrote:


On Wed, May 24, 2023 at 08:37:45AM +1200, David Rowley wrote:

On Mon, 22 May 2023 at 07:05, Jonathan S. Katz  wrote:

* Parallel execution of queries that use `FULL` and `OUTER` joins


I think this should be `RIGHT` joins rather than `OUTER` joins.

LEFT joins have been parallelizable I think for a long time now.


Well, since we can swap left/right easily, why would we not have just
have swappted the tables and done the join in the past?  I think there
are two things missing in my description.

First, I need to mention parallel _hash_ join.  Second, I think this
item is saying that the _inner_ side of a parallel hash join can be an
OUTER or FULL join.  How about?

 Allow hash joins to be parallelized where the inner side is
 processed as an OUTER or FULL join (Melanie Plageman, Thomas Munro)

In this case, the inner side is the hashed side.


I think Jonathan's text is safe to swap OUTER to RIGHT as it mentions
"execution".


I made this swap in the release announcement. Thanks!

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: PG 16 draft release notes ready

2023-05-23 Thread Jonathan S. Katz

On 5/22/23 4:18 PM, Robert Haas wrote:

On Sun, May 21, 2023 at 3:05 PM Jonathan S. Katz  wrote:

* Support for regular expressions for matching usernames and databases
names in `pg_hba.conf`, and user names in `pg_ident.conf`


I suggest that this is not a major feature.

Perhaps the work that I did to improve CREATEROLE could be considered
for inclusion in the major features list. In previous releases,
someone with CREATEROLE can hack the PG OS account. Now they can't. In
previous releases, someone with CREATEROLE can manage all
non-superuser roles, but now they can manage the roles they create (or
ones they are given explicit authority to manage). You can even
control whether or not such users automatically inherit the privileges
of roles they create, as superusers inherit all privileges. There is
certainly some argument that this is not a sufficiently significant
set of changes to justify a major feature mention, and even if it is,
it's not clear to me exactly how it would be best worded. And yet I
feel like it's very likely that if we look back on this release in 3
years, those changes will have had a significant impact on many
PostgreSQL deployments, above all in the cloud, whereas I think it
likely that the ability to have regular expressions in pg_hba.conf and
pg_ident.conf will have had very little effect by comparison.

Of course, there is always a possibility that I'm over-estimating the
impact of my own work.


In general, I'm completely fine with people advocating for their own 
features during this process, in case there's something that I missed.


For this case, while I think this work is very impactful, but I don't 
know if I'd call it a major feature vs. modifying an unintended 
behavior. Additionally, folks have likely put mitigations in place for 
this through the years. I'm happy to be convinced otherwise.


The regular expressions in the files adds an ability that both we didn't 
have before, and has been a request I've heard from users with very 
large deployments. For them, it'll help simplify a lot of their 
configurations/automations for setting this up en masse. Again, I'm 
happy to be convinced otherwise.


I wanted to use the beta release to allow for us to see 1/ how people 
ultimately test these things and 2/ help better sift out what will be 
called a major feature. We could end up shuffling items in the list or 
completely rewriting it, so it's not set in stone.


Thanks,

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: PG 16 draft release notes ready

2023-05-23 Thread Jonathan S. Katz

On 5/23/23 4:37 PM, David Rowley wrote:

On Mon, 22 May 2023 at 07:05, Jonathan S. Katz  wrote:

* Parallel execution of queries that use `FULL` and `OUTER` joins


I think this should be `RIGHT` joins rather than `OUTER` joins.

LEFT joins have been parallelizable I think for a long time now.


I had grabbed it from this line:

  Allow outer and full joins to be performed in parallel (Melanie 
Plageman, Thomas Munro)


If we want to be specific on RIGHT joins, I can update it in the release 
announcement, but it may be too late for the release notes (at least for 
beta 1).


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: PG 16 draft release notes ready

2023-05-21 Thread Jonathan S. Katz

On 5/21/23 3:04 PM, Jonathan S. Katz wrote:

On 5/18/23 4:49 PM, Bruce Momjian wrote:

I have completed the first draft of the PG 16 release notes.


One thing that we could attempt for this beta is to include a 
prospective list of "major features + enhancements." Of course it can 
change before the GA, but it'll give readers some idea of things to test.


I'd propose the following (in no particular order):

* General performance improvements for read-heavy workloads (looking for 
clarification for that in[1])


Per [1] this sounds like it should be:

* Optimization to reduce overall memory usage, including general 
performance improvements.


We can get more specific for the GA.

Thanks,

Jonathan

[1] 
https://www.postgresql.org/message-id/5749E807-A5B7-4CC7-8282-84F6F0D4D1D0%40anarazel.de




OpenPGP_signature
Description: OpenPGP digital signature


Re: PG 16 draft release notes ready

2023-05-21 Thread Jonathan S. Katz

On 5/21/23 3:24 PM, Andres Freund wrote:

Hi,

On May 21, 2023 11:46:56 AM PDT, "Jonathan S. Katz"  
wrote:

On 5/21/23 1:13 PM, Andres Freund wrote:



Looking through the release notes, I didn't see an entry for

commit c6e0fe1f2a08505544c410f613839664eea9eb21
Author: David Rowley 
Date:   2022-08-29 17:15:00 +1200
Improve performance of and reduce overheads of memory management

even though I think that's one of the more impactful improvements. What was
the reason for leaving that out?


IIUC in[1], would this "just speed up" read-heavy workloads?


I don't think so. It can speed up write workloads as well. But more importantly 
it can noticeably reduce memory usage, including for things like the relcache.


Cool! I'll dive more into the thread later to learn more.

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: PG 16 draft release notes ready

2023-05-21 Thread Jonathan S. Katz

On 5/18/23 4:49 PM, Bruce Momjian wrote:

I have completed the first draft of the PG 16 release notes.


One thing that we could attempt for this beta is to include a 
prospective list of "major features + enhancements." Of course it can 
change before the GA, but it'll give readers some idea of things to test.


I'd propose the following (in no particular order):

* General performance improvements for read-heavy workloads (looking for 
clarification for that in[1])


* Parallel execution of queries that use `FULL` and `OUTER` joins

* Logical replication allowed from read-only standbys

* Logical replication subscribers can apply large transactions in parallel

* Monitoring of I/O statistics through the `pg_stat_io` view

* Addition of SQL/JSON constructors and identity functions

* Optimizations to the vacuum freezing strategy

* Support for regular expressions for matching usernames and databases 
names in `pg_hba.conf`, and user names in `pg_ident.conf`


The above is tossing items at the wall, and I'm OK with any or all being 
modified or replaced.


Thanks,

Jonathan

[1] 
https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=v...@mail.gmail.com


OpenPGP_signature
Description: OpenPGP digital signature


Re: PG 16 draft release notes ready

2023-05-21 Thread Jonathan S. Katz

On 5/21/23 1:13 PM, Andres Freund wrote:



Looking through the release notes, I didn't see an entry for

commit c6e0fe1f2a08505544c410f613839664eea9eb21
Author: David Rowley 
Date:   2022-08-29 17:15:00 +1200
  
 Improve performance of and reduce overheads of memory management


even though I think that's one of the more impactful improvements. What was
the reason for leaving that out?


IIUC in[1], would this "just speed up" read-heavy workloads?

Thanks,

Jonathan

[1] 
https://www.postgresql.org/message-id/CAApHDvpjauCRXcgcaL6%2Be3eqecEHoeRm9D-kcbuvBitgPnW%3Dvw%40mail.gmail.com




OpenPGP_signature
Description: OpenPGP digital signature


Re: PostgreSQL 16 Beta 1 release announcement draft

2023-05-21 Thread Jonathan S. Katz

On 5/19/23 12:17 AM, Jonathan S. Katz wrote:

Hi,

Attached is a draft of the release announcement for PostgreSQL 16 Beta 
1. The goal of this announcement is to get people excited about testing 
the beta and highlight many of the new features.


Please review for inaccuracies, omissions, and other suggestions / errors.

Please provide feedback no later than May 24, 0:00 AoE. This will give 
me enough time to incorporate the changes prior to the release the next 
day.


Thanks everyone for your feedback. Here is the updated text that 
combines all of the feedback from both -advocacy and -hackers.


Thanks,

Jonathan
The PostgreSQL Global Development Group announces that the first beta release of
PostgreSQL 16 is now [available for 
download](https://www.postgresql.org/download/).
This release contains previews of all features that will be available when
PostgreSQL 16 is made generally available, though some details of the release
can change during the beta period.

You can find information about all of the features and changes found in
PostgreSQL 16 in the [release 
notes](https://www.postgresql.org/docs/16/release-16.html):

  
[https://www.postgresql.org/docs/16/release-16.html](https://www.postgresql.org/docs/16/release-16.html)

In the spirit of the open source PostgreSQL community, we strongly encourage you
to test the new features of PostgreSQL 16 on your systems to help us eliminate
bugs or other issues that may exist. While we do not advise you to run
PostgreSQL 16 Beta 1 in production environments, we encourage you to find ways
to run your typical application workloads against this beta release.

Your testing and feedback will help the community ensure that the PostgreSQL 16
release upholds our standards of delivering a stable, reliable release of the
world's most advanced open source relational database. Please read more about
our [beta testing process](https://www.postgresql.org/developer/beta/) and how
you can contribute:

  
[https://www.postgresql.org/developer/beta/](https://www.postgresql.org/developer/beta/)

PostgreSQL 16 Feature Highlights


### Logical replication enhancements

Logical replication lets PostgreSQL users stream data in real-time to other
PostgreSQL or other external systems that implement the logical protocol. Until
PostgreSQL 16, users could only create logical replication publishers on primary
instances. PostgreSQL 16 adds the ability to perform logical decoding on a
standby instance, giving users more options to distribute their workload, for
example, use a standby that's less busy than a primary to logically replicate
changes.

PostgreSQL 16 also includes several performance improvements to logical
replication. This includes allowing the subscriber to apply large transactions
in parallel, use indexes other than the `PRIMARY KEY` to perform lookups during
`UPDATE` or `DELETE` operations, and allow for tables to be copied using binary
format during initialization.

### Performance

PostgreSQL 16 includes performance improvements in query execution. This release
adds more query parallelism, including allowing `FULL` and `OUTER` joins to
execute in parallel, and parallel execution of the `string_agg` and `array_agg`
aggregate functions. Additionally, PostgreSQL 16 can use incremental sorts in
`SELECT DISTINCT` queries. There are also several optimizations for
[window 
queries](https://www.postgresql.org/docs/devel/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS),
improvements in lookups for `RANGE` and `LIST` partitions, and support for
"anti-joins" in `RIGHT` and `OUTER` queries.

This release also introduces support for CPU acceleration using SIMD for both
x86 and ARM architectures, including optimizations for processing ASCII and JSON
strings, and array and substransaction searches. Additionally, PostgreSQL 16
introduces [load 
balancing](https://www.postgresql.org/docs/devel/libpq-connect.html#LIBPQ-CONNECT-LOAD-BALANCE-HOSTS)
to libpq, the client library for PostgreSQL.

### Developer Experience

PostgreSQL 16 continues to implement the 
[SQL/JSON](https://www.postgresql.org/docs/devel/functions-json.html)
standard for manipulating 
[JSON](https://www.postgresql.org/docs/devel/datatype-json.html)
data, including support for SQL/JSON constructors (e.g. `JSON_ARRAY()`,
`JSON_ARRAYAGG()` et al), and identity functions (`IS JSON`). This release also
adds the SQL standard 
[`ANY_VALUE`](https://www.postgresql.org/docs/devel/functions-aggregate.html#id-1.5.8.27.5.2.4.1.1.1.1)
aggregate function, which returns any arbitrary value from the aggregate set.
For convenience, PostgreSQL 16 now lets you specify non-decimal integer
literals, such as `0xff`, `0o777`, and `0b101010`, and use thousands separators,
such as `5_432`.

This release adds support for the extended query protocol to the 
[`psql`](https://www.postgresql.org/docs/devel/app-psql.html)
client. Users can execute a query, e.g. `SELECT $1 + $2`, and use the
[`\bi

Re: PostgreSQL 16 Beta 1 release announcement draft

2023-05-21 Thread Jonathan S. Katz

On 5/19/23 10:57 AM, Nathan Bossart wrote:

On Fri, May 19, 2023 at 12:17:50AM -0400, Jonathan S. Katz wrote:



PostgreSQL 16 continues to give users to the ability grant privileged access to
features without requiring superuser with new
[predefined roles](https://www.postgresql.org/docs/devel/predefined-roles.html).
These include `pg_maintain`, which enables execution of operations such as
`VACUUM`, `ANALYZE`, `REINDEX`, and others, and `pg_createsubscription`, which
allows users to create a logical replication subscription. Additionally,
starting with release, logical replication subscribers execute transactions on a
table as the table owner, not the superuser.


[pg_use_]reserved_connections might also deserve a mention here.  AFAICT
it's the only new predefined role that isn't mentioned in the announcement.
I'm okay with leaving it out if folks don't think it should make the cut.


I'm not sure how widely used this one would be, so I left it out. 
However, open to other opinions.


Thanks,

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: PostgreSQL 16 Beta 1 release announcement draft

2023-05-21 Thread Jonathan S. Katz

On 5/19/23 1:42 AM, Erik Rijkers wrote:

Op 5/19/23 om 06:17 schreef Jonathan S. Katz:

Hi,

Attached is a draft of the release announcement for PostgreSQL 16 Beta 


Hi,


The usual small fry:


'continues to to'  should be
'continues to'

'continues to give users to the ability'  should be
'continues to give users the ability to'

'pg_createsubscription'  should be
'pg_create_subscription'

'starting with release'  should be
'starting with this release'

'credentials to connected to other services'  should be
'credentials to connect to other services'


Thanks Erik. I made all of these changes and will upload them in the 
next review.


Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


PostgreSQL 16 Beta 1 release announcement draft

2023-05-18 Thread Jonathan S. Katz

Hi,

Attached is a draft of the release announcement for PostgreSQL 16 Beta 
1. The goal of this announcement is to get people excited about testing 
the beta and highlight many of the new features.


Please review for inaccuracies, omissions, and other suggestions / errors.

Please provide feedback no later than May 24, 0:00 AoE. This will give 
me enough time to incorporate the changes prior to the release the next day.


Thanks,

Jonathan
The PostgreSQL Global Development Group announces that the first beta release of
PostgreSQL 16 is now [available for 
download](https://www.postgresql.org/download/).
This release contains previews of all features that will be available when
PostgreSQL 16 is made generally available, though some details of the release
can change during the beta period.

You can find information about all of the features and changes found in
PostgreSQL 16 in the [release 
notes](https://www.postgresql.org/docs/16/release-16.html):

  
[https://www.postgresql.org/docs/16/release-16.html](https://www.postgresql.org/docs/16/release-16.html)

In the spirit of the open source PostgreSQL community, we strongly encourage you
to test the new features of PostgreSQL 16 on your systems to help us eliminate
bugs or other issues that may exist. While we do not advise you to run
PostgreSQL 16 Beta 1 in production environments, we encourage you to find ways
to run your typical application workloads against this beta release.

Your testing and feedback will help the community ensure that the PostgreSQL 16
release upholds our standards of delivering a stable, reliable release of the
world's most advanced open source relational database. Please read more about
our [beta testing process](https://www.postgresql.org/developer/beta/) and how
you can contribute:

  
[https://www.postgresql.org/developer/beta/](https://www.postgresql.org/developer/beta/)

PostgreSQL 16 Feature Highlights


### Logical replication enhancements

Logical replication lets PostgreSQL users stream data in real-time to other
PostgreSQL or other external systems that implement the logical protocol. Until
PostgreSQL 16, users could only create logical replication publishers on primary
instances. PostgreSQL 16 adds the ability to perform logical decoding on a
standby instance, given users more options to distribute their workload, for
example, use a standby that's less busy than a primary to logically replicate
changes.

PostgreSQL 16 also includes several performance improvements to logical
replication. This includes allowing the subscriber to apply large transactions
in parallel, use indexes other than the `PRIMARY KEY` to perform lookups during
`UPDATE` or `DELETE` operations, and allow for tables to be copied using binary
format during initialization.

### Performance

PostgreSQL 16 includes performance improvements in query execution. This release
adds more query parallelism, including allowing `FULL` and `OUTER` joins to
execute in parallel, and parallel execution of the `string_agg` and `array_agg`
aggregate functions. Additionally, PostgreSQL 16 can use incremental sorts in
`SELECT DISTINCT` queries. There are also several optimizations for
[window 
queries](https://www.postgresql.org/docs/devel/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS),
improvements in lookups for `RANGE` and `LIST` partitions, and support for
"anti-joins" in `RIGHT` and `OUTER` queries.

This release also introduces support for CPU acceleration using SIMD for both
x86 and ARM architectures, including optimizations for processing ASCII and JSON
strings, and array and substransaction searches. Additionally, PostgreSQL 16
introduces [load 
balancing](https://www.postgresql.org/docs/devel/libpq-connect.html#LIBPQ-CONNECT-LOAD-BALANCE-HOSTS)
to libpq, the client library for PostgreSQL.

### Developer Experience

PostgreSQL 16 continues to to implement the 
[SQL/JSON](https://www.postgresql.org/docs/devel/functions-json.html)
standard for manipulating 
[JSON](https://www.postgresql.org/docs/devel/datatype-json.html)
data, including support for SQL/JSON constructors (e.g. `JSON_ARRAY()`,
`JSON_ARRAYAGG()` et al), and identity functions (`IS JSON`). This release also
adds the SQL standard 
[`ANY_VALUE`](https://www.postgresql.org/docs/devel/functions-aggregate.html#id-1.5.8.27.5.2.4.1.1.1.1)
aggregate function, which returns any arbitrary value from the aggregate set.
For convenience, PostgreSQL 16 now lets you specify non-decimal integer
literals, such as `0xff`, `0o777`, and `0b101010`, and use thousands separators,
such as `5_432`.

This release adds support for the extended query protocol to the 
[`psql`](https://www.postgresql.org/docs/devel/app-psql.html)
client. Users can execute a query, e.g. `SELECT $1 + $2`, and use the
[`\bind`](https://www.postgresql.org/docs/devel/app-psql.html#APP-PSQL-META-COMMAND-BIND)
command to substitute the variables.

### Security features

PostgreSQL 16 continues to give users to the ability 

Re: PG 16 draft release notes ready

2023-05-18 Thread Jonathan S. Katz

On 5/18/23 4:49 PM, Bruce Momjian wrote:

I have completed the first draft of the PG 16 release notes.  You can
see the output here:

https://momjian.us/pgsql_docs/release-16.html

I will adjust it to the feedback I receive;  that URL will quickly show
all updates.


Still reading, but saw this:

  Allow incremental sorts in more cases, including DISTINCT (David 
Rowley)window


I didn't realize we had a DISTINCT (David Rowley) window, but it sounds 
like an awesome feature ;)


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: PG 16 draft release notes ready

2023-05-18 Thread Jonathan S. Katz

On 5/18/23 4:49 PM, Bruce Momjian wrote:

I have completed the first draft of the PG 16 release notes.  You can
see the output here:

https://momjian.us/pgsql_docs/release-16.html

I will adjust it to the feedback I receive;  that URL will quickly show
all updates.


Thanks for going through this. The release announcement draft will 
follow shortly after in a different thread.



I learned a few things creating it this time:

*  I can get confused over C function names and SQL function names in
commit messages.

*  The sections and ordering of the entries can greatly clarify the
items.

*  The feature count is slightly higher than recent releases:

release-10:  189
release-11:  170
release-12:  180
release-13:  178
release-14:  220
release-15:  184
-->  release-16:  200


This definitely feels like a very full release. Personally I'm very excited.

Thanks,

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: Order changes in PG16 since ICU introduction

2023-05-18 Thread Jonathan S. Katz

On 5/18/23 1:55 PM, Jeff Davis wrote:

On Wed, 2023-05-17 at 19:59 -0400, Jonathan S. Katz wrote:

I did a quicker read through this time. LGTM overall. I like what you
did with the explanations around sensitivity (now it makes sense).


Committed, thank you.


\o/


There are a few things I don't understand that would be good to
document better:

* Rules. I still don't quite understand the use case: are these for
people inventing new languages? What is a plausible use case that isn't
covered by the existing locales and collation settings? Do rules make
sense for a database default collation? Are they for language experts
only or might an ordinary developer benefit from using them?


From my read of them, as an app developer I'd be very unlikely to use 
this. Maybe there is something with building out some collation rules 
vis-a-vis an extension, but I have trouble imagining the use-case. I may 
also not be the target audience for this feature.



* The collation types "phonebk", "emoji", etc.: are these variants of
particular locales, or do they make sense in multiple locales? I don't
know where they fit in or how to document them.


I remember I had a exploratory use case for "phonebk" but I couldn't 
figure out how to get it to work. AIUI from random searching, the idea 
is that it provides the "phonebook" rules for ordering "names" in a 
particular locale, but I couldn't get it to work.



* I don't understand what "kc" means if "ks" is not set to "level1".


Me neither, but I haven't stared at this as hard as others.

Thanks,

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: psql: Add role's membership options to the \du+ command

2023-05-17 Thread Jonathan S. Katz

On 5/7/23 3:14 PM, Pavel Luzanov wrote:

On 05.05.2023 19:51, David G. Johnston wrote:
But if it is really a blocker then maybe we should produce 3 separate 
newline separated columns, one for the member of role, one for the 
list of attributes, and one with the grantor.  The column headers can 
be translated more easily as single nouns.  The readability quite 
probably would end up being equivalent (maybe even better) in tabular 
form instead of sentence form.


Just to visualize this approach. Below are the output for the tabular 
form and the sentence form from last patch version (sql script attached):


Tabular form rolname  | memberof |   options   
| grantor 
--+--+-+-- postgres |  | |  regress_du_admin | regress_du_role0+| admin, inherit, set+| postgres    +  | regress_du_role1+| admin, inherit, set+| postgres    +  | regress_du_role2 | admin, inherit, set | postgres regress_du_role0 |  | |  regress_du_role1 | regress_du_role0+| admin, inherit, set+| regress_du_admin+  | regress_du_role0+| inherit    +| regress_du_role1+  | regress_du_role0 | set | regress_du_role2 regress_du_role2 | regress_du_role0+| admin  +| regress_du_admin+  | regress_du_role0+| inherit, set   +| regress_du_role1+  | regress_du_role0+| empty  +| regress_du_role2+  | regress_du_role1 | admin, set  | regress_du_admin(5 rows)Sentence form from patch v7 rolname  |   memberof --+-- postgres |  regress_du_admin | regress_du_role0 from postgres (admin, inherit, set)    +  | regress_du_role1 from postgres (admin, inherit, set)    +  | regress_du_role2 from postgres (admin, inherit, set) regress_du_role0 |  regress_du_role1 | regress_du_role0 from regress_du_admin (admin, inherit, set)+  | regress_du_role0 from regress_du_role1 (inherit)    +  | regress_du_role0 from regress_du_role2 (set) regress_du_role2 | regress_du_role0 from regress_du_admin (admin)  +  | regress_du_role0 from regress_du_role1 (inherit, set)   +  | regress_du_role0 from regress_du_role2 (empty)  +  | regress_du_role1 from regress_du_admin (admin, set)(5 rows)  


The tabular form solves the latest patch translation problems mentioned by 
Kyotaro.
But it requires mapping elements between 3 array-like columns.

To move forward, needs more opinions?


[RMT Hat]

Nudging this along, as it's an open item. It'd be good to get this 
resolved before Beta 1, but that may be tough at this point.


[Personal hat]

I'm probably not the target user for this feature, so I'm not sure how 
much you should weigh my opinion (e.g. I still don't agree with 
explicitly showing "empty", but as mentioned, I'm not the target user).


That said, from a readability standpoint, it was easier for me to follow 
the tabular form vs. the sentence form.


Thanks,

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: Order changes in PG16 since ICU introduction

2023-05-17 Thread Jonathan S. Katz

On 5/17/23 6:59 PM, Jeff Davis wrote:

On Tue, 2023-05-16 at 20:23 -0700, Jeff Davis wrote:

Other than that, and I took your suggestions almost verbatim. Patch
attached. Thank you!


Attached new patch with a typo fix and a few other edits. I plan to
commit soon.


I did a quicker read through this time. LGTM overall. I like what you 
did with the explanations around sensitivity (now it makes sense).


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: Possible regression setting GUCs on \connect

2023-05-17 Thread Jonathan S. Katz

On 5/17/23 1:30 PM, Alexander Korotkov wrote:

Tom,

On Wed, May 17, 2023 at 3:08 PM Tom Lane  wrote:

Amit Kapila  writes:

Tom/Nathan, do you have any further suggestions here?


My recommendation is to revert this feature.  I do not see any
way that we won't regret it as a poor design.


I have carefully noted your concerns regarding the USER SET patch that
I've committed.  It's clear that you have strong convictions about
this, particularly in relation to your plan of storing the setter role
OID in pg_db_role_setting.

I want to take a moment to acknowledge the significance of your
perspective and I respect that you have a different view on this
matter.  Although I have not yet had the opportunity to see the
feasibility of your approach, I am open to understanding it further.

Anyway, I don't want to do anything counter-productive.  So, I've
taken the decision to revert the USER SET patch for the time being.


Thanks Alexander. I know reverting a feature is not easy and appreciate 
you taking the time to work through this discussion.



I'm looking forward to continuing working with you on this subject for v17.


+1; I think everyone agrees there is a feature here that will be helpful 
to our users.


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


  1   2   3   4   5   6   7   >