Re: Logging which local address was connected to in log_line_prefix

2024-05-24 Thread Greg Sabino Mullane
Peter, thank you for the feedback. Attached is a new patch with "address"
rather than "interface", plus a new default of "local" if there is no
address. I also removed the questionable comment, and updated the
commitfest title.

Cheers,
Greg
From bfa69fc2fffcb29dee0c6acfa4fc3749f987b272 Mon Sep 17 00:00:00 2001
From: Greg Sabino Mullane 
Date: Fri, 24 May 2024 11:25:48 -0400
Subject: [PATCH] Add local address to log_line_prefix

---
 doc/src/sgml/config.sgml  |  5 
 src/backend/utils/error/elog.c| 25 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 3 files changed, 31 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 698169afdb..d0b5e4d9ea 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7470,6 +7470,11 @@ local0.*/var/log/postgresql
  Remote host name or IP address
  yes
 
+
+ %L
+ Local address
+ yes
+
 
  %b
  Backend type
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index d91a85cb2d..b1525d901c 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -67,6 +67,7 @@
 #endif
 
 #include "access/xact.h"
+#include "common/ip.h"
 #include "libpq/libpq.h"
 #include "libpq/pqformat.h"
 #include "mb/pg_wchar.h"
@@ -79,6 +80,7 @@
 #include "storage/ipc.h"
 #include "storage/proc.h"
 #include "tcop/tcopprot.h"
+#include "utils/builtins.h"
 #include "utils/guc_hooks.h"
 #include "utils/memutils.h"
 #include "utils/ps_status.h"
@@ -3023,6 +3025,29 @@ log_status_format(StringInfo buf, const char *format, ErrorData *edata)
 	appendStringInfoSpaces(buf,
 		   padding > 0 ? padding : -padding);
 break;
+			case 'L':
+if (MyProcPort
+	&& (MyProcPort->laddr.addr.ss_family == AF_INET
+		||
+		MyProcPort->laddr.addr.ss_family == AF_INET6)
+	)
+{
+	Port *port = MyProcPort;
+	char local_host[NI_MAXHOST];
+
+	local_host[0] = '\0';
+
+	if (0 == pg_getnameinfo_all(>laddr.addr, port->laddr.salen,
+			   local_host, sizeof(local_host),
+			   NULL, 0,
+			   NI_NUMERICHOST | NI_NUMERICSERV)
+		)
+		appendStringInfo(buf, "%s", local_host);
+}
+else
+	appendStringInfo(buf, "[local]");
+
+break;
 			case 'r':
 if (MyProcPort && MyProcPort->remote_host)
 {
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 83d5df8e46..85a9c59116 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -587,6 +587,7 @@
 	#   %d = database name
 	#   %r = remote host and port
 	#   %h = remote host
+#   %L = local address
 	#   %b = backend type
 	#   %p = process ID
 	#   %P = process ID of parallel group leader
-- 
2.30.2



Re: commitfest.postgresql.org is no longer fit for purpose

2024-05-17 Thread Greg Sabino Mullane
On Fri, May 17, 2024 at 7:11 AM Tomas Vondra 
wrote:

> It does seem to me a part of the solution needs to be helping to get
> those patches reviewed. I don't know how to do that, but perhaps there's
> a way to encourage people to review more stuff, or review stuff from a
> wider range of contributors. Say by treating reviews more like proper
> contributions.


This is a huge problem. I've been in the situation before where I had some
cycles to do a review, but actually finding one to review is
super-difficult. You simply cannot tell without clicking on the link and
wading through the email thread. Granted, it's easy as an
occasional reviewer to simply disregard potential patches if the email
thread is over a certain size, but it's still a lot of work. Having some
sort of summary/status field would be great, even if not everything was
labelled. It would also be nice if simpler patches were NOT picked up by
experienced hackers, as we want to encourage new/inexperienced people, and
having some "easy to review" patches available will help people gain
confidence and grow.


> Long time ago there was a "rule" that people submitting patches are
> expected to do reviews. Perhaps we should be more strict this.
>

Big -1. How would we even be more strict about this? Public shaming?
Withholding a commit?

Cheers,
Greg


Re: Logging which interface was connected to in log_line_prefix

2024-05-01 Thread Greg Sabino Mullane
Thank you for taking the time to review this. I've attached a new rebased
version, which has no significant changes.


> There is a comment in the patch that states:
>
> /* We do not need clean_ipv6_addr here: just report verbatim */
>
> I am not quite sure what it means, but I am guessing it means that the
> patch does not need to format the IPv6 addresses in any specific way.


Yes, basically correct. There is a kluge (their word, not mine) in
utils/adt/network.c to strip the zone - see the comment for the
clean_ipv6_addr() function in that file. I added the patch comment in case
some future person wonders why we don't "clean up" the ipv6 address, like
other places in the code base do. We don't need to pass it back to anything
else, so we can simply output the correct version, zone and all.

Cheers,
Greg


add_local_interface_to_log_line_prefix.v2.patch
Description: Binary data


Re: Tarball builds in the new world order

2024-04-24 Thread Greg Sabino Mullane
On Tue, Apr 23, 2024 at 6:06 PM Tom Lane  wrote:

> This change seems like a good thing anyway for anyone who's tempted
> to use "make dist" manually, since they wouldn't necessarily want
> to package HEAD either.  Now, if we just do it exactly like that
> then trying to "make dist" without setting PG_COMMIT_HASH will
> fail, since "git archive" has no default for its 
> argument.  I can't quite decide if that's a good thing, or if we
> should hack the makefile a little further to allow PG_COMMIT_HASH
> to default to HEAD.
>

Just having it fail seems harsh. What if we had plain "make dist" at least
output a friendly hint about "please specify a hash"? That seems better
than an implicit HEAD default, as they can manually set it to HEAD
themselves per the hint.

Cheers,
Greg


Re: psql: Greatly speed up "\d tablename" when not using regexes

2024-04-10 Thread Greg Sabino Mullane
Patch looks good to me. Great idea overall, that forced regex has always
bugged me.

+   char   *regexChars = "|*+?()[]{}.^$\\";


One super minor optimization is that we technically do not need to scan for
')' and ']'. If they appear without their partner, the query will fail
anyway. :)
Cheers,
Greg


Re: Security lessons from liblzma

2024-04-04 Thread Greg Sabino Mullane
>
> It would be better if we created the required test files as part of the
> test run.  (Why not?  Too slow?)  Alternatively, I have been thinking
> that maybe we could make the output more reproducible by messing with
> whatever random seed OpenSSL uses.  Or maybe use a Python library to
> create the files.  Some things to think about.
>

I think this last idea is the way to go. I've hand-crafted GIF images and
PGP messages in the past; surely we have enough combined brain power around
here to craft our own SSL files? It may even be a wheel that someone has
invented already.

Cheers,
Greg


Re: Reports on obsolete Postgres versions

2024-04-04 Thread Greg Sabino Mullane
On Thu, Apr 4, 2024 at 2:23 PM Bruce Momjian  wrote:

> -end-of-life (EOL) and no longer supported.
> +after its initial release. After this, a final minor version will be
> released
> +and the software will then be unsupported (end-of-life).


Would be a shame to lose the EOL acronym.

+Such upgrades might require manual changes to complete so always read
> +the release notes first.


Proposal:
"Such upgrades might require additional steps, so always read the release
notes first."

I went with frequently-encountered and low risk bugs".
>

But neither of those classifications are really true. Especially the "low
risk" part - I could see various ways a reader could wrongly interpret that.

Cheers,
Greg


Re: On disable_cost

2024-04-03 Thread Greg Sabino Mullane
On Wed, Apr 3, 2024 at 3:21 PM Robert Haas  wrote:

> It's also pretty clear to me that the fact that enable_indexscan
> and enable_indexonlyscan work completely differently from each other
> is surprising at best, wrong at worst, but here again, what this patch
> does about that is not above reproach.


Yes, that is wrong, surely there is a reason we have two vars. Thanks for
digging into this: if nothing else, the code will be better for this
discussion, even if we do nothing for now with disable_cost.

Cheers,
Greg


Re: On disable_cost

2024-04-02 Thread Greg Sabino Mullane
On Mon, Apr 1, 2024 at 7:54 PM Robert Haas  wrote:

> What I think we're mostly doing in the regression tests is shutting
> off every relevant type of plan except one. I theorize that what we
> actually want to do is tell the planner what we do want to happen,
> rather than what we don't want to happen, but we've got this weird set
> of GUCs that do the opposite of that and we're super-attached to them
> because they've existed forever.


So rather than listing all the things we don't want to happen, we need a
way to force (nay, highly encourage) a particular solution. As our costing
is a based on positive numbers, what if we did something like this in
costsize.c?

 Costdisable_cost = 1.0e10;
 Costpromotion_cost = 1.0e10; // or higher or lower, depending on
how strongly we want to "beat" disable_costs effects.
...

if (!enable_seqscan)
startup_cost += disable_cost;
else if (promote_seqscan)
startup_cost -= promotion_cost; // or replace "promote" with
"encourage"?


Cheers,
Greg


Re: Possibility to disable `ALTER SYSTEM`

2024-03-27 Thread Greg Sabino Mullane
>
> The purpose of the setting is to prevent accidental
> modifications via ALTER SYSTEM in environments where


The emphasis on 'accidental' seems a bit heavy here, and odd. Surely, just
"to prevent modifications via ALTER SYSTEM in environments where..." is
enough?

Cheers,
Greg


Re: Adding comments to help understand psql hidden queries

2024-03-22 Thread Greg Sabino Mullane
On Fri, Mar 22, 2024 at 11:39 AM David Christensen 
wrote:

> I think it's easier to keep the widths balanced than constant (patch
> version included here)


Yeah, I'm fine with that, especially because nobody is translating it, nor
are they likely to, to be honest.

Cheers,
Greg


Re: Adding comments to help understand psql hidden queries

2024-03-22 Thread Greg Sabino Mullane
On Thu, Mar 21, 2024 at 6:20 PM Peter Eisentraut 
wrote:

> lines are supposed to align vertically.  With your patch, the first line
> would have variable length depending on the command.
>

Yes, that is a good point. Aligning those would be quite tricky, what if we
just kept a standard width for the closing query? Probably the 24 stars we
currently have to match "QUERY", which it appears nobody has changed for
translation purposes yet anyway. (If I am reading the code correctly, it
would be up to the translators to maintain the vertical alignment).

Cheers,
Greg


Re: Avoiding inadvertent debugging mode for pgbench

2024-03-20 Thread Greg Sabino Mullane
My mistake. Attached please find version 3, which should hopefully make
cfbot happy again.


pgbench.dash.d.or.not.dash.d.v3.patch
Description: Binary data


Re: Possibility to disable `ALTER SYSTEM`

2024-03-20 Thread Greg Sabino Mullane
>
> As a bonus, if that GUC is set, we could even check at server startup that
> all the configuration files are not writable by the postgres user,
> and print a warning or refuse to start up if they are.
>

Ugh, please let's not do this. This was bouncing around in my head last
night, and this is really a quite radical change - especially just to
handle the given ask, which is to prevent a specific command from running.
Not implement a brand new security system. There are so many ways this
could go wrong if we start having separate permissions for some of our
files. In addition to backups and other tools that need to write to the
conf files as the postgres user, what about systems that create a new
cluster automatically e.g. Patroni? It will now need elevated privs just to
create the conf files and assign the new ownership to them. Lots of moving
pieces there and ways things could go wrong. So a big -1 from me, as they
say/ :)

Cheers,
Greg


Re: Avoiding inadvertent debugging mode for pgbench

2024-03-19 Thread Greg Sabino Mullane
Rebased version attached (v2), with another sentence in the sgml to explain
the optional use of -d

Cheers,
Greg


pgbench.dash.d.or.not.dash.d.v2.patch
Description: Binary data


Re: Possibility to disable `ALTER SYSTEM`

2024-03-19 Thread Greg Sabino Mullane
On Tue, Mar 19, 2024 at 12:05 PM Tom Lane  wrote:

> If you aren't willing to build a solution that blocks off mods
> using COPY TO FILE/PROGRAM and other readily-available-to-superusers
> tools (plpythonu for instance), I think you shouldn't bother asking
> for a feature at all.  Just trust your superusers.
>

There is a huge gap between using a well-documented standard tool like
ALTER SYSTEM and going out of your way to modify the configuration files
through trickery. I think we need to only solve the former as in "hey,
please don't do that because your changes will be overwritten"

Cheers,
Greg


Re: Possibility to disable `ALTER SYSTEM`

2024-03-18 Thread Greg Sabino Mullane
Going to agree with Robert Treat here about an extension being a great
solution. I resisted posting earlier as I wanted to see how this all pans
out, but I wrote a quick little POC extension some months ago that does
the disabling and works well (and cannot be easily worked around).

On Mon, Mar 18, 2024 at 4:59 PM Robert Haas  wrote:

> I think that all of this is true except for (c). I think we'd need a
> new hook to make it work.
>

Seems we can just use ProcessUtility and:
if (IsA(parsetree, AlterSystemStmt) { ereport(ERROR, ...

When we know that a feature is
> widely-needed, it's better to have one good implementation of it in
> core than several perhaps not-so-good implementations out of core.
>

Meh, maybe. This one seems pretty dirt simple. Granted, I have expanded my
original POC to allow *some* things to be changed by ALTER SYSTEM, but the
original use case warrants a very small extension.

That allows us to focus all of our efforts on that one implementation
> instead of splitting them across several -- which is the whole selling
> point of open source, really -- and it makes it easier for users who
> want the feature to get access to it.
>

Well, yeah, but they have to wait until version 18 at best, while an
extension can run on any current version and probably be pretty
future-proof as well.

Cheers,
Greg


Re: Reports on obsolete Postgres versions

2024-03-11 Thread Greg Sabino Mullane
On Mon, Mar 11, 2024 at 4:38 PM Bruce Momjian  wrote:

> https://www.postgresql.org/support/versioning/
>
> This web page should correct the idea that "upgrades are more risky than
> staying with existing versions".  Is there more we can do?  Should we have
> a more consistent response for such reporters?
>

It could be helpful to remove this sentence:

"Upgrading to a minor release does not normally require a dump and restore"

While technically true, "not normally" is quite the understatement, as the
true answer is "never" or at least "not in the last few decades". I have a
hard time even imagining a scenario that would require a minor revision to
do a dump and restore - surely, that in itself would warrant a major
release?


> It would be a crazy idea to report something in the logs if a major
> version is run after a certain date, since we know the date when major
> versions will become unsupported.
>

Could indeed be useful to spit something out at startup. Heck, even minor
versions are fairly regular now. Sure would be nice to be able to point a
client at the database and say "See? Even Postgres itself thinks you should
upgrade from 11.3!!" (totally made up example, not at all related to an
actual production system /sarcasm)

Cheers,
Greg


Logging which interface was connected to in log_line_prefix

2024-03-06 Thread Greg Sabino Mullane
Someone on -general was asking about this, as they are listening on
multiple IPs and would like to know which exact one clients were hitting. I
took a quick look and we already have that information, so I grabbed some
stuff from inet_server_addr and added it as part of a "%L" (for 'local
interface'). Quick patch / POC attached.

Cheers,
Greg


add_local_interface_to_log_line_prefix.v1.patch
Description: Binary data


Re: Reducing the log spam

2024-03-06 Thread Greg Sabino Mullane
On Tue, Mar 5, 2024 at 7:55 AM Laurenz Albe 
wrote:

> My experience from the field is that a lot of log spam looks like
>
>   database/table/... "xy" does not exist
>   duplicate key value violates unique constraint "xy"


Forcibly hiding those at the Postgres level seems a heavy hammer for what
is ultimately an application problem.

Tell me about a system that logs different classes of errors to different
log files, and I'm interested again.

Cheers,
Greg


Avoiding inadvertent debugging mode for pgbench

2024-02-29 Thread Greg Sabino Mullane
Attached please find a patch to adjust the behavior of the pgbench program
and make it behave like the other programs that connect to a database
(namely, psql and pg_dump). Specifically, add support for using -d and
--dbname to specify the name of the database. This means that -d can no
longer be used to turn on debugging mode, and the long option --debug must
be used instead.

This removes a long-standing footgun, in which people assume that the -d
option behaves the same as other programs. Indeed, because it takes no
arguments, and because the first non-option argument is the database name,
it still appears to work. However, people then wonder why pgbench is so
darn verbose all the time! :)

This is a breaking change, but fixing it this way seems to have the least
total impact, as the number of people using the debug mode of pgbench is
likely quite small. Further, those already using the long option are
unaffected, and those using the short one simply need to replace '-d' with
'--debug', arguably making their scripts a little more self-documenting in
the process.

Cheers,
Greg


pgbench.dash.d.or.not.dash.d.v1.patch
Description: Binary data


Re: An improved README experience for PostgreSQL

2024-02-28 Thread Greg Sabino Mullane
+1 on the general idea. Maybe make that COPYRIGHT link go to an absolute
URI, like all the other links, in case this file gets copied somewhere?
Perhaps point it to https://www.postgresql.org/about/licence/
Cheers,
Greg


Re: What about Perl autodie?

2024-02-08 Thread Greg Sabino Mullane
>
> 2. Don't wait, migrate them all now.  This would mean requiring
> Perl 5.10.1 or later to run the TAP tests, even in back branches.
>

#2 please. For context, meson did not even exist in 2009.

Cheers,
Greg


Re: What about Perl autodie?

2024-02-07 Thread Greg Sabino Mullane
On Wed, Feb 7, 2024 at 9:05 AM Peter Eisentraut 
wrote:

> I came across the Perl autodie pragma
> (https://perldoc.perl.org/autodie).  This seems pretty useful; is this
> something we can use?  Any drawbacks?  Any minimum Perl version?


Big +1

No drawbacks. I've been using it heavily for many, many years. Came out in
5.10.1,
which should be available everywhere at this point (2009 was the year of
release)

Cheers,
Greg


Adding comments to help understand psql hidden queries

2023-12-11 Thread Greg Sabino Mullane
The use of the --echo-hidden flag in psql is used to show people the way
psql performs its magic for its backslash commands. None of them has more
magic than "\d relation", but it suffers from needing a lot of separate
queries to gather all of the information it needs. Unfortunately, those
queries can get overwhelming and hard to figure out which one does what,
especially for those not already very familiar with the system catalogs.
Attached is a patch to add a small SQL comment to the top of each SELECT
query inside describeOneTableDetail. All other functions use a single
query, and thus need no additional context. But "\d mytable" has the
potential to run over a dozen SQL queries! The new format looks like this:

/ QUERY */
/* Get information about row-level policies */
SELECT pol.polname, pol.polpermissive,
  CASE WHEN pol.polroles = '{0}' THEN NULL ELSE
pg_catalog.array_to_string(array(select rolname from pg_catalog.pg_roles
where oid = any (pol.polroles) order by 1),',') END,
  pg_catalog.pg_get_expr(pol.polqual, pol.polrelid),
  pg_catalog.pg_get_expr(pol.polwithcheck, pol.polrelid),
  CASE pol.polcmd
WHEN 'r' THEN 'SELECT'
WHEN 'a' THEN 'INSERT'
WHEN 'w' THEN 'UPDATE'
WHEN 'd' THEN 'DELETE'
END AS cmd
FROM pg_catalog.pg_policy pol
WHERE pol.polrelid = '134384' ORDER BY 1;
//

Cheers,
Greg


psql.echo.hidden.comments.v1.patch
Description: Binary data


Adding ordering to list of available extensions

2023-10-30 Thread Greg Sabino Mullane
Please find attached a patch to provide some basic ordering to the system
views pg_available_extensions and pg_available_extension_versions. It is
sorely tempting to add ORDER BYs to many of the other views in that file,
but I understand that would be contentious as there are reasons for not
adding an ORDER BY. However, in the case of pg_available_extensions, it's a
very, very small resultset, with an obvious default ordering, and extremely
unlikely to be a part of a larger complex query. It's much more likely
people like myself are just doing a "SELECT * FROM pg_available_extensions"
and then get annoyed at the random ordering.

Cheers,
Greg


show_extensions_in_natural_order.pg.patch
Description: Binary data


Re: [PATCH] Add inline comments to the pg_hba_file_rules view

2023-09-26 Thread Greg Sabino Mullane
Also a reluctant -1, as the comment-at-EOL style is very rare in my
experience over the years of seeing many a pg_hba file.


Re: Make --help output fit within 80 columns per line

2023-09-18 Thread Greg Sabino Mullane
On Fri, Sep 15, 2023 at 11:11 AM torikoshia 
wrote:

> I do not intend to adhere to this rule(my terminals are usually bigger
> than 80 chars per line), but wouldn't it be a not bad direction to use
> 80 characters for all commands?
>

Well, that's the question du jour, isn't it? The 80 character limit is
based on punch cards, and really has no place in modern systems. While gnu
systems are stuck in the past, many other ones have moved on to more
sensible defaults:

$ wget --help | wc -L
110

$ gcloud --help | wc -L
122

$ yum --help | wc -L
122

git is an interesting one, as they force things through a pager for their
help, but if you look at their raw help text files, they have plenty of
times they go past 80 when needed:

$ wc -L git/Documentation/git-*.txt | sort -g | tail -20
109 git-filter-branch.txt
109 git-rebase.txt
116 git-diff-index.txt
116 git-http-fetch.txt
117 git-restore.txt
122 git-checkout.txt
122 git-ls-tree.txt
129 git-init-db.txt
131 git-push.txt
132 git-update-ref.txt
142 git-maintenance.txt
144 git-interpret-trailers.txt
146 git-cat-file.txt
148 git-repack.txt
161 git-config.txt
162 git-notes.txt
205 git-stash.txt
251 git-submodule.txt

So in summary, I think 80 is a decent soft limit, but let's not stress out
about some lines going over that, and make a hard limit of perhaps 120.

See also: https://hilton.org.uk/blog/source-code-line-length

Cheers,
Greg

P.S. I know this won't change anything right away, but it will get the
conversation started, so we can escape the inertia of punch cards / VT100
terminals someday. :)


Re: Make --help output fit within 80 columns per line

2023-09-13 Thread Greg Sabino Mullane
On Tue, Jul 4, 2023 at 9:47 PM torikoshia 
wrote:

> Since it seems preferable to have consistent line break policy and some
> people use 80-column terminal, wouldn't it be better to make all commands
> in 80
> columns per line?
>

All this seems an awful lot of work to support this mythical 80-column
terminal user. It's 2023, perhaps it's time to widen the default assumption
past 80 characters?

Cheers,
Greg


Re: Possibility to disable `ALTER SYSTEM`

2023-09-13 Thread Greg Sabino Mullane
Seems to be some resistance to getting this in core, so why not just use an
extension? I was able to create a quick POC to do just that. Hook into PG
and look for AlterSystemStmt, throw a "Sorry, ALTER SYSTEM is not currently
allowed" error. Put into shared_preload_libraries and you're done. As a
bonus, works on all supported versions, so no need to wait for Postgres 17
- or Postgres 18/19 given the feature drift this thread is experiencing :)

Cheers,
Greg


Re: Prevent psql \watch from running queries that return no rows

2023-08-22 Thread Greg Sabino Mullane
Thank you for the feedback, everyone. Attached is version 4 of the patch,
featuring a few tests and minor rewordings.

Cheers,
Greg


psql_watch_exit_on_zero_rows_v4.patch
Description: Binary data


Re: Improve pg_stat_statements by making jumble handle savepoint names better

2023-07-25 Thread Greg Sabino Mullane
On Mon, Jul 24, 2023 at 6:46 PM Michael Paquier  wrote:

> Shouldn't this new field be marked as query_jumble_location
>

Yes, it should. I had some trouble getting it to work that way in the first
place, but now I realize it was just my unfamiliarity with this part of the
code. So thanks for the hint: v2 of the patch is much simplified by adding
two attributes to theTransactionStmt node. I've also added some tests per
your suggestion.

Unrelated to this patch, I'm struggling with meson testing. Why doesn't
this update the postgres test binary?:

meson test --suite pg_stat_statements

It runs "ninja" as expected, but it does not put a new
build/tmp_install/home/greg/pg/17/bin/postgres in place until I do a "meson
test"

Cheers,
Greg


savepoint_jumble.v2.patch
Description: Binary data


Improve pg_stat_statements by making jumble handle savepoint names better

2023-07-24 Thread Greg Sabino Mullane
Please find attached a patch to jumble savepoint name, to prevent certain
transactional commands from filling up pg_stat_statements.
This has been a problem with some busy systems that use django, which likes
to wrap everything in uniquely named savepoints. Soon, over 50% of
your pg_stat_statements buffer is filled with savepoint stuff, pushing out
the more useful queries. As each query is unique, it looks like this:

postgres=# select calls, query, queryid from pg_stat_statements where query
~ 'save|release|rollback' order by 2;

 calls |query |   queryid
---+--+--
 1 | release b900150983cd24fb0d6963f7d28e17f7 |  8797482500264589878
 1 | release ed9407630eb1000c0f6b63842defa7de | -9206510099095862114
 1 | rollback | -2049453941623996126
 1 | rollback to c900150983cd24fb0d6963f7d28e17f7 | -5335832667999552746
 1 | savepoint b900150983cd24fb0d6963f7d28e17f7   | -117254996647181
 1 | savepoint c47bce5c74f589f4867dbd57e9ca9f80   |   355123032993044571
 1 | savepoint c900150983cd24fb0d6963f7d28e17f7   | -5921314469994822125
 1 | savepoint d8f8e0260c64418510cefb2b06eee5cd   |  -981090856656063578
 1 | savepoint ed9407630eb1000c0f6b63842defa7de   |   -25952890433218603

As the actual name of the savepoint is not particularly useful, the patch
will basically ignore the savepoint name and allow things to be collapsed:

 calls | query  |   queryid
---++--
 2 | release $1 | -7998168840889089775
 1 | rollback   |  3749380189022910195
 1 | rollback to $1 | -1816677871228308673
 5 | savepoint $1   |  6160699978368237767

Without the patch, the only solution is to keep raising
pg_stat_statements.max to larger and larger values to compensate for the
pollution of the
statement pool.

Cheers,
Greg


savepoint_jumble.v1.patch
Description: Binary data


Re: Forgive trailing semicolons inside of config files

2023-07-11 Thread Greg Sabino Mullane
On Tue, Jul 11, 2023 at 11:04 AM Isaac Morland 
wrote:

> Please, no!
>
> There is no end to accepting sloppy syntax. What next, allow "SET
> random_page_cost = 2.5;" (with or without semicolon) in config files?
>

Well yes, there is an end. A single, trailing semicolon. Full stop. It's
not a slippery slope in which we end up asking the AI parser to interpret
our haikus to derive the actual value. The postgresql.conf file is not some
finicky YAML/JSON beast - we already support some looseness in quoting or
not quoting values, optional whitespace, etc. Think of the trailing
semicolon as whitespace, if you like. You can see from the patch that this
does not replace EOL/EOF.


> I'd be more interested in improvements in visibility of errors. For
> example, maybe if I try to start the server and there is a config file
> problem, I could somehow get a straightforward error message right in the
> terminal window complaining about the line of the configuration which is
> wrong.
>

That ship has long since sailed. We already send a detailed error message
with the line number, but in today's world of "service start", "systemctl
start", and higher level of control such as Patroni and Kubernetes, getting
things to show in a terminal window isn't happening. We can't work around
2>&1.


> Or maybe there could be a "check configuration" subcommand which checks
> the configuration.
>

There are things inside of Postgres once it has started, but yeah,
something akin to visudo would be nice for editing config files.


> But I think it would be way more useful than a potentially never-ending
> series of patches to liberalize the config parser.
>

It's a single semicolon, not a sign of the parser apocalypse. I've no plans
for future enhancements, but if they do no harm and make Postgres more user
friendly, I will support them.

Cheers,
Greg


Forgive trailing semicolons inside of config files

2023-07-11 Thread Greg Sabino Mullane
This has been a long-standing annoyance of mine. Who hasn't done something
like this?:

psql> SET random_page_cost = 2.5;
(do some stuff, realize that rpc was too high)

Let's put that inside of postgresql.conf:

#--
# CUSTOMIZED OPTIONS
#--

# Add settings for extensions here

random_page_cost = 2.5;


Boom! Server will not start. Surely, we can be a little more liberal in
what we accept? Attached patch allows a single trailing semicolon to be
silently discarded. As this parsing happens before the logging collector
starts up, the error about the semicolon is often buried somewhere in a
separate logfile or journald - so let's just allow postgres to start up
since there is no ambiguity about what random_page_cost (or any other GUC)
is meant to be set to.

I also considered doing an additional ereport(LOG) when we find one, but
seemed better on reflection to simply ignore it.

Cheers,
Greg


postgresql.conf_allow_trailing_semicolon.v1.patch
Description: Binary data


Re: Prevent psql \watch from running queries that return no rows

2023-07-05 Thread Greg Sabino Mullane
Thanks for the feedback!

On Wed, Jul 5, 2023 at 5:51 AM Daniel Gustafsson  wrote:

>
> The comment on ExecQueryAndProcessResults() needs to be updated with an
> explanation of what this parameter is.
>

I added a comment in the place where min_rows is used, but not sure what
you mean by adding it to the main comment at the top of the function?  None
of the other args are explained there, even the non-intuitive ones  (e.g.
svpt_gone_p)

-   return cancel_pressed ? 0 : success ? 1 : -1;
> +   return (cancel_pressed || return_early) ? 0 : success ? 1 : -1;
>
> I think this is getting tangled up enough that it should be replaced with
> separate if() statements for the various cases.
>

Would like to hear others weigh in, I think it's still only three states
plus a default, so I'm not convinced it warrants multiple statements yet. :)

+   HELP0("  \\watch [[i=]SEC] [c=N] [m=ROW]\n");
> +   HELP0("  execute query every SEC seconds,
> up to N times\n");
> +   HELP0("  stop if less than ROW minimum
> rows are rerturned\n");
>
> "less than ROW minimum rows" reads a bit awkward IMO, how about calling it
> [m=MIN] and describe as "stop if less than MIN rows are returned"?  Also,
> there
> is a typo: s/rerturned/returned/.
>

Great idea: changed and will attach a new patch

Cheers,
Greg


psql_watch_exit_on_zero_rows_v3.patch
Description: Binary data


Re: Bytea PL/Perl transform

2023-06-22 Thread Greg Sabino Mullane
>
> So I decided to propose a simple transform extension to pass bytea as
> native Perl octet strings.


Quick review, mostly housekeeping things:

* Needs a rebase, minor failure on Mkvcbuild.pm
* Code needs standardized formatting, esp. bytea_plperl.c
* Needs to be meson-i-fied (i.e. add a "meson.build" file)
* Do all of these transforms need to be their own contrib modules? So much
duplicated code across contrib/*_plperl already (and *plpython too for that
matter) ...

Cheers,
Greg


Re: Bypassing shared_buffers

2023-06-17 Thread Greg Sabino Mullane
On Thu, Jun 15, 2023 at 4:16 AM Vladimir Churyukin 
wrote:

> We're trying to see what is the worst performance in terms of I/O, i.e.
>> when the database just started up or the data/indexes being queried are not
>> cached at all.
>
>
You could create new tables that are copies of the existing ones (CREATE
TABLE foo as SELECT * FROM ...), create new indexes, and run a query on
those. Use schemas and search_path to keep the queries the same. No restart
needed! (just potentially lots of I/O, time, and disk space :) Don't forget
to do explain (analyze, buffers) to double check things.


Re: Let's make PostgreSQL multi-threaded

2023-06-08 Thread Greg Sabino Mullane
On Thu, Jun 8, 2023 at 8:44 AM Hannu Krosing  wrote:

> Do we have any statistics for the distribution of our user base ?
>
> My gut feeling says that for performance-critical use the non-Linux is
> in low single digits at best.
>

Stats are probably not possible, but based on years of consulting, as well
as watching places like SO, Slack, IRC, etc. over the years, IMO that's a
very accurate gut feeling. I'd hazard 1% or less for non-Linux systems.

Cheers,
Greg


Re: Prevent psql \watch from running queries that return no rows

2023-06-04 Thread Greg Sabino Mullane
On Sat, Jun 3, 2023 at 5:58 PM Michael Paquier  wrote:

>
> Wouldn't something like a target_rows be more flexible?  You could use
> this parameter with a target number of rows to expect, zero being one
> choice in that.
>

Thank you! That does feel better to me. Please see attached a new v2 patch
that uses
a min_rows=X syntax (defaults to 0). Also added some help.c changes.

Cheers,
Greg


psql_watch_exit_on_zero_rows_v2.patch
Description: Binary data


Prevent psql \watch from running queries that return no rows

2023-06-02 Thread Greg Sabino Mullane
Attached is a patch to allow a new behavior for the \watch command in psql.
When enabled, this instructs \watch to stop running once the query returns
zero rows. The use case is the scenario in which you are watching the
output of something long-running such as pg_stat_progress_create_index, but
once it finishes you don't need thousands of runs showing empty rows from
the view.

This adds a new argument "zero" to the existing i=SEC and c=N arguments

Notes:

* Not completely convinced of the name "zero" (better than
"stop_when_no_rows_returned"). Considered adding a new x=y argument, or
overloading c (c=-1) but neither seemed very intuitive. On the other hand,
it's tempting to stick to a single method moving forward, although this is
a boolean option not a x=y one like the other two.

* Did not update help.c on purpose - no need to make \watch span two lines
there.

* Considered leaving early (e.g. don't display the last empty result) but
seemed better to show the final empty result as an explicit confirmation as
to why it stopped.

* Quick way to test:
select * from pg_stat_activity where backend_start > now() - '20
seconds'::interval;
\watch zero

Cheers,
Greg


psql_watch_exit_on_zero_rows_v1.patch
Description: Binary data


Re: PostgreSQL 14 press release draft

2021-09-22 Thread Greg Sabino Mullane
Some super quick nitpicks; feel free to ignore/apply/laugh off.

...
administrators to deploy their data-backed applications. PostgreSQL
continues to
add innovations on complex data types, including more conveniences for
accessing
JSON and support for noncontiguous ranges of data. This latest release also
adds
to PostgreSQL's trend on improvements for high performance and distributed
data workloads, with advances in support for connection concurrency,
high-write
workloads, query parallelism and logical replication.

>> add innovations on complex data types -> add innovations to? "on" sounds
odd

>> comma after "query parallelism" please

now works. This aligns PostgreSQL with commonly recognized syntax for
retrieving information from JSON data. The subscripting framework added to
PostgreSQL 14 can be generally extended to other nested data structures,
and is
also applied to the `hstore` data type in this release.

>> with commonly recognized syntax -> with the commonly recognized syntax

>> hyperlink hstore?


[Range types](https://www.postgresql.org/docs/14/rangetypes.html), also
first
released in PostgreSQL 9.2, now have support for noncontiguous ranges
through
the introduction of the "[multirange](
https://www.postgresql.org/docs/14/rangetypes.html#RANGETYPES-BUILTIN)".
A multirange is an ordered list of ranges that are nonoverlapping, which
allows
for developers to write simpler queries for dealing with complex sequences
of

>> introduction of the multirange -> introduction of the multirange type

>> which allows for developers to write -> which lets developers write


on the recent improvements to the overall management of B-tree indexes by

>> to the overall management -> to the management

operations. As this is a client-side feature, you can use pipeline mode
with any
modern PostgreSQL database so long as you use the version 14 client.

>> more complicated than "version 14 client", more like - if the
application has explicit
>> support for it and was compiled via libpq against PG 14.
>> Just don't want to overpromise here

[Foreign data wrappers](
https://www.postgresql.org/docs/14/sql-createforeigndatawrapper.html),
which are used for working with federated workloads across PostgreSQL and
other

>> which are used for -> used for

In addition to supporting query parallelism, `postgres_fdw` can now also
bulk
insert data on foreign tables and import table partitions with the
[`IMPORT FOREIGN SCHEMA`](
https://www.postgresql.org/docs/14/sql-importforeignschema.html)
directive.

>> can now also bulk insert -> can now bulk insert

>> on foreign tables and import -> on foreign tables, and can import

PostgreSQL 14 extends its performance gains to the [vacuuming](
https://www.postgresql.org/docs/14/routine-vacuuming.html)
system, including optimizations for reducing overhead from B-Trees.

>> B-Tree or B-tree - pick one (latter used earlier in this doc)

lets you uniquely track a query through several PostgreSQL systems,
including
[`pg_stat_activity`](
https://www.postgresql.org/docs/14/monitoring-stats.html#MONITORING-PG-STAT-ACTIVITY-VIEW
),
[`EXPLAIN VERBOSE`](https://www.postgresql.org/docs/14/sql-explain.html),
and
through several logging functions.

>> "several PostgreSQL systems" sounds weird.

>> "several logging functions" - not sure what this means


parallel queries when using the `RETURN QUERY` directive, and enabling

>> directive -> command

now benefit from incremental sorts, a feature that was introduced in
[PostgreSQL 13](
https://www.postgresql.org/about/news/postgresql-13-released-2077/).

>> that was introduced in -> introduced in

function. This release also adds the SQL conforming
[`SEARCH`](
https://www.postgresql.org/docs/14/queries-with.html#QUERIES-WITH-SEARCH)
and [`CYCLE`](
https://www.postgresql.org/docs/14/queries-with.html#QUERIES-WITH-CYCLE)
directives to help with ordering and cycle detection for recursive
[common table expressions](
https://www.postgresql.org/docs/14/queries-with.html#QUERIES-WITH-RECURSIVE
).

>> directives -> clauses


PostgreSQL 14 makes it convenient to assign read-only and write-only
privileges

>> convenient -> easy

companies and organizations. Built on over 30 years of engineering,
starting at

>> companies -> companies,

>> we claims 25 years at the top, 30 here. That's 5 non-open-source years?
:)


Re: Add option --drop-cascade for pg_dump/restore

2021-08-12 Thread Greg Sabino Mullane
On Wed, Aug 11, 2021 at 10:53 PM Wu Haotian  wrote:

> Maybe we can add checks like "option --clean requires plain text format"?
> If so, should I start a new mail thread for this?
>

Shrug. To me, that seems related enough it could go into the existing
patch/thread.

Cheers,
Greg


Re: make MaxBackends available in _PG_init

2021-08-12 Thread Greg Sabino Mullane
On Wed, Aug 11, 2021 at 10:08 AM Tom Lane  wrote:

> You must not have enabled EXEC_BACKEND properly.  It's a compile-time
> #define that affects multiple modules, so it's easy to get wrong.
> The way I usually turn it on is
>

Thank you. I was able to get it all working, and withdraw any objections to
that bit of the patch. :)

Cheers,
Greg


Re: make MaxBackends available in _PG_init

2021-08-11 Thread Greg Sabino Mullane
On Mon, Aug 9, 2021 at 8:22 PM Bossart, Nathan  wrote:

> > Is this going to get tripped by a call from restore_backend_variables?
>
> I ran 'make check-world' with EXEC_BACKEND with no problems, so I
> don't think so.
>

v3 looks good, but I'm still not sure how to test the bit mentioned above.
I'm not familiar with this part of the code (SubPostmasterMain etc.), but
running make check-world with EXEC_BACKEND does not seem to execute that
code, as I added exit(1) to restore_backend_variables() and the tests still
ran fine. Further digging shows that even though the #ifdef EXEC_BACKEND
path is triggered, no --fork argument was being passed. Is there something
else one needs to provide to force that --fork (see line 189 of
src/backend/main/main.c) when testing?

Cheers,
Greg


Re: Add option --drop-cascade for pg_dump/restore

2021-08-10 Thread Greg Sabino Mullane
On Fri, Jul 16, 2021 at 9:40 AM Tom Lane  wrote:

> That would require pg_restore to try to edit the DROP commands during
> restore, which sounds horribly fragile.  I'm inclined to think that
> supporting this option only during initial dump is safer.
>

Safer, but not nearly as useful. Maybe see what the OP (Wu Haotian) can
come up with as a first implementation?

Cheers,
Greg


Re: Avoid stuck of pbgench due to skipped transactions

2021-08-10 Thread Greg Sabino Mullane
Apologies, just saw this. I found no problems, those "failures" were just
me missing checkboxes on the commitfest interface. +1 on the patch.

Cheers,
Greg


Re: make MaxBackends available in _PG_init

2021-08-09 Thread Greg Sabino Mullane
On Sat, Aug 7, 2021 at 2:01 PM Bossart, Nathan  wrote:

> Here is a rebased version of the patch.
>

Giving this a review.

Patch applies cleanly and `make check` works as of
e12694523e7e4482a052236f12d3d8b58be9a22c

Overall looks very nice and tucks MaxBackends safely away.
I have a few suggestions:

> + size = add_size(size, mul_size(GetMaxBackends(0),
sizeof(BTOneVacInfo)));

The use of GetMaxBackends(0) looks weird - can we add another constant in
there
for the "default" case? Or just have GetMaxBackends() work?


> + GMB_MAX_PREPARED_XACTS = 1 << 0, /* include max_prepared_xacts */
s/include/add in/


>  +typedef enum GMBOption
> +{
> + GMB_MAX_PREPARED_XACTS = 1 << 0, /* include max_prepared_xacts */
> + GMB_NUM_AUXILIARY_PROCS = 1 << 1 /* include NUM_AUXILIARY_PROCS */
> +} GMBOption;

Is a typedef enum really needed? As opposed to something like this style:

#define WL_LATCH_SET (1 << 0)
#define WL_SOCKET_READABLE   (1 << 1)
#define WL_SOCKET_WRITEABLE  (1 << 2)


> -  (MaxBackends + max_prepared_xacts + 1));
> +  (GetMaxBackends(GMB_MAX_PREPARED_XACTS) + 1));

This is a little confusing - there is no indication to the reader that this
is an additive function. Perhaps a little more intuitive name:

> +  (GetMaxBackends(GMB_PLUS_MAX_PREPARED_XACTS) + 1));

However, the more I went through this patch, the more the GetMaxBackends(0)
nagged at me. The vast majority of the calls are with "0". I'd argue for
just having no arguments at all, which removes a bit of code and actually
makes things like this easier to read:

Original change:
> - uint32  TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS +
max_prepared_xacts;
> + uint32  TotalProcs = GetMaxBackends(GMB_NUM_AUXILIARY_PROCS |
GMB_MAX_PREPARED_XACTS);

Versus:
> - uint32  TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS +
max_prepared_xacts;
> + uint32  TotalProcs = GetMaxBackends() + NUM_AUXILIARY_PROCS +
max_prepared_xacts;


> + * This must be called after modules have had the change to alter GUCs in
> + * shared_preload_libraries, and before shared memory size is determined.
s/change/chance/;


> +void
> +SetMaxBackends(int max_backends)
> +{
> + if (MaxBackendsInitialized)
> +  elog(ERROR, "MaxBackends already initialized");

Is this going to get tripped by a call from restore_backend_variables?

Cheers,
Greg


Mention --enable-tap-tests in the TAP section page

2021-07-01 Thread Greg Sabino Mullane
Quick patch to add mention of the need for compiling with
--enable-tap-tests on the TAP section page

https://www.postgresql.org/docs/current/regress-tap.html

Searching about the TAP tests often leads to this page, but there is no
easy link or mention of the fact that the sample invocations will not work
without the special config flag.

Cheers,
Greg


mention-enable-tap.patch
Description: Binary data


Re: Speed up pg_checksums in cases where checksum already set

2021-06-29 Thread Greg Sabino Mullane
On Tue, Jun 29, 2021 at 2:59 AM Michael Paquier  wrote:

> Does that look fine to you?
>

Looks great, I appreciate the renaming.

Cheers,
Greg


Re: Remove useless int64 range checks on BIGINT sequence MINVALUE/MAXVALUE values

2021-06-22 Thread Greg Sabino Mullane
Those code comments look good.

Re: Avoid stuck of pbgench due to skipped transactions

2021-06-22 Thread Greg Sabino Mullane
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, failed
Spec compliant:   not tested
Documentation:not tested

Looks fine to me, as a way of catching this edge case.

Re: Speed up pg_checksums in cases where checksum already set

2021-06-18 Thread Greg Sabino Mullane
On Fri, Jun 18, 2021 at 1:57 AM Michael Paquier  wrote:

> This doc addition is a bit confusing, as it could mean that each file
> has just one single checksum.  We could be more precise, say:
> "When enabling checksums, each relation file block with a changed
> checksum is rewritten in place."
>

Agreed, I like that wording. New patch attached.


> Should we also mention that the sync happens even if no blocks are
> rewritten based on the reasoning of upthread (aka we'd better do the
> final flush as an interrupted pg_checksums may let a portion of the
> files as not flushed)?
>

I don't know that we need to bother: the default is already to sync and one
has to go out of one's way using the -N argument to NOT sync, so I think
it's a pretty safe assumption to everyone (except those who read my first
version of my patch!) that syncing always happens.

Cheers,
Greg


003.pg_checksums.optimize.writes.and.always.sync.patch
Description: Binary data


Re: Update maintenance_work_mem/autovacuum_work_mem to reflect the 1GB limitation with VACUUM

2021-06-17 Thread Greg Sabino Mullane
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Latest patch looks fine to me, to be clear.

The new status of this patch is: Ready for Committer


Re: Update maintenance_work_mem/autovacuum_work_mem to reflect the 1GB limitation with VACUUM

2021-06-04 Thread Greg Sabino Mullane
Should we say "currently has"?

Re: Speed up pg_checksums in cases where checksum already set

2021-06-02 Thread Greg Sabino Mullane
Newer version attach that adds a small documentation tweak as well.

Cheers,
Greg


pg_checksums.optimize.writes.and.always.sync.patch
Description: Binary data


Re: Speed up pg_checksums in cases where checksum already set

2021-06-02 Thread Greg Sabino Mullane
Fair enough; thanks for the feedback. Attached is a new version that does
an unconditional sync (well, unless do_sync is false, a flag I am not
particularly fond of).

Cheers,
Greg


pg_checksums.optimize.writes.always.sync.patch
Description: Binary data


Re: Update maintenance_work_mem/autovacuum_work_mem to reflect the 1GB limitation with VACUUM

2021-06-01 Thread Greg Sabino Mullane
s/Node/Note/

Other than that, +1 to the patch and +1 to backpatching.

The new status of this patch is: Waiting on Author


Re: Add option --drop-cascade for pg_dump/restore

2021-05-28 Thread Greg Sabino Mullane
Overall the patch looks good, but I did notice a few small things:

1. In pg_dumpall.c, the section  /* Add long options to the pg_dump argument 
list */, we are now 
passing along the --drop-cascade option. However, --clean is not passed in, so 
any call to pg_dumpall using --drop-cascade fails a the pg_dump step. You'll 
note 
that --if-exists it not passed along either; because we are dropping the whole 
database, we don't 
need to have pg_dump worry about dropping objects at all. So I think that 
--drop-cascade should NOT be passed along from pg_dumpall to pg_dump.

2. I'm not even sure if --drop-cascade makes sense for pg_dumpall, as you 
cannot cascade global things like databases and roles.

3. In the file pg_backup_archiver.c, the patch does a 
stmtEnd = strstr(mark + strlen(buffer), ";");" and then spits 
out things "past" the semicolon as the final %s in the appendPQExpBuffer line. 
I'm not clear why: are we expecting more things to appear after the semi-colon? 
Why not just append a "\n" manually as part of the previous %s?

Cheers,
Greg

The new status of this patch is: Waiting on Author


Re: Reduce lock level for ALTER TABLE ... ADD CHECK .. NOT VALID

2021-05-28 Thread Greg Sabino Mullane
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Looks fine to me

The new status of this patch is: Ready for Committer


Re: Speed up pg_checksums in cases where checksum already set

2021-05-27 Thread Greg Sabino Mullane
Thanks for the quick replies, everyone.

On Wed, May 26, 2021 at 10:17 PM Michael Paquier 
wrote:

>
> -if (do_sync)
> +if (do_sync && total_files_modified)
> Here, I am on the edge.  It could be an advantage to force a flush of
> the data folder anyway, no?


I was originally on the fence about including this as well, but it seems
like since the database is shut down and already in a consistent state,
there seems no advantage to syncing if we have not made any changes. Things
are no better or worse than when we arrived. However, the real-world use
case of running pg_checksums --enable and getting no changed blocks is
probably fairly rare, so if there is a strong objection, I'm happy
reverting to just (do_sync). (I'm not sure how cheap a sync is, I assume
it's low impact as the database is shut down, I guess it becomes a "might
as well while we are here"?)

On Wed, May 26, 2021 at 10:29 PM Justin Pryzby  wrote:

> In one of the checksum patches, there was an understanding that the pages
> should be written even if the checksum is correct, to handle replicas.
> ...
> Does your patch complicate things for the "stop all the clusters before
> switching them all" case?
>

I cannot imagine how it would, but, like Michael, I'm not really
understanding the reasoning here. We only run when safely shutdown, so no
WAL or dirty buffers need concern us :). Of course, once the postmaster is
up and running, fiddling with checksums becomes vastly more complicated, as
evidenced by that thread. I'm happy sticking to and speeding up the offline
version for now.

Cheers,
Greg


Speed up pg_checksums in cases where checksum already set

2021-05-26 Thread Greg Sabino Mullane
The attached patch makes an optimization to pg_checksums which prevents
rewriting the block if the checksum is already what we expect. This can
lead to much faster runs in cases where it is already set (e.g. enabled ->
disabled -> enable, external helper process, interrupted runs, future
parallel processes). There is also an effort to not sync the data directory
if no changes were written. Finally, added a bit more output on how many
files were actually changed, e.g.:

Checksum operation completed
Files scanned:   1236
Blocks scanned:  23283
Files modified:  38
Blocks modified: 19194
pg_checksums: syncing data directory
pg_checksums: updating control file
Checksums enabled in cluster

Cheers,
Greg


pg_checksums.optimize.writes.patch
Description: Binary data


Re: psql \df choose functions by their arguments

2021-04-07 Thread Greg Sabino Mullane
I like the wildcard aspect, but I have a few issues with the patch:

* It doesn't respect some common abbreviations that work elsewhere (e.g.
CREATE FUNCTION). So while "int4" works, "int" does not. Nor does "float",
which thus requires the mandatory-double-quoted "double precision"

* Adding commas to the args, as returned by psql itself via \df, provides
no matches.

* There seems to be no way (?) to limit the functions returned if they
share a common root. The previous incantation allowed you to pull out
foo(int) from foo(int, bigint). This was a big motivation for writing this
patch.

* SQL error on \df foo a..b as well as one on \df foo (bigint bigint)

Cheers,
Greg


Re: psql \df choose functions by their arguments

2021-01-19 Thread Greg Sabino Mullane
Ha ha ha, my bad, I am not sure why I left those out. Here is a new patch
with int2, int4, and int8. Thanks for the email.

Cheers,
Greg


v6-psql-df-pick-function-by-type.patch
Description: Binary data


Re: psql \df choose functions by their arguments

2021-01-14 Thread Greg Sabino Mullane
Thanks for the feedback: new version v5 (attached) has int8, plus the
suggested code formatting.

Cheers,
Greg


v5-psql-df-pick-function-by-type.patch
Description: Binary data


Re: psql \df choose functions by their arguments

2021-01-06 Thread Greg Sabino Mullane
On Sat, Jan 2, 2021 at 1:56 AM Thomas Munro  wrote:

> ...
> It looks like there is a collation dependency here that causes the
> test to fail on some systems:
>

Thanks for pointing that out. I tweaked the function definitions to
hopefully sidestep the ordering issue - attached is v4.

Cheers,
Greg


v4-psql-df-pick-function-by-type.patch
Description: Binary data


Re: psql \df choose functions by their arguments

2020-12-30 Thread Greg Sabino Mullane
Attached is the latest patch against HEAD - basically fixes a few typos.

Cheers,
Greg


v3-psql-df-pick-function-by-type.patch
Description: Binary data


Re: Prefer TG_TABLE_NAME over TG_RELNAME in tests

2020-11-11 Thread Greg Sabino Mullane
Thanks so much: the Bucardo PR has been merged in.


Re: psql \df choose functions by their arguments

2020-11-03 Thread Greg Sabino Mullane
On Sun, Nov 1, 2020 at 12:05 PM David Christensen  wrote:

>
> I can’t speak for the specific patch, but tab completion of proc args for
> \df, \ef and friends has long been a desired feature of mine, particularly
> when you are dealing with functions with huge numbers of arguments and the
> same name which I have (sadly) come across many times in the wild.
>

If someone can get this working against this current patch, that would be
great, but I suspect it will require some macro-jiggering in tab-complete.c
and possibly more, so yeah, could be something to add to the todo list.

Cheers,
Greg


Re: psql \df choose functions by their arguments

2020-11-03 Thread Greg Sabino Mullane
Thanks for looking this over!


> some Abbreviations of common types are not added to the
> type_abbreviations[] Such as:
>
> Int8=> bigint
>

I wasn't aiming to provide a canonical list, as I personally have never
seen anyone use int8 instead of bigint when (for example) creating a
function, but I'm not strongly opposed to expanding the list.

Single array seems difficult to handle it, may be we can use double array
> or use a struct.
>

I think the single works out okay, as this is a simple write-once variable
that is not likely to get updated often.


> And I think It's better to update '/?' info about '\df[+]' in function
> slashUsage(unsigned short int pager).
>

Suggestions welcome, but it's already pretty tight in there, so I couldn't
think of anything:

fprintf(output, _("  \\dew[+] [PATTERN]  list foreign-data
wrappers\n"));
fprintf(output, _("  \\df[anptw][S+] [PATRN] list [only
agg/normal/procedures/trigger/window] functions\n"));
fprintf(output, _("  \\dF[+]  [PATTERN]  list text search
configurations\n"));

The \df option is already our longest one, even with the silly attempt to
shorten PATTERN :)

Cheers,
Greg


Re: psql \df choose functions by their arguments

2020-11-01 Thread Greg Sabino Mullane
Thanks for the feedback, attached is version two of the patch. Major
changes:

* Use booleans not generic "int x"
* Build a quick list of abbreviations at the top of the function
* Add array mapping for all types
* Removed the tab-complete bit, it was too fragile and unhelpful

Cheers,
Greg
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 221a967bfe..cf3e8c7134 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1574,7 +1574,7 @@ testdb=
 
 
   
-\df[anptwS+] [ pattern ]
+\df[anptwS+] [ pattern ] [ types ]
 
 
 
@@ -1587,6 +1587,7 @@ testdb=
 If pattern is specified, only
 functions whose names match the pattern are shown.
+Any additional words are considered type arguments to help narrow the list of returned functions.
 By default, only user-created
 objects are shown; supply a pattern or the S
 modifier to include system objects.
@@ -1598,7 +1599,7 @@ testdb=
 
 
 
-To look up functions taking arguments or returning values of a specific
+To look up functions returning values of a specific
 data type, use your pager's search capability to scroll through the
 \df output.
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c7a83d5dfc..426603b0cb 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -783,6 +783,8 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 			case 'f':			/* function subsystem */
 switch (cmd[2])
 {
+		char	   *funcargs;
+
 	case '\0':
 	case '+':
 	case 'S':
@@ -791,7 +793,9 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 	case 'p':
 	case 't':
 	case 'w':
-		success = describeFunctions([2], pattern, show_verbose, show_system);
+		funcargs = psql_scan_slash_option(scan_state, OT_WHOLE_LINE, NULL, true);
+		success = describeFunctions([2], pattern, show_verbose, show_system, funcargs);
+		free(funcargs);
 		break;
 	default:
 		status = PSQL_CMD_UNKNOWN;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 07d640021c..a8d3f3ba53 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -26,6 +26,7 @@
 #include "fe_utils/print.h"
 #include "fe_utils/string_utils.h"
 #include "settings.h"
+#include "stringutils.h"
 #include "variables.h"
 
 static bool describeOneTableDetails(const char *schemaname,
@@ -312,7 +313,7 @@ describeTablespaces(const char *pattern, bool verbose)
  * and you can mix and match these in any order.
  */
 bool
-describeFunctions(const char *functypes, const char *pattern, bool verbose, bool showSystem)
+describeFunctions(const char *functypes, const char *pattern, bool verbose, bool showSystem, const char *funcargs)
 {
 	bool		showAggregate = strchr(functypes, 'a') != NULL;
 	bool		showNormal = strchr(functypes, 'n') != NULL;
@@ -626,6 +627,67 @@ describeFunctions(const char *functypes, const char *pattern, bool verbose, bool
 		  "n.nspname", "p.proname", NULL,
 		  "pg_catalog.pg_function_is_visible(p.oid)");
 
+	/*
+	 * Check for any additional arguments to narrow down which functions are
+	 * desired
+	 */
+	if (funcargs)
+	{
+
+		bool		is_initial_run = true;
+		bool		found_abbreviation;
+		int			argoffset = 0;
+		char	   *functoken;
+
+		static const char *type_abbreviations[] = {
+			"bool", "boolean", "bool[]", "boolean[]",
+			"char", "character", "char[]", "character[]",
+			"double", "double precision", "double[]", "double precision[]",
+			"float", "double precision", "float[]", "double precision[]",
+			"int", "integer", "int[]", "integer[]",
+			"time", "time without time zone", "time[]", "time without time zone[]",
+			"timetz", "time with time zone", "timetz[]", "time with time zone[]",
+			"timestamp", "timestamp without timestamp zone", "timestamp[]", "timestamp without timestamp zone[]",
+			"timestamptz", "timestamp with timestamp zone", "timestamptz[]", "timestamp with timestamp zone[]",
+			"varbit", "bit varying", "varbit[]", "bit varying[]",
+			"varchar", "character varying", "varchar[]", "character varying[]",
+			NULL
+		};
+
+		while ((functoken = strtokx(is_initial_run ? funcargs : NULL, " \t\n\r", ".,();", "\"", 0, false, true, pset.encoding)))
+		{
+			is_initial_run = false;
+			found_abbreviation = false;
+
+			if (isalpha(functoken[0]))
+			{
+appendPQExpBuffer(, "  AND p.proargtypes[%d]::regtype::text = ", argoffset++);
+for (int i = 0; NULL != *(type_abbreviations + i); i += 2)
+{
+	const char *shortname = *(type_abbreviations + i);
+	const char *longname = *(type_abbreviations + i + 1);
+
+	if (pg_strcasecmp(functoken, shortname) == 0)
+	{
+		appendPQExpBuffer(, "LOWER('%s')::text\n", longname);
+		found_abbreviation = true;
+		break;
+	}
+}
+if (!found_abbreviation)
+{
+	

Re: psql \df choose functions by their arguments

2020-10-29 Thread Greg Sabino Mullane
Thank you for looking this over.


> This isn't working for arrays:
> ...
> postgres=# \df aa aa int[]
>

Arrays should work as expected, I think you have one too many "aa" in there?


> I think it should use the same syntax as \sf and \ef, which require
> parenthesis
> and commas, not spaces.
>

Hmm, that will not allow partial matches if we require a closing parens.
Right now both commas and parens are accepted, but optional.


> I think x is just used as "initial", so I think you should make it boolean
> and
> then set is_initial = false, or similar.
>

Good suggestion, it is done.


> +
>  pg_strcasecmp(functoken, "bool") == 0 ? "'boolean'"
>
> I think writing this all within a call to appendPQExpBuffer() is excessive.
> You can make an array or structure to search through and then append the
> result
> to the buffer.
>

Hmm, like a custom struct we loop through? I will look into implementing
that and submit a new patch.

Cheers,
Greg


psql \df choose functions by their arguments

2020-10-15 Thread Greg Sabino Mullane
-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


Improve psql \df to choose functions by their arguments

== OVERVIEW

Having to scroll through same-named functions with different argument types
when you know exactly which one you want is annoying at best, error causing
at worst. This patch enables a quick narrowing of functions with the
same name but different arguments. For example, to see the full details
of a function names "myfunc" with a TEXT argument, but not showing
the version of "myfunc" with a BIGINT argument, one can now do:

psql=# \df myfunc text

For this, we are fairly liberal in what we accept, and try to be as
intuitive as possible.

Features:

* Type names are case insensitive. Whitespace is optional, but quoting is
respected:

greg=# \df myfunc text "character varying" INTEGER

* Abbreviations of common types is permitted (because who really likes
to type out "character varying"?), so the above could also be written as:

greg=# \df myfunc text varchar int

* The matching is greedy, so you can see everything matching a subset:

greg=# \df myfunc timestamptz
   List of functions
 Schema |  Name  | Result data type |Argument data types
 | Type
-
++--+---+--
 public | myfunc | void | timestamp with time zone
 | func
 public | myfunc | void | timestamp with time zone, bigint
 | func
 public | myfunc | void | timestamp with time zone, bigint,
boolean | func
 public | myfunc | void | timestamp with time zone, integer
| func
 public | myfunc | void | timestamp with time zone, text, cidr
 | func
(5 rows)

* The appearance of a closing paren indicates we do not want the greediness:

greg=# \df myfunc (timestamptz, bigint)
  List of functions
 Schema |  Name  | Result data type |   Argument data types|
Type
-
++--+--+--
 public | myfunc | void | timestamp with time zone, bigint |
func
(1 row)


== TAB COMPLETION:

I'm not entirely happy with this, but I figure piggybacking
onto COMPLETE_WITH_FUNCTION_ARG is better than nothing at all.
Ideally we'd walk prev*_wd to refine the returned list, but
that's an awful lot of complexity for very little gain, and I think
the current behavior of showing the complete list of args each time
should suffice.


== DOCUMENTATION:

The new feature is briefly mentioned: wordsmithing help in the
sgml section is appreciated. I'm not sure how many of the above features
need to be documented in detail.

Regarding psql/help.c, I don't think this really warrants a change there.
As it is, we've gone through great lengths to keep this overloaded
backslash
command left justified with the rest!


== TESTS:

I put this into psql.c, seems the best place. Mostly testing out
basic functionality, quoting, and the various abbreviations. Not much
else to test, near as I can tell, as this is a pure convienence addition
and shouldn't affect anything else. Any extra words after a function name
for \df was previously treated as an error.


== IMPLEMENTATION:

Rather than messing with psqlscanslash, we simply slurp in the entire rest
of the line via psql_scan_slash_option (all of which was previously
ignored).
This is passed to describeFunction, which then uses strtokx to break it
into tokens. We look for a match by comparing the current proargtypes
entry,
casted to text, against the lowercase version of the token found by
strtokx.
Along the way, we convert things like "timestamptz" to the official version
(i.e. "timestamp with time zone"). If any of the tokens start with a
closing
paren, we immediately stop parsing and set pronargs to the current number
of valid tokens, thereby forcing a match to one (or zero) functions.

6ab7a45d541f2c31c5631b811f14081bf7b22271
v1-psql-df-pick-function-by-type.patch

- --
Greg Sabino Mullane
PGP Key: 0x14964AC8 202010151316
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-BEGIN PGP SIGNATURE-

iF0EAREDAB0WIQQlKd9quPeUB+lERbS8m5BnFJZKyAUCX4iENQAKCRC8m5BnFJZK
yIUKAKDiv1E9KgXuSO7lE9p+ttFdk02O2ACg44lu9VdKt3IggIrPiXBPKR8C85M=
=QPSd
-END PGP SIGNATURE-
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index ee3fc09577..c63255cebc 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1548,7 +1548,7 @@ testdb=
 
 
   
-\df[anptwS+] [ pattern ]
+\df[anptwS+] [ pattern ] [ types ]
 
 
 
@@ -1561,6 +1561,7 @@ testdb=
 If pattern is specified, only
 functions whose names match the pattern are shown.
+Any additional words are considered type arguments to help narrow the list of returned functions.
 By default, only user-c

WIP psql \df choose functions by their arguments

2020-10-14 Thread Greg Sabino Mullane
-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


Improve psql \df to choose functions by their arguments

== OVERVIEW

Having to scroll through same-named functions with different argument types 
when you know exactly which one you want is annoying at best, error causing 
at worst. This patch enables a quick narrowing of functions with the 
same name but different arguments. For example, to see the full details 
of a function names "myfunc" with a TEXT argument, but not showing 
the version of "myfunc" with a BIGINT argument, one can now do:

psql=# \df myfunc text

For this, we are fairly liberal in what we accept, and try to be as 
intuitive as possible.

Features:

* Type names are case insensitive. Whitespace is optional, but quoting is 
respected:

greg=# \df myfunc text "character varying" INTEGER

* Abbreviations of common types is permitted (because who really likes 
to type out "character varying"?), so the above could also be written as:

greg=# \df myfunc text varchar int

* The matching is greedy, so you can see everything matching a subset:

greg=# \df myfunc timestamptz
   List of functions
 Schema |  Name  | Result data type |Argument data types
| Type 
- 
++--+---+--
 public | myfunc | void | timestamp with time zone  
| func
 public | myfunc | void | timestamp with time zone, bigint  
| func
 public | myfunc | void | timestamp with time zone, bigint, boolean 
| func
 public | myfunc | void | timestamp with time zone, integer 
| func
 public | myfunc | void | timestamp with time zone, text, cidr  
| func
(5 rows)

* The appearance of a closing paren indicates we do not want the greediness:

greg=# \df myfunc (timestamptz, bigint)
  List of functions
 Schema |  Name  | Result data type |   Argument data types| Type 
- ++--+--+--
 public | myfunc | void | timestamp with time zone, bigint | func
(1 row)


== TAB COMPLETION:

I'm not entirely happy with this, but I figure piggybacking 
onto COMPLETE_WITH_FUNCTION_ARG is better than nothing at all.
Ideally we'd walk prev*_wd to refine the returned list, but 
that's an awful lot of complexity for very little gain, and I think 
the current behavior of showing the complete list of args each time 
should suffice.


== DOCUMENTATION:

The new feature is briefly mentioned: wordsmithing help in the 
sgml section is appreciated. I'm not sure how many of the above features 
need to be documented in detail.

Regarding psql/help.c, I don't think this really warrants a change there. 
As it is, we've gone through great lengths to keep this overloaded backslash 
command left justified with the rest!


== TESTS:

I put this into psql.c, seems the best place. Mostly testing out 
basic functionality, quoting, and the various abbreviations. Not much 
else to test, near as I can tell, as this is a pure convienence addition 
and shouldn't affect anything else. Any extra words after a function name 
for \df was previously treated as an error.


== IMPLEMENTATION:

Rather than messing with psqlscanslash, we simply slurp in the entire rest 
of the line via psql_scan_slash_option (all of which was previously ignored). 
This is passed to describeFunction, which then uses strtokx to break it 
into tokens. We look for a match by comparing the current proargtypes entry, 
casted to text, against the lowercase version of the token found by strtokx. 
Along the way, we convert things like "timestamptz" to the official version 
(i.e. "timestamp with time zone"). If any of the tokens start with a closing 
paren, we immediately stop parsing and set pronargs to the current number 
of valid tokens, thereby forcing a match to one (or zero) functions.

dcd972f6b945070ef4454ea39d25378427a90e89  df.patch

-BEGIN PGP SIGNATURE-

iF0EAREDAB0WIQQlKd9quPeUB+lERbS8m5BnFJZKyAUCX4bsgwAKCRC8m5BnFJZK
yGDvAJ9ix8jzwtTwKLDQUgu5yb/iBoC7EQCfQsf8LLZ0RWsiiMposi57u3S94nE=
=rQj2
-END PGP SIGNATURE-
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index ee3fc09577..c63255cebc 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1548,7 +1548,7 @@ testdb=
 
 
   
-\df[anptwS+] [ pattern ]
+\df[anptwS+] [ pattern ] [ types ]
 
 
 
@@ -1561,6 +1561,7 @@ testdb=
 If pattern is specified, only
 functions whose names match the pattern are shown.
+Any additional words are considered type arguments to help narrow the list of returned functions.
 By default, only user-created
 objects are shown; supply a pattern or the S
 modifier to include system objects.
@@ -1572,7 +1573,7 @@ testdb=
 
 
 
-