Re: logical decoding and replication of sequences, take 2

2023-03-16 Thread John Naylor
On Wed, Mar 15, 2023 at 7:00 PM Tomas Vondra 
wrote:
>
> On 3/10/23 11:03, John Naylor wrote:

> > + * When we're called via the SQL SRF there's already a transaction
> >
> > I see this was copied from existing code, but I found it confusing --
> > does this function have a stable name?
>
> What do you mean by "stable name"? It certainly is not exposed as a
> user-callable SQL function, so I think this comment it misleading and
> should be removed.

Okay, I was just trying to think of why it was phrased this way...

-- 
John Naylor
EDB: http://www.enterprisedb.com


Re: logical decoding and replication of sequences, take 2

2023-03-16 Thread John Naylor
On Wed, Mar 15, 2023 at 7:51 PM Tomas Vondra 
wrote:
>
>
>
> On 3/14/23 08:30, John Naylor wrote:
> > I tried a couple toy examples with various combinations of use styles.
> >
> > Three with "automatic" reading from sequences:
> >
> > create table test(i serial);
> > create table test(i int GENERATED BY DEFAULT AS IDENTITY);
> > create table test(i int default nextval('s1'));
> >
> > ...where s1 has some non-default parameters:
> >
> > CREATE SEQUENCE s1 START 100 MAXVALUE 100 INCREMENT BY -1;
> >
> > ...and then two with explicit use of s1, one inserting the 'nextval'
> > into a table with no default, and one with no table at all, just
> > selecting from the sequence.
> >
> > The last two seem to work similarly to the first three, so it seems like
> > FOR ALL TABLES adds all sequences as well. Is that expected?
>
> Yeah, that's a bug - we shouldn't replicate the sequence changes, unless
> the sequence is actually added to the publication. I tracked this down
> to a thinko in get_rel_sync_entry() which failed to check the object
> type when puballtables or puballsequences was set.
>
> Attached is a patch fixing this.

Okay, I can verify that with 0001-0006, sequences don't replicate unless
specified. I do see an additional change that doesn't make sense: On the
subscriber I no longer see a jump to the logged 32 increment, I see the
very next value:

# alter system set wal_level='logical';
# port  is subscriber

echo
echo "PUB:"
psql -c "drop table if exists test;"
psql -c "drop publication if exists pub1;"

echo
echo "SUB:"
psql -p  -c "drop table if exists test;"
psql -p  -c "drop subscription if exists sub1 ;"

echo
echo "PUB:"
psql -c "create table test(i int GENERATED BY DEFAULT AS IDENTITY);"
psql -c "CREATE PUBLICATION pub1 FOR ALL TABLES;"
psql -c "CREATE PUBLICATION pub2 FOR ALL SEQUENCES;"

echo
echo "SUB:"
psql -p  -c "create table test(i int GENERATED BY DEFAULT AS IDENTITY);"
psql -p  -c "CREATE SUBSCRIPTION sub1 CONNECTION 'host=localhost
dbname=postgres application_name=sub1 port=5432' PUBLICATION pub1;"
psql -p  -c "CREATE SUBSCRIPTION sub2 CONNECTION 'host=localhost
dbname=postgres application_name=sub2 port=5432' PUBLICATION pub2;"

echo
echo "PUB:"
psql -c "insert into test default values;"
psql -c "insert into test default values;"
psql -c "select * from test;"
psql -c "select * from test_i_seq;"

sleep 1

echo
echo "SUB:"
psql -p  -c "select * from test;"
psql -p  -c "select * from test_i_seq;"

psql -p  -c "drop subscription sub1 ;"
psql -p  -c "drop subscription sub2 ;"

psql -p  -c "insert into test default values;"
psql -p  -c "select * from test;"
psql -p  -c "select * from test_i_seq;"

The last two queries on the subscriber show:

 i
---
 1
 2
 3
(3 rows)

 last_value | log_cnt | is_called
+-+---
  3 |  30 | t
(1 row)

...whereas before with 0001-0003 I saw:

 i

  1
  2
 34
(3 rows)

 last_value | log_cnt | is_called
+-+---
 34 |  32 | t

> > The documentation for CREATE PUBLICATION mentions sequence options,
> > but doesn't really say how these options should be used.
> Good point. The idea is that we handle tables and sequences the same
> way, i.e. if you specify 'sequence' then we'll replicate increments for
> sequences explicitly added to the publication.
>
> If this is not clear, the docs may need some improvements.

Aside from docs, I'm not clear what some of the tests are doing:

+CREATE PUBLICATION testpub_forallsequences FOR ALL SEQUENCES WITH (publish
= 'sequence');
+RESET client_min_messages;
+ALTER PUBLICATION testpub_forallsequences SET (publish = 'insert,
sequence');

What does it mean to add 'insert' to a sequence publication?

Likewise, from a brief change in my test above, 'sequence' seems to be a
noise word for table publications. I'm not fully read up on the background
of this topic, but wanted to make sure I understood the design of the
syntax.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: slapd logs to syslog during tests

2023-03-16 Thread Andres Freund
Hi,

On 2023-03-16 23:52:04 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I got sidetracked trying to make slapd stop any and all syslog access, but 
> > it
> > doesn't look like that's possible. But working on commiting the logfile-only
> > approach now. Planning to backpatch this, unless somebody protests very 
> > soon.
> 
> Sadly, buildfarm seems to be having some indigestion with this ...

Unfortunately even just slightly older versions don't have the logfile-only
option :(.

For a bit I thought we were out of options, because 'loglevel 0' works, but
I was not seeing any contents in the logfile we specify. But as it turns out,
the logfile we (before this patch already) specify, don't contain anything
ever, because:
   logfile 
  Specify  a  file  for  recording  slapd  debug messages. By 
default these messages only go to stderr, are not recorded anywhere else, and 
are
  unrelated to messages exposed by the loglevel configuration 
parameter. Specifying a logfile copies messages to both stderr and the logfile.
and
   loglevel  [...]
  Specify  the  level at which debugging statements and operation 
statistics should be syslogged (currently logged to the syslogd(8) LOG_LOCAL4

yet using logfile-only does prevent things from ending up in syslog.

Because it's not at all confusing that a 'loglevel' option doesn't influence
at all what ends up in the file controlled by 'logfile'.


Given that 'loglevel 0' works and doesn't actually reduce the amount of
logging available, that seems to be the way to go.


Iff we actually want slapd logging, the stderr logging can be turned on (and
potentially redirected to a logfile via logfile or redirection). But
unfortunately it seems that the debug level can only be set on the server
commandline. And has a significant sideeffect:
   -d debug-level
  Turn on debugging as defined by debug-level.  If this option is 
specified, even with a zero argument, slapd will  not  fork  or  disassociate
  from  the invoking terminal

Which means the server can't be started anymore as we do currently do, we'd
have to use IPC::Run::start.  I hacked that together locally, but that's more
than I think I can get right at my current level of tiredness.


So unless somebody has a better idea, I'm gonna replace 'logfile-only on' with
'loglevel 0' for now. I also am open to reverting and trying again tomorrow.

Greetings,

Andres Freund




Re: logical decoding and replication of sequences, take 2

2023-03-16 Thread vignesh C
On Thu, 16 Mar 2023 at 21:55, Tomas Vondra
 wrote:
>
> Hi!
>
> On 3/16/23 08:38, Masahiko Sawada wrote:
> > Hi,
> >
> > On Wed, Mar 15, 2023 at 9:52 PM Tomas Vondra
> >  wrote:
> >>
> >>
> >>
> >> On 3/14/23 08:30, John Naylor wrote:
> >>> I tried a couple toy examples with various combinations of use styles.
> >>>
> >>> Three with "automatic" reading from sequences:
> >>>
> >>> create table test(i serial);
> >>> create table test(i int GENERATED BY DEFAULT AS IDENTITY);
> >>> create table test(i int default nextval('s1'));
> >>>
> >>> ...where s1 has some non-default parameters:
> >>>
> >>> CREATE SEQUENCE s1 START 100 MAXVALUE 100 INCREMENT BY -1;
> >>>
> >>> ...and then two with explicit use of s1, one inserting the 'nextval'
> >>> into a table with no default, and one with no table at all, just
> >>> selecting from the sequence.
> >>>
> >>> The last two seem to work similarly to the first three, so it seems like
> >>> FOR ALL TABLES adds all sequences as well. Is that expected?
> >>
> >> Yeah, that's a bug - we shouldn't replicate the sequence changes, unless
> >> the sequence is actually added to the publication. I tracked this down
> >> to a thinko in get_rel_sync_entry() which failed to check the object
> >> type when puballtables or puballsequences was set.
> >>
> >> Attached is a patch fixing this.
> >>
> >>> The documentation for CREATE PUBLICATION mentions sequence options,
> >>> but doesn't really say how these options should be used.
> >> Good point. The idea is that we handle tables and sequences the same
> >> way, i.e. if you specify 'sequence' then we'll replicate increments for
> >> sequences explicitly added to the publication.
> >>
> >> If this is not clear, the docs may need some improvements.
> >>
> >
> > I'm late to this thread, but I have some questions and review comments.
> >
> > Regarding sequence logical replication, it seems that changes of
> > sequence created after CREATE SUBSCRIPTION are applied on the
> > subscriber even without REFRESH PUBLICATION command on the subscriber.
> > Which is a different behavior than tables. For example, I set both
> > publisher and subscriber as follows:
> >
> > 1. On publisher
> > create publication test_pub for all sequences;
> >
> > 2. On subscriber
> > create subscription test_sub connection 'dbname=postgres port=5551'
> > publication test_pub; -- port=5551 is the publisher
> >
> > 3. On publisher
> > create sequence s1;
> > select nextval('s1');
> >
> > I got the error "ERROR:  relation "public.s1" does not exist on the
> > subscriber". Probably we need to do should_apply_changes_for_rel()
> > check in apply_handle_sequence().
> >
>
> Yes, you're right - the sequence handling should have been calling the
> should_apply_changes_for_rel() etc.
>
> The attached 0005 patch should fix that - I still need to test it a bit
> more and maybe clean it up a bit, but hopefully it'll allow you to
> continue the review.
>
> I had to tweak the protocol a bit, so that this uses the same cache as
> tables. I wonder if maybe we should make it even more similar, by
> essentially treating sequences as tables with (last_value, log_cnt,
> called) columns.
>
> > If my understanding is correct, is there any case where the subscriber
> > needs to apply transactional sequence changes? The commit message of
> > 0001 patch says:
> >
> > * Changes for sequences created in the same top-level transaction are
> >   treated as transactional, i.e. just like any other change from that
> >   transaction, and discarded in case of a rollback.
> >
> > IIUC such sequences are not visible to the subscriber, so it cannot
> > subscribe to them until the commit.
> >
>
> The comment is slightly misleading, as it talks about creation of
> sequences, but it should be talking about relfilenodes. For example, if
> you create a sequence, add it to publication, and then in a later
> transaction you do
>
>ALTER SEQUENCE x RESTART
>
> or something else that creates a new relfilenode, then the subsequent
> increments are visible only in that transaction. But we still need to
> apply those on the subscriber, but only as part of the transaction,
> because it might roll back.
>
> > ---
> > I got an assertion failure. The reproducible steps are:
> >
>
> I do believe this was due to a thinko in apply_handle_sequence, which
> sometimes started transaction and didn't terminate it correctly. I've
> changed it to use the begin_replication_step() etc. and it seems to be
> working fine now.

One of the patch does not apply on HEAD, because of a recent commit,
we might have to rebase the patch:
git am 0005-fixup-syncing-refresh-sequences-20230316.patch
Applying: fixup syncing/refresh sequences
error: patch failed: src/backend/replication/pgoutput/pgoutput.c:711
error: src/backend/replication/pgoutput/pgoutput.c: patch does not apply
Patch failed at 0001 fixup syncing/refresh sequences

Regards,
Vignesh




Re: psql \watch 2nd argument: iteration count

2023-03-16 Thread Andrey Borodin
On Wed, Mar 15, 2023 at 5:54 PM Michael Paquier  wrote:
>
> On Wed, Mar 15, 2023 at 04:58:49PM +0900, Michael Paquier wrote:
> > Yep.  You are right.
>
> Fixed that and applied 0001.
Great, thanks!

>
> +valptr++;
> +if (strncmp("i", opt, strlen("i")) == 0 ||
> +strncmp("interval", opt, strlen("interval")) == 0)
> +{
>
> Did you look at process_command_g_options() and if some consolidation
> was possible?  It would be nice to have APIs shaped so as more
> sub-commands could rely on the same facility in the future.
I've tried, but they behave so differently. I could reuse only the
"char *valptr = strchr(opt, '=');" thing from there :)
And process_command_g_options() changes data in-place...
Actually, I'm not sure having "i" == "interval" and "c"=="count" is a
good idea here too. I mean I like it, but is it coherent?
Also I do not like repeating 4 times this 5 lines
+ pg_log_error("\\watch: incorrect interval value '%s'", valptr);
+ free(opt);
+ resetPQExpBuffer(query_buf);
+ psql_scan_reset(scan_state);
+ return PSQL_CMD_ERROR;
But I hesitate defining a new function for this...

>
> -\watch [  class="parameter">seconds ]
> +\watch [  class="parameter">seconds [  class="parameter">iterations ] ]
>
> This set of changes is not reflected in the documentation.
Done.

> With an interval in place, we could now automate some tests with
> \watch where it does not fail.  What do you think about adding a test
> with a simple query, an interval of 0s and one iteration?
Done. Also found a bug that we actually were doing N+1 iterations.

Thank you for working on this!

Best regards, Andrey Borodin.


v9-0001-Iteration-count-argument-to-psql-watch-command.patch
Description: Binary data


Re: slapd logs to syslog during tests

2023-03-16 Thread Tom Lane
Andres Freund  writes:
> I got sidetracked trying to make slapd stop any and all syslog access, but it
> doesn't look like that's possible. But working on commiting the logfile-only
> approach now. Planning to backpatch this, unless somebody protests very soon.

Sadly, buildfarm seems to be having some indigestion with this ...

regards, tom lane




Re: Simplify some codes in pgoutput

2023-03-16 Thread Peter Smith
On Wed, Mar 15, 2023 at 7:30 PM houzj.f...@fujitsu.com
 wrote:
>
> Hi,
>
> I noticed that there are some duplicated codes in pgoutput_change() function
> which can be simplified, and here is an attempt to do that.
>
> Best Regards,
> Hou Zhijie

Hi Hou-san.

I had a quick look at the 0001 patch.

Here are some first comments.

==

1.
+ if (relentry->attrmap)
+ old_slot = execute_attr_map_slot(relentry->attrmap, old_slot,
+ MakeTupleTableSlot(RelationGetDescr(targetrel),
+ ));

1a.
IMO maybe it was more readable before when there was a separate
'tupdesc' variable, instead of trying to squeeze too much into one
statement.

1b.
Should you retain the old comments that said "/* Convert tuple if needed. */"

~~~

2.
- if (old_slot)
- old_slot = execute_attr_map_slot(relentry->attrmap,
- old_slot,
- MakeTupleTableSlot(tupdesc, ));

The original code for REORDER_BUFFER_CHANGE_UPDATE was checking "if
(old_slot)" but that check seems no longer present. Is it OK?

~~~

3.
- /*
- * Send BEGIN if we haven't yet.
- *
- * We send the BEGIN message after ensuring that we will actually
- * send the change. This avoids sending a pair of BEGIN/COMMIT
- * messages for empty transactions.
- */

That original longer comment has been replaced with just "/* Send
BEGIN if we haven't yet */". Won't it be better to retain the more
informative longer comment?

~~~

4.
+
+cleanup:
  if (RelationIsValid(ancestor))
  {
  RelationClose(ancestor);

~

Since you've introduced a new label 'cleanup:' then IMO you can remove
that old comment "/* Cleanup */".

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: A problem about ParamPathInfo for an AppendPath

2023-03-16 Thread Richard Guo
On Fri, Mar 17, 2023 at 6:15 AM Tom Lane  wrote:

> Pushed.  I thought the comment needed to be completely rewritten not just
> tweaked, and I felt it was probably reasonable to continue to exclude
> dummy paths from getting the more expensive treatment.


Yes agreed.  Thanks for the changes and pushing.

Thanks
Richard


Re: gcc 13 warnings

2023-03-16 Thread John Naylor
On Fri, Mar 17, 2023 at 1:11 AM Andres Freund  wrote:

> On 2023-03-16 13:54:29 -0400, Tom Lane wrote:

> So I just elected to leave it at the default for meson.

In my build scripts I've been setting it to -O2, because that seemed the
obvious thing to do, and assumed some later commit would get rid of the
need to do it manually. (if it was discussed before, I missed that)

> > I'm not sure if we're prepared to go to -O3 by default though,
> > especially for some of the older buildfarm critters where that
> > might be buggy.  (I'd imagine you take a hit in gdb-ability too.)

Newer platforms could be buggy enough. A while back, IIUC gcc moved an
optimization pass from O3 to O2, which resulted in obviously bad code
generation, which I know because of a bug report filed by one Andres Freund:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101481

...which was never properly addressed as far as I know.

I'm a bit surprised we would even consider changing optimization level
based on a build tool default.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: Allow logical replication to copy tables in binary format

2023-03-16 Thread Amit Kapila
On Fri, Mar 17, 2023 at 6:42 AM Peter Smith  wrote:
>
> On Thu, Mar 16, 2023 at 1:55 PM Amit Kapila  wrote:
> >
> > On Wed, Mar 15, 2023 at 3:33 PM Melih Mutlu  wrote:
> > >
> > > Amit Kapila , 15 Mar 2023 Çar, 12:31 tarihinde 
> > > şunu yazdı:
> > >>
> > >> On Tue, Mar 14, 2023 at 4:32 PM Melih Mutlu  
> > >> wrote:
> > >
> > >
> > >>
> > >> What purpose does this test serve w.r.t this patch? Before checking
> > >> the sync for different column orders, the patch has already changed
> > >> binary to false, so it doesn't seem to test the functionality of this
> > >> patch. Am, I missing something?
> > >
> > >
> > > I missed that binary has changed to false before testing column orders. I 
> > > moved that test case up before changing binary to false.
> > > Please see v14 [1].
> > >
> >
> > After thinking some more about this test, I don't think we need this
> > test as this doesn't add any value to this patch. This tests the
> > column orders which is well-established functionality of the apply
> > worker.
> >
>
> I agree that different column order is a "well-established
> functionality of the apply worker".
>
> But when I searched the TAP tests I could not find any existing tests
> that check the combination of
> - different column orders
> - CREATE SUBSCRIPTION with parameters binary=true and copy_data=true
>
> So there seemed to be a gap in the test coverage, which is why I suggested it.
>
> I guess that test was not strictly tied to this patch. Should I post
> this new test suggestion as a separate thread or do you think there is
> no point because it will not get any support?
>

Personally, I don't think we need to test every possible combination
unless it is really achieving something meaningful. In this particular
case, I don't see the need or maybe I am missing something.

-- 
With Regards,
Amit Kapila.




Re: Assert failure of the cross-check for nullingrels

2023-03-16 Thread Richard Guo
On Mon, Mar 13, 2023 at 5:44 PM Richard Guo  wrote:

> On Mon, Mar 13, 2023 at 5:03 PM Richard Guo 
> wrote:
>
>> Back to the original issue, if a join has more than one quals, actually
>> we treat them as a whole when we check if identity 3 applies as well as
>> when we adjust them to be suitable for commutation according to identity
>> 3.  So when we check if a qual is computable at a given level, I think
>> we should also consider the join's quals as a whole.  I'm thinking that
>> we use a 'group' notion for RestrictInfos and then use the clause_relids
>> of the 'group' in clause_is_computable_at.  Does this make sense?
>>
>
> I'm imagining something like attached (no comments and test cases yet).
>

Here is an updated patch with comments and test case.  I also change the
code to store 'group_clause_relids' directly in RestrictInfo.

Thanks
Richard


v2-0001-Check-group_clause_relids-to-see-if-a-clause-is-computable.patch
Description: Binary data


Re: zstd compression for pg_dump

2023-03-16 Thread Tomas Vondra



On 3/16/23 05:50, Justin Pryzby wrote:
> On Fri, Mar 10, 2023 at 12:48:13PM -0800, Jacob Champion wrote:
>> On Wed, Mar 8, 2023 at 10:59 AM Jacob Champion  
>> wrote:
>>> I did some smoke testing against zstd's GitHub release on Windows. To
>>> build against it, I had to construct an import library, and put that
>>> and the DLL into the `lib` folder expected by the MSVC scripts...
>>> which makes me wonder if I've chosen a harder way than necessary?
>>
>> It looks like pg_dump's meson.build is missing dependencies on zstd
>> (meson couldn't find the headers in the subproject without them).
> 
> I saw that this was added for LZ4, but I hadn't added it for zstd since
> I didn't run into an issue without it.  Could you check that what I've
> added works for your case ?
> 
>>> Parallel zstd dumps seem to work as expected, in that the resulting
>>> pg_restore output is identical to uncompressed dumps and nothing
>>> explodes. I haven't inspected the threading implementation for safety
>>> yet, as you mentioned.
>>
>> Hm. Best I can tell, the CloneArchive() machinery is supposed to be
>> handling safety for this, by isolating each thread's state. I don't feel
>> comfortable pronouncing this new addition safe or not, because I'm not
>> sure I understand what the comments in the format-specific _Clone()
>> callbacks are saying yet.
> 
> My line of reasoning for unix is that pg_dump forks before any calls to
> zstd.  Nothing zstd does ought to affect the pg_dump layer.  But that
> doesn't apply to pg_dump under windows.  This is an opened question.  If
> there's no solid answer, I could disable/ignore the option (maybe only
> under windows).
> 

I may be missing something, but why would the patch affect this? Why
would it even affect safety of the parallel dump? And I don't see any
changes to the clone stuff ...

>> On to code (not a complete review):
>>
>>> if (hasSuffix(fname, ".gz"))
>>> compression_spec.algorithm = PG_COMPRESSION_GZIP;
>>> else
>>> {
>>> boolexists;
>>>
>>> exists = (stat(path, ) == 0);
>>> /* avoid unused warning if it is not built with compression */
>>> if (exists)
>>> compression_spec.algorithm = PG_COMPRESSION_NONE;
>>> -#ifdef HAVE_LIBZ
>>> -   if (!exists)
>>> -   {
>>> -   free_keep_errno(fname);
>>> -   fname = psprintf("%s.gz", path);
>>> -   exists = (stat(fname, ) == 0);
>>> -
>>> -   if (exists)
>>> -   compression_spec.algorithm = PG_COMPRESSION_GZIP;
>>> -   }
>>> -#endif
>>> -#ifdef USE_LZ4
>>> -   if (!exists)
>>> -   {
>>> -   free_keep_errno(fname);
>>> -   fname = psprintf("%s.lz4", path);
>>> -   exists = (stat(fname, ) == 0);
>>> -
>>> -   if (exists)
>>> -   compression_spec.algorithm = PG_COMPRESSION_LZ4;
>>> -   }
>>> -#endif
>>> +   else if (check_compressed_file(path, , "gz"))
>>> +   compression_spec.algorithm = PG_COMPRESSION_GZIP;
>>> +   else if (check_compressed_file(path, , "lz4"))
>>> +   compression_spec.algorithm = PG_COMPRESSION_LZ4;
>>> +   else if (check_compressed_file(path, , "zst"))
>>> +   compression_spec.algorithm = PG_COMPRESSION_ZSTD;
>>> }
>>
>> This function lost some coherence, I think. Should there be a hasSuffix
>> check at the top for ".zstd" (and, for that matter, ".lz4")?
> 

This was discussed in the lz4 thread a couple days, and I think there
should be hasSuffix() cases for lz4/zstd too, not just for .gz.

> The function is first checking if it was passed a filename which already
> has a suffix.  And if not, it searches through a list of suffixes,
> testing for an existing file with each suffix.  The search with stat()
> doesn't happen if it has a suffix.  I'm having trouble seeing how the
> hasSuffix() branch isn't dead code.  Another opened question.
> 

AFAICS it's done this way because of this comment in pg_backup_directory

 * ...
 * ".gz" suffix is added to the filenames. The TOC files are never
 * compressed by pg_dump, however they are accepted with the .gz suffix
 * too, in case the user has manually compressed them with 'gzip'.

I haven't tried, but I believe that if you manually compress the
directory, it may hit this branch. And IMO if we support that for gzip,
the other compression methods should do that too for consistency.

In any case, it's a tiny amount of code and I don't feel like ripping
that out when it might break some currently supported use case.

>> I'm a little suspicious of the replacement of supports_compression()
>> with parse_compress_specification(). For example:
>>
>>> -   errmsg = supports_compression(AH->compression_spec);
>>> -   if (errmsg)
>>> +   parse_compress_specification(AH->compression_spec.algorithm,
>>> +   NULL, _spec);
>>> +   if (compress_spec.parse_error != NULL)
>>> {
>>> pg_log_warning("archive is compressed, but this installation does 
>>> not 

RE: Allow logical replication to copy tables in binary format

2023-03-16 Thread shiy.f...@fujitsu.com
On Thu, Mar 16, 2023 9:20 PM Melih Mutlu   wrote:
> 
> Hi,
> 
> Please see the attached v16.
> 

Thanks for updating the patch.

+# Cannot sync due to type mismatch
+$node_subscriber->wait_for_log(qr/ERROR: ( [A-Z0-9]+:)? incorrect binary data 
format/);

+# Ensure the COPY command is executed in text format on the publisher
+$node_publisher->wait_for_log(qr/LOG: ( [A-Z0-9]+:)? statement: COPY (.+)? TO 
STDOUT\n/);

It looks that you forgot to pass `offset` into wait_for_log().

Regards,
Shi Yu


Re: pg_dump versus hash partitioning

2023-03-16 Thread Julien Rouhaud
On Thu, Mar 16, 2023 at 08:43:56AM -0400, Tom Lane wrote:
> Julien Rouhaud  writes:
> > On Mon, Mar 13, 2023 at 07:39:12PM -0400, Tom Lane wrote:
> >> Yeah, we need to do both.  Attached find an updated patch series:
>
> > I didn't find a CF entry, is it intended?
>
> Yeah, it's there:
>
> https://commitfest.postgresql.org/42/4226/

Ah thanks!  I was looking for "pg_dump" in the page.

> > I'm not sure if you intend to keep the current 0002 - 0003 separation, but 
> > if
> > yes the part about te->defn and possible fallback should be moved to 0003.  
> > In
> > 0002 we're only looking at the COPY statement.
>
> I was intending to smash it all into one commit --- the separation is
> just to ease review.

Ok, no problem then and I agree it's better to squash them.

> > I know you're only inheriting this comment, but isn't it well outdated and 
> > not
> > accurate anymore?  I'm assuming that "archiving is not on" was an acceptable
> > way to mean "wal_level < archive" at some point, but it's now completely
> > misleading.
>
> Sure, want to propose wording?

Just mentioning the exact wal_level, something like

 * [...].  If wal_level is set to minimal this prevents
 * WAL-logging the COPY.  This obtains a speedup similar to
 * [...]


> > More generally, I also think that forcing --load-via-partition-root for
> > known unsafe partitioning seems like the best compromise.  I'm not sure if 
> > we
> > should have an option to turn it off though.
>
> I think the odds of that yielding a usable dump are nil, so I don't
> see why we should bother.

No objection from me.




Re: Add pg_walinspect function with block info columns

2023-03-16 Thread Peter Geoghegan
On Thu, Mar 16, 2023 at 2:19 AM Bharath Rupireddy
 wrote:
> On Wed, Mar 15, 2023 at 12:20 PM Michael Paquier  wrote:
> > I am not sure to get the concern here.  As long as one is smart enough
> > with SQL, there is no need to perform a double scan of the contents of
> > pg_wal with a large scan on the start LSN.  If one wishes to only
> > extract some block for a given record type, or for a filter of your
> > choice, it is possible to use a LATERAL on pg_get_wal_block_info(),
> > say:
> > SELECT r.start_lsn, b.blockid
> >   FROM pg_get_wal_records_info('0/0128', '0/1911AA8') AS r,
> >   LATERAL pg_get_wal_block_info(start_lsn, end_lsn) as b
> >   WHERE r.resource_manager = 'Heap2';
> >
> > This will extract the block information that you'd want for a given
> > record type.

The same information *already* appears in pg_get_wal_records_info()'s
block_ref output! Why should the user be expected to use a LATERAL
join (or any type of join) to get _the same information_, just in a
usable form?

> IIUC, the concern raised so far in this thread is not just on the
> performance of JOIN queries to get both block info and record level
> info, but on ease of using pg_walinspect functions. If
> pg_get_wal_block_info emits the record level information too (which
> turns out to be 50 LOC more), one doesn't have to be expert at writing
> JOIN queries or such, but just can run the function, which actually
> takes way less time (3sec) to scan the same 5mn WAL records [3].

That's exactly my concern, yes. As you say, it's not just the
performance aspect. Requiring users to write a needlessly ornamental
query is actively misleading. It suggests that block_ref is distinct
information from the blocks output by pg_get_wal_block_info().

-- 
Peter Geoghegan




Re: Add LZ4 compression in pg_dump

2023-03-16 Thread Tomas Vondra



On 3/16/23 23:58, Justin Pryzby wrote:
> On Thu, Mar 16, 2023 at 11:30:50PM +0100, Tomas Vondra wrote:
>> On 3/16/23 01:20, Justin Pryzby wrote:
>>> But try reading the diff while looking for the cause of a bug.  It's the
>>> difference between reading 50, two-line changes, and reading a hunk that
>>> replaces 100 lines with a different 100 lines, with empty/unrelated
>>> lines randomly thrown in as context.
>>
>> I don't know, maybe I'm doing something wrong or maybe I just am bad at
>> looking at diffs, but if I apply the patch you submitted on 8/3 and do
>> the git-diff above (output attached), it seems pretty incomprehensible
>> to me :-( I don't see 50 two-line changes (I certainly wouldn't be able
>> to identify the root cause of the bug based on that).
> 
> It's true that most of the diff is still incomprehensible...
> 
> But look at the part relevant to the "empty-data" bug:
> 

Well, yeah. If you know where to look, and if you squint just the right
way, then you can see any bug. I don't think I'd be able to spot the bug
in the diff unless I knew in advance what the bug is.

That being said, I don't object to moving the function etc. Unless there
are alternative ideas how to fix the empty-data issue, I'll get this
committed after playing with it a bit more.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Allow logical replication to copy tables in binary format

2023-03-16 Thread Peter Smith
On Fri, Mar 17, 2023 at 12:20 AM Melih Mutlu  wrote:
>
> Hi,
>
> Please see the attached v16.
>
> Peter Smith , 16 Mar 2023 Per, 03:03 tarihinde şunu 
> yazdı:
>>
>> Here are some review comments for v15-0001
>
>
> I applied your comments in the updated patch.

Thanks.

I checked patchv16-0001 and have only one minor comment.

==
doc/src/sgml/logical-replication.sgml

1.
diff --git a/doc/src/sgml/logical-replication.sgml
b/doc/src/sgml/logical-replication.sgml
index 6b0e300adc..bad25e54cd 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -251,7 +251,10 @@
column of type bigint.  The target table can also have
additional columns not provided by the published table.  Any such columns
will be filled with the default value as specified in the definition of the
-   target table.
+   target table. However, logical replication in binary
+   format is more restrictive. See the binary option of
+   CREATE
SUBSCRIPTION
+   for details.
   

IMO the sentence "However, logical replication in binary format is
more restrictive." should just be plain text.

There should not be the binary markup in that 1st sentence.

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: "current directory" in a server error message

2023-03-16 Thread Kyotaro Horiguchi
At Thu, 16 Mar 2023 12:05:32 -0400, Tom Lane  wrote in 
> Kyotaro Horiguchi  writes:
> > At Thu, 16 Mar 2023 09:32:05 +0530, Bharath Rupireddy 
> >  wrote in 
> >> On Thu, Mar 16, 2023 at 7:47 AM Kyotaro Horiguchi
> >>  wrote:
> >>> Thus I think that the message should read "path must be in or below
> >>> the data directory" instead.
> 
> >> BTW, adminpack too has the same error message.
> 
> > I somehow dropped them. Thanks for pointing.
> 
> Agreed, this is an improvement.  I fixed adminpack too and pushed it.

Oh, thanks for committing this.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


Re: Amcheck verification of GiST and GIN

2023-03-16 Thread Peter Geoghegan
On Thu, Mar 16, 2023 at 4:48 PM Peter Geoghegan  wrote:
> Some feedback on the GiST patch:

I see that the Bloom filter that's used to implement heapallindexed
verification fingerprints index tuples that are formed via calls to
gistFormTuple(), without any attempt to normalize-away differences in
TOAST input state. In other words, there is nothing like
verify_nbtree.c's bt_normalize_tuple() function involved in the
fingerprinting process. Why is that safe, though? See the "toast_bug"
test case within contrib/amcheck/sql/check_btree.sql for an example of
how inconsistent TOAST input state confused verify_nbtree.c's
heapallindexed verification (before bugfix commit eba775345d). I'm
concerned about GiST heapallindexed verification being buggy in
exactly the same way, or in some way that is roughly analogous.

I do have some concerns about there being analogous problems that are
unique to GiST, since GiST is an AM that gives opclass authors many
more choices than B-Tree opclass authors have. In particular, I wonder
if heapallindexed verification needs to account for how GiST
compression might end up breaking heapallindexed. I refer to the
"compression" implemented by GiST support routine 3 of GiST opclasses.
The existence of GiST support routine 7, the "same" routine, also
makes me feel a bit squeamish about heapallindexed verification -- the
existence of a "same" routine hints at some confusion about "equality
versus equivalence" issues.

In more general terms: heapallindexed verification works by
fingerprinting index tuples during the index verification stage, and
then performing Bloom filter probes in a separate CREATE INDEX style
heap-matches-index stage (obviously). There must be some justification
for our assumption that there can be no false positive corruption
reports due only to a GiST opclass (either extant or theoretical) that
follows the GiST contract, and yet allows an inconsistency to arise
that isn't really index corruption. This justification won't be easy
to come up with, since the GiST contract was not really designed with
these requirements in mind. But...we should try to come up with
something.

What are the assumptions underlying heapallindexed verification for
GiST? It doesn't have to be provably correct or anything, but it
should at least be empirically falsifiable. Basically, something that
says: "Here are our assumptions, if we were wrong in making these
assumptions then you could tell that we made a mistake because of X,
Y, Z". It's not always clear when something is corrupt. Admittedly I
have much less experience with GiST than other people, which likely
includes you (Andrey). I am likely missing some context around the
evolution of GiST. Possibly I'm making a big deal out of something
without it being unhelpful. Unsure.

Here is an example of the basic definition of correctness being
unclear, in a bad way: Is a HOT chain corrupt when its root
LP_REDIRECT points to an LP_DEAD item, or does that not count as
corruption? I'm pretty sure that the answer is ambiguous even today,
or was ambiguous until recently, at least. Hopefully the
verify_heapam.c HOT chain verification patch will be committed,
providing us with a clear *definition* of HOT chain corruption -- the
definition itself may not be the easy part.

On a totally unrelated note: I wonder if we should be checking that
internal page tuples have 0x as their offset number? Seems like
it'd be a cheap enough cross-check.

-- 
Peter Geoghegan




Re: Allow logical replication to copy tables in binary format

2023-03-16 Thread Peter Smith
On Thu, Mar 16, 2023 at 1:55 PM Amit Kapila  wrote:
>
> On Wed, Mar 15, 2023 at 3:33 PM Melih Mutlu  wrote:
> >
> > Amit Kapila , 15 Mar 2023 Çar, 12:31 tarihinde 
> > şunu yazdı:
> >>
> >> On Tue, Mar 14, 2023 at 4:32 PM Melih Mutlu  wrote:
> >
> >
> >>
> >> What purpose does this test serve w.r.t this patch? Before checking
> >> the sync for different column orders, the patch has already changed
> >> binary to false, so it doesn't seem to test the functionality of this
> >> patch. Am, I missing something?
> >
> >
> > I missed that binary has changed to false before testing column orders. I 
> > moved that test case up before changing binary to false.
> > Please see v14 [1].
> >
>
> After thinking some more about this test, I don't think we need this
> test as this doesn't add any value to this patch. This tests the
> column orders which is well-established functionality of the apply
> worker.
>

I agree that different column order is a "well-established
functionality of the apply worker".

But when I searched the TAP tests I could not find any existing tests
that check the combination of
- different column orders
- CREATE SUBSCRIPTION with parameters binary=true and copy_data=true

So there seemed to be a gap in the test coverage, which is why I suggested it.

I guess that test was not strictly tied to this patch. Should I post
this new test suggestion as a separate thread or do you think there is
no point because it will not get any support?

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: slapd logs to syslog during tests

2023-03-16 Thread Andres Freund
Hi,

On 2023-03-16 16:14:58 -0400, Andrew Dunstan wrote:
> are you moving ahead with this?

I got sidetracked trying to make slapd stop any and all syslog access, but it
doesn't look like that's possible. But working on commiting the logfile-only
approach now. Planning to backpatch this, unless somebody protests very soon.

Greetings,

Andres Freund




Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

2023-03-16 Thread Melanie Plageman
On Wed, Mar 15, 2023 at 9:03 PM Melanie Plageman
 wrote:
> On Sat, Mar 11, 2023 at 2:16 PM Justin Pryzby  wrote:
> > On Sat, Mar 11, 2023 at 09:55:33AM -0500, Melanie Plageman wrote:
> > This patch should add support in vacuumdb.c.  And maybe a comment about
> > adding support there, since it's annoying when it the release notes one
> > year say "support VACUUM (FOO)" and then one year later say "support
> > vacuumdb --foo".
>
> So, v4 adds support for buffer-usage-limit to vacuumdb. There are a few
> issues. The main one is that no other vacuumdb option takes a size as a
> parameter. I couldn't actually find any other client with a parameter
> specified as a size.
>
> My VACUUM option code is using the GUC size parsing code from
> parse_int() -- including the unit flag GUC_UNIT_KB. Now that vacuumdb
> also needs to parse sizes, I think we'll need to lift the parse_int()
> code and the unit_conversion struct and
> unit_conversion_memory_unit_conversion_table out of guc.c and put it
> somewhere that it can be accessed for more than guc parsing (e.g. option
> parsing).
>
> For vacuumdb in this version, I just specified buffer-usage-limit is
> only in kB and thus can only be specified as an int.
>
> If we had something like pg_parse_size() in common, would this make
> sense? It would be a little bit of work to figure out what to do about
> the flags, etc.
>
> Another issue is the server-side guc
> #define MAX_BAS_RING_SIZE_KB (16 * 1024 * 1024)
> I just redefined it in vacuumdb code. I'm not sure what the preferred
> method for dealing with this is.
>
> I know this validation would get done server-side if I just passed the
> user-specified option through, but all of the other vacuumdb options
> appear to be doing min/max boundary validation on the client side.

So, after discussing vacuumdb client-side validation off-list with Jelte,
I realized that I was trying to do too much there.

Attached v5 passes the contents of the buffer-usage-limit option to
vacuumdb unvalidated into the VACUUM command string which vacuumdb
builds. This solves most of the problems.

I also improved the error messages coming from VACUUM
(buffer_usage_limit) handling.

- Melanie
From 5ca6eed0c126ed050277960d0a74979d70031239 Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Wed, 22 Feb 2023 12:26:01 -0500
Subject: [PATCH v5 2/3] use shared buffers when failsafe active

---
 src/backend/access/heap/vacuumlazy.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8f14cf85f3..3de7976cf6 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2622,6 +2622,13 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel)
 	if (unlikely(vacuum_xid_failsafe_check(>cutoffs)))
 	{
 		vacrel->failsafe_active = true;
+		/*
+		 * Abandon use of a buffer access strategy when entering failsafe mode,
+		 * as completing the ongoing VACUUM is our top priority.
+		 * Assume the caller who allocated the memory for the
+		 * BufferAccessStrategy object will free it.
+		 */
+		vacrel->bstrategy = NULL;
 
 		/* Disable index vacuuming, index cleanup, and heap rel truncation */
 		vacrel->do_index_vacuuming = false;
-- 
2.37.2

From a3bf94e4044f14dcd7b5c062b8417a3e0b9fe686 Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Wed, 22 Feb 2023 12:06:41 -0500
Subject: [PATCH v5 1/3] remove global variable vac_strategy

---
 src/backend/commands/vacuum.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index c54360a6a0..a6aac30529 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -75,7 +75,6 @@ int			vacuum_multixact_failsafe_age;
 
 /* A few variables that don't seem worth passing around as parameters */
 static MemoryContext vac_context = NULL;
-static BufferAccessStrategy vac_strategy;
 
 
 /*
@@ -94,7 +93,7 @@ static void vac_truncate_clog(TransactionId frozenXID,
 			  TransactionId lastSaneFrozenXid,
 			  MultiXactId lastSaneMinMulti);
 static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params,
-	   bool skip_privs);
+	   bool skip_privs, BufferAccessStrategy bstrategy);
 static double compute_parallel_delay(void);
 static VacOptValue get_vacoptval_from_boolean(DefElem *def);
 static bool vac_tid_reaped(ItemPointer itemptr, void *state);
@@ -338,7 +337,7 @@ vacuum(List *relations, VacuumParams *params,
 		in_outer_xact = IsInTransactionBlock(isTopLevel);
 
 	/*
-	 * Due to static variables vac_context, anl_context and vac_strategy,
+	 * Due to static variable vac_context
 	 * vacuum() is not reentrant.  This matters when VACUUM FULL or ANALYZE
 	 * calls a hostile index expression that itself calls ANALYZE.
 	 */
@@ -404,7 +403,6 @@ vacuum(List *relations, VacuumParams *params,
 		bstrategy = GetAccessStrategy(BAS_VACUUM);
 		MemoryContextSwitchTo(old_context);
 	}
-	

Re: Remove last traces of SCM credential auth from libpq?

2023-03-16 Thread Michael Paquier
On Thu, Mar 16, 2023 at 08:10:12PM -0400, Tom Lane wrote:
> Maybe flush those special messages too?  I'm not sure how long
> they've been obsolete, though.

KRB4 was switched in a159ad3 back in 2005, and KRB5 in 98de86e back in
2014 (deprecated in 8.3, so that's even older than creds).  So yes,
that could be removed as well, I guess, falling back to the default
error message.

> Nah, I see no reason to wait.  We already dropped the higher-level
> client support (psql/pg_dump) for these server versions in v15.

Okay.  I'll clean up this part today, then.
--
Michael


signature.asc
Description: PGP signature


Re: suppressing useless wakeups in logical/worker.c

2023-03-16 Thread Nathan Bossart
I've attached a minimally-updated patch that doesn't yet address the bigger
topics under discussion.

On Thu, Mar 16, 2023 at 03:30:37PM +0530, Amit Kapila wrote:
> On Wed, Feb 1, 2023 at 5:35 AM Nathan Bossart  
> wrote:
>> On Sat, Jan 28, 2023 at 10:26:25AM +0530, Amit Kapila wrote:
>> > BTW, do we need to do something about wakeups in
>> > wait_for_relation_state_change()?
>>
>> ... and wait_for_worker_state_change(), and copy_read_data().  From a quick
>> glance, it looks like fixing these would be a more invasive change.
> 
> What kind of logic do you have in mind to avoid waking up once per
> second in those cases?

I haven't looked into this too much yet.  I'd probably try out Tom's
suggestions from upthread [0] next and see if those can be applied here,
too.

>>  TBH
>> I'm beginning to wonder whether all this is really worth it to prevent
>> waking up once per second.
> 
> If we can't do it for all cases, do you see any harm in doing it for
> cases where we can achieve it without adding much complexity? We can
> probably add comments for others so that if someone else has better
> ideas in the future we can deal with those as well.

I don't think there's any harm, but I'm also not sure it does a whole lot
of good.  At the very least, I think we should figure out something better
than the process_syncing_tables() hacks before taking this patch seriously.

[0] https://postgr.es/m/3220831.1674772625%40sss.pgh.pa.us

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From d76710d4b79e173e2d1baf1af228fe7dd8927e72 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 24 Jan 2023 21:12:28 -0800
Subject: [PATCH v4 1/1] suppress useless wakeups in logical/worker.c

---
 src/backend/replication/logical/tablesync.c |  28 +++
 src/backend/replication/logical/worker.c| 189 
 src/include/replication/worker_internal.h   |   4 +
 src/tools/pgindent/typedefs.list|   1 +
 4 files changed, 186 insertions(+), 36 deletions(-)

diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 07eea504ba..573b46b5a2 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -419,6 +419,13 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
 
 	Assert(!IsTransactionState());
 
+	/*
+	 * If we've made it past our previously-stored special wakeup time, reset
+	 * it so that it can be recalculated as needed.
+	 */
+	if (LogRepWorkerGetSyncStartWakeup() <= GetCurrentTimestamp())
+		LogRepWorkerClearSyncStartWakeup();
+
 	/* We need up-to-date sync state info for subscription tables here. */
 	FetchTableStates(_tx);
 
@@ -592,6 +599,27 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
  DSM_HANDLE_INVALID);
 		hentry->last_start_time = now;
 	}
+	else
+	{
+		TimestampTz retry_time;
+
+		/*
+		 * Store when we can start the sync worker so that we
+		 * know how long to sleep.
+		 */
+		retry_time = TimestampTzPlusMilliseconds(hentry->last_start_time,
+ wal_retrieve_retry_interval);
+		LogRepWorkerUpdateSyncStartWakeup(retry_time);
+	}
+}
+else
+{
+	TimestampTz now = GetCurrentTimestamp();
+	TimestampTz retry_time;
+
+	/* Maybe there will be a free slot in a second... */
+	retry_time = TimestampTzPlusSeconds(now, 1);
+	LogRepWorkerUpdateSyncStartWakeup(retry_time);
 }
 			}
 		}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 10f9711972..8e68540b6f 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -208,8 +208,6 @@
 #include "utils/syscache.h"
 #include "utils/timeout.h"
 
-#define NAPTIME_PER_CYCLE 1000	/* max sleep time between cycles (1s) */
-
 typedef struct FlushPosition
 {
 	dlist_node	node;
@@ -351,6 +349,26 @@ static XLogRecPtr skip_xact_finish_lsn = InvalidXLogRecPtr;
 /* BufFile handle of the current streaming file */
 static BufFile *stream_fd = NULL;
 
+/*
+ * Reasons to wake up and perform periodic tasks.
+ */
+typedef enum LogRepWorkerWakeupReason
+{
+	LRW_WAKEUP_TERMINATE,
+	LRW_WAKEUP_PING,
+	LRW_WAKEUP_STATUS,
+	LRW_WAKEUP_SYNC_START
+#define NUM_LRW_WAKEUPS (LRW_WAKEUP_SYNC_START + 1)
+} LogRepWorkerWakeupReason;
+
+/*
+ * Wake up times for periodic tasks.
+ */
+static TimestampTz wakeup[NUM_LRW_WAKEUPS];
+
+static void LogRepWorkerComputeNextWakeup(LogRepWorkerWakeupReason reason,
+		  TimestampTz now);
+
 typedef struct SubXactInfo
 {
 	TransactionId xid;			/* XID of the subxact */
@@ -3441,10 +3459,9 @@ UpdateWorkerStats(XLogRecPtr last_lsn, TimestampTz send_time, bool reply)
 static void
 LogicalRepApplyLoop(XLogRecPtr last_received)
 {
-	TimestampTz last_recv_timestamp = GetCurrentTimestamp();
-	bool		ping_sent = false;
 	TimeLineID	tli;
 	ErrorContextCallback errcallback;
+	TimestampTz now;
 
 	/*
 	 * Init the 

Re: Dropped and generated columns might cause wrong data on subs when REPLICA IDENTITY FULL

2023-03-16 Thread Amit Kapila
On Fri, Mar 17, 2023 at 5:41 AM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > On Thu, Mar 16, 2023 at 9:33 PM Önder Kalacı  wrote:
> >>> and backpatch the fix for dropped column to PG10.
>
> > You can first submit the fix for dropped columns with patches till
> > v10. Once that is committed, then you can send the patches for
> > generated columns.
>
> Don't worry about v10 --- that's out of support and shouldn't
> get patched for this.
>

Okay, thanks for reminding me.

-- 
With Regards,
Amit Kapila.




Re: Dropped and generated columns might cause wrong data on subs when REPLICA IDENTITY FULL

2023-03-16 Thread Tom Lane
Amit Kapila  writes:
> On Thu, Mar 16, 2023 at 9:33 PM Önder Kalacı  wrote:
>>> and backpatch the fix for dropped column to PG10.

> You can first submit the fix for dropped columns with patches till
> v10. Once that is committed, then you can send the patches for
> generated columns.

Don't worry about v10 --- that's out of support and shouldn't
get patched for this.

regards, tom lane




Re: Remove last traces of SCM credential auth from libpq?

2023-03-16 Thread Tom Lane
Michael Paquier  writes:
> On Thu, Mar 16, 2023 at 10:49:45AM -0400, Tom Lane wrote:
>> Also, in pg_fe_sendauth, couldn't you just let the default: case
>> handle it instead of adding a bespoke error message?  We're not
>> really expecting that anyone is ever going to hit this, so I'm
>> not convinced it's worth the translation burden.

> Yes, I was wondering if that's worth keeping or not, so I chose
> consistency with AUTH_REQ_KRB4 and AUTH_REQ_KRB5.

Maybe flush those special messages too?  I'm not sure how long
they've been obsolete, though.

> Would it be better to hold on this patch for 17~?

Nah, I see no reason to wait.  We already dropped the higher-level
client support (psql/pg_dump) for these server versions in v15.

regards, tom lane




Re: Dropped and generated columns might cause wrong data on subs when REPLICA IDENTITY FULL

2023-03-16 Thread Amit Kapila
On Thu, Mar 16, 2023 at 9:33 PM Önder Kalacı  wrote:
>
> Hi Amit, Shi Yu,
>
> > Generated column is introduced in PG12, and I reproduced generated column 
> > problem
> in PG12~PG15.
> > For dropped column problem, I reproduced it in PG10~PG15. (Logical 
> > replication
> was introduced in PG10)
>
> So, I'm planning to split the changes into two commits. The first one fixes
> for dropped columns, and the second one adds generated columns check/test.
>
> Is that the right approach for such a case?
>

Works for me.

> >  and backpatch the fix for dropped column to PG10.
>
> Still, even the first commit fails to apply cleanly to PG12 (and below).
>
> What is the procedure here? Should I be creating multiple patches per version?
>

You can first submit the fix for dropped columns with patches till
v10. Once that is committed, then you can send the patches for
generated columns.

> Or does the committer prefer to handle the conflicts? Depending on your reply,
> I can work on the followup.
>

Thanks for working on it.

-- 
With Regards,
Amit Kapila.




Re: Remove last traces of SCM credential auth from libpq?

2023-03-16 Thread Michael Paquier
On Thu, Mar 16, 2023 at 10:49:45AM -0400, Tom Lane wrote:
> In addition to the changes here, it looks like you could drop the
> configure/meson probes that set HAVE_STRUCT_CMSGCRED.

Right, done.

> Also, in pg_fe_sendauth, couldn't you just let the default: case
> handle it instead of adding a bespoke error message?  We're not
> really expecting that anyone is ever going to hit this, so I'm
> not convinced it's worth the translation burden.

Yes, I was wondering if that's worth keeping or not, so I chose
consistency with AUTH_REQ_KRB4 and AUTH_REQ_KRB5.

Would it be better to hold on this patch for 17~?  I have just noticed
that while looking at Jacob's patch for require_auth, so the timing is
not good.  Honestly, I don't see a reason to wait a few extra month to
remove that, particularly now that pg_dump and pg_upgrade go down to
9.2..
--
Michael
diff --git a/src/include/libpq/pqcomm.h b/src/include/libpq/pqcomm.h
index 5268d442ab..bff7dd18a2 100644
--- a/src/include/libpq/pqcomm.h
+++ b/src/include/libpq/pqcomm.h
@@ -116,7 +116,7 @@ extern PGDLLIMPORT bool Db_user_namespace;
 #define AUTH_REQ_PASSWORD	3	/* Password */
 #define AUTH_REQ_CRYPT		4	/* crypt password. Not supported any more. */
 #define AUTH_REQ_MD5		5	/* md5 password */
-#define AUTH_REQ_SCM_CREDS	6	/* transfer SCM credentials */
+/* 6 is available.  It was used for SCM creds, not supported any more. */
 #define AUTH_REQ_GSS		7	/* GSSAPI without wrap() */
 #define AUTH_REQ_GSS_CONT	8	/* Continue GSS exchanges */
 #define AUTH_REQ_SSPI		9	/* SSPI negotiate without wrap() */
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 20c82f5979..4882c70559 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -427,9 +427,6 @@
 /* Define to 1 if you have the `strsignal' function. */
 #undef HAVE_STRSIGNAL
 
-/* Define to 1 if the system has the type `struct cmsgcred'. */
-#undef HAVE_STRUCT_CMSGCRED
-
 /* Define to 1 if the system has the type `struct option'. */
 #undef HAVE_STRUCT_OPTION
 
diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c
index a3b80dc550..fa95f8e6e9 100644
--- a/src/interfaces/libpq/fe-auth.c
+++ b/src/interfaces/libpq/fe-auth.c
@@ -688,68 +688,6 @@ pg_SASL_continue(PGconn *conn, int payloadlen, bool final)
 	return STATUS_OK;
 }
 
-/*
- * Respond to AUTH_REQ_SCM_CREDS challenge.
- *
- * Note: this is dead code as of Postgres 9.1, because current backends will
- * never send this challenge.  But we must keep it as long as libpq needs to
- * interoperate with pre-9.1 servers.  It is believed to be needed only on
- * Debian/kFreeBSD (ie, FreeBSD kernel with Linux userland, so that the
- * getpeereid() function isn't provided by libc).
- */
-static int
-pg_local_sendauth(PGconn *conn)
-{
-#ifdef HAVE_STRUCT_CMSGCRED
-	char		buf;
-	struct iovec iov;
-	struct msghdr msg;
-	struct cmsghdr *cmsg;
-	union
-	{
-		struct cmsghdr hdr;
-		unsigned char buf[CMSG_SPACE(sizeof(struct cmsgcred))];
-	}			cmsgbuf;
-
-	/*
-	 * The backend doesn't care what we send here, but it wants exactly one
-	 * character to force recvmsg() to block and wait for us.
-	 */
-	buf = '\0';
-	iov.iov_base = 
-	iov.iov_len = 1;
-
-	memset(, 0, sizeof(msg));
-	msg.msg_iov = 
-	msg.msg_iovlen = 1;
-
-	/* We must set up a message that will be filled in by kernel */
-	memset(, 0, sizeof(cmsgbuf));
-	msg.msg_control = 
-	msg.msg_controllen = sizeof(cmsgbuf.buf);
-	cmsg = CMSG_FIRSTHDR();
-	cmsg->cmsg_len = CMSG_LEN(sizeof(struct cmsgcred));
-	cmsg->cmsg_level = SOL_SOCKET;
-	cmsg->cmsg_type = SCM_CREDS;
-
-	if (sendmsg(conn->sock, , 0) == -1)
-	{
-		char		sebuf[PG_STRERROR_R_BUFLEN];
-
-		appendPQExpBuffer(>errorMessage,
-		  "pg_local_sendauth: sendmsg: %s\n",
-		  strerror_r(errno, sebuf, sizeof(sebuf)));
-		return STATUS_ERROR;
-	}
-
-	conn->client_finished_auth = true;
-	return STATUS_OK;
-#else
-	libpq_append_conn_error(conn, "SCM_CRED authentication method not supported");
-	return STATUS_ERROR;
-#endif
-}
-
 static int
 pg_password_sendauth(PGconn *conn, const char *password, AuthRequest areq)
 {
@@ -830,8 +768,6 @@ auth_method_description(AuthRequest areq)
 			return libpq_gettext("server requested GSSAPI authentication");
 		case AUTH_REQ_SSPI:
 			return libpq_gettext("server requested SSPI authentication");
-		case AUTH_REQ_SCM_CREDS:
-			return libpq_gettext("server requested UNIX socket credentials");
 		case AUTH_REQ_SASL:
 		case AUTH_REQ_SASL_CONT:
 		case AUTH_REQ_SASL_FIN:
@@ -922,7 +858,6 @@ check_expected_areq(AuthRequest areq, PGconn *conn)
 			case AUTH_REQ_GSS:
 			case AUTH_REQ_GSS_CONT:
 			case AUTH_REQ_SSPI:
-			case AUTH_REQ_SCM_CREDS:
 			case AUTH_REQ_SASL:
 			case AUTH_REQ_SASL_CONT:
 			case AUTH_REQ_SASL_FIN:
@@ -1183,11 +1118,6 @@ pg_fe_sendauth(AuthRequest areq, int payloadlen, PGconn *conn)
 			}
 			break;
 
-		case AUTH_REQ_SCM_CREDS:
-			if (pg_local_sendauth(conn) != STATUS_OK)
-return STATUS_ERROR;
-			break;
-
 		default:
 			

Re: Allow logical replication to copy tables in binary format

2023-03-16 Thread Amit Kapila
On Thu, Mar 16, 2023 at 6:59 PM Melih Mutlu  wrote:
>
> Amit Kapila , 16 Mar 2023 Per, 06:25 tarihinde şunu 
> yazdı:
>>
>> On Thu, Mar 16, 2023 at 8:27 AM Euler Taveira  wrote:
>> >
>> > On Wed, Mar 8, 2023, at 11:50 PM, Amit Kapila wrote:
>> >
>> > It is not clear to me which version check you wanted to add because we
>> > seem to have a binary option in COPY from the time prior to logical
>> > replication. I feel we need a publisher version 14 check as that is
>> > where we start to support binary mode transfer in logical replication.
>> > See the check in function libpqrcv_startstreaming().
>> >
>> > ... then you are breaking existent cases. Even if you have a convincing
>> > argument, you are introducing a behavior change in prior versions (commit
>> > messages should always indicate that you are breaking backward 
>> > compatibility).
>> >
>> > +
>> > +   /*
>> > +* The binary option for replication is supported since v14
>> > +*/
>> > +   if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 14 &&
>> > +   MySubscription->binary)
>> > +   {
>> > +   appendStringInfo(, " WITH (FORMAT binary)");
>> > +   options = lappend(options, makeDefElem("format", (Node *) 
>> > makeString("binary"), -1));
>> > +   }
>> > +
>> >
>> > What are the arguments to support since v14 instead of the to-be-released
>> > version? I read the thread but it is not clear. It was said about the
>> > restrictive nature of this feature and it will be frustrating to see that 
>> > the
>> > same setup (with the same commands) works with v14 and v15 but it doesn't 
>> > with
>> > v16.
>> >
>>
>> If the failure has to happen it will anyway happen later when the
>> publisher will be upgraded to v16. The reason for the version checks
>> as v14 was to allow the initial sync from the same version where the
>> binary mode for replication was introduced. However, if we expect
>> failures in the existing setup, I am fine with supporting this for >=
>> v16.
>
>
> Upgrading the subscriber to v16 and keeping the subscriber in v14 could break 
> existing subscriptions. I don't know how likely such a case is.
>
> I don't have a strong preference on this. What do you think? Should we change 
> it >=v16 or keep it as it is?
>

I think to reduce the risk of breakage, let's change the check to
>=v16. Also, accordingly, update the doc and commit message.

-- 
With Regards,
Amit Kapila.




Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?

2023-03-16 Thread Andres Freund
Hi,

On 2023-03-07 15:44:46 +1300, Thomas Munro wrote:
> On Tue, Mar 7, 2023 at 3:42 PM Thomas Munro  wrote:
> > Apparently ye olde GCC 4.7 on "lapwing" doesn't like the way you
> > initialised that struct.  I guess it wants {{0}} instead of {0}.
> > Apparently old GCC was wrong about that warning[1], but that system
> > doesn't have the back-patched fixes?  Not sure.
> 
> Oh, you already pushed a fix.  But now I'm wondering if it's useful to
> have old buggy compilers set to run with -Werror.

I think it's actively harmful to do so. Avoiding warnings on a > 10 year old
compiler a) is a waste of time b) unnecessarily requires making our code
uglier.

Greetings,

Andres Freund




Re: Amcheck verification of GiST and GIN

2023-03-16 Thread Peter Geoghegan
On Sun, Feb 5, 2023 at 4:45 PM Andrey Borodin  wrote:
> Here's v24 == (v23 + a step for pg_amcheck). There's a lot of
> shotgun-style changes, but I hope next index types will be easy to add
> now.

Some feedback on the GiST patch:

* You forgot to initialize GistCheckState.heaptuplespresent to 0.

It might be better to allocate GistCheckState dynamically, using
palloc0(). That's future proof. "Simple and obvious" is usually the
most important goal for managing memory in amcheck code. It can be a
little inefficient if that makes it simpler.

* ISTM that gist_index_check() should allow the caller to omit a
"heapallindexed" argument by specifying "DEFAULT FALSE", for
consistency with bt_index_check().

(Actually there are two versions of bt_index_check(), with
overloading, but that's just because of the way that the extension
evolved over time).

* What's the point in having a custom memory context that is never reset?

I believe that gistgetadjusted() will leak memory here, so there is a
need for some kind of high level strategy for managing memory. The
strategy within verify_nbtree.c is to call MemoryContextReset() right
after every loop iteration within bt_check_level_from_leftmost() --
which is pretty much once every call to bt_target_page_check(). That
kind of approach is obviously not going to suffer any memory leaks.

Again, "simple and obvious" is good for memory management in amcheck.

* ISTM that it would be clearer if the per-page code within
gist_check_parent_keys_consistency() was broken out into its own
function -- a little like bt_target_page_check()..

That way the control flow would be easier to understand when looking
at the code at a high level.

* ISTM that gist_refind_parent() should throw an error about
corruption in the event of a parent page somehow becoming a leaf page.

Obviously this is never supposed to happen, and likely never will
happen, even with corruption. But it seems like a good idea to make
the most conservative possible assumption by throwing an error. If it
never happens anyway, then the fact that we handle it with an error
won't matter -- so the error is harmless. If it does happen then we'll
want to hear about it as soon as possible -- so the error is useful.

* I suggest using c99 style variable declarations in loops.

Especially for things like "for (OffsetNumber offset =
FirstOffsetNumber; ... ; ... )".

-- 
Peter Geoghegan




Re: improving user.c error messages

2023-03-16 Thread Nathan Bossart
Here is a rebased patch in which I've addressed the latest feedback except
for the DropRole() part that is under discussion.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From cd6a75109471e173869a15b39342ff4882eac61f Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Thu, 26 Jan 2023 11:05:13 -0800
Subject: [PATCH v8 1/1] Improve several permission-related error messages.

---
 contrib/file_fdw/expected/file_fdw.out|   3 +-
 contrib/file_fdw/file_fdw.c   |  10 +-
 .../test_decoding/expected/permissions.out|  12 +-
 src/backend/backup/basebackup_server.c|   4 +-
 src/backend/catalog/objectaddress.c   |  16 +-
 src/backend/commands/copy.c   |  12 +-
 src/backend/commands/user.c   | 167 +-
 src/backend/replication/slot.c|   4 +-
 src/backend/storage/ipc/procarray.c   |   4 +-
 src/backend/storage/ipc/signalfuncs.c |  16 +-
 src/backend/tcop/utility.c|   5 +-
 src/backend/utils/init/postinit.c |  10 +-
 src/backend/utils/misc/guc.c  |  15 +-
 .../expected/dummy_seclabel.out   |   3 +-
 .../unsafe_tests/expected/rolenames.out   |   3 +-
 src/test/regress/expected/create_role.out |  80 ++---
 src/test/regress/expected/dependency.out  |   4 +
 src/test/regress/expected/privileges.out  |  23 ++-
 18 files changed, 282 insertions(+), 109 deletions(-)

diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out
index f5ae29732a..72304e0ff3 100644
--- a/contrib/file_fdw/expected/file_fdw.out
+++ b/contrib/file_fdw/expected/file_fdw.out
@@ -491,7 +491,8 @@ ALTER FOREIGN TABLE agg_text OWNER TO regress_file_fdw_user;
 ALTER FOREIGN TABLE agg_text OPTIONS (SET format 'text');
 SET ROLE regress_file_fdw_user;
 ALTER FOREIGN TABLE agg_text OPTIONS (SET format 'text');
-ERROR:  only superuser or a role with privileges of the pg_read_server_files role may specify the filename option of a file_fdw foreign table
+ERROR:  permission denied to set the "filename" option of a file_fdw foreign table
+DETAIL:  Only roles with privileges of the "pg_read_server_files" role may set this option.
 SET ROLE regress_file_fdw_superuser;
 -- cleanup
 RESET ROLE;
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 99b21e8316..9e330b9934 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -279,13 +279,19 @@ file_fdw_validator(PG_FUNCTION_ARGS)
 !has_privs_of_role(GetUserId(), ROLE_PG_READ_SERVER_FILES))
 ereport(ERROR,
 		(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
-		 errmsg("only superuser or a role with privileges of the pg_read_server_files role may specify the filename option of a file_fdw foreign table")));
+		 errmsg("permission denied to set the \"%s\" option of a file_fdw foreign table",
+"filename"),
+		 errdetail("Only roles with privileges of the \"%s\" role may set this option.",
+   "pg_read_server_files")));
 
 			if (strcmp(def->defname, "program") == 0 &&
 !has_privs_of_role(GetUserId(), ROLE_PG_EXECUTE_SERVER_PROGRAM))
 ereport(ERROR,
 		(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
-		 errmsg("only superuser or a role with privileges of the pg_execute_server_program role may specify the program option of a file_fdw foreign table")));
+		 errmsg("permission denied to set the \"%s\" option of a file_fdw foreign table",
+"program"),
+		 errdetail("Only roles with privileges of the \"%s\" role may set this option.",
+   "pg_execute_server_program")));
 
 			filename = defGetString(def);
 		}
diff --git a/contrib/test_decoding/expected/permissions.out b/contrib/test_decoding/expected/permissions.out
index ed97f81dda..d6eaba8c55 100644
--- a/contrib/test_decoding/expected/permissions.out
+++ b/contrib/test_decoding/expected/permissions.out
@@ -54,13 +54,16 @@ RESET ROLE;
 -- plain user *can't* can control replication
 SET ROLE regress_lr_normal;
 SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot', 'test_decoding');
-ERROR:  must be superuser or replication role to use replication slots
+ERROR:  permission denied to use replication slots
+DETAIL:  Only roles with the REPLICATION attribute may use replication slots.
 INSERT INTO lr_test VALUES('lr_superuser_init');
 ERROR:  permission denied for table lr_test
 SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');
-ERROR:  must be superuser or replication role to use replication slots
+ERROR:  permission denied to use replication slots
+DETAIL:  Only roles with the REPLICATION attribute may use replication slots.
 SELECT pg_drop_replication_slot('regression_slot');
-ERROR:  must be superuser or replication role to use replication slots
+ERROR:  permission denied to use replication slots
+DETAIL:  Only roles with the 

Re: PGDOCS - Replica Identity quotes

2023-03-16 Thread Peter Smith
A rebase was needed due to the recent REPLICA IDENTITY push [1].

PSA v2.

--
[1] 
https://github.com/postgres/postgres/commit/89e46da5e511a6970e26a020f265c9fb4b72b1d2

Kind Regards,
Peter Smith.
Fujitsu Australia


v2-0001-PGDOCS-replica-identity-quotes.patch
Description: Binary data


Re: Refactor calculations to use instr_time

2023-03-16 Thread Melanie Plageman
On Thu, Mar 09, 2023 at 04:02:44PM +0300, Nazir Bilal Yavuz wrote:
> From dcd49e48a0784a95b8731df1c6ee7c3a612a8529 Mon Sep 17 00:00:00 2001
> From: Nazir Bilal Yavuz 
> Date: Thu, 9 Mar 2023 15:35:38 +0300
> Subject: [PATCH v5] Refactor instr_time calculations
> 
> Also, some calculations are refactored to use 'INSTR_TIME_ACCUM_DIFF' instead
> of using 'INSTR_TIME_SUBTRACT' and 'INSTR_TIME_ADD'.
> ---
>  src/backend/access/transam/xlog.c   |  6 ++---
>  src/backend/storage/file/buffile.c  |  6 ++---
>  src/backend/utils/activity/pgstat_wal.c | 31 +
>  src/include/pgstat.h| 17 +-
>  src/tools/pgindent/typedefs.list|  1 +
>  5 files changed, 37 insertions(+), 24 deletions(-)
> 
> diff --git a/src/backend/utils/activity/pgstat_wal.c 
> b/src/backend/utils/activity/pgstat_wal.c
> index e8598b2f4e0..58daae3fbd6 100644
> --- a/src/backend/utils/activity/pgstat_wal.c
> +++ b/src/backend/utils/activity/pgstat_wal.c
> @@ -88,25 +88,26 @@ pgstat_flush_wal(bool nowait)
>* Calculate how much WAL usage counters were increased by subtracting 
> the
>* previous counters from the current ones.
>*/
> - WalUsageAccumDiff(, , );
> - PendingWalStats.wal_records = diff.wal_records;
> - PendingWalStats.wal_fpi = diff.wal_fpi;
> - PendingWalStats.wal_bytes = diff.wal_bytes;
> + WalUsageAccumDiff(_usage_diff, , );
>  
>   if (!nowait)
>   LWLockAcquire(_shmem->lock, LW_EXCLUSIVE);
>   else if (!LWLockConditionalAcquire(_shmem->lock, LW_EXCLUSIVE))
>   return true;
>  
> -#define WALSTAT_ACC(fld) stats_shmem->stats.fld += PendingWalStats.fld
> - WALSTAT_ACC(wal_records);
> - WALSTAT_ACC(wal_fpi);
> - WALSTAT_ACC(wal_bytes);
> - WALSTAT_ACC(wal_buffers_full);
> - WALSTAT_ACC(wal_write);
> - WALSTAT_ACC(wal_sync);
> - WALSTAT_ACC(wal_write_time);
> - WALSTAT_ACC(wal_sync_time);
> +#define WALSTAT_ACC(fld, var_to_add) \
> + (stats_shmem->stats.fld += var_to_add.fld)
> +#define WALLSTAT_ACC_INSTR_TIME_TYPE(fld) \
> + (stats_shmem->stats.fld += INSTR_TIME_GET_MICROSEC(PendingWalStats.fld))
> + WALSTAT_ACC(wal_records, wal_usage_diff);
> + WALSTAT_ACC(wal_fpi, wal_usage_diff);
> + WALSTAT_ACC(wal_bytes, wal_usage_diff);
> + WALSTAT_ACC(wal_buffers_full, PendingWalStats);
> + WALSTAT_ACC(wal_write, PendingWalStats);
> + WALSTAT_ACC(wal_sync, PendingWalStats);
> + WALLSTAT_ACC_INSTR_TIME_TYPE(wal_write_time);

I think you want one less L here?
WALLSTAT_ACC_INSTR_TIME_TYPE -> WALSTAT_ACC_INSTR_TIME_TYPE

Also, I don't quite understand why TYPE is at the end of the name. I
think it would still be clear without it.

I might find it clearer if the WALSTAT_ACC_INSTR_TIME_TYPE macro was
defined before using it for those fields instead of defining it right
after defining WALSTAT_ACC.

> + WALLSTAT_ACC_INSTR_TIME_TYPE(wal_sync_time);
> +#undef WALLSTAT_ACC_INSTR_TIME_TYPE
>  #undef WALSTAT_ACC
>  
>   LWLockRelease(_shmem->lock);
> diff --git a/src/include/pgstat.h b/src/include/pgstat.h
> index f43fac09ede..5bbc55bb341 100644
> --- a/src/include/pgstat.h
> +++ b/src/include/pgstat.h
> @@ -442,6 +442,21 @@ typedef struct PgStat_WalStats
>   TimestampTz stat_reset_timestamp;
>  } PgStat_WalStats;
>  
> +/*
> + * This struct stores wal-related durations as instr_time, which makes it
> + * easier to accumulate them without requiring type conversions. Then,
> + * during stats flush, they will be moved into shared stats with type
> + * conversions.
> + */
> +typedef struct PgStat_PendingWalStats
> +{
> + PgStat_Counter wal_buffers_full;
> + PgStat_Counter wal_write;
> + PgStat_Counter wal_sync;
> + instr_time wal_write_time;
> + instr_time wal_sync_time;
> +} PgStat_PendingWalStats;
> +

So, I am not a fan of having this second struct (PgStat_PendingWalStats)
which only has a subset of the members of PgStat_WalStats. It is pretty
confusing.

It is okay to have two structs -- one that is basically "in-memory" and
one that is a format that can be on disk, but these two structs with
different members are confusing and don't convey why we have the two
structs.

I would either put WalUsage into PgStat_PendingWalStats (so that it has
all the same members as PgStat_WalStats), or figure out a way to
maintain WalUsage separately from PgStat_WalStats or something else.
Worst case, add more comments to the struct definitions to explain why
they have the members they have and how WalUsage relates to them.

- Melanie




Re: Add LZ4 compression in pg_dump

2023-03-16 Thread Justin Pryzby
On Thu, Mar 16, 2023 at 11:30:50PM +0100, Tomas Vondra wrote:
> On 3/16/23 01:20, Justin Pryzby wrote:
> > But try reading the diff while looking for the cause of a bug.  It's the
> > difference between reading 50, two-line changes, and reading a hunk that
> > replaces 100 lines with a different 100 lines, with empty/unrelated
> > lines randomly thrown in as context.
>
> I don't know, maybe I'm doing something wrong or maybe I just am bad at
> looking at diffs, but if I apply the patch you submitted on 8/3 and do
> the git-diff above (output attached), it seems pretty incomprehensible
> to me :-( I don't see 50 two-line changes (I certainly wouldn't be able
> to identify the root cause of the bug based on that).

It's true that most of the diff is still incomprehensible...

But look at the part relevant to the "empty-data" bug:

[... incomprehensible changes elided ...]
>  static void
> -InitCompressorZlib(CompressorState *cs, int level)
> +DeflateCompressorInit(CompressorState *cs)
>  {
> + GzipCompressorState *gzipcs;
>   z_streamp   zp;
>  
> - zp = cs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
> + gzipcs = (GzipCompressorState *) 
> pg_malloc0(sizeof(GzipCompressorState));
> + zp = gzipcs->zp = (z_streamp) pg_malloc(sizeof(z_stream));
>   zp->zalloc = Z_NULL;
>   zp->zfree = Z_NULL;
>   zp->opaque = Z_NULL;
>  
>   /*
> -  * zlibOutSize is the buffer size we tell zlib it can output to.  We
> -  * actually allocate one extra byte because some routines want to 
> append a
> -  * trailing zero byte to the zlib output.
> +  * outsize is the buffer size we tell zlib it can output to.  We 
> actually
> +  * allocate one extra byte because some routines want to append a 
> trailing
> +  * zero byte to the zlib output.
>*/
> - cs->zlibOut = (char *) pg_malloc(ZLIB_OUT_SIZE + 1);
> - cs->zlibOutSize = ZLIB_OUT_SIZE;
> + gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
> + gzipcs->outsize = ZLIB_OUT_SIZE;
>  
> - if (deflateInit(zp, level) != Z_OK)
> - pg_fatal("could not initialize compression library: %s",
> -  zp->msg);
> + /* -Z 0 uses the "None" compressor -- not zlib with no compression */
> + Assert(cs->compression_spec.level != 0);
> +
> + if (deflateInit(zp, cs->compression_spec.level) != Z_OK)
> + pg_fatal("could not initialize compression library: %s", 
> zp->msg);
>  
>   /* Just be paranoid - maybe End is called after Start, with no Write */
> - zp->next_out = (void *) cs->zlibOut;
> - zp->avail_out = cs->zlibOutSize;
> + zp->next_out = gzipcs->outbuf;
> + zp->avail_out = gzipcs->outsize;
> +
> + /* Keep track of gzipcs */
> + cs->private_data = gzipcs;
>  }
>  
>  static void
> -EndCompressorZlib(ArchiveHandle *AH, CompressorState *cs)
> +DeflateCompressorEnd(ArchiveHandle *AH, CompressorState *cs)
>  {
> - z_streamp   zp = cs->zp;
> + GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
> + z_streamp   zp;
>  
> + zp = gzipcs->zp;
>   zp->next_in = NULL;
>   zp->avail_in = 0;
>  
>   /* Flush any remaining data from zlib buffer */
> - DeflateCompressorZlib(AH, cs, true);
> + DeflateCompressorCommon(AH, cs, true);
>  
>   if (deflateEnd(zp) != Z_OK)
>   pg_fatal("could not close compression stream: %s", zp->msg);
>  
> - free(cs->zlibOut);
> - free(cs->zp);
> + pg_free(gzipcs->outbuf);
> + pg_free(gzipcs->zp);
> + pg_free(gzipcs);
> + cs->private_data = NULL;
>  }
>  
>  static void
> -DeflateCompressorZlib(ArchiveHandle *AH, CompressorState *cs, bool flush)
> +DeflateCompressorCommon(ArchiveHandle *AH, CompressorState *cs, bool flush)
>  {
> - z_streamp   zp = cs->zp;
> - char   *out = cs->zlibOut;
> + GzipCompressorState *gzipcs = (GzipCompressorState *) cs->private_data;
> + z_streamp   zp = gzipcs->zp;
> + void   *out = gzipcs->outbuf;
>   int res = Z_OK;
>  
> - while (cs->zp->avail_in != 0 || flush)
> + while (gzipcs->zp->avail_in != 0 || flush)
>   {
>   res = deflate(zp, flush ? Z_FINISH : Z_NO_FLUSH);
>   if (res == Z_STREAM_ERROR)
>   pg_fatal("could not compress data: %s", zp->msg);
> - if ((flush && (zp->avail_out < cs->zlibOutSize))
> + if ((flush && (zp->avail_out < gzipcs->outsize))
>   || (zp->avail_out == 0)
>   || (zp->avail_in != 0)
>   )
> @@ -289,18 +122,18 @@ DeflateCompressorZlib(ArchiveHandle *AH, 
> CompressorState *cs, bool flush)
>* chunk is the EOF marker in the custom format. This 
> should never
>* happen but ...
>*/
> - if (zp->avail_out < cs->zlibOutSize)
> + if (zp->avail_out 

Re: Add LZ4 compression in pg_dump

2023-03-16 Thread Tomas Vondra


On 3/16/23 01:20, Justin Pryzby wrote:
> On Mon, Mar 13, 2023 at 10:47:12PM +0100, Tomas Vondra wrote:
>>> Rearrange functions to their original order allowing a cleaner diff to the 
>>> prior code;
>>
>> OK. I wasn't very enthusiastic about this initially, but after thinking
>> about it a bit I think it's meaningful to make diffs clearer. But I
>> don't see much difference with/without the patch. The
>>
>> git diff --diff-algorithm=minimal -w 
>> e9960732a~:src/bin/pg_dump/compress_io.c src/bin/pg_dump/compress_gzip.c
>>
>> Produces ~25k diff with/without the patch. What am I doing wrong?
> 
> Do you mean 25 kB of diff ?

Yes, if you redirect the git-diff to a file, it's a 25kB file.

> I agree that the statistics of the diff output don't change a lot:
> 
>   1 file changed, 201 insertions(+), 570 deletions(-)
>   1 file changed, 198 insertions(+), 548 deletions(-)
> 
> But try reading the diff while looking for the cause of a bug.  It's the
> difference between reading 50, two-line changes, and reading a hunk that
> replaces 100 lines with a different 100 lines, with empty/unrelated
> lines randomly thrown in as context.
> 
> When the diff is readable, the pg_fatal() also stands out.
> 

I don't know, maybe I'm doing something wrong or maybe I just am bad at
looking at diffs, but if I apply the patch you submitted on 8/3 and do
the git-diff above (output attached), it seems pretty incomprehensible
to me :-( I don't see 50 two-line changes (I certainly wouldn't be able
to identify the root cause of the bug based on that).

>>> Change pg_fatal() to an assertion+comment;
>>
>> Yeah, that's reasonable. I'd even ditch the assert/comment, TBH. We
>> could add such protections against "impossible" stuff to a zillion other
>> places and the confusion likely outweighs the benefits.
>>
>>> Update the commit message and fix a few typos;
>>
>> Thanks. I don't want to annoy you too much, but could you split the
>> patch into the "empty-data" fix and all the other changes (rearranging
>> functions etc.)? I'd rather not mix those in the same commit.
> 
> I don't know if that makes sense?  The "empty-data" fix creates a new
> function called DeflateCompressorInit().  My proposal was to add the new
> function in the same place in the file as it used to be.
> 

Got it. In that case I agree it's fine to do that in a single commit.

> The patch also moves the pg_fatal() that's being removed.  I don't think
> it's going to look any cleaner to read a history involving the
> pg_fatal() first being added, then moved, then removed.  Anyway, I'll
> wait while the community continues discussion about the pg_fatal().
> 

I think the agreement was to replace the pg_fatal with and assert, and I
see your patch already does that.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Companydiff --git a/src/bin/pg_dump/compress_io.c b/src/bin/pg_dump/compress_gzip.c
index 5ac21f091f0..3c9ac55c266 100644
--- a/src/bin/pg_dump/compress_io.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -1,285 +1,118 @@
 /*-
  *
- * compress_io.c
- *  Routines for archivers to write an uncompressed or compressed data
- *  stream.
+ * compress_gzip.c
+ *  Routines for archivers to read or write a gzip compressed data stream.
  *
  * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
- * This file includes two APIs for dealing with compressed data. The first
- * provides more flexibility, using callbacks to read/write data from the
- * underlying stream. The second API is a wrapper around fopen/gzopen and
- * friends, providing an interface similar to those, but abstracts away
- * the possible compression. Both APIs use libz for the compression, but
- * the second API uses gzip headers, so the resulting files can be easily
- * manipulated with the gzip utility.
- *
- * Compressor API
- * --
- *
- * The interface for writing to an archive consists of three functions:
- * AllocateCompressor, WriteDataToArchive and EndCompressor. First you call
- * AllocateCompressor, then write all the data by calling 
WriteDataToArchive
- * as many times as needed, and finally EndCompressor. WriteDataToArchive
- * and EndCompressor will call the WriteFunc that was provided to
- * AllocateCompressor for each chunk of compressed data.
- *
- * The interface for reading an archive consists of just one function:
- * ReadDataFromArchive. ReadDataFromArchive reads the whole compressed 
input
- * stream, by repeatedly calling the given ReadFunc. ReadFunc returns the
- * compressed data chunk at a time, and ReadDataFromArchive decompresses it
- * and passes the decompressed data to ahwrite(), until ReadFunc returns 0
- * to signal EOF.
- *
- * The interface is the same for compressed and uncompressed 

Re: A problem about ParamPathInfo for an AppendPath

2023-03-16 Thread Tom Lane
Richard Guo  writes:
> Attached is a patch for this change and the changes described upthread.

Pushed.  I thought the comment needed to be completely rewritten not just
tweaked, and I felt it was probably reasonable to continue to exclude
dummy paths from getting the more expensive treatment.

regards, tom lane




Re: slapd logs to syslog during tests

2023-03-16 Thread Thomas Munro
On Fri, Mar 17, 2023 at 9:15 AM Andrew Dunstan  wrote:
> On 2023-03-11 Sa 18:37, Andres Freund wrote:
> On my buildfarm host (for all my animals) I noted that slapd was by far the
> biggest contributor to syslog. Even though there's not normally slapd
> running. It's of course the slapds started by various tests.
>
> Would anybody mind if I add 'logfile_only' to slapd's config in LdapServer.pm?
> That still leaves a few logline, from before the config file parsing, but it's
> a lot better than all requests getting logged.
>
> Obviously I also could reconfigure syslog to just filter this stuff, but it
> seems that the tests shouldn't spam like that.
>
>
> Hi, Andres,
>
> are you moving ahead with this?

+1 for doing so.  It has befuddled me before that I had to hunt down
error messages by tracing system calls[1], and that's useless for
readers of CI/BF logs.

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKGJdwNiwM5iWXVh050kKw5p3VCMJyoFyCpPbEf6ZNOC1pw%40mail.gmail.com#efe8bb4695171288e4e600391df3f9fa




Re: More weird compiler warnings

2023-03-16 Thread Andres Freund
On 2023-03-16 14:31:37 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2022-03-26 13:55:49 -0700, Andres Freund wrote:
> >> On 2022-03-26 16:23:26 -0400, Tom Lane wrote:
> >>> but I'm wondering if we could silence the warning by changing the loop 
> >>> condition to
> >>> while (--nb >= 0)
> >>> which seems like it might be marginally more readable anyway.
> 
> >> Yes, that looks like it silences it.  I modified the small reproducer I 
> >> had in
> >> that bug (https://godbolt.org/z/ejK9h6von) and the warning vanishes.
> 
> > The recent discussion about warnings reminded me of this. Given the gcc bug
> > hasn't been fixed, I think we should make that change. I'd vote for
> > backpatching it as well - what do you think?
> 
> +1, can't hurt anything AFAICS.

Done.




Re: doc: mentioned CREATE+ATTACH PARTITION as an alternative to CREATE TABLE..PARTITION OF

2023-03-16 Thread Tom Lane
Justin Pryzby  writes:
> On Thu, Mar 16, 2023 at 04:52:07PM -0400, Tom Lane wrote:
>   It is possible to use ALTER
>   TABLE ATTACH/DETACH PARTITION to perform these
>   operations with a weaker lock, thus reducing interference with
>   concurrent operations on the partitioned table.

> Note that in order for DETACH+DROP to use a lower lock level, it has to be
> DETACH CONCURRENTLY.  ATTACH is implicitly uses a lower lock level, but for
> DETACH it's only on request.

Right, but that's the sort of detail you should read on that command's man
page, we don't need to duplicate it in N other places.

regards, tom lane




Re: Add LZ4 compression in pg_dump

2023-03-16 Thread Tomas Vondra



On 3/16/23 18:04, gkokola...@pm.me wrote:
> 
> --- Original Message ---
> On Tuesday, March 14th, 2023 at 4:32 PM, Tomas Vondra 
>  wrote:
>>
>> On 3/14/23 16:18, gkokola...@pm.me wrote:
>>
>>> ...> Would you mind me trying to come with a patch to address your points?
>>
>>
>> That'd be great, thanks. Please keep it split into smaller patches - two
>> might work, with one patch for "cosmetic" changes and the other tweaking
>> the API error-handling stuff.
> 
> Please find attached a set for it. I will admit that the splitting in the
> series might not be ideal and what you requested. It is split on what
> seemed as a logical units. Please advice how a better split can look like.
> 
> 0001 is unifying types and return values on the API
> 0002 is addressing the constant definitions
> 0003 is your previous 0004 adding comments
> 

Thanks. I think the split seems reasonable - the goal was to not mix
different changes, and from that POV it works.

I'm not sure I understand the Gzip_read/Gzip_write changes in 0001. I
mean, gzread/gzwrite returns int, so how does renaming the size_t
variable solve the issue of negative values for errors? I mean, this

-size_tret;
+size_tgzret;

-ret = gzread(gzfp, ptr, size);
+gzret = gzread(gzfp, ptr, size);

means we still lost the information gzread() returned a negative value,
no? We'll still probably trigger an error, but it's a bit weird.


ISTM all this kinda assumes we're processing chunks of memory small
enough that we'll never actually overflow int - I did check what the
code in 15 does, and it seems use int and size_t quite arbitrarily.

For example cfread() seems quite sane:

int
cfread(void *ptr, int size, cfp *fp)
{
int ret;
...
ret = gzread(fp->compressedfp, ptr, size);
...
return ret;
}

but then _PrintFileData() happily stashes it into a size_t, ignoring the
signedness. Surely, if

static void
_PrintFileData(ArchiveHandle *AH, char *filename)
{
size_tcnt;
...
while ((cnt = cfread(buf, buflen, cfp)))
{
ahwrite(buf, 1, cnt, AH);
}
...
}

Unless I'm missing something, if gzread() ever returns -1 or some other
negative error value, we'll cast it to  size_t, while condition will
evaluate to "true" and we'll happily chew on some random chunk of data.

So the confusion is (at least partially) a preexisting issue ...

For gzwrite() it seems to be fine, because that only returns 0 on error.
OTOH it's defined to take 'int size' but then we happily pass size_t
values to it.

As I wrote earlier, this apparently assumes we never need to deal with
buffers larger than int, and I don't think we have the ambition to relax
that (I'm not sure it's even needed / possible).


I see the read/write functions are now defined as int, but we only ever
return 0/1 from them, and then interpret that as bool. Why not to define
it like that? I don't think we need to adhere to the custom that
everything returns "int". This is an internal API. Or if we want to
stick to int, I'd define meaningful "nice" constants for 0/1.



0002 seems fine to me. I see you've ditched the idea of having two
separate buffers, and replaced them with DEFAULT_IO_BUFFER_SIZE. Fine
with me, although I wonder if this might have negative impact on
performance or something (but I doubt that).

0003 seems fine too.


> As far as the error handling is concerned, you had said upthread:
> 
>> I think the right approach is to handle all library errors and not just
>> let them through. So Gzip_write() needs to check the return value, and
>> either call pg_fatal() or translate it to an error defined by the API.
> 
> While working on it, I thought it would be clearer and more consistent
> for the pg_fatal() to be called by the caller of the individual functions.
> Each individual function can keep track of the specifics of the error
> internally. Then the caller upon detecting that there was an error by
> checking the return value, can call pg_fatal() with a uniform error
> message and then add the specifics by calling the get_error_func().
> 

I agree it's cleaner the way you did it.

I was thinking that with each compression function handling error
internally, the callers would not need to do that. But I haven't
realized there's logic to detect ENOSPC and so on, and we'd need to
duplicate that in every compression func.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Track IO times in pg_stat_io

2023-03-16 Thread Melanie Plageman
v5 attached mostly addresses instr_time persistence issues.

On Tue, Mar 14, 2023 at 6:56 PM Andres Freund  wrote:
> On 2023-03-09 11:50:38 -0500, Melanie Plageman wrote:
> > On Tue, Mar 7, 2023 at 1:39 PM Andres Freund  wrote:
> > > On 2023-03-06 11:30:13 -0500, Melanie Plageman wrote:
> > > > > As pgstat_bktype_io_stats_valid() is called only in Assert(), I think 
> > > > > that would be a good idea
> > > > > to also check that if counts are not Zero then times are not Zero.
> > > >
> > > > Yes, I think adding some validation around the relationship between
> > > > counts and timing should help prevent developers from forgetting to call
> > > > pg_stat_count_io_op() when calling pgstat_count_io_time() (as relevant).
> > > >
> > > > However, I think that we cannot check that if IO counts are non-zero
> > > > that IO times are non-zero, because the user may not have
> > > > track_io_timing enabled. We can check that if IO times are not zero, IO
> > > > counts are not zero. I've done this in the attached v3.
> > >
> > > And even if track_io_timing is enabled, the timer granularity might be so 
> > > low
> > > that we *still* get zeroes.
> > >
> > > I wonder if we should get rid of pgStatBlockReadTime, 
> > > pgStatBlockWriteTime,
> >
> > And then have pg_stat_reset_shared('io') reset pg_stat_database IO
> > stats?
>
> Yes.

I think this makes sense but I am hesitant to do it in this patchset,
because it feels a bit hidden...maybe?

But, if you feel strongly, I will make the change.

> > > >  typedef struct PgStat_BktypeIO
> > > >  {
> > > > - PgStat_Counter 
> > > > data[IOOBJECT_NUM_TYPES][IOCONTEXT_NUM_TYPES][IOOP_NUM_TYPES];
> > > > + PgStat_Counter 
> > > > counts[IOOBJECT_NUM_TYPES][IOCONTEXT_NUM_TYPES][IOOP_NUM_TYPES];
> > > > + instr_time  
> > > > times[IOOBJECT_NUM_TYPES][IOCONTEXT_NUM_TYPES][IOOP_NUM_TYPES];
> > > >  } PgStat_BktypeIO;
> > >
> > > Ah, you're going to hate me. We can't store instr_time on disk. There's
> > > another patch that gets substantial peformance gains by varying the 
> > > frequency
> > > at which instr_time keeps track of time based on the CPU frequency...
> >
> > What does that have to do with what we can store on disk?
>
> The frequency can change.

Ah, I see.

> > If so, would it not be enough to do this when reading/writing the stats
> > file?
>
> Theoretically yes. But to me it seems cleaner to do it when flushing to shared
> stats. See also the overflow issue below.
>
> > > It also just doesn't have enough range to keep track of system wide
> > > time on a larger system. A single backend won't run for 293 years, but
> > > with a few thousand backends that's a whole different story.
> > >
> > > I think we need to accumulate in instr_time, but convert to floating point
> > > when flushing stats.
> >
> > Hmmm. So, are you saying that we need to read from disk when we query
> > the view and add that to what is in shared memory? That we only store
> > the delta since the last restart in the instr_time array?
>
> No, I don't think I am suggesting that. What I am trying to suggest is that
> PendingIOStats should contain instr_time, but that PgStat_IO should contain
> PgStat_Counter as microseconds, as before.

So, I've modified the code to make a union of instr_time and
PgStat_Counter in PgStat_BktypeIO. I am not quite sure if this is okay.
I store in microsec and then in pg_stat_io, I multiply to get
milliseconds for display.

I considered refactoring pgstat_io_end() to use INSTR_TIME_ACCUM_DIFF()
like [1], but, in the end I actually think I would end up with more
operations because of the various different counters needing to be
updated. As it is now, I do a single subtract and a few adds (one for
each of the different statistics objects tracking IO times
(pgBufferUsage, pgStatBlockWrite/ReadTime). Whereas, I would need to do
an accum diff for every one of those.

- Melanie

[1] 
https://www.postgresql.org/message-id/1feedb83-7aa9-cb4b-5086-598349d3f555%40gmail.com
From 5c02cd9cb784bf22f756fb9d92f4bd29ba2e744a Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Mon, 6 Mar 2023 10:41:51 -0500
Subject: [PATCH v5] Track IO times in pg_stat_io

Add IO timing for reads, writes, extends, and fsyncs to pg_stat_io.

Reviewed-by: Bertrand Drouvot 
Reviewed-by: Andres Freund 
Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_ay5iKmnbXZ3DsauViF3eMxu4m1oNnJXqV_HyqYeg55Ww%40mail.gmail.com
---
 doc/src/sgml/monitoring.sgml   | 59 +++
 src/backend/catalog/system_views.sql   |  4 ++
 src/backend/storage/buffer/bufmgr.c| 56 ++-
 src/backend/storage/buffer/localbuf.c  |  6 +-
 src/backend/storage/smgr/md.c  | 27 ---
 src/backend/utils/activity/pgstat.c| 77 +++-
 src/backend/utils/activity/pgstat_io.c | 99 +-
 src/backend/utils/adt/pgstatfuncs.c| 48 +++--
 src/include/catalog/pg_proc.dat|  6 +-
 src/include/pgstat.h   | 14 +++-
 

Re: doc: mentioned CREATE+ATTACH PARTITION as an alternative to CREATE TABLE..PARTITION OF

2023-03-16 Thread Justin Pryzby
On Thu, Mar 16, 2023 at 04:52:07PM -0400, Tom Lane wrote:
> Justin Pryzby  writes:
> > On Thu, Jan 19, 2023 at 04:47:59PM -0500, Robert Treat wrote:
> >> I think all of that feedback is useful, I guess the immediate question
> >> becomes if Justin wants to try to proceed with his patch implementing
> >> the change, or if adjusting the documentation for the current
> >> implementation is the right move for now.
> 
> > The docs change is desirable in any case, since it should be
> > backpatched, and any patch to change CREATE..PARTITION OF would be for
> > v17+ anyway.
> 
> Right.  Pushed with a little further effort to align it better with
> surrounding text.

Thanks.

  It is possible to use ALTER
  TABLE ATTACH/DETACH PARTITION to perform these
  operations with a weaker lock, thus reducing interference with
  concurrent operations on the partitioned table.

Note that in order for DETACH+DROP to use a lower lock level, it has to be
DETACH CONCURRENTLY.  ATTACH is implicitly uses a lower lock level, but for
DETACH it's only on request.

-- 
Justin




Re: How to check for in-progress transactions

2023-03-16 Thread Tejasvi Kashi
On Thu, Mar 16, 2023 at 17:01, Melanie Plageman 
wrote:

> On Thu, Mar 16, 2023 at 4:43 PM Tejasvi Kashi  wrote:
> > Thanks a lot for your reply. It looks like this is exactly what I need.
> For my use case, I'm trying to get read-only transactions to wait for the
> replication of prior writes.
>
> can't you use remote_apply?
>
> https://www.postgresql.org/docs/15/runtime-config-wal.html


That will ensure that the writes are acknowledged only after remote
application. But, in my case, I’m trying to get read transactions to wait
if they have seen a write that is yet to be replicated.


>


Re: How to check for in-progress transactions

2023-03-16 Thread Melanie Plageman
On Thu, Mar 16, 2023 at 4:43 PM Tejasvi Kashi  wrote:
> Thanks a lot for your reply. It looks like this is exactly what I need. For 
> my use case, I'm trying to get read-only transactions to wait for the 
> replication of prior writes.

can't you use remote_apply?

https://www.postgresql.org/docs/15/runtime-config-wal.html




Re: doc: mentioned CREATE+ATTACH PARTITION as an alternative to CREATE TABLE..PARTITION OF

2023-03-16 Thread Tom Lane
Justin Pryzby  writes:
> On Thu, Jan 19, 2023 at 04:47:59PM -0500, Robert Treat wrote:
>> I think all of that feedback is useful, I guess the immediate question
>> becomes if Justin wants to try to proceed with his patch implementing
>> the change, or if adjusting the documentation for the current
>> implementation is the right move for now.

> The docs change is desirable in any case, since it should be
> backpatched, and any patch to change CREATE..PARTITION OF would be for
> v17+ anyway.

Right.  Pushed with a little further effort to align it better with
surrounding text.

regards, tom lane




Re: How to check for in-progress transactions

2023-03-16 Thread Tejasvi Kashi
Hi Bharath,

Thanks a lot for your reply. It looks like this is exactly what I need. For
my use case, I'm trying to get read-only transactions to wait for the
replication of prior writes.

Sincerely,

Tej Kashi
MMath CS, University of Waterloo
Waterloo, ON, CA

On Thu, 16 Mar 2023 at 01:36, Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> On Thu, Mar 16, 2023 at 1:18 AM Tejasvi Kashi  wrote:
> >
> > For my use case, I'm trying to ascertain if there are any in-flight
> transactions that are yet to be replicated to synchronous standbys (in a
> synchronous streaming replication setting)
> >
> > I've been looking at sent_lsn, write_lsn, flush_lsn etc., of the
> walsender, but with no success. Considering the visibility change added
> above, is there a way for me to check for transactions that have been
> committed locally but are waiting for replication?
>
> I think you can look for SyncRep wait_event from pg_stat_activity,
> something like [1]. The backends will wait indefinitely until latch is
> set (postmaster death or an ack is received from sync standbys) in
> SyncRepWaitForLSN(). backend_xid is your
> locally-committed-but-not-yet-replicated txn id. Will this help?
>
> Well, if you're planning to know all
> locally-committed-but-not-yet-replicated txns from an extension or any
> other source code, you may run the full query [1] or if running a
> query seems costly, you can look at what pg_stat_get_activity() does
> to get each backend's wait_event_info and have your code do that.
>
> BTW, what exactly is the use-case that'd want
> locally-committed-but-not-yet-replicated txns info?
>
> [1]
> postgres=# select * from pg_stat_activity where backend_type = 'client
> backend' and wait_event = 'SyncRep';
> -[ RECORD 1 ]+--
> datid| 5
> datname  | postgres
> pid  | 4187907
> leader_pid   |
> usesysid | 10
> usename  | ubuntu
> application_name | psql
> client_addr  |
> client_hostname  |
> client_port  | -1
> backend_start| 2023-03-16 05:16:56.917124+00
> xact_start   | 2023-03-16 05:17:09.472092+00
> query_start  | 2023-03-16 05:17:09.472092+00
> state_change | 2023-03-16 05:17:09.472095+00
> wait_event_type  | IPC
> wait_event   | SyncRep
> state| active
> backend_xid  | 731
> backend_xmin | 731
> query_id |
> query| create table foo(col1 int);
> backend_type | client backend
>
> --
> Bharath Rupireddy
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com
>


Re: slapd logs to syslog during tests

2023-03-16 Thread Andrew Dunstan


On 2023-03-11 Sa 18:37, Andres Freund wrote:

Hi,

On my buildfarm host (for all my animals) I noted that slapd was by far the
biggest contributor to syslog. Even though there's not normally slapd
running. It's of course the slapds started by various tests.

Would anybody mind if I add 'logfile_only' to slapd's config in LdapServer.pm?
That still leaves a few logline, from before the config file parsing, but it's
a lot better than all requests getting logged.

Obviously I also could reconfigure syslog to just filter this stuff, but it
seems that the tests shouldn't spam like that.



Hi, Andres,


are you moving ahead with this?


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Experiments with Postgres and SSL

2023-03-16 Thread Greg Stark
Here's an updated patch for direct SSL connections.

I've added libpq client support with a new connection parameter. This
allows testing it easily with psql. It's still a bit hard to see
what's going on though. I'm thinking it would be good to have libpq
keep a string which describes what negotiations were attempted and
failed and what was eventually accepted which psql could print with
the SSL message or expose some other way.

In the end I didn't see how adding an API for this really helped any
more than just saying the API is to stuff the unread data into the
Port structure. So I just documented that. If anyone has any better
idea...

I added documentation for the libpq connection setting.

One thing, I *think* it's ok to replace the send(2) call with
secure_write() in the negotiation. It does mean it's possible for the
connection to fail with FATAL at that point instead of COMMERROR but I
don't think that's a problem.

I haven't added tests. I'm not sure how to test this since to test it
properly means running the server with every permutation of ssl and
gssapi configurations.

Incidentally, some of the configuration combinations -- namely
sslnegotiation=direct and default gssencmode and sslmode results in a
counter-intuitive behaviour. But I don't see a better option that
doesn't mean making the defaults less useful.
From b07e19223bee52b7bb9b50afea39e4baaa0a46f3 Mon Sep 17 00:00:00 2001
From: Greg Stark 
Date: Thu, 16 Mar 2023 15:10:15 -0400
Subject: [PATCH v2 3/3] Direct SSL connections documentation

---
 doc/src/sgml/libpq.sgml | 102 
 1 file changed, 93 insertions(+), 9 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 3706d349ab..e2f0891ea5 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1701,10 +1701,13 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
 Note that if GSSAPI encryption is possible,
 that will be used in preference to SSL
 encryption, regardless of the value of sslmode.
-To force use of SSL encryption in an
-environment that has working GSSAPI
-infrastructure (such as a Kerberos server), also
-set gssencmode to disable.
+To negotiate SSL encryption in an environment that
+has working GSSAPI infrastructure (such as a
+Kerberos server), also set gssencmode
+to disable.
+Use of non-default values of sslnegotiation can
+also cause SSL to be used instead of
+negotiating GSSAPI encryption.

   
  
@@ -1731,6 +1734,75 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
   
  
 
+ 
+  sslnegotiation
+  
+   
+This option controls whether PostgreSQL
+will perform its protocol negotiation to request encryption from the
+server or will just directly make a standard SSL
+connection.  Traditional PostgreSQL
+protocol negotiation is the default and the most flexible with
+different server configurations. If the server is known to support
+direct SSL connections then the latter requires one
+fewer round trip reducing connection latency and also allows the use
+of protocol agnostic SSL network tools.
+   
+
+
+ 
+  postgres
+  
+   
+ perform PostgreSQL protocol
+ negotiation. This is the default if the option is not provided.
+   
+  
+ 
+
+ 
+  direct
+  
+   
+ first attempt to establish a standard SSL connection and if that
+ fails reconnect and perform the negotiation. This fallback
+ process adds significant latency if the initial SSL connection
+ fails.
+   
+  
+ 
+
+ 
+  requiredirect
+  
+   
+ attempt to establish a standard SSL connection and if that fails
+ return a connection failure immediately.
+   
+  
+ 
+
+
+   
+If sslmode set to disable
+or allow then sslnegotiation is
+ignored. If gssencmode is set
+to require then sslnegotiation
+must be the default postgres value.
+   
+
+   
+Moreover, note if gssencmode is set
+to prefer and sslnegotiation
+to direct then the effective preference will be
+direct SSL connections, followed by
+negotiated GSS connections, followed by
+negotiated SSL connections, possibly followed
+lastly by unencrypted connections.
+   
+  
+ 
+
  
   sslcompression
   
@@ -1884,11 +1956,13 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
 

 The Server Name Indication can be used by SSL-aware proxies to route
-connections without having to decrypt the SSL stream.  (Note that this
-requires a proxy that is 

Re: The use of atooid() on non-Oid results

2023-03-16 Thread Daniel Gustafsson
> On 16 Mar 2023, at 15:58, Tom Lane  wrote:
> 
> Daniel Gustafsson  writes:
>> When looking at the report in [0] an API choice in the relevant pg_upgrade 
>> code
>> path stood out as curious.  check_is_install_user() runs this query to ensure
>> that only the install user is present in the cluster:
> 
>>res = executeQueryOrDie(conn,
>>"SELECT COUNT(*) "
>>"FROM pg_catalog.pg_roles "
>>"WHERE rolname !~ '^pg_'");
> 
>> The result is then verified with the following:
> 
>>if (cluster == _cluster && atooid(PQgetvalue(res, 0, 0)) != 1)
>>pg_fatal("Only the install user can be defined in the new cluster.");
> 
>> This was changed from atoi() in ee646df59 with no specific comment on why.
>> This is not a bug, since atooid() will do the right thing here, but it threw 
>> me
>> off reading the code and might well confuse others.  Is there a reason not to
>> change this back to atoi() for code clarity as we're not reading an Oid here?
> 
> Hmm ... in principle, you could have more than 2^31 entries in pg_roles,
> but not more than 2^32 since they all have to have distinct OIDs.  So
> I can see the point of avoiding that theoretical overflow hazard.  But
> personally I'd probably avoid assuming anything about how wide the COUNT()
> result could be, and instead writing
> 
>   ... && strcmp(PQgetvalue(res, 0, 0), "1") != 0)

Yeah, that makes sense.  I'll go ahead with that solution instead and possibly
a brief addition to the comment.

--
Daniel Gustafsson





Re: [PATCH] Add pretty-printed XML output option

2023-03-16 Thread Jim Jones

On 15.03.23 22:13, Tom Lane wrote:

I wrote:
It occurred to me to test v23 for memory leaks, and it had bad ones:
* the "newline" node used in the CONTENT case never got freed.
Making another one for each line wasn't helping, either.

Oh, I did really miss that one. Thanks!

Pushed with those changes and some other cosmetic cleanup.
Thanks for working so hard on this!

Great! Thank you, Peter and Andrey for the very nice reviews.

BTW, the libxml leak problem seems to extend to other cases too.
I tested with code like

do $$
declare x xml; t text;
begin
x := '73';
for i in 1..1000 loop
   t := xmlserialize(document x as text);
end loop;
raise notice 't = %', t;
end;
$$;

That case is fine, but if you change the encoding spec to "latin1",
it leaks like mad.  That problem is not the fault of this patch,
I don't think.  I wonder if we need to do something to prevent
libxml from seeing encoding declarations other than utf8?


In my environment (libxml2 v2.9.10 and Ubuntu 22.04) I couldn't 
reproduce this memory leak. It's been most likely fixed in further 
libxml2 versions. Unfortunately their gitlab page has no release notes 
from versions prior to 2.9.13 :(


Best, Jim





Re: More weird compiler warnings

2023-03-16 Thread Tom Lane
Andres Freund  writes:
> On 2022-03-26 13:55:49 -0700, Andres Freund wrote:
>> On 2022-03-26 16:23:26 -0400, Tom Lane wrote:
>>> but I'm wondering if we could silence the warning by changing the loop 
>>> condition to
>>> while (--nb >= 0)
>>> which seems like it might be marginally more readable anyway.

>> Yes, that looks like it silences it.  I modified the small reproducer I had 
>> in
>> that bug (https://godbolt.org/z/ejK9h6von) and the warning vanishes.

> The recent discussion about warnings reminded me of this. Given the gcc bug
> hasn't been fixed, I think we should make that change. I'd vote for
> backpatching it as well - what do you think?

+1, can't hurt anything AFAICS.

regards, tom lane




Re: Date-Time dangling unit fix

2023-03-16 Thread Tom Lane
Hearing no further comments on this, I adjusted DecodeTimeOnly to
look more like DecodeDateTime as I recommended upthread, and pushed.

regards, tom lane




Re: gcc 13 warnings

2023-03-16 Thread Andres Freund
Hi,

On 2023-03-16 13:54:29 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2023-03-16 12:10:27 -0400, Tom Lane wrote:
> >> It wouldn't be entirely surprising if meson is selecting some -W
> >> switches that the configure script doesn't ... but I don't know
> >> where to check or change that.
>
> > I think it's just that meson defaults to -O3 (fwiw, I see substantial gains 
> > of
> > that over -O2).  I see such warnings with autoconf as well if I make it use
> > -O3.
>
> Oh, interesting.  Should we try to standardize the two build systems
> on the same -O level, and if so which one?

I'm on the fence on this one (and posed it as a question before). O3 does
result in higher performance for me, but it also does take longer to build,
and increases the numbers of warnings.

So I just elected to leave it at the default for meson.


> To my mind, you should ideally get the identical built bits out of
> either system, so defaulting to a different -O level seems bad.

I doubt that is attainable, unfortunately. My experience is that even trivial
changes can lead to substantial changes in output. Even just being in a
different directory (the root build directory for meson vs the subdirectory in
make builds) apparently sometimes leads to different compiler output.


> I'm not sure if we're prepared to go to -O3 by default though,
> especially for some of the older buildfarm critters where that
> might be buggy.  (I'd imagine you take a hit in gdb-ability too.)

My experience is that debuggability is already bad enough at O2 that the
difference to O3 is pretty marginal. But it certainly depends a bit on the
compiler version and what level of debug information one enables.

Greetings,

Andres Freund




Default libpq connection parameter handling and copy-paste of apparently dead code for it?

2023-03-16 Thread Greg Stark
I notice a number of places in fe-connect.c have copied this idiom
where if an option is present they validate the legal options and
otherwise they strdup a default value. This strdup of the default
option I think is being copied from sslmode's validation which is a
bit special but afaics the following still applies to it.

   /*
* validate channel_binding option
*/
   if (conn->channel_binding)
   {
   if (strcmp(conn->channel_binding, "disable") != 0
   && strcmp(conn->channel_binding, "prefer") != 0
   && strcmp(conn->channel_binding, "require") != 0)
   {
   conn->status = CONNECTION_BAD;
   libpq_append_conn_error(conn, "invalid %s value: \"%s\"",

"channel_binding", conn->channel_binding);
   return false;
   }
   }
   else
   {
   conn->channel_binding = strdup(DefaultChannelBinding);
   if (!conn->channel_binding)
   goto oom_error;
   }

AFAICS the else branch of this is entirely dead code. These options
cannot be NULL because the default option is present in the
PQconninfoOptions array as the "compiled in default value" which
conninfo_add_defaults() will strdup in for us.

Unless. conninfo_array_parse() is passed use_defaults=false in
which case no... But why does this parameter exist? This is a static
function with one call site and this parameter is passed as true at
that call site.

So I think this is just dead from some no longer extant code path
that's being copied by new parameters that are added.

As an aside conninfo_add_defaults doesn't put the default value there
if the option is an empty string. I think we should make it do that,
effectively forcing all options to treat empty strings as missing
options. Otherwise it's annoying to use environment variables when you
want to explicitly set a parameter to a default value since it's much
less convenient to "remove" an environment variable in a shell than
pass it as an empty string. And it would just be confusing to have
empty strings behave differently from omitted parameters.


-- 
greg




Re: gcc 13 warnings

2023-03-16 Thread Tom Lane
Andres Freund  writes:
> On 2023-03-16 12:10:27 -0400, Tom Lane wrote:
>> It wouldn't be entirely surprising if meson is selecting some -W
>> switches that the configure script doesn't ... but I don't know
>> where to check or change that.

> I think it's just that meson defaults to -O3 (fwiw, I see substantial gains of
> that over -O2).  I see such warnings with autoconf as well if I make it use
> -O3.

Oh, interesting.  Should we try to standardize the two build systems
on the same -O level, and if so which one?

To my mind, you should ideally get the identical built bits out of
either system, so defaulting to a different -O level seems bad.
I'm not sure if we're prepared to go to -O3 by default though,
especially for some of the older buildfarm critters where that
might be buggy.  (I'd imagine you take a hit in gdb-ability too.)

regards, tom lane




Re: Remove last traces of SCM credential auth from libpq?

2023-03-16 Thread Tom Lane
"Jonathan S. Katz"  writes:
> It looks like in the po files there are a bunch of "SCM_CRED 
> authentication method not supported" messages that can also be removed.

Those will go away in the normal course of translation maintenance,
there's no need to remove them by hand.  (Generally speaking, there
is no need to ever touch the .po files except when new versions get
imported from the translation repo.)

regards, tom lane




Re: Remove last traces of SCM credential auth from libpq?

2023-03-16 Thread Jonathan S. Katz

On 3/16/23 10:49 AM, Tom Lane wrote:

Michael Paquier  writes:

libpq has kept some code related to the support of authentication with
SCM credentials for some time now, code dead in the backend since
9.1.  Wouldn't it be time to let it go and remove this code entirely,
erroring in libpq if attempting to connect to a server that supports
that?


+1.  Since that's only used on Unix-domain sockets, it could only be
useful if you were using current libpq while talking to a pre-9.1
server on the same machine.


+1.


Also, in pg_fe_sendauth, couldn't you just let the default: case
handle it instead of adding a bespoke error message?  We're not
really expecting that anyone is ever going to hit this, so I'm
not convinced it's worth the translation burden.


+1 to this, that was my thought as well. That would let us remove the 
"AUTH_REQ_SCM_CREDS" constant too.


It looks like in the po files there are a bunch of "SCM_CRED 
authentication method not supported" messages that can also be removed.


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: gcc 13 warnings

2023-03-16 Thread Andres Freund
Hi,

On 2023-03-16 10:05:06 -0700, Andres Freund wrote:
> I think it's just that meson defaults to -O3 (fwiw, I see substantial gains of
> that over -O2).  I see such warnings with autoconf as well if I make it use
> -O3.

WRT:

In file included from 
/home/andres/src/postgresql/src/include/access/htup_details.h:22,
 from 
/home/andres/src/postgresql/src/pl/plpgsql/src/pl_exec.c:21:
In function ‘assign_simple_var’,
inlined from ‘exec_set_found’ at 
/home/andres/src/postgresql/src/pl/plpgsql/src/pl_exec.c:8307:2:
/home/andres/src/postgresql/src/include/varatt.h:230:36: warning: array 
subscript 0 is outside array bounds of ‘char[0]’ [-Warray-bounds]
  230 | (((varattrib_1b_e *) (PTR))->va_tag)
  |^
/home/andres/src/postgresql/src/include/varatt.h:94:12: note: in definition of 
macro ‘VARTAG_IS_EXPANDED’
   94 | (((tag) & ~1) == VARTAG_EXPANDED_RO)
  |^~~
/home/andres/src/postgresql/src/include/varatt.h:284:57: note: in expansion of 
macro ‘VARTAG_1B_E’
  284 | #define VARTAG_EXTERNAL(PTR)VARTAG_1B_E(PTR)
  | ^~~
/home/andres/src/postgresql/src/include/varatt.h:301:57: note: in expansion of 
macro ‘VARTAG_EXTERNAL’
  301 | (VARATT_IS_EXTERNAL(PTR) && 
!VARTAG_IS_EXPANDED(VARTAG_EXTERNAL(PTR)))
  | ^~~
/home/andres/src/postgresql/src/pl/plpgsql/src/pl_exec.c:8495:17: note: in 
expansion of macro ‘VARATT_IS_EXTERNAL_NON_EXPANDED’
 8495 | 
VARATT_IS_EXTERNAL_NON_EXPANDED(DatumGetPointer(newvalue)))
  | ^~~

I think that's basically because gcc does realize that the datum is just an 8
byte by-value datum:
assign_simple_var(estate, var, BoolGetDatum(state), false, false);

but doesn't (and probably can't, with the available information) grok that
that means we don't even get to the VARATT_IS_EXTERNAL_NON_EXPANDED() in
assign_simple_var().

If I add
if (var->datatype->typlen == -1)
pg_unreachable();
to exec_set_found(), the warning indeed goes away.


I've wondered before if we should make at least some Asserts() into something
like the above (if we have something better backing it than abort()), so the
compiler can understand unreachable code paths even when building without
cassert.

Greetings,

Andres Freund




Re: More weird compiler warnings

2023-03-16 Thread Andres Freund
Hi,

On 2022-03-26 13:55:49 -0700, Andres Freund wrote:
> On 2022-03-26 16:23:26 -0400, Tom Lane wrote:
> > serinus' experimental gcc whines about a few places in network.c:
> >
> > ../../../../../pgsql/src/backend/utils/adt/network.c: In function 'inetnot':
> > ../../../../../pgsql/src/backend/utils/adt/network.c:1893:34: warning: 
> > writing 1 byte into a region of size 0 [-Wstringop-overflow=]
> >  1893 | pdst[nb] = ~pip[nb];
> >   | ~^~
> > ../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 
> > into destination object 'ipaddr' of size 16
> >27 | unsigned char ipaddr[16];   /* up to 128 bits of 
> > address */
> >   |   ^~
> > ../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 
> > into destination object 'ipaddr' of size 16
> >
> > The code in question looks like
> >
> > {
> > intnb = ip_addrsize(ip);
> > unsigned char *pip = ip_addr(ip);
> > unsigned char *pdst = ip_addr(dst);
> >
> > while (nb-- > 0)
> > pdst[nb] = ~pip[nb];
> > }
> >
> > There's nothing actually wrong with this
> 
> I reported this to the gcc folks, that's clearly a bug. I suspect that it
> might not just cause spurious warnings, but also code generation issues - but
> I don't know that part for sure.
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104986
> 
> 
> > but I'm wondering if we could silence the warning by changing the loop 
> > condition to
> >
> > while (--nb >= 0)
> >
> > which seems like it might be marginally more readable anyway.
> 
> Yes, that looks like it silences it.  I modified the small reproducer I had in
> that bug (https://godbolt.org/z/ejK9h6von) and the warning vanishes.

The recent discussion about warnings reminded me of this. Given the gcc bug
hasn't been fixed, I think we should make that change. I'd vote for
backpatching it as well - what do you think?

Greetings,

Andres Freund




Re: gcc 13 warnings

2023-03-16 Thread Andres Freund
Hi,

On 2023-03-16 12:10:27 -0400, Tom Lane wrote:
> Pavel Stehule  writes:
> > čt 16. 3. 2023 v 16:43 odesílatel Tom Lane  napsal:
> >> Hmm, I do not see any warnings on HEAD with Fedora 37's gcc 12.2.1.
> >> What non-default configure switches, CFLAGS, etc are you using?
> 
> > meson build without any settings
> > I think so it is related to meson build, I didn't see these warnings with
> > autoconf
> 
> It wouldn't be entirely surprising if meson is selecting some -W
> switches that the configure script doesn't ... but I don't know
> where to check or change that.

I think it's just that meson defaults to -O3 (fwiw, I see substantial gains of
that over -O2).  I see such warnings with autoconf as well if I make it use
-O3.

I think some of these are stemming from
https://postgr.es/m/20230204130708.pta7pjc4dvu225ey%40alap3.anarazel.de

Greetings,

Andres Freund




Re: Add LZ4 compression in pg_dump

2023-03-16 Thread gkokolatos


--- Original Message ---
On Tuesday, March 14th, 2023 at 4:32 PM, Tomas Vondra 
 wrote:


> 
> 
> 
> 
> On 3/14/23 16:18, gkokola...@pm.me wrote:
> 
> > ...> Would you mind me trying to come with a patch to address your points?
> 
> 
> That'd be great, thanks. Please keep it split into smaller patches - two
> might work, with one patch for "cosmetic" changes and the other tweaking
> the API error-handling stuff.

Please find attached a set for it. I will admit that the splitting in the
series might not be ideal and what you requested. It is split on what
seemed as a logical units. Please advice how a better split can look like.

0001 is unifying types and return values on the API
0002 is addressing the constant definitions
0003 is your previous 0004 adding comments

As far as the error handling is concerned, you had said upthread:

> I think the right approach is to handle all library errors and not just
> let them through. So Gzip_write() needs to check the return value, and
> either call pg_fatal() or translate it to an error defined by the API.

While working on it, I thought it would be clearer and more consistent
for the pg_fatal() to be called by the caller of the individual functions.
Each individual function can keep track of the specifics of the error
internally. Then the caller upon detecting that there was an error by
checking the return value, can call pg_fatal() with a uniform error
message and then add the specifics by calling the get_error_func().

Thoughts?

Cheers,
//Georgios

> 
> regards
> 
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL CompanyFrom 4aa7603d891c62bf9d95af9910b8fb4b0fe2fb10 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos 
Date: Thu, 16 Mar 2023 16:30:09 +
Subject: [PATCH v2 2/3] Clean up constants in pg_dump's compression API.

Prior to the introduction of the API, pg_dump would use the ZLIB_[IN|OUT]_SIZE
constants to handle buffer sizes throughout. This behaviour is confusing after
the introduction of the API. Ammend it by introducing a DEFAULT_IO_BUFFER_SIZE
constant to use when appropriate while giving the opportunity to specific
compression implementations to use their own.

With the help and guidance of Tomas Vondra.
---
 src/bin/pg_dump/compress_gzip.c   | 22 +++---
 src/bin/pg_dump/compress_io.h |  5 ++---
 src/bin/pg_dump/compress_lz4.c| 17 +
 src/bin/pg_dump/compress_none.c   |  4 ++--
 src/bin/pg_dump/pg_backup_directory.c |  4 ++--
 5 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c
index 29e2fd8d50..4106d4c866 100644
--- a/src/bin/pg_dump/compress_gzip.c
+++ b/src/bin/pg_dump/compress_gzip.c
@@ -120,8 +120,8 @@ WriteDataToArchiveGzip(ArchiveHandle *AH, CompressorState *cs,
 		 * actually allocate one extra byte because some routines want to
 		 * append a trailing zero byte to the zlib output.
 		 */
-		gzipcs->outbuf = pg_malloc(ZLIB_OUT_SIZE + 1);
-		gzipcs->outsize = ZLIB_OUT_SIZE;
+		gzipcs->outsize = DEFAULT_IO_BUFFER_SIZE;
+		gzipcs->outbuf = pg_malloc(gzipcs->outsize + 1);
 
 		/*
 		 * A level of zero simply copies the input one block at the time. This
@@ -158,10 +158,10 @@ ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
 	zp->zfree = Z_NULL;
 	zp->opaque = Z_NULL;
 
-	buf = pg_malloc(ZLIB_IN_SIZE);
-	buflen = ZLIB_IN_SIZE;
+	buflen = DEFAULT_IO_BUFFER_SIZE;
+	buf = pg_malloc(buflen);
 
-	out = pg_malloc(ZLIB_OUT_SIZE + 1);
+	out = pg_malloc(DEFAULT_IO_BUFFER_SIZE + 1);
 
 	if (inflateInit(zp) != Z_OK)
 		pg_fatal("could not initialize compression library: %s",
@@ -176,14 +176,14 @@ ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
 		while (zp->avail_in > 0)
 		{
 			zp->next_out = (void *) out;
-			zp->avail_out = ZLIB_OUT_SIZE;
+			zp->avail_out = DEFAULT_IO_BUFFER_SIZE;
 
 			res = inflate(zp, 0);
 			if (res != Z_OK && res != Z_STREAM_END)
 pg_fatal("could not uncompress data: %s", zp->msg);
 
-			out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
-			ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+			out[DEFAULT_IO_BUFFER_SIZE - zp->avail_out] = '\0';
+			ahwrite(out, 1, DEFAULT_IO_BUFFER_SIZE - zp->avail_out, AH);
 		}
 	}
 
@@ -192,13 +192,13 @@ ReadDataFromArchiveGzip(ArchiveHandle *AH, CompressorState *cs)
 	while (res != Z_STREAM_END)
 	{
 		zp->next_out = (void *) out;
-		zp->avail_out = ZLIB_OUT_SIZE;
+		zp->avail_out = DEFAULT_IO_BUFFER_SIZE;
 		res = inflate(zp, 0);
 		if (res != Z_OK && res != Z_STREAM_END)
 			pg_fatal("could not uncompress data: %s", zp->msg);
 
-		out[ZLIB_OUT_SIZE - zp->avail_out] = '\0';
-		ahwrite(out, 1, ZLIB_OUT_SIZE - zp->avail_out, AH);
+		out[DEFAULT_IO_BUFFER_SIZE - zp->avail_out] = '\0';
+		ahwrite(out, 1, DEFAULT_IO_BUFFER_SIZE - zp->avail_out, AH);
 	}
 
 	if (inflateEnd(zp) != Z_OK)
diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h
index 

RE: Initial Schema Sync for Logical Replication

2023-03-16 Thread Kumar, Sachin
Hi Peter,

> Hi,
> 
> I have a couple of questions.
> 
> Q1.
> 
> What happens if the subscriber already has some tables present? For
> example, I did not see the post saying anything like "Only if the table does
> not already exist then it will be created".
> 
My assumption was the if subscriber is doing initial schema sync , It does not 
have
any conflicting database objects.
> On the contrary, the post seemed to say SUBREL_STATE_CREATE 'c' would
> *always* be set when this subscriber mode is enabled. And then it seemed
> to say the table would *always* get re-created by the tablesync in this new
> mode.
Right
> 
> Won't this cause problems
> - if the user wanted a slightly different subscriber-side table? (eg some 
> extra
> columns on the subscriber-side table)
> - if there was some pre-existing table data on the subscriber-side table that
> you now are about to re-create and clobber?
> 
> Or does the idea intend that the CREATE TABLE DDL that will be executed is
> like "CREATE TABLE ... IF NOT EXISTS"?
> 
pg_dump does not support --if-not-exists , But I think it can be added  and we 
get a
dump with IF NOT EXISTS.
On subscriber side we get table OID list, we can use this change table_state
= SUBREL_STATE_INIT so that it won't be recreated. 
> ~~~
> 
> Q2.
> 
> The post says. "DDL replication is required for this feature". And "It should
> also have turned on publication of DDL commands."
> 
> It wasn't entirely clear to me why those must be a requirement. Is that just 
> to
> make implementation easier?
DDL replication is needed to facilitate concurrent DDL, so that we don’t have to
worry about schema change at the same time when subscriber is initializing.
if we can block publisher so that it does not do DDLs or subscriber can simple
error out if it sees conflicting table information , then we don’t need to use 
DDL
replication. 
Regards
Sachin
> 
> Sure, I see that the idea might have some (or maybe a lot?) of common
> internal code with the table DDL replication work, but OTOH an auto-create
> feature for subscriber tables seems like it might be a useful feature to have
> regardless of the value of the publication 'ddl' parameter.
> 
> --
> Kind Regards,
> Peter Smith.
> Fujitsu Australia.


Re: Transparent column encryption

2023-03-16 Thread Andres Freund
Hi,

On 2023-03-16 11:26:46 +0100, Peter Eisentraut wrote:
> On 13.03.23 22:11, Andres Freund wrote:
> > > It adds branches, and it makes tupledescs wider. In tight spots, such as
> > > printtup, that can hurt, even if the branches aren't ever entered.
> > In fact, I do see a noticable, but not huge, regression:
> 
> I tried to reproduce your measurements, but I don't have the CPU affinity
> tools like that on macOS, so the differences were lost in the noise.  (The
> absolute numbers looked very similar to yours.)
> 
> I can reproduce a regression if I keep adding more columns to pg_attribute,
> like 8 OID columns does start to show.
>
> [...]
> How do we proceed here?

Maybe a daft question, but why do we need a separate type and typmod for
encrypted columns? Why isn't the fact that the column is encrypted exactly one
new field, and we use the existing type/typmod fields?


> A lot of user-facing table management stuff like compression, statistics,
> collation, dropped columns, and probably future features like column
> reordering or periods, have to be represented in pg_attribute somehow, at
> least in the current architecture.  Do we hope that hardware keeps up with
> the pace at which we add new features?

Yea, it's not great as is. I think it's also OK to decide that the slowdown is
worth it in some cases - it just has to be a conscious decision, in the open.


> Do we need to decouple tuple descriptors from pg_attribute altogether?

Yes. Very clearly. The amount of memory and runtime we spent on tupledescs is
disproportionate. A second angle is that we build tupledescs way way too
frequently. The executor infers them everywhere, so not even prepared
statements protect against that.


> How do we decide what goes into the tuple descriptor and what does not?  I'm
> interested in addressing this, because obviously we do want the ability to
> add more features in the future, but I don't know what the direction should
> be.

We've had some prior discussion around this, see e.g.
https://postgr.es/m/20210819114435.6r532qbadcsyfscp%40alap3.anarazel.de

Greetings,

Andres Freund




Re: improving user.c error messages

2023-03-16 Thread Nathan Bossart
On Thu, Mar 16, 2023 at 04:59:53PM +0100, Peter Eisentraut wrote:
> On 16.03.23 16:48, Nathan Bossart wrote:
>> > I think the following change in DropRole() is incorrect:
>> > 
>> >  if (!is_admin_of_role(GetUserId(), roleid))
>> >  ereport(ERROR,
>> >  (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
>> > -errmsg("must have admin option on role \"%s\"",
>> > -   role)));
>> > +errmsg("permission denied to drop role"),
>> > +errdetail("Only roles with the %s attribute and the %s
>> > option on role \"%s\" may drop this role.",
>> > +  "CREATEROLE", "ADMIN",
>> > NameStr(roleform->rolname;
>> > 
>> > The message does not reflect what check is actually performed.  (Perhaps
>> > this was confused with a similar but not exactly the same check in
>> > RenameRole().)
>> Hm.  Is your point that we should only mention the admin option here?  I
>> mentioned both createrole and admin option in this message (and the
>> createrole check above this point) in an attempt to avoid giving partial
>> information.
> 
> AFAICT, the mention of CREATEROLE is incorrect, because the code doesn't
> actually check for the CREATEROLE attribute.

There is a createrole check at the top of DropRole():

/*
 * DROP ROLE
 */
void
DropRole(DropRoleStmt *stmt)
{
Relationpg_authid_rel,
pg_auth_members_rel;
ListCell   *item;
List   *role_addresses = NIL;

if (!have_createrole_privilege())
ereport(ERROR,

(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 errmsg("permission denied to drop 
role")));

Granted, no one will see the admin option error unless they at least have
createrole, so we could leave it out, but my intent was to list the full
set of privileges required to drop the role to avoid ambiguity.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: gcc 13 warnings

2023-03-16 Thread Tom Lane
Pavel Stehule  writes:
> čt 16. 3. 2023 v 16:43 odesílatel Tom Lane  napsal:
>> Hmm, I do not see any warnings on HEAD with Fedora 37's gcc 12.2.1.
>> What non-default configure switches, CFLAGS, etc are you using?

> meson build without any settings
> I think so it is related to meson build, I didn't see these warnings with
> autoconf

It wouldn't be entirely surprising if meson is selecting some -W
switches that the configure script doesn't ... but I don't know
where to check or change that.

If that is the case, do we want to beat meson over the head till
it stops doing that, or try to silence the warnings?  The ones
you show here don't look terribly helpful ...

regards, tom lane




Re: "current directory" in a server error message

2023-03-16 Thread Tom Lane
Kyotaro Horiguchi  writes:
> At Thu, 16 Mar 2023 09:32:05 +0530, Bharath Rupireddy 
>  wrote in 
>> On Thu, Mar 16, 2023 at 7:47 AM Kyotaro Horiguchi
>>  wrote:
>>> Thus I think that the message should read "path must be in or below
>>> the data directory" instead.

>> BTW, adminpack too has the same error message.

> I somehow dropped them. Thanks for pointing.

Agreed, this is an improvement.  I fixed adminpack too and pushed it.

regards, tom lane




Re: Dropped and generated columns might cause wrong data on subs when REPLICA IDENTITY FULL

2023-03-16 Thread Önder Kalacı
Hi Amit, Shi Yu,

> Generated column is introduced in PG12, and I reproduced generated column
problem
in PG12~PG15.
> For dropped column problem, I reproduced it in PG10~PG15. (Logical
replication
was introduced in PG10)

So, I'm planning to split the changes into two commits. The first one fixes
for dropped columns, and the second one adds generated columns check/test.

Is that the right approach for such a case?

>  and backpatch the fix for dropped column to PG10.

Still, even the first commit fails to apply cleanly to PG12 (and below).

What is the procedure here? Should I be creating multiple patches per
version?
Or does the committer prefer to handle the conflicts? Depending on your
reply,
I can work on the followup.

I'm still attaching the dropped column patch for reference.


Thanks,
Onder


shiy.f...@fujitsu.com , 16 Mar 2023 Per, 13:38
tarihinde şunu yazdı:

> On Thu, Mar 16, 2023 5:23 PM Amit Kapila  wrote:
> >
> > On Mon, Mar 13, 2023 at 6:26 PM Önder Kalacı 
> > wrote:
> > >
> > > Attaching v2
> > >
> >
> > Can we change the comment to: "Ignore dropped and generated columns as
> > the publisher doesn't send those."? After your change, att =
> > TupleDescAttr(slot1->tts_tupleDescriptor, attrnum); is done twice in
> > the same function.
> >
> > In test cases, let's change the comment to: "The bug was that when the
> > REPLICA IDENTITY FULL is used with dropped or generated columns, we
> > fail to apply updates and deletes.". Also, I think we don't need to
> > provide the email link as anyway commit message will have a link to
> > the discussion.
> >
> > Did you check this in the back branches?
> >
>
> I tried to reproduce this bug in backbranch.
>
> Generated column is introduced in PG12, and I reproduced generated column
> problem
> in PG12~PG15.
>
> For dropped column problem, I reproduced it in PG10~PG15. (Logical
> replication
> was introduced in PG10)
>
> So I think we should backpatch the fix for generated column to PG12, and
> backpatch the fix for dropped column to PG10.
>
> Regards,
> Shi Yu
>
From d25b2feac3b26646034bf3ace6ab93056c3c29af Mon Sep 17 00:00:00 2001
From: Onder Kalaci 
Date: Thu, 16 Mar 2023 18:06:54 +0300
Subject: [PATCH v1 1/2] Ignore dropped columns when REPLICA IDENTITY FULL

Dropped columns are filled with NULL values on slot_store_data()
but not on table_scan_getnextslot(). With this commit, we skip
such columns while checking tuple equality.
---
 src/backend/executor/execReplication.c |  9 -
 src/test/subscription/t/100_bugs.pl| 51 ++
 2 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index fa8628e3e1..7c8b0d2667 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -289,6 +289,13 @@ tuples_equal(TupleTableSlot *slot1, TupleTableSlot *slot2,
 		Form_pg_attribute att;
 		TypeCacheEntry *typentry;
 
+		/*
+		 * Ignore dropped columns as the publisher doesn't send those
+		 */
+		att = TupleDescAttr(slot1->tts_tupleDescriptor, attrnum);
+		if (att->attisdropped)
+			continue;
+
 		/*
 		 * If one value is NULL and other is not, then they are certainly not
 		 * equal
@@ -302,8 +309,6 @@ tuples_equal(TupleTableSlot *slot1, TupleTableSlot *slot2,
 		if (slot1->tts_isnull[attrnum] || slot2->tts_isnull[attrnum])
 			continue;
 
-		att = TupleDescAttr(slot1->tts_tupleDescriptor, attrnum);
-
 		typentry = eq[attrnum];
 		if (typentry == NULL)
 		{
diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl
index 036576acab..255746094c 100644
--- a/src/test/subscription/t/100_bugs.pl
+++ b/src/test/subscription/t/100_bugs.pl
@@ -373,4 +373,55 @@ is( $result, qq(1
 $node_publisher->stop('fast');
 $node_subscriber->stop('fast');
 
+# The bug was that when the REPLICA IDENTITY FULL is used with dropped columns,
+# we fail to apply updates and deletes
+my $node_publisher_d_cols = PostgreSQL::Test::Cluster->new('node_publisher_d_cols');
+$node_publisher_d_cols->init(allows_streaming => 'logical');
+$node_publisher_d_cols->start;
+
+my $node_subscriber_d_cols = PostgreSQL::Test::Cluster->new('node_subscriber_d_cols');
+$node_subscriber_d_cols->init(allows_streaming => 'logical');
+$node_subscriber_d_cols->start;
+
+$node_publisher_d_cols->safe_psql(
+	'postgres', qq(
+	CREATE TABLE dropped_cols (a int, b_drop int, c int);
+	ALTER TABLE dropped_cols REPLICA IDENTITY FULL;
+	CREATE PUBLICATION pub_dropped_cols FOR TABLE dropped_cols;
+	-- some initial data
+	INSERT INTO dropped_cols VALUES (1, 1, 1);
+));
+
+$node_subscriber_d_cols->safe_psql(
+	'postgres', qq(
+	 CREATE TABLE dropped_cols (a int, b_drop int, c int);
+));
+
+my $publisher_connstr_d_cols = $node_publisher_d_cols->connstr . ' dbname=postgres';
+$node_subscriber_d_cols->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub_dropped_cols CONNECTION '$publisher_connstr_d_cols' PUBLICATION pub_dropped_cols"
+);

Re: gcc 13 warnings

2023-03-16 Thread Pavel Stehule
čt 16. 3. 2023 v 16:43 odesílatel Tom Lane  napsal:

> Melanie Plageman  writes:
> > On Thu, Mar 16, 2023 at 9:40 AM Pavel Stehule 
> wrote:
> >> ../src/include/varatt.h:230:36: warning: array subscript 0 is outside
> array bounds of ‘char[0]’ [-Warray-bounds=]
>
> > I see these with gcc 12.2.0 also.
>
> Hmm, I do not see any warnings on HEAD with Fedora 37's gcc 12.2.1.
> What non-default configure switches, CFLAGS, etc are you using?
>

meson build without any settings

I think so it is related to meson build, I didn't see these warnings with
autoconf

regards

Pavel



>
> regards, tom lane
>


Re: improving user.c error messages

2023-03-16 Thread Peter Eisentraut

On 16.03.23 16:48, Nathan Bossart wrote:

I think the following change in DropRole() is incorrect:

 if (!is_admin_of_role(GetUserId(), roleid))
 ereport(ERROR,
 (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
-errmsg("must have admin option on role \"%s\"",
-   role)));
+errmsg("permission denied to drop role"),
+errdetail("Only roles with the %s attribute and the %s
option on role \"%s\" may drop this role.",
+  "CREATEROLE", "ADMIN",
NameStr(roleform->rolname;

The message does not reflect what check is actually performed.  (Perhaps
this was confused with a similar but not exactly the same check in
RenameRole().)

Hm.  Is your point that we should only mention the admin option here?  I
mentioned both createrole and admin option in this message (and the
createrole check above this point) in an attempt to avoid giving partial
information.


AFAICT, the mention of CREATEROLE is incorrect, because the code doesn't 
actually check for the CREATEROLE attribute.






Re: improving user.c error messages

2023-03-16 Thread Nathan Bossart
On Thu, Mar 16, 2023 at 04:24:07PM +0100, Peter Eisentraut wrote:
> I have committed two pieces that were not message changes separately.

Thanks!

> I think the following change in DropRole() is incorrect:
> 
> if (!is_admin_of_role(GetUserId(), roleid))
> ereport(ERROR,
> (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> -errmsg("must have admin option on role \"%s\"",
> -   role)));
> +errmsg("permission denied to drop role"),
> +errdetail("Only roles with the %s attribute and the %s
> option on role \"%s\" may drop this role.",
> +  "CREATEROLE", "ADMIN",
> NameStr(roleform->rolname;
> 
> The message does not reflect what check is actually performed.  (Perhaps
> this was confused with a similar but not exactly the same check in
> RenameRole().)

Hm.  Is your point that we should only mention the admin option here?  I
mentioned both createrole and admin option in this message (and the
createrole check above this point) in an attempt to avoid giving partial
information.

> In file_fdw_validator(), the option names "filename" and "program" could be
> parameterized.
> 
> 
> In DropOwnedObjects() and ReassignOwnedObjects(), I suggest the following
> changes, for clarity:
> 
> - errdetail("Only roles with privileges of role \"%s\" may drop its
> objects.",
> + errdetail("Only roles with privileges of role \"%s\" may drop objects
> owned by it.",
> 
> - errdetail("Only roles with privileges of role \"%s\" may reassign its
> objects.",
> + errdetail("Only roles with privileges of role \"%s\" may reassign objects
> owned by it.",

Will do.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: gcc 13 warnings

2023-03-16 Thread Tom Lane
Melanie Plageman  writes:
> On Thu, Mar 16, 2023 at 9:40 AM Pavel Stehule  wrote:
>> ../src/include/varatt.h:230:36: warning: array subscript 0 is outside array 
>> bounds of ‘char[0]’ [-Warray-bounds=]

> I see these with gcc 12.2.0 also.

Hmm, I do not see any warnings on HEAD with Fedora 37's gcc 12.2.1.
What non-default configure switches, CFLAGS, etc are you using?

regards, tom lane




Re: improving user.c error messages

2023-03-16 Thread Peter Eisentraut

On 10.03.23 01:03, Nathan Bossart wrote:

By the way, I'm not sure what the separation between 0001 and 0002 is
supposed to be.

I'll combine them.  I first started with user.c only, but we kept finding
new messages to improve.

I combined the patches in v7.


I have committed two pieces that were not message changes separately.


I think the following change in DropRole() is incorrect:

if (!is_admin_of_role(GetUserId(), roleid))
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
-errmsg("must have admin option on role \"%s\"",
-   role)));
+errmsg("permission denied to drop role"),
+errdetail("Only roles with the %s attribute and the 
%s option on role \"%s\" may drop this role.",
+  "CREATEROLE", "ADMIN", 
NameStr(roleform->rolname;


The message does not reflect what check is actually performed.  (Perhaps 
this was confused with a similar but not exactly the same check in 
RenameRole().)


That was the only "factual" error that I found.


In file_fdw_validator(), the option names "filename" and "program" could 
be parameterized.



In DropOwnedObjects() and ReassignOwnedObjects(), I suggest the 
following changes, for clarity:


- errdetail("Only roles with privileges of role \"%s\" may drop its 
objects.",
+ errdetail("Only roles with privileges of role \"%s\" may drop objects 
owned by it.",


- errdetail("Only roles with privileges of role \"%s\" may reassign its 
objects.",
+ errdetail("Only roles with privileges of role \"%s\" may reassign 
objects owned by it.",



The rest looks okay to me.





Re: Progress report of CREATE INDEX for nested partitioned tables

2023-03-16 Thread Ilya Gladyshev



> 16 марта 2023 г., в 04:07, Justin Pryzby  написал(а):
> 
> On Tue, Mar 14, 2023 at 06:58:14PM +0400, Ilya Gladyshev wrote:
>>> The only change from the current patch is (3).  (1) still calls
>>> count_leaf_partitions(), but only once.  I'd prefer that to rearranging
>>> the progress reporting to set the TOTAL in ProcessUtilitySlow().
>> 
>> As for reusing TOTAL calculated outside of DefineIndex, as I can see, 
>> ProcessUtilitySlow is not the only call site for DefineIndex (although, I 
>> don’t know whether all of them need progress tracking), for instance, there 
>> is ALTER TABLE that calls DefineIndex to create index for constraints. So I 
>> feel like rearranging progress reporting will result in unnecessary code 
>> duplication in those call sites, so passing in an optional parameter seems 
>> to be easier here, if we are going to optimize it, after all. Especially if 
>> back-patching is a non-issue.
> 
> Yeah.  See attached.  I don't like duplicating the loop.  Is this really
> the right direction to go ?
> 
> I haven't verified if the child tables are locked in all the paths which
> would call count_leaf_partitions().  But why is it important to lock
> them for this?  If they weren't locked before, that'd be a pre-existing
> problem...
> <0001-fix-CREATE-INDEX-progress-report-with-nested-partiti.patch>

I’m not sure what the general policy on locking is, but I have checked ALTER 
TABLE ADD INDEX, and the all the partitions seem to be locked on the first 
entry to DefineIndex there. All other call sites pass in the parentIndexId, 
which means the progress tracking machinery will not be initialized, so I 
think, we don’t need to do locking in count_leaf_partitions(). 

The approach in the patch looks good to me. Some nitpicks on the patch: 
1. There’s an unnecessary second call to get_rel_relkind in ProcessUtilitySlow, 
we can just use what’s in the variable relkind.
2. We can also combine else and if to have one less nested level like that:
+   else if (!RELKIND_HAS_PARTITIONS(child_relkind))

3. There was a part of the comment saying "If the index was built by calling 
DefineIndex() recursively, the called function is responsible for updating the 
progress report for built indexes.", I think it is still useful to have it 
there.






Re: The use of atooid() on non-Oid results

2023-03-16 Thread Tom Lane
Daniel Gustafsson  writes:
> When looking at the report in [0] an API choice in the relevant pg_upgrade 
> code
> path stood out as curious.  check_is_install_user() runs this query to ensure
> that only the install user is present in the cluster:

> res = executeQueryOrDie(conn,
> "SELECT COUNT(*) "
> "FROM pg_catalog.pg_roles "
> "WHERE rolname !~ '^pg_'");

> The result is then verified with the following:

> if (cluster == _cluster && atooid(PQgetvalue(res, 0, 0)) != 1)
> pg_fatal("Only the install user can be defined in the new cluster.");

> This was changed from atoi() in ee646df59 with no specific comment on why.
> This is not a bug, since atooid() will do the right thing here, but it threw 
> me
> off reading the code and might well confuse others.  Is there a reason not to
> change this back to atoi() for code clarity as we're not reading an Oid here?

Hmm ... in principle, you could have more than 2^31 entries in pg_roles,
but not more than 2^32 since they all have to have distinct OIDs.  So
I can see the point of avoiding that theoretical overflow hazard.  But
personally I'd probably avoid assuming anything about how wide the COUNT()
result could be, and instead writing

... && strcmp(PQgetvalue(res, 0, 0), "1") != 0)

regards, tom lane




Re: Remove last traces of SCM credential auth from libpq?

2023-03-16 Thread Tom Lane
Michael Paquier  writes:
> libpq has kept some code related to the support of authentication with
> SCM credentials for some time now, code dead in the backend since
> 9.1.  Wouldn't it be time to let it go and remove this code entirely,
> erroring in libpq if attempting to connect to a server that supports
> that?

+1.  Since that's only used on Unix-domain sockets, it could only be
useful if you were using current libpq while talking to a pre-9.1
server on the same machine.  That seems fairly unlikely --- and if
you did have to do that, you could still connect, just not with peer
auth.  You'd be suffering other quality-of-life problems too,
because we removed support for such old servers from psql and pg_dump
awhile ago.

> Hard to say if this is actually working these days.

I didn't trace the old discussions, but the commit that removed the
server-side support (be4585b1c) mentions something about portability
issues with that code ... so it's rather likely that it didn't work
anyway.

In addition to the changes here, it looks like you could drop the
configure/meson probes that set HAVE_STRUCT_CMSGCRED.

Also, in pg_fe_sendauth, couldn't you just let the default: case
handle it instead of adding a bespoke error message?  We're not
really expecting that anyone is ever going to hit this, so I'm
not convinced it's worth the translation burden.

regards, tom lane




Re: gcc 13 warnings

2023-03-16 Thread Melanie Plageman
On Thu, Mar 16, 2023 at 9:40 AM Pavel Stehule  wrote:
> [1700/2287] Compiling C object src/pl/plpgsql/src/plpgsql.so.p/pl_exec.c.o
> In file included from ../src/include/access/htup_details.h:22,
>  from ../src/pl/plpgsql/src/pl_exec.c:21:
> In function ‘assign_simple_var’,
> inlined from ‘exec_set_found’ at ../src/pl/plpgsql/src/pl_exec.c:8307:2:
> ../src/include/varatt.h:230:36: warning: array subscript 0 is outside array 
> bounds of ‘char[0]’ [-Warray-bounds=]
>   230 | (((varattrib_1b_e *) (PTR))->va_tag)
>   |^
> ../src/include/varatt.h:94:12: note: in definition of macro 
> ‘VARTAG_IS_EXPANDED’
>94 | (((tag) & ~1) == VARTAG_EXPANDED_RO)
>   |^~~
> ../src/include/varatt.h:284:57: note: in expansion of macro ‘VARTAG_1B_E’
>   284 | #define VARTAG_EXTERNAL(PTR)
> VARTAG_1B_E(PTR)
>   | ^~~
> ../src/include/varatt.h:301:57: note: in expansion of macro ‘VARTAG_EXTERNAL’
>   301 | (VARATT_IS_EXTERNAL(PTR) && 
> !VARTAG_IS_EXPANDED(VARTAG_EXTERNAL(PTR)))
>   | 
> ^~~
> ../src/pl/plpgsql/src/pl_exec.c:8495:17: note: in expansion of macro 
> ‘VARATT_IS_EXTERNAL_NON_EXPANDED’
>  8495 | 
> VARATT_IS_EXTERNAL_NON_EXPANDED(DatumGetPointer(newvalue)))
>   | ^~~
> In function ‘exec_set_found’:
> cc1: note: source object is likely at address zero

I see these with gcc 12.2.0 also.




Re: Move defaults toward ICU in 16?

2023-03-16 Thread Peter Eisentraut

On 09.03.23 20:14, Jeff Davis wrote:

Let's come back to that after dealing with the other two.


Leaving 0001 open for now.


I suspect making a change like this now would result in a bloodbath on 
the build farm that we could do without.  I suggest revisiting this 
after the commit fest ends.





gcc 13 warnings

2023-03-16 Thread Pavel Stehule
Hi

see

[504/2287] Compiling C object
src/backend/postgres_lib.a.p/access_transam_xlogrecovery.c.o
In function ‘recoveryStopsAfter’,
inlined from ‘PerformWalRecovery’ at
../src/backend/access/transam/xlogrecovery.c:1749:8:
../src/backend/access/transam/xlogrecovery.c:2737:42: warning:
‘recordXtime’ may be used uninitialized [-Wmaybe-uninitialized]
 2737 | recoveryStopTime = recordXtime;
  | ~^
../src/backend/access/transam/xlogrecovery.c: In function
‘PerformWalRecovery’:
../src/backend/access/transam/xlogrecovery.c:2628:21: note: ‘recordXtime’
was declared here
 2628 | TimestampTz recordXtime;
  | ^~~
[1642/2287] Compiling C object src/bin/pgbench/pgbench.p/pgbench.c.o
In function ‘coerceToInt’,
inlined from ‘evalStandardFunc’ at ../src/bin/pgbench/pgbench.c:2617:11:
../src/bin/pgbench/pgbench.c:2042:17: warning: ‘vargs[0].type’ may be used
uninitialized [-Wmaybe-uninitialized]
 2042 | if (pval->type == PGBT_INT)
  | ^~
../src/bin/pgbench/pgbench.c: In function ‘evalStandardFunc’:
../src/bin/pgbench/pgbench.c:2250:22: note: ‘vargs’ declared here
 2250 | PgBenchValue vargs[MAX_FARGS];
  |  ^
In function ‘coerceToInt’,
inlined from ‘evalStandardFunc’ at ../src/bin/pgbench/pgbench.c:2617:11:
../src/bin/pgbench/pgbench.c:2044:32: warning: ‘vargs[0].u.ival’ may be
used uninitialized [-Wmaybe-uninitialized]
 2044 | *ival = pval->u.ival;
  | ~~~^
../src/bin/pgbench/pgbench.c: In function ‘evalStandardFunc’:
../src/bin/pgbench/pgbench.c:2250:22: note: ‘vargs’ declared here
 2250 | PgBenchValue vargs[MAX_FARGS];
  |  ^
In function ‘coerceToInt’,
inlined from ‘evalStandardFunc’ at ../src/bin/pgbench/pgbench.c:2617:11:
../src/bin/pgbench/pgbench.c:2049:40: warning: ‘vargs[0].u.dval’ may be
used uninitialized [-Wmaybe-uninitialized]
 2049 | double  dval = rint(pval->u.dval);
  |^~
../src/bin/pgbench/pgbench.c: In function ‘evalStandardFunc’:
../src/bin/pgbench/pgbench.c:2250:22: note: ‘vargs’ declared here
 2250 | PgBenchValue vargs[MAX_FARGS];
  |  ^
[1700/2287] Compiling C object src/pl/plpgsql/src/plpgsql.so.p/pl_exec.c.o
In file included from ../src/include/access/htup_details.h:22,
 from ../src/pl/plpgsql/src/pl_exec.c:21:
In function ‘assign_simple_var’,
inlined from ‘exec_set_found’ at ../src/pl/plpgsql/src/pl_exec.c:8307:2:
../src/include/varatt.h:230:36: warning: array subscript 0 is outside array
bounds of ‘char[0]’ [-Warray-bounds=]
  230 | (((varattrib_1b_e *) (PTR))->va_tag)
  |^
../src/include/varatt.h:94:12: note: in definition of macro
‘VARTAG_IS_EXPANDED’
   94 | (((tag) & ~1) == VARTAG_EXPANDED_RO)
  |^~~
../src/include/varatt.h:284:57: note: in expansion of macro ‘VARTAG_1B_E’
  284 | #define VARTAG_EXTERNAL(PTR)
 VARTAG_1B_E(PTR)
  | ^~~
../src/include/varatt.h:301:57: note: in expansion of macro
‘VARTAG_EXTERNAL’
  301 | (VARATT_IS_EXTERNAL(PTR) &&
!VARTAG_IS_EXPANDED(VARTAG_EXTERNAL(PTR)))
  |
^~~
../src/pl/plpgsql/src/pl_exec.c:8495:17: note: in expansion of macro
‘VARATT_IS_EXTERNAL_NON_EXPANDED’
 8495 |
VARATT_IS_EXTERNAL_NON_EXPANDED(DatumGetPointer(newvalue)))
  | ^~~
In function ‘exec_set_found’:
cc1: note: source object is likely at address zero

Regards

Pavel


FW: uuid-ossp source or binaries for Windows

2023-03-16 Thread Mark Hill
I posted this to pgsql-general but I think that's more for Postgres users.   
I'm trying to build Postgres 
with the uuid-ossp extension on Windows using the msvc toolset provided with 
the Postgres source
in /src/tools/msvc, e.g. postgresql-14.7/src/tools/msvc.

I think I need uuid-ossp installed.  The uuid-ossp source located here:  
ftp://ftp.ossp.org/pkg/lib/uuid/uuid-1.6.2.tar.gz
is Unix-specific.

Is there uuid-ossp source download for Windows or are uuid-ossp prebuilt 
binaries for Windows available?

Thanks, Mark

-Original Message-
From: Mark Hill  
Sent: Wednesday, March 15, 2023 11:15 PM
To: 'Daniel Gustafsson' 
Cc: pgsql-gene...@lists.postgresql.org; Ken Peressini ; 
Michael King 
Subject: RE: uuid-ossp source or binaries for Windows

EXTERNAL

Hey Daniel,

Thanks for getting back to me.

I think the issue I'm having is that my build of Postgres is missing uuid 
pieces needed by our users.

They're executing the command:   CREATE EXTENSION "uuid-ossp"

and getting the error

ERROR:  could not open extension control file 
"/share/extension/uuid-ossp.control"

The only file matching "*uuid*" in my build of Postgres is:   
/include/server/utils/uuid.h

I should have in addition:
/include/uuid.h
/lib/uuid-ossp.dll
/share/extension/uuid-ossp--1.1.sql
/share/extension/uuid-ossp.control
/share/extension/uuid-ossp--unpackaged--1.0.sql
/share/extension/uuid-ossp--1.0--1.1.sql

I need a Windows-specific install of uuid-ossp for the Postgres build to use, 
for both 32bit and 64bit Windows.

Thanks, Mark

-Original Message-
From: Daniel Gustafsson 
Sent: Wednesday, March 15, 2023 3:16 PM
To: Mark Hill 
Cc: pgsql-gene...@lists.postgresql.org; Ken Peressini ; 
Michael King 
Subject: Re: uuid-ossp source or binaries for Windows

EXTERNAL

> On 15 Mar 2023, at 19:31, Mark Hill  wrote:
>
> I've downloaded the PostgreSQL 14.7 source and building it on Windows 64bit 
> and 32bit.
>
> I'm using the Visual Studio tools in the src/tools/msvc folder.
>
> I'm trying to build with the uuid extension but it looks like I need 
> uuid-ossp installed in order to get it to work.

Do you need the extension specifically or does the built-in generator function 
do what you need?

> The source download referenced in the Postgresql doc here, 
> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww%2F=05%7C01%7Cmark.hill%40sas.com%7C2fe3e6f033eb4de4506708db25cca633%7Cb1c14d5c362545b3a4309552373a0c2f%7C0%7C0%7C638145333114215621%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=4dfMYruL3rZjCY8ScPUM70xOk%2FM2WJIs8FPw4xXrUI0%3D=0.
> postgresql.org%2Fdocs%2Fcurrent%2Fuuid-ossp.html%23id-1.11.7.58.6
> =05%7C01%7CMark.Hill%40sas.com%7C5acf51786dd5440ea0ed08db2589a9fd%7Cb1
> c14d5c362545b3a4309552373a0c2f%7C0%7C0%7C638145045990073139%7CUnknown%
> 7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJX
> VCI6Mn0%3D%7C3000%7C%7C%7C=TSRqdrvImMLf6Pr8XWqRSUkCWUDaAjFtziykz
> Czt5Sc%3D=0 this source download, 
> https://nam02.safelinks.protection.outlook.com/?url=ftp%3A%2F%2Fftp.ossp.org%2Fpkg%2Flib%2Fuuid%2Fuuid-1.6.2.tar.gz=05%7C01%7Cmark.hill%40sas.com%7C2fe3e6f033eb4de4506708db25cca633%7Cb1c14d5c362545b3a4309552373a0c2f%7C0%7C0%7C638145333114215621%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=t9AVwe32CRgcW9oH%2Fmjj0lC8SSIAw0cBrQXmH1GKcNc%3D=0,
>  is Unix-specific as far as I can tell.
>
> Where can I find uuid-ossp for Windows, 32 and 64 bit, either the 
> source so I can build it or prebuilt libraries?

I don't know windows at all, but uuid-ossp.dll is provided in the EDB packages 
(looking at the binary zip bundle) so it's clearly available to be built.
Maybe someone from EDB can chime in with pointers for building on Windows so we 
can update the docs accordingly?

--
Daniel Gustafsson







Re: Allow logical replication to copy tables in binary format

2023-03-16 Thread Melih Mutlu
Amit Kapila , 16 Mar 2023 Per, 06:25 tarihinde
şunu yazdı:

> On Thu, Mar 16, 2023 at 8:27 AM Euler Taveira  wrote:
> >
> > On Wed, Mar 8, 2023, at 11:50 PM, Amit Kapila wrote:
> >
> > It is not clear to me which version check you wanted to add because we
> > seem to have a binary option in COPY from the time prior to logical
> > replication. I feel we need a publisher version 14 check as that is
> > where we start to support binary mode transfer in logical replication.
> > See the check in function libpqrcv_startstreaming().
> >
> > ... then you are breaking existent cases. Even if you have a convincing
> > argument, you are introducing a behavior change in prior versions (commit
> > messages should always indicate that you are breaking backward
> compatibility).
> >
> > +
> > +   /*
> > +* The binary option for replication is supported since v14
> > +*/
> > +   if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 14 &&
> > +   MySubscription->binary)
> > +   {
> > +   appendStringInfo(, " WITH (FORMAT binary)");
> > +   options = lappend(options, makeDefElem("format", (Node *)
> makeString("binary"), -1));
> > +   }
> > +
> >
> > What are the arguments to support since v14 instead of the to-be-released
> > version? I read the thread but it is not clear. It was said about the
> > restrictive nature of this feature and it will be frustrating to see
> that the
> > same setup (with the same commands) works with v14 and v15 but it
> doesn't with
> > v16.
> >
>
> If the failure has to happen it will anyway happen later when the
> publisher will be upgraded to v16. The reason for the version checks
> as v14 was to allow the initial sync from the same version where the
> binary mode for replication was introduced. However, if we expect
> failures in the existing setup, I am fine with supporting this for >=
> v16.
>

Upgrading the subscriber to v16 and keeping the subscriber in v14 could
break existing subscriptions. I don't know how likely such a case is.

I don't have a strong preference on this. What do you think? Should we
change it >=v16 or keep it as it is?

Best,
-- 
Melih Mutlu
Microsoft


Re: Allow logical replication to copy tables in binary format

2023-03-16 Thread Melih Mutlu
Hi,

Please see the attached v16.

Peter Smith , 16 Mar 2023 Per, 03:03 tarihinde şunu
yazdı:

> Here are some review comments for v15-0001
>

I applied your comments in the updated patch.

shiy.f...@fujitsu.com , 16 Mar 2023 Per, 05:35
tarihinde şunu yazdı:

> On Thu, Mar 16, 2023 2:26 AM Melih Mutlu  wrote:
> >
> > Right, it needs to be ordered. Fixed.
> >
>
> Hi,
>
> Thanks for updating the patch. I tested some cases like toast data,
> combination
> of row filter and column lists, and it works well.
>

Thanks for testing. I changed wait_for_log lines as you suggested.

houzj.f...@fujitsu.com , 16 Mar 2023 Per, 05:55
tarihinde şunu yazdı:

> 1.
> +   {
> +   appendStringInfo(, " WITH (FORMAT binary)");
>
> We could use appendStringInfoString here.
>

Done.


> 2.
> I think it would be better to pass the log offset when using wait_for_log,
> because otherwise it will check the whole log file to find the target
> message,
> This might not be a big problem, but it has a risk of getting unexpected
> log message
> which was generated by previous commands.
>

You're right. I added offsets for wait_for_log's .

Thanks,
-- 
Melih Mutlu
Microsoft


v16-0001-Allow-logical-replication-to-copy-table-in-binary.patch
Description: Binary data


Re: [BUG] Logical replica crash if there was an error in a function.

2023-03-16 Thread Anton A. Melnikov

Hello!

On 15.03.2023 21:29, Gregory Stark (as CFM) wrote:


These patches that are "Needs Review" and have received no comments at
all since before March 1st are these. If your patch is amongst this
list I would suggest any of:

1) Move it yourself to the next CF (or withdraw it)
2) Post to the list with any pending questions asking for specific
feedback -- it's much more likely to get feedback than just a generic
"here's a patch plz review"...
3) Mark it Ready for Committer and possibly post explaining the
resolution to any earlier questions to make it easier for a committer
to understand the state



There were some remarks:
1) very poor use of test cycles (by Tom Lane)
Solved in v2 by adding few extra queries to an existent test without any 
additional syncing.
2) the patch-tester fails on all platforms (by Andres Freund)
Fixed in v3.
3) warning with "my" variable $result and suggestion to correct comment (by 
vignesh C)
Both fixed in v4.

Now there are no any pending questions, so moved it to RFC.

With the best regards!

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companycommit 16beb574708e0e2e993eb2fa1b093be97b8e53cf
Author: Anton A. Melnikov 
Date:   Sun Dec 11 06:26:03 2022 +0300

Add test for syntax error in the function in a a logical replication
worker. See dea834938.

diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl
index 036576acab..b241a7d498 100644
--- a/src/test/subscription/t/100_bugs.pl
+++ b/src/test/subscription/t/100_bugs.pl
@@ -69,6 +69,51 @@ $node_publisher->wait_for_catchup('sub1');
 
 pass('index predicates do not cause crash');
 
+# https://www.postgresql.org/message-id/flat/adf0452f-8c6b-7def-d35e-ab516c80088e%40inbox.ru
+
+# The bug was that when a syntax error occurred in a SQL-language or PL/pgSQL-language
+# CREATE FUNCTION or DO command executed in a logical replication worker,
+# we'd suffer a null pointer dereference or assertion failure.
+
+$node_subscriber->safe_psql('postgres', q{
+	create or replace procedure rebuild_test(
+	) as
+	$body$
+	declare
+		l_code  text:= E'create or replace function public.test_selector(
+	) returns setof public.tab1 as
+	\$body\$
+		select * from  error_name
+	\$body\$ language sql;';
+	begin
+		execute l_code;
+		perform public.test_selector();
+	end
+	$body$ language plpgsql;
+	create or replace function test_trg()
+	returns trigger as
+	$body$
+		declare
+		begin
+			call public.rebuild_test();
+			return NULL;
+		end
+	$body$ language plpgsql;
+	create trigger test_trigger after insert on tab1 for each row
+execute function test_trg();
+	alter table tab1 enable replica trigger test_trigger;
+});
+
+# This command will cause a crash on the replica if that bug hasn't been fixed
+$node_publisher->safe_psql('postgres', "INSERT INTO tab1 VALUES (3, 4)");
+
+my $result = $node_subscriber->wait_for_log(
+	"ERROR:  relation \"error_name\" does not exist at character"
+);
+
+ok($result,
+	"ERROR: Logical decoding doesn't fail on function error");
+
 # We'll re-use these nodes below, so drop their replication state.
 $node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION sub1");
 $node_publisher->safe_psql('postgres', "DROP PUBLICATION pub1");
@@ -352,7 +397,7 @@ $node_subscriber->wait_for_subscription_sync($node_publisher, 'tap_sub_sch');
 
 # Subscriber's sch1.t1 should receive the row inserted into the new sch1.t1,
 # but not the row inserted into the old sch1.t1 post-rename.
-my $result = $node_subscriber->safe_psql('postgres', "SELECT * FROM sch1.t1");
+$result = $node_subscriber->safe_psql('postgres', "SELECT * FROM sch1.t1");
 is( $result, qq(1
 2), 'check data in subscriber sch1.t1 after schema rename');
 


Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry

2023-03-16 Thread Drouvot, Bertrand

On 3/16/23 12:46 PM, Michael Paquier wrote:

On Thu, Mar 16, 2023 at 11:32:56AM +0100, Drouvot, Bertrand wrote:

On 3/16/23 7:29 AM, Michael Paquier wrote:

  From what I get with this change, the number of tuples changed by DMLs
have their computations done a bit earlier,


Thanks for looking at it!

Right, but note this is in a dedicated new tablestatus (created
within find_tabstat_entry()).


Sure, however it copies the pointer of the PgStat_TableXactStatus from
tabentry, isn't it?  


Oh I see what you mean, yeah, the pointer is copied.


This means that it keeps a reference of the chain
of subtransactions.  It does not matter for the functions but it could
for out-of-core callers of find_tabstat_entry(), no?


Yeah, maybe.


Perhaps you are
right and that's not worth worrying, still I don't feel particularly
confident that this is the best approach we can take.



due to what potential out-of-core callers could do with it?


How much do we need to care about the remaining two callers
pg_stat_get_xact_blocks_fetched() and pg_stat_get_xact_blocks_hit()?


Regarding pg_stat_get_xact_blocks_fetched() and pg_stat_get_xact_blocks_hit()
the callers (if any) are outside of the core PG (as from what I can
see they are not used at all).

I don't think we should pay any particular attention to those 2 ones
as anyway nothing prevent the 7 others to be called outside of the
pg_stat_xact_all_tables view.


I am not quite sure, TBH.  Did you look at the difference with a long
chain of subtrans, like savepoints?  The ODBC driver "loves" producing
a lot of savepoints, for example.



No, I did not measure the impact.


It would feel a bit safer to me to document that find_tabstat_entry()
is currently only used for this xact system view..  The extra
computation could lead to surprises, actually, if this routine is used
outside this context?  Perhaps that's OK, but it does not give me a
warm feeling, just to reshape three functions of pgstatfuncs.c with
macros.


That's a fair point. On the other hand those 9 functions (which can
all be used outside of the pg_stat_xact_all_tables view) are not
documented, so I'm not sure this is that much of a concern (and if
we think it is we still gave the option to add an extra flag to
indicate whether or not the extra computation is needed.)


That's not quite exact, I think.  The first 7 functions are used in a
system catalog that is documented. 


Right.


Still we have a problem here.  I
can actually see a few projects relying on these two functions while
looking a bit around, so they are used.  And the issue comes from
ddfc2d9, that has removed these functions from the documentation
ignoring that they are used in no system catalogs.  I think that we
should fix that and re-add the two missing functions with a proper
description in the docs, at least? 


As they could be/are used outside of the xact view, yes I think the same.


There is no trace of them.
Perhaps the ones exposted through pg_stat_xact_all_tables are fine if
not listed.


I'd be tempted to add documentation for all of them, I can look at it.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: pg_dump versus hash partitioning

2023-03-16 Thread Tom Lane
Julien Rouhaud  writes:
> On Mon, Mar 13, 2023 at 07:39:12PM -0400, Tom Lane wrote:
>> Yeah, we need to do both.  Attached find an updated patch series:

> I didn't find a CF entry, is it intended?

Yeah, it's there:

https://commitfest.postgresql.org/42/4226/

> I'm not sure if you intend to keep the current 0002 - 0003 separation, but if
> yes the part about te->defn and possible fallback should be moved to 0003.  In
> 0002 we're only looking at the COPY statement.

I was intending to smash it all into one commit --- the separation is
just to ease review.

> I know you're only inheriting this comment, but isn't it well outdated and not
> accurate anymore?  I'm assuming that "archiving is not on" was an acceptable
> way to mean "wal_level < archive" at some point, but it's now completely
> misleading.

Sure, want to propose wording?

> Hash partitioning was added with pg11, should we bypass pg10 too with a 
> comment
> saying that we only care about hash, at least for the forseeable future?

Good point.  With v10 already out of support, it hardly matters in the
real world, but we might as well not expend cycles when we clearly
needn't.

> More generally, I also think that forcing --load-via-partition-root for
> known unsafe partitioning seems like the best compromise.  I'm not sure if we
> should have an option to turn it off though.

I think the odds of that yielding a usable dump are nil, so I don't
see why we should bother.

regards, tom lane




Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements

2023-03-16 Thread Andrei Zubkov
A little comment fix in update script of a patch
-- 
Andrei Zubkov
From 52e75fa05f5dea5700d96aea81ea81d91492b018 Mon Sep 17 00:00:00 2001
From: Andrei Zubkov 
Date: Thu, 16 Mar 2023 13:18:59 +0300
Subject: [PATCH 1/2] pg_stat_statements tests: Add NOT NULL checking of
 pg_stat_statements_reset

This is preliminary patch. It adds NOT NULL checking for the result of
pg_stat_statements_reset() function. It is needed for upcoming patch
"Track statement entry timestamp" that will change the result type of
this function to the timestamp of a reset performed.

Reviewed by: Julien Rouhaud, Hayato Kuroda, Yuki Seino, Chengxi Sun,
Anton Melnikov, Darren Rush

Discussion: https://www.postgresql.org/message-id/flat/72e80e7b160a6eb189df9ef6f068cce3765d37f8.camel%40moonset.ru
---
 .../pg_stat_statements/expected/cursors.out   |  28 ++--
 contrib/pg_stat_statements/expected/dml.out   |  20 +--
 .../expected/level_tracking.out   |  80 +-
 .../pg_stat_statements/expected/planning.out  |  18 +--
 .../pg_stat_statements/expected/select.out|  44 ++---
 .../expected/user_activity.out| 120 +++---
 .../pg_stat_statements/expected/utility.out   | 150 +-
 contrib/pg_stat_statements/expected/wal.out   |  10 +-
 contrib/pg_stat_statements/sql/cursors.sql|   6 +-
 contrib/pg_stat_statements/sql/dml.sql|   4 +-
 .../pg_stat_statements/sql/level_tracking.sql |  12 +-
 contrib/pg_stat_statements/sql/planning.sql   |   4 +-
 contrib/pg_stat_statements/sql/select.sql |  10 +-
 .../pg_stat_statements/sql/user_activity.sql  |  15 +-
 contrib/pg_stat_statements/sql/utility.sql|  28 ++--
 contrib/pg_stat_statements/sql/wal.sql|   2 +-
 16 files changed, 278 insertions(+), 273 deletions(-)

diff --git a/contrib/pg_stat_statements/expected/cursors.out b/contrib/pg_stat_statements/expected/cursors.out
index 46375ea9051..0fc4b2c098d 100644
--- a/contrib/pg_stat_statements/expected/cursors.out
+++ b/contrib/pg_stat_statements/expected/cursors.out
@@ -3,10 +3,10 @@
 --
 -- These tests require track_utility to be enabled.
 SET pg_stat_statements.track_utility = TRUE;
-SELECT pg_stat_statements_reset();
- pg_stat_statements_reset 
---
- 
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t 
+---
+ t
 (1 row)
 
 -- DECLARE
@@ -20,13 +20,13 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
 ---+--+---
  2 |0 | CLOSE cursor_stats_1
  2 |0 | DECLARE cursor_stats_1 CURSOR WITH HOLD FOR SELECT $1
- 1 |1 | SELECT pg_stat_statements_reset()
+ 1 |1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
 (3 rows)
 
-SELECT pg_stat_statements_reset();
- pg_stat_statements_reset 
---
- 
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t 
+---
+ t
 (1 row)
 
 -- FETCH
@@ -59,12 +59,12 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
  1 |0 | DECLARE cursor_stats_2 CURSOR WITH HOLD FOR SELECT $1
  1 |1 | FETCH 1 IN cursor_stats_1
  1 |1 | FETCH 1 IN cursor_stats_2
- 1 |1 | SELECT pg_stat_statements_reset()
+ 1 |1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
 (9 rows)
 
-SELECT pg_stat_statements_reset();
- pg_stat_statements_reset 
---
- 
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t 
+---
+ t
 (1 row)
 
diff --git a/contrib/pg_stat_statements/expected/dml.out b/contrib/pg_stat_statements/expected/dml.out
index 7b9c8f979ee..4c723a038b9 100644
--- a/contrib/pg_stat_statements/expected/dml.out
+++ b/contrib/pg_stat_statements/expected/dml.out
@@ -81,16 +81,16 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
  1 |   12 | SELECT * FROM pgss_dml_tab ORDER BY a
  2 |4 | SELECT * FROM pgss_dml_tab WHERE a > $1 ORDER BY a
  1 |8 | SELECT * FROM pgss_dml_tab WHERE a IN ($1, $2, $3, $4, $5)
- 1 |1 | SELECT pg_stat_statements_reset()
+ 1 |1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
  1 |0 | SET pg_stat_statements.track_utility = FALSE
  6 |6 | UPDATE pgss_dml_tab SET b = $1 WHERE a = $2
  1 |3 | UPDATE pgss_dml_tab SET b = $1 WHERE a > $2
 (10 rows)
 
-SELECT pg_stat_statements_reset();
- pg_stat_statements_reset 
---
- 
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t 
+---
+ t
 (1 row)
 
 -- MERGE
@@ -136,12 +136,12 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
|  |  WHEN NOT MATCHED THEN INSERT (a, b) VALUES ($1, $2)
  1 |0 | MERGE INTO pgss_dml_tab USING pgss_dml_tab st ON (st.a = pgss_dml_tab.a)   +
|  |  WHEN NOT MATCHED THEN INSERT (b, a) VALUES ($1, $2)
- 1 |1 | SELECT pg_stat_statements_reset()
+ 1 |1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
 (10 

Re: May be BUG. Periodic burst growth of the checkpoint_req counter on replica.

2023-03-16 Thread Anton A. Melnikov

Hello!


On 15.03.2023 21:29, Gregory Stark (as CFM) wrote:


These patches that are "Needs Review" and have received no comments at
all since before March 1st are these. If your patch is amongst this
list I would suggest any of:

1) Move it yourself to the next CF (or withdraw it)
2) Post to the list with any pending questions asking for specific
feedback -- it's much more likely to get feedback than just a generic
"here's a patch plz review"...
3) Mark it Ready for Committer and possibly post explaining the
resolution to any earlier questions to make it easier for a committer
to understand the state



There are two different patch variants and some discussion expected.
So moved them to the next CF.

With the best wishes!

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companycommit 6f3ab7b6b5562aaf68ccfdd6dde17c27c2d2869e
Author: Anton A. Melnikov 
Date:   Wed Dec 7 10:10:58 2022 +0300

Remove burst growth of the checkpoint_req on replica.

with correcttions according to comment:  https://www.postgresql.org/message-id/20220907.163946.2214338139097492905.horikyota.ntt%40gmail.com

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index a31fbbff78..02d86bf370 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -8935,3 +8935,20 @@ SetWalWriterSleeping(bool sleeping)
 	XLogCtl->WalWriterSleeping = sleeping;
 	SpinLockRelease(>info_lck);
 }
+
+/*
+ * Check if a new checkpoint WAL record has been received since the
+ * last restartpoint. So it is possible to create a new one.
+ */
+bool RestartPointAvailable(void)
+{
+	bool result = false;
+
+	SpinLockAcquire(>info_lck);
+	if (!XLogRecPtrIsInvalid(XLogCtl->lastCheckPointRecPtr)
+		&& XLogCtl->lastCheckPoint.redo > XLogCtl->RedoRecPtr)
+result = true;
+	SpinLockRelease(>info_lck);
+
+	return result;
+}
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 97b882564f..a88c716197 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -3196,7 +3196,15 @@ XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen,
 			{
 (void) GetRedoRecPtr();
 if (XLogCheckpointNeeded(readSegNo))
-	RequestCheckpoint(CHECKPOINT_CAUSE_XLOG);
+{
+	/*
+		* If there is no new checkpoint WAL records since the
+		* last restartpoint the creation of new one
+		* will certainly fail, so skip it.
+		*/
+	if (RestartPointAvailable())
+		RequestCheckpoint(CHECKPOINT_CAUSE_XLOG);
+}
 			}
 		}
 
diff --git a/src/include/access/xlogrecovery.h b/src/include/access/xlogrecovery.h
index f3398425d8..dcc2e64ca7 100644
--- a/src/include/access/xlogrecovery.h
+++ b/src/include/access/xlogrecovery.h
@@ -83,6 +83,7 @@ extern void InitWalRecovery(ControlFileData *ControlFile,
 			bool *wasShutdown_ptr, bool *haveBackupLabel_ptr,
 			bool *haveTblspcMap_ptr);
 extern void PerformWalRecovery(void);
+extern bool RestartPointAvailable(void);
 
 /*
  * FinishWalRecovery() returns this.  It contains information about the point
commit 9ec1f90bbe407da94f8e940084e8815b4f8e874b
Author: Anton A. Melnikov 
Date:   Wed Dec 7 10:49:19 2022 +0300

Add statistic about restartpoint into pg_stat_bgwriter

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2d8104b090..658cafdc03 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1107,6 +1107,9 @@ CREATE VIEW pg_stat_bgwriter AS
 SELECT
 pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed,
 pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req,
+pg_stat_get_bgwriter_timed_restartpoints() AS restartpoints_timed,
+pg_stat_get_bgwriter_requested_restartpoints() AS restartpoints_req,
+pg_stat_get_bgwriter_performed_restartpoints() AS restartpoints_done,
 pg_stat_get_checkpoint_write_time() AS checkpoint_write_time,
 pg_stat_get_checkpoint_sync_time() AS checkpoint_sync_time,
 pg_stat_get_bgwriter_buf_written_checkpoints() AS buffers_checkpoint,
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index 5fc076fc14..2296701ddf 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -350,6 +350,8 @@ CheckpointerMain(void)
 		pg_time_t	now;
 		int			elapsed_secs;
 		int			cur_timeout;
+		bool		chkpt_or_rstpt_requested = false;
+		bool		chkpt_or_rstpt_timed = false;
 
 		/* Clear any already-pending wakeups */
 		ResetLatch(MyLatch);
@@ -368,7 +370,7 @@ CheckpointerMain(void)
 		if (((volatile CheckpointerShmemStruct *) CheckpointerShmem)->ckpt_flags)
 		{
 			do_checkpoint = true;
-			PendingCheckpointerStats.requested_checkpoints++;
+			chkpt_or_rstpt_requested = true;
 		}
 
 		/*
@@ -382,7 +384,7 @@ 

Re: Data is copied twice when specifying both child and parent table in publication

2023-03-16 Thread Amit Kapila
On Wed, Feb 8, 2023 at 9:21 AM wangw.f...@fujitsu.com
 wrote:
>
> I think this failure is caused by the recently commit (b7ae039) in the current
> HEAD. Rebased the patch set and attach them.
>

+ if (server_version >= 16)
+ {
+ appendStringInfo(, "SELECT DISTINCT N.nspname, C.relname,\n"
+ "  ( SELECT array_agg(a.attname ORDER BY a.attnum)\n"
+ "FROM pg_attribute a\n"
+ "WHERE a.attrelid = GPT.relid AND a.attnum > 0 AND\n"
+ "  NOT a.attisdropped AND\n"
+ "  (a.attnum = ANY(GPT.attrs) OR GPT.attrs IS NULL)\n"
+ "  ) AS attnames\n"
+ " FROM pg_class C\n"
+ "   JOIN pg_namespace N ON N.oid = C.relnamespace\n"
+ "   JOIN ( SELECT (pg_get_publication_tables(VARIADIC
array_agg(pubname::text))).*\n"
+ "  FROM pg_publication\n"
+ "  WHERE pubname IN ( %s )) as GPT\n"
+ "   ON GPT.relid = C.oid\n",
+ pub_names.data);

The function pg_get_publication_tables()  has already handled dropped
columns, so we don't need it here in this query. Also, the part to
build attnames should be the same as it is in view
pg_publication_tables. Can we directly try to pass the list of
pubnames to the function pg_get_publication_tables() instead of
joining it with pg_publication?

Can we keep the changes in the else part (fix when publisher < 16) the
same as HEAD and move the proposed change to a separate patch?
Basically, for the HEAD patch, let's just try to fix this when
publisher >=16. I am slightly worried that as this is a corner case
bug and we didn't see any user complaints for this, so introducing a
complex fix for back branches may not be required or at least we can
discuss that separately.

-- 
With Regards,
Amit Kapila.




Re: proposal: possibility to read dumped table's name from file

2023-03-16 Thread Pavel Stehule
út 7. 3. 2023 v 3:47 odesílatel Julien Rouhaud  napsal:

> Hi,
>
> On Mon, Mar 06, 2023 at 10:20:32PM +0100, Daniel Gustafsson wrote:
> > > On 6 Mar 2023, at 21:45, Gregory Stark (as CFM) 
> wrote:
> > >
> > > So This patch has been through a lot of commitfests. And it really
> > > doesn't seem that hard to resolve -- Pavel has seemingly been willing
> > > to go along whichever way the wind has been blowing but honestly it
> > > kind of seems like he's just gotten drive-by suggestions and he's put
> > > a lot of work into trying to satisfy them.
> >
> > Agreed.
>
> Indeed, I'm not sure I would have had that much patience.
>
> > > He implemented --include-tables-from-file=... etc. Then he implemented
> > > a hand-written parser for a DSL to select objects, then he implemented
> > > a bison parser, then he went back to the hand-written parser.
> >
> > Well, kind of.  I was trying to take the patch to the finishing line but
> was
> > uncomfortable with the hand written parser so I implemented a parser in
> Bison
> > to replace it with.  Not that hand-written parsers are bad per se (or
> that my
> > bison parser was perfect), but reading quoted identifiers across line
> > boundaries tend to require a fair amount of handwritten code.  Pavel did
> not
> > object to this version, but it was objected to by two other committers.
> >
> > At this point [0] I stepped down from trying to finish it as the
> approach I was
> > comfortable didn't gain traction (which is totally fine).
> >
> > Downthread from this the patch got a lot of reviews from Julien with the
> old
> > parser back in place.
>
> Yeah, and the current state seems quite good to me.
>
> > > Can we get some consensus on whether the DSL looks right
> >
> > I would consider this pretty settled.
>
> Agreed.
>

rebase + enhancing about related option from a563c24

Regards

Pavel
From 76f224be5f88b2a33f9bdc4a38a5bd49c86a7d3b Mon Sep 17 00:00:00 2001
From: "ok...@github.com" 
Date: Thu, 16 Mar 2023 08:18:08 +0100
Subject: [PATCH] possibility to read options for dump from file

---
 doc/src/sgml/ref/pg_dump.sgml   | 110 +++
 doc/src/sgml/ref/pg_dumpall.sgml|  22 +
 doc/src/sgml/ref/pg_restore.sgml|  25 +
 src/bin/pg_dump/Makefile|   5 +-
 src/bin/pg_dump/filter.c| 489 ++
 src/bin/pg_dump/filter.h|  56 ++
 src/bin/pg_dump/meson.build |   2 +
 src/bin/pg_dump/pg_dump.c   | 117 
 src/bin/pg_dump/pg_dumpall.c|  61 +-
 src/bin/pg_dump/pg_restore.c| 114 
 src/bin/pg_dump/t/004_pg_dump_filterfile.pl | 701 
 src/tools/msvc/Mkvcbuild.pm |   1 +
 12 files changed, 1700 insertions(+), 3 deletions(-)
 create mode 100644 src/bin/pg_dump/filter.c
 create mode 100644 src/bin/pg_dump/filter.h
 create mode 100644 src/bin/pg_dump/t/004_pg_dump_filterfile.pl

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index e6b003bf10..2a8cbe41bc 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -829,6 +829,102 @@ PostgreSQL documentation
   
  
 
+ 
+  --filter=filename
+  
+   
+Specify a filename from which to read patterns for objects to include
+or exclude from the dump. The patterns are interpreted according to the
+same rules as the corresponding options:
+-t/--table,
+--table-and-children,
+--exclude-table-and-children or
+-T for tables,
+-n/--schema for schemas,
+--include-foreign-data for data on foreign servers and
+--exclude-table-data,
+--exclude-table-data-and-and-children for table data.
+To read from STDIN, use - as the
+filename.  The --filter option can be specified in
+conjunction with the above listed options for including or excluding
+objects, and can also be specified more than once for multiple filter
+files.
+   
+
+   
+The file lists one object pattern per row, with the following format:
+
+{ include | exclude } { table | table_and_children | schema | foreign_data | table_data | table_data_and_children } PATTERN
+
+   
+
+   
+The first keyword specifies whether the objects matched by the pattern
+are to be included or excluded. The second keyword specifies the type
+of object to be filtered using the pattern:
+
+ 
+  
+   table: tables, works like
+   -t/--table
+  
+ 
+ 
+  
+   table_and_children: tables, works like
+   -t/--table, except that
+   it also includes any partitions or inheritance child
+   tables of the table(s) matching the
+   pattern.
+  
+ 
+ 
+  
+   schema: schemas, works like
+   -n/--schema
+  
+ 

Re: Split index and table statistics into different types of stats

2023-03-16 Thread Michael Paquier
On Thu, Mar 16, 2023 at 10:24:32AM +0100, Drouvot, Bertrand wrote:
> My plan was to get [1] done before resuming working on the "Split
> index and table statistics into different types of stats" one.

Okay, I was unsure what should be the order here.  Let's see about [1]
first, then.
--
Michael


signature.asc
Description: PGP signature


Re: Speed-up shared buffers prewarming

2023-03-16 Thread Drouvot, Bertrand

Hi,

On 3/15/23 10:40 PM, Matthias van de Meent wrote:

On Wed, 15 Mar 2023 at 21:38, Konstantin Knizhnik  wrote:


Hi hackers,

It is well known fact that queries using sequential scan can not be used to 
prewarm cache, because them are using ring buffer
even if shared buffers are almost empty.
I have searched hackers archive but failed to find any discussion about it.
What are the drawbacks of using free buffers even with BAM_BULKREAD strategy?
I mean the following trivial patch:

diff --git a/src/backend/storage/buffer/freelist.c 
b/src/backend/storage/buffer/freelist.c
index 6be80476db..243335d0e4 100644
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -208,8 +208,15 @@ StrategyGetBuffer(BufferAccessStrategy strategy, uint32 
*buf_state)
 /*
  * If given a strategy object, see whether it can select a buffer. We
  * assume strategy objects don't need buffer_strategy_lock.
  */
-   if (strategy != NULL)
+   if (strategy != NULL && StrategyControl->firstFreeBuffer < 0)
 {
 buf = GetBufferFromRing(strategy, buf_state);
 if (buf != NULL)

So if there are free buffers, then use normal buffer allocation instead of 
GetBufferFromRing.


Yes. As seen in [1], ring buffers aren't all that great in some cases,
and I think this is one. Buffer allocation should always make use of
the available resources, so that it doesn't take O(N/ring_size) scans
on a table to fill the buffers if that seqscan is the only workload of
the system.


Agree but then what do you think about also paying special attention to those 
buffers when eviction needs to be done?

Those buffers are "usually" needed briefly, so something like being able to 
distinguish them and be more aggressive regarding their eviction.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry

2023-03-16 Thread Michael Paquier
On Thu, Mar 16, 2023 at 11:32:56AM +0100, Drouvot, Bertrand wrote:
> On 3/16/23 7:29 AM, Michael Paquier wrote:
>>  From what I get with this change, the number of tuples changed by DMLs
>> have their computations done a bit earlier,
> 
> Thanks for looking at it!
> 
> Right, but note this is in a dedicated new tablestatus (created
> within find_tabstat_entry()).

Sure, however it copies the pointer of the PgStat_TableXactStatus from
tabentry, isn't it?  This means that it keeps a reference of the chain
of subtransactions.  It does not matter for the functions but it could
for out-of-core callers of find_tabstat_entry(), no?  Perhaps you are
right and that's not worth worrying, still I don't feel particularly
confident that this is the best approach we can take.

>> How much do we need to care about the remaining two callers
>> pg_stat_get_xact_blocks_fetched() and pg_stat_get_xact_blocks_hit()?
> 
> Regarding pg_stat_get_xact_blocks_fetched() and pg_stat_get_xact_blocks_hit()
> the callers (if any) are outside of the core PG (as from what I can
> see they are not used at all).
> 
> I don't think we should pay any particular attention to those 2 ones
> as anyway nothing prevent the 7 others to be called outside of the
> pg_stat_xact_all_tables view.

I am not quite sure, TBH.  Did you look at the difference with a long
chain of subtrans, like savepoints?  The ODBC driver "loves" producing
a lot of savepoints, for example.

>> It would feel a bit safer to me to document that find_tabstat_entry()
>> is currently only used for this xact system view..  The extra
>> computation could lead to surprises, actually, if this routine is used
>> outside this context?  Perhaps that's OK, but it does not give me a
>> warm feeling, just to reshape three functions of pgstatfuncs.c with
>> macros.
> 
> That's a fair point. On the other hand those 9 functions (which can
> all be used outside of the pg_stat_xact_all_tables view) are not
> documented, so I'm not sure this is that much of a concern (and if
> we think it is we still gave the option to add an extra flag to
> indicate whether or not the extra computation is needed.)

That's not quite exact, I think.  The first 7 functions are used in a
system catalog that is documented.  Still we have a problem here.  I
can actually see a few projects relying on these two functions while
looking a bit around, so they are used.  And the issue comes from
ddfc2d9, that has removed these functions from the documentation
ignoring that they are used in no system catalogs.  I think that we
should fix that and re-add the two missing functions with a proper
description in the docs, at least?  There is no trace of them.
Perhaps the ones exposted through pg_stat_xact_all_tables are fine if
not listed.
--
Michael


signature.asc
Description: PGP signature


Re: Making empty Bitmapsets always be NULL

2023-03-16 Thread Yuya Watari
Hello,

On Thu, Mar 16, 2023 at 10:30 AM Yuya Watari  wrote:
> My idea is to compute the bitwise OR of all bitmapwords of the result
> Bitmapset. The bitwise OR can be represented as a single operation in
> the machine code and does not require any conditional branches. If the
> bitwise ORed value is zero, we can conclude the result Bitmapset is
> empty. The costs related to this operation can be almost negligible;
> it is significantly cheaper than calling bms_is_empty_internal() and
> less expensive than using a conditional branch such as 'if.'

After posting the patch, I noticed that my patch had some bugs. My
idea above is not applicable to bms_del_member(), and I missed some
additional operations required in bms_del_members(). I have attached
the fixed version to this email. I really apologize for making the
mistakes. Should we add new regression tests to prevent this kind of
bug?

The following tables illustrate the result of a re-run experiment. The
significant improvement was a mistake, but a speedup of about 2% was
still obtained when the number of partitions, namely n, was large.
This result indicates that the optimization regarding
bms_is_empty_internal() is effective on some workloads.

Table 1: Planning time (ms)
(n: the number of partitions of each table)
-
   n |   Master | Patched (trailing-zero) | Patched (bitwise-OR)
-
   1 |   36.903 |  36.621 |   36.731
   2 |   35.842 |  35.031 |   35.704
   4 |   37.756 |  37.457 |   37.409
   8 |   42.069 |  42.578 |   42.322
  16 |   53.670 |  67.792 |   53.618
  32 |   88.412 | 100.605 |   89.147
  64 |  229.734 | 271.259 |  225.971
 128 |  889.367 |1272.270 |  870.472
 256 | 4209.312 |8223.623 | 4129.594
-

Table 2: Planning time speedup (higher is better)
--
   n | Patched (trailing-zero) | Patched (bitwise-OR)
--
   1 |  100.8% |   100.5%
   2 |  102.3% |   100.4%
   4 |  100.8% |   100.9%
   8 |   98.8% |99.4%
  16 |   79.2% |   100.1%
  32 |   87.9% |99.2%
  64 |   84.7% |   101.7%
 128 |   69.9% |   102.2%
 256 |   51.2% |   101.9%
--

-- 
Best regards,
Yuya Watari


v2-0001-Remove-bms_is_empty_internal-function-calls.patch
Description: Binary data


Re: [PATCH] Tracking statements entry timestamp in pg_stat_statements

2023-03-16 Thread Andrei Zubkov
Hi Michael,

Thank you for your attention.

On Thu, 2023-03-16 at 16:13 +0900, Michael Paquier wrote:
> +/* First we have to remove them from the extension */
> +ALTER EXTENSION pg_stat_statements DROP VIEW pg_stat_statements;
> +ALTER EXTENSION pg_stat_statements DROP FUNCTION
> pg_stat_statements(boolean);
> +ALTER EXTENSION pg_stat_statements DROP FUNCTION
> +  pg_stat_statements_reset(Oid, Oid, bigint);
> 
> The upgrade script of an extension is launched by the backend in the
> context of an extension, so these three queries should not be needed,
> as far as I recall.

Agreed. I've done it as it was in previous versions. But I'm sure those
are unnecessary.

> Wouldn't it be better to do this kind of refactoring in its own patch
> to make the follow-up changes more readable?  This function is
> changed
> to return a timestamp rather than void, but IS NOT NULL applied on
> the
> existing queries would also return true.  This would clean up quite a
> few diffs in the main patch..
Splitting this commit seems reasonable to me.

New version is attached.
From 52e75fa05f5dea5700d96aea81ea81d91492b018 Mon Sep 17 00:00:00 2001
From: Andrei Zubkov 
Date: Thu, 16 Mar 2023 13:18:59 +0300
Subject: [PATCH 1/2] pg_stat_statements tests: Add NOT NULL checking of
 pg_stat_statements_reset

This is preliminary patch. It adds NOT NULL checking for the result of
pg_stat_statements_reset() function. It is needed for upcoming patch
"Track statement entry timestamp" that will change the result type of
this function to the timestamp of a reset performed.

Reviewed by: Julien Rouhaud, Hayato Kuroda, Yuki Seino, Chengxi Sun,
Anton Melnikov, Darren Rush

Discussion: https://www.postgresql.org/message-id/flat/72e80e7b160a6eb189df9ef6f068cce3765d37f8.camel%40moonset.ru
---
 .../pg_stat_statements/expected/cursors.out   |  28 ++--
 contrib/pg_stat_statements/expected/dml.out   |  20 +--
 .../expected/level_tracking.out   |  80 +-
 .../pg_stat_statements/expected/planning.out  |  18 +--
 .../pg_stat_statements/expected/select.out|  44 ++---
 .../expected/user_activity.out| 120 +++---
 .../pg_stat_statements/expected/utility.out   | 150 +-
 contrib/pg_stat_statements/expected/wal.out   |  10 +-
 contrib/pg_stat_statements/sql/cursors.sql|   6 +-
 contrib/pg_stat_statements/sql/dml.sql|   4 +-
 .../pg_stat_statements/sql/level_tracking.sql |  12 +-
 contrib/pg_stat_statements/sql/planning.sql   |   4 +-
 contrib/pg_stat_statements/sql/select.sql |  10 +-
 .../pg_stat_statements/sql/user_activity.sql  |  15 +-
 contrib/pg_stat_statements/sql/utility.sql|  28 ++--
 contrib/pg_stat_statements/sql/wal.sql|   2 +-
 16 files changed, 278 insertions(+), 273 deletions(-)

diff --git a/contrib/pg_stat_statements/expected/cursors.out b/contrib/pg_stat_statements/expected/cursors.out
index 46375ea9051..0fc4b2c098d 100644
--- a/contrib/pg_stat_statements/expected/cursors.out
+++ b/contrib/pg_stat_statements/expected/cursors.out
@@ -3,10 +3,10 @@
 --
 -- These tests require track_utility to be enabled.
 SET pg_stat_statements.track_utility = TRUE;
-SELECT pg_stat_statements_reset();
- pg_stat_statements_reset 
---
- 
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t 
+---
+ t
 (1 row)
 
 -- DECLARE
@@ -20,13 +20,13 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
 ---+--+---
  2 |0 | CLOSE cursor_stats_1
  2 |0 | DECLARE cursor_stats_1 CURSOR WITH HOLD FOR SELECT $1
- 1 |1 | SELECT pg_stat_statements_reset()
+ 1 |1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
 (3 rows)
 
-SELECT pg_stat_statements_reset();
- pg_stat_statements_reset 
---
- 
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t 
+---
+ t
 (1 row)
 
 -- FETCH
@@ -59,12 +59,12 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
  1 |0 | DECLARE cursor_stats_2 CURSOR WITH HOLD FOR SELECT $1
  1 |1 | FETCH 1 IN cursor_stats_1
  1 |1 | FETCH 1 IN cursor_stats_2
- 1 |1 | SELECT pg_stat_statements_reset()
+ 1 |1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
 (9 rows)
 
-SELECT pg_stat_statements_reset();
- pg_stat_statements_reset 
---
- 
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t 
+---
+ t
 (1 row)
 
diff --git a/contrib/pg_stat_statements/expected/dml.out b/contrib/pg_stat_statements/expected/dml.out
index 7b9c8f979ee..4c723a038b9 100644
--- a/contrib/pg_stat_statements/expected/dml.out
+++ b/contrib/pg_stat_statements/expected/dml.out
@@ -81,16 +81,16 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
  1 |   12 | SELECT * FROM pgss_dml_tab ORDER BY a
  2 |4 | SELECT * FROM pgss_dml_tab WHERE a > $1 ORDER BY a
  1 |8 | SELECT * FROM pgss_dml_tab 

Re: logical decoding and replication of sequences, take 2

2023-03-16 Thread Amit Kapila
On Thu, Mar 16, 2023 at 1:08 PM Masahiko Sawada  wrote:
>
> Hi,
>
> On Wed, Mar 15, 2023 at 9:52 PM Tomas Vondra
>  wrote:
> >
> >
> >
> > On 3/14/23 08:30, John Naylor wrote:
> ---
> I got an assertion failure. The reproducible steps are:
>
> 1. On publisher
> alter system set logical_replication_mode = 'immediate';
> select pg_reload_conf();
> create publication test_pub for all sequences;
>
> 2. On subscriber
> create subscription test_sub connection 'dbname=postgres port=5551'
> publication test_pub with (streaming='parall\el')
>
> 3. On publisher
> begin;
> create table bar (c int, d serial);
> insert into bar(c) values (100);
> commit;
>
> I got the following assertion failure:
>
> TRAP: failed Assert("(!seq.transactional) || in_remote_transaction"),
...
>
> seq.transactional is true and in_remote_transaction is false. It might
> be an issue of the parallel apply feature rather than this patch.
>

During parallel apply we didn't need to rely on in_remote_transaction,
so it was not set. I haven't checked the patch in detail but am
wondering, isn't it sufficient to instead check IsTransactionState()
and or IsTransactionOrTransactionBlock()?

-- 
With Regards,
Amit Kapila.




Re: [EXTERNAL] Support load balancing in libpq

2023-03-16 Thread Daniel Gustafsson
In general I think this feature makes sense (which has been echoed many times
in the thread), and the implementation strikes a good balance of robustness and
simplicity.  Reading this I think it's very close to being committable, but I
have a few comments on the patch series:

+sent to the same server. There are currently two modes:
The documentation lists the modes disabled and random, but I wonder if it's
worth expanding the docs to mention that "disabled" is pretty much a round
robin load balancing scheme?  It reads a bit odd to present load balancing
without a mention of round robin balancing given how common it is.

-   conn->addrlist_family = hint.ai_family = AF_UNSPEC;
+   hint.ai_family = AF_UNSPEC;
This removes all uses of conn->addrlist_family and that struct member can be
removed.

+to, for example, load balance over stanby servers only. Once 
successfully
s/stanby/standby/

+Postgres servers.
s/Postgres/PostgreSQL/

+more addresses than others. So if you want uniform load balancing,
+this is something to keep in mind. However, non-uniform load
+balancing can also be used to your advantage, e.g. by providing the
The documentation typically use a less personal form, I would suggest something
along the lines of:

"If uniform load balancing is required then an external load balancing tool
must be used.  Non-uniform load balancing can also be used to skew the
results, e.g.  by providing the.."

+   if (!libpq_prng_init(conn))
+   return false;
This needs to set a returned error message with libpq_append_conn_error (feel
free to come up with a better wording of course):

  libpq_append_conn_error(conn, "unable to initiate random number generator");

-#ifndef WIN32
+/* MinGW has sys/time.h, but MSVC doesn't */
+#ifndef _MSC_VER
 #include 
This seems unrelated to the patch in question, and should be a separate commit 
IMO.

+   LOAD_BALANCE_RANDOM,/* Read-write server */
I assume this comment is a copy/paste and should have been reflecting random 
order?

+++ b/src/interfaces/libpq/t/003_loadbalance_host_list.pl
Nitpick, but we should probably rename this load_balance to match the parameter
being tested.

+++ b/src/interfaces/libpq/t/004_loadbalance_dns.pl
On the subject of tests, I'm a bit torn.  I appreciate that the patch includes
a thorough test, but I'm not convinced we should add that to the tree.  A test
which require root permission level manual system changes stand a very low
chance of ever being executed, and as such will equate to dead code that may
easily be broken or subtly broken.
 
I am also not a fan of the random_seed parameter.  A parameter only useful for
testing, and which enables testing by circumventing the mechanism to test
(making randomness predictable), seems like expensive bagage to carry around.
From experience we also know this risks ending up in production configs for all
the wrong reasons.

Given the implementation of this feature, the actual connection phase isn't any
different from just passing multiple hostnames and operating in the round-robin
fashion.  What we want to ensure is that the randomization isn't destroying the
connection array.  Let's see what we can do while still having tests that can
be executed in the buildfarm.

A few ideas:

  * Add basic tests for the load_balance_host connection param only accepting
sane values

  * Alter the connect_ok() tests in 003_loadbalance_host_list.pl to not require
random_seed but instead using randomization. Thinking out loud, how about
something along these lines?
- Passing a list of unreachable hostnames with a single good hostname
  should result in a connection.
- Passing a list of one good hostname should result in a connection
- Passing a list on n good hostname (where all are equal) should result in
  a connection
- Passing a list of n unreachable hostnames should result in log entries
  for n broken resolves in some order, and running that test n' times
  shouldn't - statistically - result in the same order for a large enough 
n'.

  * Remove random_seed and 004_loadbalance_dns.pl

Thoughts?

--
Daniel Gustafsson





RE: Dropped and generated columns might cause wrong data on subs when REPLICA IDENTITY FULL

2023-03-16 Thread shiy.f...@fujitsu.com
On Thu, Mar 16, 2023 5:23 PM Amit Kapila  wrote:
> 
> On Mon, Mar 13, 2023 at 6:26 PM Önder Kalacı 
> wrote:
> >
> > Attaching v2
> >
> 
> Can we change the comment to: "Ignore dropped and generated columns as
> the publisher doesn't send those."? After your change, att =
> TupleDescAttr(slot1->tts_tupleDescriptor, attrnum); is done twice in
> the same function.
> 
> In test cases, let's change the comment to: "The bug was that when the
> REPLICA IDENTITY FULL is used with dropped or generated columns, we
> fail to apply updates and deletes.". Also, I think we don't need to
> provide the email link as anyway commit message will have a link to
> the discussion.
> 
> Did you check this in the back branches?
> 

I tried to reproduce this bug in backbranch.

Generated column is introduced in PG12, and I reproduced generated column 
problem
in PG12~PG15.

For dropped column problem, I reproduced it in PG10~PG15. (Logical replication
was introduced in PG10)

So I think we should backpatch the fix for generated column to PG12, and
backpatch the fix for dropped column to PG10.

Regards,
Shi Yu


Re: Add a hook to allow modification of the ldapbindpasswd

2023-03-16 Thread Andrew Dunstan


On 2023-03-15 We 18:18, Andrew Dunstan wrote:



On 2023-03-15 We 17:50, Tom Lane wrote:

Andrew Dunstan  writes:

pushed.

drongo is not happy with this, but I'm kind of baffled as to why:

"c:\\prog\\bf\\root\\HEAD\\pgsql.build\\pgsql.sln" (default target) (1) ->
"c:\\prog\\bf\\root\\HEAD\\pgsql.build\\ldap_password_func.vcxproj" (default 
target) (60) ->
(Link target) ->
   ldap_password_func.obj : error LNK2001: unresolved external symbol 
__imp_ldap_password_hook 
[c:\\prog\\bf\\root\\HEAD\\pgsql.build\\ldap_password_func.vcxproj]
   .\\Release\\ldap_password_func\\ldap_password_func.dll : fatal error 
LNK1120: 1 unresolved externals 
[c:\\prog\\bf\\root\\HEAD\\pgsql.build\\ldap_password_func.vcxproj]

The only obvious explanation for a link problem would be if the
variable's declaration were missing PGDLLIMPORT; but it's not.





Ugh. Not batting 1000 today. Will investigate.





The issue was apparently that I had neglected to suppress building the 
test module on MSVC if not configured to build with LDAP, since the hook 
is only defined in that case. I have pushed a fix for that and drongo is 
happy once more.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry

2023-03-16 Thread Drouvot, Bertrand

Hi,

On 3/16/23 7:29 AM, Michael Paquier wrote:

On Mon, Mar 06, 2023 at 08:33:15AM +0100, Drouvot, Bertrand wrote:

Thanks for having looked at it!


Looking at that, I have a few comments.

+tabentry = (PgStat_TableStatus *) entry_ref->pending;
+tablestatus = palloc(sizeof(PgStat_TableStatus));
+*tablestatus = *tabentry;
+
[...]
+for (trans = tabentry->trans; trans != NULL; trans = trans->upper)
+{
+tablestatus->t_counts.t_tuples_inserted += trans->tuples_inserted;
+tablestatus->t_counts.t_tuples_updated += trans->tuples_updated;
+tablestatus->t_counts.t_tuples_deleted += trans->tuples_deleted;
+}
  
-if (entry_ref)

-return entry_ref->pending;
-return NULL;
+return tablestatus;

 From what I get with this change, the number of tuples changed by DMLs
have their computations done a bit earlier,


Thanks for looking at it!

Right, but note this is in a dedicated new tablestatus (created within 
find_tabstat_entry()).


meaning that it would make
all the callers of find_tabstat_entry() pay the computation cost.


Right. Another suggested approach was to add a flag but then we'd not really
hide the internal from pgstatfuncs.


Still it is not really going to matter, because we will just do the
computation once when looking at any pending changes of
pg_stat_xact_all_tables for each entry. 


Yes.


There are 9 callers of
find_tabstat_entry, with 7 being used for pg_stat_xact_all_tables.


Right, those are:

pg_stat_get_xact_numscans()

pg_stat_get_xact_tuples_returned()

pg_stat_get_xact_tuples_fetched()

pg_stat_get_xact_tuples_inserted()

pg_stat_get_xact_tuples_updated()

pg_stat_get_xact_tuples_deleted()

pg_stat_get_xact_tuples_hot_updated()


How much do we need to care about the remaining two callers
pg_stat_get_xact_blocks_fetched() and pg_stat_get_xact_blocks_hit()?


Regarding pg_stat_get_xact_blocks_fetched() and pg_stat_get_xact_blocks_hit()
the callers (if any) are outside of the core PG (as from what I can see they 
are not used
at all).

I don't think we should pay any particular attention to those 2 ones as anyway 
nothing
prevent the 7 others to be called outside of the pg_stat_xact_all_tables view.


Could it be a problem if these two also pay the extra computation cost
if a transaction with many subtransactions (aka )needs to look at their
data?  These two are used nowhere, they have pg_proc entries and they
are undocumented, so it is hard to say the impact of this change on
them..



Right, and that's the same for the 7 others as nothing prevent them to be 
called outside
of the pg_stat_xact_all_tables view.

Do you think it would be better to add the extra flag then?


Second question: once the data from the subtransactions is copied,
would it be cleaner to set trans to NULL after the data copy is done?



That would not hurt but I'm not sure it's worth it (well, it's currently
not done in pg_stat_get_xact_tuples_inserted() for example).


It would feel a bit safer to me to document that find_tabstat_entry()
is currently only used for this xact system view..  The extra
computation could lead to surprises, actually, if this routine is used
outside this context?  Perhaps that's OK, but it does not give me a
warm feeling, just to reshape three functions of pgstatfuncs.c with
macros.


That's a fair point. On the other hand those 9 functions (which can all be used 
outside
of the pg_stat_xact_all_tables view) are not documented, so I'm not sure this 
is that much of
a concern (and if we think it is we still gave the option to add an extra flag 
to indicate whether
or not the extra computation is needed.)

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




  1   2   >