date:20230206

Re: make_ctags: use -I option to ignore pg_node_attr macro

2023-02-06 Thread Tatsuo Ishii

> On 2022/10/19 13:25, Tatsuo Ishii wrote:
>> Thanks. the v6 patch pushed to master branch.
> 
> Since this commit, make_etags has started failing to generate
> tags files with the following error messages, on my MacOS.
> 
> $ src/tools/make_etags
> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ctags:
> illegal option -- e
> usage: ctags [-BFadtuwvx] [-f tagsfile] file ...
> sort: No such file or directory
> 
> 
> In my MacOS, non-Exuberant ctags is installed and doesn't support
> -e option. But the commit changed make_etags so that it always
> calls ctags with -e option via make_ctags. This seems the cause of
> the above failure.
> 
> IS_EXUBERANT=""
> ctags --version 2>&1 | grep Exuberant && IS_EXUBERANT="Y"
> 
> make_ctags has the above code and seems to support non-Exuberant
> ctags.
> If so, we should revert the changes of make_etags by the commit and
> make make_etags work with that ctags? Or, we should support
> only Exuberant-type ctags (btw, I'm ok with this) and get rid of
> something like the above code?

Thanks for the report. I will look into this.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

RE: Deadlock between logrep apply worker and tablesync worker

2023-02-06 Thread houzj.f...@fujitsu.com

On Tuesday, February 7, 2023 12:12 PM Peter Smith  wrote:
> On Fri, Feb 3, 2023 at 6:58 PM houzj.f...@fujitsu.com 
> wrote:
> >
> ...
> > > Right, I think that case could be addressed by Tom's patch to some
> > > extent but I am thinking we should also try to analyze if we can
> > > completely avoid the need to remove origins from both processes. One
> > > idea could be to introduce another relstate something like
> > > PRE_SYNCDONE and set it in a separate transaction before we set the
> > > state as SYNCDONE and remove the slot and origin in tablesync worker.
> > > Now, if the tablesync worker errors out due to some reason during
> > > the second transaction, it can remove the slot and origin after restart by
> checking the state.
> > > However, it would add another relstate which may not be the best way
> > > to address this problem. Anyway, that can be accomplished as a separate
> patch.
> >
> > Here is an attempt to achieve the same.
> > Basically, the patch removes the code that drop the origin in apply
> > worker. And add a new state PRE_SYNCDONE after synchronization
> > finished in front of apply (sublsn set), but before dropping the
> > origin and other final cleanups. The tablesync will restart and redo
> > the cleanup if it failed after reaching the new state. Besides, since
> > the changes can already be applied on the table in PRE_SYNCDONE state,
> > so I also modified the check in should_apply_changes_for_rel(). And
> > some other conditions for the origin drop in subscription commands are
> were adjusted in this patch.
> >
> 
> Here are some review comments for the 0001 patch
> 
> ==
> General Comment
> 
> 0.
> The idea of using the extra relstate for clean-up seems OK, but the
> implementation of the new state in this patch appears misordered and
> misnamed to me.
> 
> The state name should indicate what it is doing (PRE_SYNCDONE is
> meaningless). The patch describes in several places that this state means
> "synchronized, but not yet cleaned up" therefore IMO it means the SYNCDONE
> state should be *before* this new state. And since this new state is for
> "cleanup" then let's call it something like that.
> 
> To summarize, I don’t think the meaning of SYNCDONE should be touched.
> SYNCDONE means the synchronization is done, same as before. And your new
> "cleanup" state belongs directly *after* that. IMO it should be like this:
> 
> 1. STATE_INIT
> 2. STATE_DATASYNC
> 3. STATE_FINISHEDCOPY
> 4. STATE_SYNCDONE
> 5. STATE_CLEANUP <-- new relstate
> 6. STATE_READY
> 
> Of course, this is going to impact almost every aspect of the patch, but I 
> think
> everything will be basically the same as you have it now
> -- only all the state names and comments need to be adjusted according to the
> above.

Although I agree the CLEANUP is easier to understand, but I am a bit concerned
that the changes would be a bit invasive.

If we add a CLEANUP state at the end as suggested, it will change the meaning
of the existing SYNCDONE state, before the change it means both data sync and
cleanup have been done, but after the change it only mean the data sync is
over. This also means all the current C codes that considered the SYNCDONE as
the final state of table sync will need to be changed. Moreover, it's common
for user to query the relation state from pg_subscription_rel to identify if
the table sync of a table is finished(e.g. check relstate IN ('r', 's')), but
if we add a new state(CLEANUP) as the final state, then all these SQLs would
need to be changed as they need to check like relstate IN ('r', 'x'(new cleanup
state)).

Best Regards,
Hou zj

Re: Perform streaming logical transactions by background workers and parallel apply

2023-02-06 Thread Amit Kapila

On Tue, Feb 7, 2023 at 12:41 PM Masahiko Sawada  wrote:
>
> On Fri, Feb 3, 2023 at 6:44 PM Amit Kapila  wrote:
>
> > We need to think of a predictable
> > way to test this path which may not be difficult. But I guess it would
> > be better to wait for some feedback from the field about this feature
> > before adding more to it and anyway it shouldn't be a big deal to add
> > this later as well.
>
> Agreed to hear some feedback before adding it. It's not an urgent feature.
>

Okay, Thanks! AFAIK, there is no pending patch left in this proposal.
If so, I think it is better to close the corresponding CF entry.

-- 
With Regards,
Amit Kapila.

RE: [PATCH] Reuse Workers and Replication Slots during Logical Replication

2023-02-06 Thread wangw.f...@fujitsu.com

On Wed, Feb 1, 2023 20:07 PM Melih Mutlu  wrote:
> Thanks for pointing out this review. I somehow skipped that, sorry.
> 
> Please see attached patches.

Thanks for updating the patch set.
Here are some comments.

1. In the function ApplyWorkerMain.
+   /* This is main apply worker */
+   run_apply_worker(, myslotname, originname, 
sizeof(originname), _startpos);

I think we need to keep the worker name as "leader apply worker" in the comment
like the current HEAD.

---

2. In the function LogicalRepApplyLoop.
+* can be reused, we need to take care of 
memory contexts here
+* before moving to sync a table.
+*/
+   if (MyLogicalRepWorker->ready_to_reuse)
+   {
+   
MemoryContextResetAndDeleteChildren(ApplyMessageContext);
+   MemoryContextSwitchTo(TopMemoryContext);
+   return;
+   }

I think in this case we also need to pop the error context stack before
returning. Otherwise, I think we might use the wrong callback
(apply error_callback) after we return from this function.

---

3. About the function UpdateSubscriptionRelReplicationSlot.
This newly introduced function UpdateSubscriptionRelReplicationSlot does not
seem to be invoked. Do we need this function?

Regards,
Wang Wei

Re: OpenSSL 3.0.0 vs old branches

2023-02-06 Thread Peter Eisentraut


On 06.02.23 16:56, Andrew Dunstan wrote:
I recently moved crake to a new machine running Fedora 36, which has 
OpenSSL 3.0.0. This causes the SSL tests to fail on branches earlier 
than release 13, so I propose to backpatch commit f0d2c65f17 to the 
release 11 and 12 branches.


This is not the only patch that we did to support OpenSSL 3.0.0.  There 
was a very lengthy discussion that resulted in various patches.  Unless 
we have a complete analysis of what was done and how it affects various 
branches, I would not do this.  Notably, we did actually consider what 
to backpatch, and the current state is the result of that.  So let's not 
throw that away without considering that carefully.  Even if it gets it 
to compile, I personally would not *trust* it without that analysis.  I 
think we should just leave it alone and consider OpenSSL 3.0.0 
unsupported in the branches were it is now unsupported.  OpenSSL 1.1.1 
is still supported upstream to serve those releases.

Re: Perform streaming logical transactions by background workers and parallel apply

2023-02-06 Thread Masahiko Sawada

On Fri, Feb 3, 2023 at 6:44 PM Amit Kapila  wrote:
>
> On Fri, Feb 3, 2023 at 1:28 PM Masahiko Sawada  wrote:
> >
> > On Fri, Feb 3, 2023 at 12:29 PM houzj.f...@fujitsu.com
> >  wrote:
> > >
> > > On Friday, February 3, 2023 11:04 AM Amit Kapila 
> > >  wrote:
> > > >
> > > > On Thu, Feb 2, 2023 at 4:52 AM Peter Smith 
> > > > wrote:
> > > > >
> > > > > Some minor review comments for v91-0001
> > > > >
> > > >
> > > > Pushed this yesterday after addressing your comments!
> > >
> > > Thanks for pushing.
> > >
> > > Currently, we have two remaining patches which we are not sure whether 
> > > it's worth
> > > committing for now. Just share them here for reference.
> > >
> > > 0001:
> > >
> > > Based on our discussion[1] on -hackers, it's not clear that if it's 
> > > necessary
> > > to add the sub-feature to stop extra worker when
> > > max_apply_workers_per_suibscription is reduced. Because:
> > >
> > > - it's not clear whether reducing the 
> > > 'max_apply_workers_per_suibscription' is very
> > >   common.
> >
> > A use case I'm concerned about is a temporarily intensive data load,
> > for example, a data loading batch job in a maintenance window. In this
> > case, the user might want to temporarily increase
> > max_parallel_workers_per_subscription in order to avoid a large
> > replication lag, and revert the change back to normal after the job.
> > If it's unlikely to stream the changes in the regular workload as
> > logical_decoding_work_mem is big enough to handle the regular
> > transaction data, the excess parallel workers won't exit.
> >
>
> Won't in such a case, it would be better to just switch off the
> parallel option for a subscription?

Not sure. Changing the parameter would be easier since it doesn't
require restarts.

> We need to think of a predictable
> way to test this path which may not be difficult. But I guess it would
> be better to wait for some feedback from the field about this feature
> before adding more to it and anyway it shouldn't be a big deal to add
> this later as well.

Agreed to hear some feedback before adding it. It's not an urgent feature.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

A bug in make_outerjoininfo

2023-02-06 Thread Richard Guo

In cases where we have any clauses between two outer joins, these
clauses should be treated as degenerate clauses in the upper OJ, and
they may prevent us from re-ordering the two outer joins.  Previously we
have the flag 'delay_upper_joins' to help avoid the re-ordering in such
cases.

In b448f1c8 remove_useless_result_rtes will remove useless FromExprs and
merge its quals up to parent.  This makes flag 'delay_upper_joins' not
necessary any more if the clauses between the two outer joins come from
FromExprs.  However, if the clauses between the two outer joins come
from JoinExpr of an inner join, it seems we have holes in preserving
ordering.  As an example, consider

create table t (a int unique);

select * from t t1 left join (t t2 left join t t3 on t2.a = t3.a) inner
join t t4 on coalesce(t3.a,1) = t4.a on t1.a = t2.a;

When building SpecialJoinInfo for outer join t1/t2, make_outerjoininfo
thinks identity 3 applies between OJ t1/t2 and OJ t2/t3, which is wrong
as there is an inner join between them.

This query will trigger the assertion for cross-checking on nullingrels
in search_indexed_tlist_for_var.

Thanks
Richard

Re: Remove some useless casts to (void *)

2023-02-06 Thread Peter Eisentraut


On 03.02.23 00:59, Corey Huinker wrote:
On Thu, Feb 2, 2023 at 5:22 PM Peter Eisentraut 
> wrote:


I have found that in some corners of the code some calls to standard C
functions are decorated with casts to (void *) for no reason, and this
code pattern then gets copied around.  I have gone through and cleaned
this up a bit, in the attached patches.

The involved functions are: repalloc, memcpy, memset, memmove, memcmp,
qsort, bsearch

Also hash_search(), for which there was a historical reason (the
argument used to be char *), but not anymore.


+1


committed


All code is example code.


I like that one!

Re: Make mesage at end-of-recovery less scary.

2023-02-06 Thread Kyotaro Horiguchi

Thanks!

At Fri, 3 Feb 2023 15:16:02 +0100, Alvaro Herrera  
wrote in 
> So this patch is now failing because it applies new tests to
> 011_crash_recovery.pl, which was removed recently.  Can you please move
> them elsewhere?

I don't find an appropriate file to move to. In the end I created a
new file with the name 034_recovery.pl.  I added a test for standbys,
too. (which is the first objective of this patch.)

> I think the comment for ValidXLogRecordLength should explain what the
> return value is.

Agreed.


/*
  * Validate record length of an XLOG record header.
  *
  * This is substantially a part of ValidXLogRecordHeader.  But XLogReadRecord
  * needs this separate from the function in case of a partial record header.
+ *
+ * Returns true if the xl_tot_len header field has a seemingly valid value,
+ * which means the caller can proceed reading to the following part of the
+ * record.
  */
 static bool
 ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,

I added a similar description to ValidXLogRecordHeader.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From c58ca4d5db52c75dec9882d158d5724e12617005 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 30 Nov 2022 11:51:46 +0900
Subject: [PATCH v24] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.  To make
sure that the detection is correct, this patch checks if all trailing
bytes in the same page are zeroed in that case.
---
 src/backend/access/transam/xlogreader.c   | 144 +-
 src/backend/access/transam/xlogrecovery.c |  94 ++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 +-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/034_recovery.pl   | 135 
 6 files changed, 335 insertions(+), 59 deletions(-)
 create mode 100644 src/test/recovery/t/034_recovery.pl

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index aa6c929477..8cb2d55333 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -558,6 +561,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -640,25 +644,21 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Validate the record header.
 	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
-	 */
-	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Even though we use an XLogRecord pointer here, the whole record header
+	 * might not fit on this page.  If the whole record header is on this page,
+	 * validate it immediately.  Even otherwise xl_tot_len must be on this page
+	 * (it is the first field of MAXALIGNed records), but we still cannot
+	 * access any further fields until we've verified that we got the whole
+	 * header, so do just a basic sanity check on record length, and validate
+	 * the rest of the header after reading it from the next page.  The length
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 *

Re: A problem in deconstruct_distribute_oj_quals

2023-02-06 Thread Tom Lane

Richard Guo  writes:
> In deconstruct_distribute_oj_quals, when we've identified a commutable
> left join which provides join clause with flexible semantics, we try to
> generate multiple versions of the join clause.  Here we have the logic
> that puts back any ojrelids that were removed from its min_righthand.

>  /*
>   * Put any OJ relids that were removed from min_righthand back into
>   * ojscope, else distribute_qual_to_rels will complain.
>   */
>  ojscope = bms_join(ojscope, bms_intersect(sjinfo->commute_below,
>sjinfo->syn_righthand));

> I doubt this is necessary.  It seems to me that all relids mentioned
> within the join clause have already been contained in ojscope, which is
> the union of min_lefthand and min_righthand.

Hmm ... that was needed at some point in the development of that
function, but maybe it isn't as the code stands now.  It does look
like the "this_ojscope" manipulations within the loop cover this.

> I noticed this code because I came across a problem with a query as
> below.

> create table t (a int);

> select t1.a from (t t1 left join t t2 on true) left join (t t3 left join t
> t4 on t3.a = t4.a) on t2.a = t3.a;

> When we deal with qual 't2.a = t3.a', deconstruct_distribute_oj_quals
> would always add the OJ relid of t3/t4 into its required_relids, due to
> the code above, which I think is wrong.  The direct consequence is that
> we would miss the plan that joins t2 and t3 directly.

I don't see any change in this query plan when I remove that code, so
I'm not sure you're explaining your point very well.

regards, tom lane

Re: [PATCH] Compression dictionaries for JSONB

2023-02-06 Thread Nikita Malakhov

Hi,

On updating dictionary -

>You cannot "just" refresh a dictionary used once to compress an
>object, because you need it to decompress the object too.

and when you have many - updating an existing dictionary requires
going through all objects compressed with it in the whole database.
It's a very tricky question how to implement this feature correctly.
Also, there are some thoughts on using JSON schema to optimize
storage for JSON objects.
(That's applied to the TOAST too, so at first glance we've decided
to forbid dropping or changing TOAST implementations already
registered in a particular database.)

In my experience, in modern world, even with fast SSD storage
arrays, with large database (about 40-50 Tb) we had disk access
as a bottleneck more often than CPU, except for the cases with
a lot of parallel execution threads for a single query (Oracle).

On Mon, Feb 6, 2023 at 10:33 PM Andres Freund  wrote:

> Hi,
>
> On 2023-02-06 16:16:41 +0100, Matthias van de Meent wrote:
> > On Mon, 6 Feb 2023 at 15:03, Aleksander Alekseev
> >  wrote:
> > >
> > > Hi,
> > >
> > > I see your point regarding the fact that creating dictionaries on a
> > > training set is too beneficial to neglect it. Can't argue with this.
> > >
> > > What puzzles me though is: what prevents us from doing this on a page
> > > level as suggested previously?
> >
> > The complexity of page-level compression is significant, as pages are
> > currently a base primitive of our persistency and consistency scheme.
>
> +many
>
> It's also not all a panacea performance-wise, datum-level decompression can
> often be deferred much longer than page level decompression. For things
> like
> json[b], you'd hopefully normally have some "pre-filtering" based on proper
> columns, before you need to dig into the json datum.
>
> It's also not necessarily that good, compression ratio wise. Particularly
> for
> wider datums you're not going to be able to remove much duplication,
> because
> there's only a handful of tuples. Consider the case of json keys - the
> dictionary will often do better than page level compression, because it'll
> have the common keys in the dictionary, which means the "full" keys never
> will
> have to appear on a page, whereas page-level compression will have the
> keys on
> it, at least once.
>
> Of course you can use a dictionary for page-level compression too, but the
> gains when it works well will often be limited, because in most OLTP usable
> page-compression schemes I'm aware of, you can't compress a page all that
> far
> down, because you need a small number of possible "compressed page sizes".
>
>
> > > More similar data you compress the more space and disk I/O you save.
> > > Additionally you don't have to compress/decompress the data every time
> > > you access it. Everything that's in shared buffers is uncompressed.
> > > Not to mention the fact that you don't care what's in pg_attribute,
> > > the fact that schema may change, etc. There is a table and a
> > > dictionary for this table that you refresh from time to time. Very
> > > simple.
> >
> > You cannot "just" refresh a dictionary used once to compress an
> > object, because you need it to decompress the object too.
>
> Right. That's what I was trying to refer to when mentioning that we might
> need
> to add a bit of additional information to the varlena header for datums
> compressed with a dictionary.
>
> Greetings,
>
> Andres Freund
>


-- 
Regards,

--
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-06 Thread Amit Kapila

On Tue, Feb 7, 2023 at 10:42 AM Peter Smith  wrote:
>
> On Tue, Feb 7, 2023 at 4:02 PM Amit Kapila  wrote:
> >
> > On Tue, Feb 7, 2023 at 10:07 AM Kyotaro Horiguchi
> >  wrote:
> > >
> > > At Tue, 7 Feb 2023 09:10:01 +0530, Amit Kapila  
> > > wrote in
> > > > On Tue, Feb 7, 2023 at 6:03 AM Peter Smith  
> > > > wrote:
> > > > > 5b.
> > > > > Since there are no translator considerations here why not write the
> > > > > second error like:
> > > > >
> > > > > errmsg("%d ms is outside the valid range for parameter
> > > > > \"min_apply_delay\" (%d .. %d)",
> > > > > result, 0, PG_INT32_MAX))
> > > > >
> > > >
> > > > I see that existing usage in the code matches what the patch had
> > > > before this comment. See below and similar usages in the code.
> > > > if (start <= 0)
> > > > ereport(ERROR,
> > > > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > > > errmsg("invalid value for parameter \"%s\": %d",
> > > > "start", start)));
> > >
> > > The same errmsg text occurs mamy times in the tree. On the other hand
> > > the pointed message is the only one.  I suppose Peter considered this
> > > aspect.
> > >
> > > # "%d%s%s is outside the valid range for parameter \"%s\" (%d .. %d)"
> > > # also appears just once
> > >
> > > As for me, it seems to me a good practice to do that regadless of the
> > > number of duplicates to (semi)mechanically avoid duplicates.
> > >
> > > (But I believe I would do as Peter suggests by myself for the first
> > > cut, though:p)
> > >
> >
> > Personally, I would prefer consistency. I think we can later start a
> > new thread to change the existing message and if there is a consensus
> > and value in the same then we could use the same style here as well.
> >
>
> Of course, if there is a convention then we should stick to it.
>
> My understanding was that (string literal) message parameters are
> specified separately from the message format string primarily as an
> aid to translators. That makes good sense for parameters with names
> that are also English words (like "start" etc), but for non-word
> parameters like "min_apply_delay" there is no such ambiguity in the
> first place.
>

TBH, I am not an expert in this matter. So, to avoid, making any
mistakes I thought of keeping it close to the existing style.

-- 
With Regards,
Amit Kapila.

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-06 Thread Amit Kapila

On Tue, Feb 7, 2023 at 10:13 AM Kyotaro Horiguchi
 wrote:
>
> At Mon, 6 Feb 2023 13:10:01 +, "Takamichi Osumi (Fujitsu)" 
>  wrote in
> > The attached patch v29 has included your changes.
>
> catalogs.sgml
>
> +  
> +   The minimum delay (ms) for applying changes.
> +  
>
> I think we don't use unit symbols that way. Namely I think we would
> write it as "The minimum delay for applying changes in milliseconds"
>

Okay, if we prefer to use milliseconds, then how about: "The minimum
delay, in milliseconds, for applying changes"?

>
> alter_subscription.sgml
>
>are slot_name,
>synchronous_commit,
>binary, streaming,
> -  disable_on_error, and
> -  origin.
> +  disable_on_error,
> +  origin, and
> +  min_apply_delay.
>   
>
> By the way, is there any rule for the order among the words?
>

Currently, it is in the order in which the corresponding features are added.

> They
> don't seem in alphabetical order nor in the same order to the
> create_sbuscription page.
>

In create_subscription page also, it appears to be in the order in
which those are added with a difference that they are divided into two
categories (parameters that control what happens during subscription
creation and parameters that control the subscription's replication
behavior after it has been created)

>  (I seems like in the order of SUBOPT_*
> symbols, but I'm not sure it's a good idea..)
>
>
> subscriptioncmds.c
>
> +   if (opts.streaming == 
> LOGICALREP_STREAM_PARALLEL &&
> +   !IsSet(opts.specified_opts, 
> SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)
> ..
> +   if (opts.min_apply_delay > 0 &&
> +   !IsSet(opts.specified_opts, 
> SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL)
>
> Don't we wrap the lines?
>
>
> worker.c
>
> +   if (wal_receiver_status_interval > 0 &&
> +   diffms > wal_receiver_status_interval * 1000L)
> +   {
> +   WaitLatch(MyLatch,
> + WL_LATCH_SET | WL_TIMEOUT | 
> WL_EXIT_ON_PM_DEATH,
> + wal_receiver_status_interval * 
> 1000L,
> + WAIT_EVENT_RECOVERY_APPLY_DELAY);
> +   send_feedback(last_received, true, false, true);
> +   }
> +   else
> +   WaitLatch(MyLatch,
> + WL_LATCH_SET | WL_TIMEOUT | 
> WL_EXIT_ON_PM_DEATH,
> + diffms,
> + WAIT_EVENT_RECOVERY_APPLY_DELAY);
>
> send_feedback always handles the case where
> wal_receiver_status_interval == 0.
>

It only handles when force is false but here we are using that as
true. So, not sure, if what you said would be an improvement.

>  thus we can simply wait for
> min(wal_receiver_status_interval, diffms) then call send_feedback()
> unconditionally.
>
>
> -start_apply(XLogRecPtr origin_startpos)
> +start_apply(void)
>
> -LogicalRepApplyLoop(XLogRecPtr last_received)
> +LogicalRepApplyLoop(void)
>
> Does this patch requires this change?
>

I think this is because the scope of last_received has been changed so
that it can be used to pass in send_feedback() during the delay.

-- 
With Regards,
Amit Kapila.

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-06 Thread Peter Smith

On Tue, Feb 7, 2023 at 4:02 PM Amit Kapila  wrote:
>
> On Tue, Feb 7, 2023 at 10:07 AM Kyotaro Horiguchi
>  wrote:
> >
> > At Tue, 7 Feb 2023 09:10:01 +0530, Amit Kapila  
> > wrote in
> > > On Tue, Feb 7, 2023 at 6:03 AM Peter Smith  wrote:
> > > > 5b.
> > > > Since there are no translator considerations here why not write the
> > > > second error like:
> > > >
> > > > errmsg("%d ms is outside the valid range for parameter
> > > > \"min_apply_delay\" (%d .. %d)",
> > > > result, 0, PG_INT32_MAX))
> > > >
> > >
> > > I see that existing usage in the code matches what the patch had
> > > before this comment. See below and similar usages in the code.
> > > if (start <= 0)
> > > ereport(ERROR,
> > > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > > errmsg("invalid value for parameter \"%s\": %d",
> > > "start", start)));
> >
> > The same errmsg text occurs mamy times in the tree. On the other hand
> > the pointed message is the only one.  I suppose Peter considered this
> > aspect.
> >
> > # "%d%s%s is outside the valid range for parameter \"%s\" (%d .. %d)"
> > # also appears just once
> >
> > As for me, it seems to me a good practice to do that regadless of the
> > number of duplicates to (semi)mechanically avoid duplicates.
> >
> > (But I believe I would do as Peter suggests by myself for the first
> > cut, though:p)
> >
>
> Personally, I would prefer consistency. I think we can later start a
> new thread to change the existing message and if there is a consensus
> and value in the same then we could use the same style here as well.
>

Of course, if there is a convention then we should stick to it.

My understanding was that (string literal) message parameters are
specified separately from the message format string primarily as an
aid to translators. That makes good sense for parameters with names
that are also English words (like "start" etc), but for non-word
parameters like "min_apply_delay" there is no such ambiguity in the
first place.

Anyway, I am fine with it being written either way.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-06 Thread Amit Kapila

On Tue, Feb 7, 2023 at 10:07 AM Kyotaro Horiguchi
 wrote:
>
> At Tue, 7 Feb 2023 09:10:01 +0530, Amit Kapila  
> wrote in
> > On Tue, Feb 7, 2023 at 6:03 AM Peter Smith  wrote:
> > > 5b.
> > > Since there are no translator considerations here why not write the
> > > second error like:
> > >
> > > errmsg("%d ms is outside the valid range for parameter
> > > \"min_apply_delay\" (%d .. %d)",
> > > result, 0, PG_INT32_MAX))
> > >
> >
> > I see that existing usage in the code matches what the patch had
> > before this comment. See below and similar usages in the code.
> > if (start <= 0)
> > ereport(ERROR,
> > (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > errmsg("invalid value for parameter \"%s\": %d",
> > "start", start)));
>
> The same errmsg text occurs mamy times in the tree. On the other hand
> the pointed message is the only one.  I suppose Peter considered this
> aspect.
>
> # "%d%s%s is outside the valid range for parameter \"%s\" (%d .. %d)"
> # also appears just once
>
> As for me, it seems to me a good practice to do that regadless of the
> number of duplicates to (semi)mechanically avoid duplicates.
>
> (But I believe I would do as Peter suggests by myself for the first
> cut, though:p)
>

Personally, I would prefer consistency. I think we can later start a
new thread to change the existing message and if there is a consensus
and value in the same then we could use the same style here as well.

-- 
With Regards,
Amit Kapila.

Re: Where is the logig to create a table file?

2023-02-06 Thread Kyotaro Horiguchi

At Fri, 3 Feb 2023 13:44:46 +0400, Pavel Borisov  wrote 
in 
> Hi, Jack
> 
> On Fri, 3 Feb 2023 at 13:19, jack...@gmail.com  wrote:
> >
> > When I use 'create table t(a int);'; suppose that this table t's oid is 
> > 1200,
> > then postgres will create a file named 1200 in the $PGDATA/base, So where
> > is the logic code in the internal?
> >
> heapam_relation_set_new_filenode()->RelationCreateStorage()

Or if you are searching for the logic to determin the file name, see
GetNewRelFileNumber().

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Exit walsender before confirming remote flush in logical replication

2023-02-06 Thread Andres Freund

Hi,

On 2023-02-07 09:00:13 +0530, Amit Kapila wrote:
> On Tue, Feb 7, 2023 at 2:04 AM Andres Freund  wrote:
> > How about we make it an option in START_REPLICATION? Delayed logical rep can
> > toggle that on by default.

> Works for me. So, when this option is set in START_REPLICATION
> message, walsender will set some flag and allow itself to exit at
> shutdown without waiting for WAL to be sent?

Yes. I think that might be useful in other situations as well, but we don't
need to make those configurable initially. But I imagine it'd be useful to set
things up so that non-HA physical replicas don't delay shutdown, particularly
if they're geographically far away.

Greetings,

Andres Freund

Re: In-placre persistance change of a relation

2023-02-06 Thread Kyotaro Horiguchi

Thank you for the comment!

At Fri, 3 Feb 2023 08:42:52 +0100, Heikki Linnakangas  wrote 
in 
> I want to call out this part of this patch:
> 
> > Also this allows for the cleanup of files left behind in the crash of
> > the transaction that created it.
> 
> This is interesting to a lot wider audience than ALTER TABLE SET
> LOGGED/UNLOGGED. It also adds most of the complexity, with the new
> marker files. Can you please split the first patch into two:
> 
> 1. Cleanup of newly created relations on crash
> 
> 2. ALTER TABLE SET LOGGED/UNLOGGED changes
> 
> Then we can review the first part independently.

Ah, indeed.  I'll do that.

> Regarding the first part, I'm not sure the marker files are the best
> approach to implement it. You need to create an extra file for every
> relation, just to delete it at commit. It feels a bit silly, but maybe

Agreed. (But I didn't come up with better idea..)

> it's OK in practice. The undo log patch set solved this problem with
> the undo log, but it looks like that patch set isn't going
> anywhere. Maybe invent a very lightweight version of the undo log for
> this?

I didn't thought on that line. Yes, indeed the marker files are a kind
of undo log.

Anyway, I'll split the current patch to two parts as suggested.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: tests against running server occasionally fail, postgres_fdw & tenk1

2023-02-06 Thread Andres Freund

Hi,

On 2023-02-06 19:29:46 -0800, Andres Freund wrote:
> There's something off. Isolationtester's control connection emits *loads* of
> invalidation messages:
> 2023-02-06 19:29:06.430 PST [2125297][client 
> backend][6/0:121864][isolation/receipt-report/control connection] LOG:  
> previously emitted 7662 messages, 21 this time
> 2023-02-06 19:29:06.566 PST [2125297][client 
> backend][6/0:121873][isolation/receipt-report/control connection] LOG:  
> previously emitted 8155 messages, 99 this time
> 2023-02-06 19:29:06.655 PST [2125297][client 
> backend][6/0:121881][isolation/receipt-report/control connection] LOG:  
> previously emitted 8621 messages, 99 this time
> 2023-02-06 19:29:06.772 PST [2125297][client 
> backend][6/0:121892][isolation/receipt-report/control connection] LOG:  
> previously emitted 9208 messages, 85 this time
> 2023-02-06 19:29:06.867 PST [2125297][client 
> backend][6/0:121900][isolation/receipt-report/control connection] LOG:  
> previously emitted 9674 messages, 85 this time

Ah, that's just due to setup-teardown happening in the control connection.

FWIW, I see plenty of sinval resets even if I increase the sinval queue size
substantially. I suspect we've increased the number of sinval messages we sent
a good bit over time, due to additional syscaches.

As we only process catchup interrupts while in ReadCommand(), it's not
surprising that we very quickly get behind. A single statement suffices.

Anyway, that's not really a correctness issue, just a performance issue.

But the postgres_fdw.sql vulnerability to syscache resets seems not
great. Perhaps pgfdw_inval_callback() could check if the definition of the
foreign server actually changed, before dropping the connection?

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

2023-02-06 Thread Noah Misch

On Mon, Feb 06, 2023 at 06:17:02PM +0100, Peter Eisentraut wrote:
> Also, pgindent takes tens of seconds to run, so hooking that into the git
> push process would slow this down quite a bit.

The pre-receive hook would do a full pgindent when you change typedefs.list.
Otherwise, it would reindent only the files being changed.  The average push
need not take tens of seconds.

> If somehow your local
> indenting doesn't give you the "correct" result for some reason, you might
> sit there for minutes and minutes trying to fix and push and fix and push.

As Andres mentioned, the hook could print the command it used.  It could even
print the diff it found, for you to apply.

On Mon, Feb 06, 2023 at 04:36:07PM -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2023-02-06 18:17:02 +0100, Peter Eisentraut wrote:
> >> First, as a matter of principle, it would introduce another level of
> >> gatekeeping power.  Right now, the committers are as a group in charge of
> >> what gets into the tree.  Adding commit hooks that are installed
> >> somewhere(?) by someone(?) and can only be seen by some(?) would upset 
> >> that.
> >> If we were using something like github or gitlab (not suggesting that, but
> >> for illustration), then you could put this kind of thing under .github/ or
> >> similar and then it would be under the same control as the source code
> >> itself.
> 
> > Well, we did talk about adding a pre-commit hook to the repository, with
> > instructions for how to enable it. And I don't see a problem with adding the
> > pre-receive we're discussing here to src/tools/something.
> 
> Yeah.  I don't think we are seriously considering putting any restrictions
> in place on gitmaster

I could have sworn that was exactly what we were discussing, a pre-receive
hook on gitmaster.

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-06 Thread Kyotaro Horiguchi

Thanks!

At Mon, 6 Feb 2023 13:10:01 +, "Takamichi Osumi (Fujitsu)" 
 wrote in 
> The attached patch v29 has included your changes.

catalogs.sgml

+  
+   The minimum delay (ms) for applying changes.
+  

I think we don't use unit symbols that way. Namely I think we would
write it as "The minimum delay for applying changes in milliseconds"


alter_subscription.sgml

   are slot_name,
   synchronous_commit,
   binary, streaming,
-  disable_on_error, and
-  origin.
+  disable_on_error,
+  origin, and
+  min_apply_delay.
  

By the way, is there any rule for the order among the words? They
don't seem in alphabetical order nor in the same order to the
create_sbuscription page.  (I seems like in the order of SUBOPT_*
symbols, but I'm not sure it's a good idea..)


subscriptioncmds.c

+   if (opts.streaming == 
LOGICALREP_STREAM_PARALLEL &&
+   !IsSet(opts.specified_opts, 
SUBOPT_MIN_APPLY_DELAY) && sub->minapplydelay > 0)
..
+   if (opts.min_apply_delay > 0 &&
+   !IsSet(opts.specified_opts, 
SUBOPT_STREAMING) && sub->stream == LOGICALREP_STREAM_PARALLEL)

Don't we wrap the lines?


worker.c

+   if (wal_receiver_status_interval > 0 &&
+   diffms > wal_receiver_status_interval * 1000L)
+   {
+   WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | 
WL_EXIT_ON_PM_DEATH,
+ wal_receiver_status_interval * 1000L,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);
+   send_feedback(last_received, true, false, true);
+   }
+   else
+   WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | 
WL_EXIT_ON_PM_DEATH,
+ diffms,
+ WAIT_EVENT_RECOVERY_APPLY_DELAY);

send_feedback always handles the case where
wal_receiver_status_interval == 0. thus we can simply wait for
min(wal_receiver_status_interval, diffms) then call send_feedback()
unconditionally.


-start_apply(XLogRecPtr origin_startpos)
+start_apply(void)

-LogicalRepApplyLoop(XLogRecPtr last_received)
+LogicalRepApplyLoop(void)

Does this patch requires this change?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: heapgettup refactoring

2023-02-06 Thread David Rowley

On Fri, 3 Feb 2023 at 15:26, David Rowley  wrote:
> I've pushed all but the final 2 patches now.

I just pushed the final patch in the series.  I held back on moving
the setting of rs_inited back into the heapgettup_initial_block()
helper function as I wondered if we should even keep that field.

It seems that rs_cblock == InvalidBlockNumber in all cases where
rs_inited == false, so maybe it's better just to use that as a
condition to check if the scan has started or not. I've attached a
patch which does that.

I ended up adjusting HeapScanDescData more than what is minimally
required to remove rs_inited. I wondered if rs_cindex should be closer
to rs_cblock in the struct so that we're more likely to be adjusting
the same cache line than if that field were closer to the end of the
struct.  We don't need rs_coffset and rs_cindex at the same time, so I
made it a union. I see that the struct is still 712 bytes before and
after this change. I've not yet tested to see if there are any
performance gains to this change.

David
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 7eb79cee58..e171d6e38b 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -321,13 +321,15 @@ initscan(HeapScanDesc scan, ScanKey key, bool 
keep_startblock)
}
 
scan->rs_numblocks = InvalidBlockNumber;
-   scan->rs_inited = false;
scan->rs_ctup.t_data = NULL;
ItemPointerSetInvalid(>rs_ctup.t_self);
scan->rs_cbuf = InvalidBuffer;
scan->rs_cblock = InvalidBlockNumber;
 
-   /* page-at-a-time fields are always invalid when not rs_inited */
+   /*
+* page-at-a-time fields are always invalid when
+* rs_cblock == InvalidBlockNumber
+*/
 
/*
 * copy the scan key, if appropriate
@@ -355,7 +357,8 @@ heap_setscanlimits(TableScanDesc sscan, BlockNumber 
startBlk, BlockNumber numBlk
 {
HeapScanDesc scan = (HeapScanDesc) sscan;
 
-   Assert(!scan->rs_inited);   /* else too late to change */
+   /* we can't set any limits when there's a scan already running */
+   Assert(scan->rs_cblock == InvalidBlockNumber);
/* else rs_startblock is significant */
Assert(!(scan->rs_base.rs_flags & SO_ALLOW_SYNC));
 
@@ -493,7 +496,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber block)
 static BlockNumber
 heapgettup_initial_block(HeapScanDesc scan, ScanDirection dir)
 {
-   Assert(!scan->rs_inited);
+   Assert(scan->rs_cblock == InvalidBlockNumber);
 
/* When there are no pages to scan, return InvalidBlockNumber */
if (scan->rs_nblocks == 0 || scan->rs_numblocks == 0)
@@ -559,7 +562,7 @@ heapgettup_start_page(HeapScanDesc scan, ScanDirection dir, 
int *linesleft,
 {
Pagepage;
 
-   Assert(scan->rs_inited);
+   Assert(scan->rs_cblock != InvalidBlockNumber);
Assert(BufferIsValid(scan->rs_cbuf));
 
/* Caller is responsible for ensuring buffer is locked if needed */
@@ -592,7 +595,7 @@ heapgettup_continue_page(HeapScanDesc scan, ScanDirection 
dir, int *linesleft,
 {
Pagepage;
 
-   Assert(scan->rs_inited);
+   Assert(scan->rs_cblock != InvalidBlockNumber);
Assert(BufferIsValid(scan->rs_cbuf));
 
/* Caller is responsible for ensuring buffer is locked if needed */
@@ -717,9 +720,9 @@ heapgettup_advance_block(HeapScanDesc scan, BlockNumber 
block, ScanDirection dir
  * the scankeys.
  *
  * Note: when we fall off the end of the scan in either direction, we
- * reset rs_inited.  This means that a further request with the same
- * scan direction will restart the scan, which is a bit odd, but a
- * request with the opposite scan direction will start a fresh scan
+ * reset rs_cblock to InvalidBlockNumber.  This means that a further request
+ * with the same scan direction will restart the scan, which is a bit odd, but
+ * a request with the opposite scan direction will start a fresh scan
  * in the proper direction.  The latter is required behavior for cursors,
  * while the former case is generally undefined behavior in Postgres
  * so we don't care too much.
@@ -737,12 +740,11 @@ heapgettup(HeapScanDesc scan,
OffsetNumber lineoff;
int linesleft;
 
-   if (unlikely(!scan->rs_inited))
+   if (unlikely(scan->rs_cblock == InvalidBlockNumber))
{
block = heapgettup_initial_block(scan, dir);
/* ensure rs_cbuf is invalid when we get InvalidBlockNumber */
Assert(block != InvalidBlockNumber || 
!BufferIsValid(scan->rs_cbuf));
-   scan->rs_inited = true;
}
else
{
@@ -824,7 +826,6 @@ continue_page:
scan->rs_cbuf = InvalidBuffer;
scan->rs_cblock = InvalidBlockNumber;
tuple->t_data = NULL;
-   scan->rs_inited = false;
 }
 
 /* 
@@ -852,12 +853,11 @@ heapgettup_pagemode(HeapScanDesc scan,

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-06 Thread Kyotaro Horiguchi

At Tue, 7 Feb 2023 09:10:01 +0530, Amit Kapila  wrote 
in 
> On Tue, Feb 7, 2023 at 6:03 AM Peter Smith  wrote:
> > 5b.
> > Since there are no translator considerations here why not write the
> > second error like:
> >
> > errmsg("%d ms is outside the valid range for parameter
> > \"min_apply_delay\" (%d .. %d)",
> > result, 0, PG_INT32_MAX))
> >
> 
> I see that existing usage in the code matches what the patch had
> before this comment. See below and similar usages in the code.
> if (start <= 0)
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> errmsg("invalid value for parameter \"%s\": %d",
> "start", start)));

The same errmsg text occurs mamy times in the tree. On the other hand
the pointed message is the only one.  I suppose Peter considered this
aspect.

# "%d%s%s is outside the valid range for parameter \"%s\" (%d .. %d)"
# also appears just once

As for me, it seems to me a good practice to do that regadless of the
number of duplicates to (semi)mechanically avoid duplicates.

(But I believe I would do as Peter suggests by myself for the first
cut, though:p)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Deadlock between logrep apply worker and tablesync worker

2023-02-06 Thread Peter Smith

On Fri, Feb 3, 2023 at 6:58 PM houzj.f...@fujitsu.com
 wrote:
>
...
> > Right, I think that case could be addressed by Tom's patch to some extent 
> > but
> > I am thinking we should also try to analyze if we can completely avoid the 
> > need
> > to remove origins from both processes. One idea could be to introduce
> > another relstate something like PRE_SYNCDONE and set it in a separate
> > transaction before we set the state as SYNCDONE and remove the slot and
> > origin in tablesync worker.
> > Now, if the tablesync worker errors out due to some reason during the second
> > transaction, it can remove the slot and origin after restart by checking 
> > the state.
> > However, it would add another relstate which may not be the best way to
> > address this problem. Anyway, that can be accomplished as a separate patch.
>
> Here is an attempt to achieve the same.
> Basically, the patch removes the code that drop the origin in apply worker. 
> And
> add a new state PRE_SYNCDONE after synchronization finished in front of apply
> (sublsn set), but before dropping the origin and other final cleanups. The
> tablesync will restart and redo the cleanup if it failed after reaching the 
> new
> state. Besides, since the changes can already be applied on the table in
> PRE_SYNCDONE state, so I also modified the check in
> should_apply_changes_for_rel(). And some other conditions for the origin drop
> in subscription commands are were adjusted in this patch.
>

Here are some review comments for the 0001 patch

==
General Comment

0.
The idea of using the extra relstate for clean-up seems OK, but the
implementation of the new state in this patch appears misordered and
misnamed to me.

The state name should indicate what it is doing (PRE_SYNCDONE is
meaningless). The patch describes in several places that this state
means "synchronized, but not yet cleaned up" therefore IMO it means
the SYNCDONE state should be *before* this new state. And since this
new state is for "cleanup" then let's call it something like that.

To summarize, I don’t think the meaning of SYNCDONE should be touched.
SYNCDONE means the synchronization is done, same as before. And your
new "cleanup" state belongs directly *after* that. IMO it should be
like this:

1. STATE_INIT
2. STATE_DATASYNC
3. STATE_FINISHEDCOPY
4. STATE_SYNCDONE
5. STATE_CLEANUP <-- new relstate
6. STATE_READY

Of course, this is going to impact almost every aspect of the patch,
but I think everything will be basically the same as you have it now
-- only all the state names and comments need to be adjusted according
to the above.

==
Commit Message

1.
Previously, we allowed the apply worker to drop the origin to avoid the case
that the tablesync worker fails to the origin(due to crash). In this case we
don't restart the tablesync worker, and the apply worker can clean the origin.

~

There seem to be some words missing in this paragraph.

SUGGESTION
Previously, we allowed the apply worker to drop the origin as a way to
recover from the scenario where the tablesync worker failed to drop it
(due to crash).

~~~

2.
To improve this, we introduce a new relstate SUBREL_STATE_PRE_SYNCDONE which
will be set after synchronization finished in front of apply (sublsn set), but
before dropping the origin and other final cleanups. The apply worker will
restart tablesync worker if the relstate is SUBREL_STATE_PRE_SYNCDONE. This
way, even if the tablesync worker error out in the transaction that tries to
drop the origin, the apply worker will restart the tablesync worker to redo the
cleanup(for origin and other stuff) and then directly exit.

~

2a.
This is going to be impacted by my "General Comment". Notice how you
describe again "will be set after synchronization finished".  Evidence
again this means
the new CLEANUP state should directly follow the SYNCDONE state.

2b.
"error out” --> "encounters an error"

2c.
"cleanup(for origin" --> space before the "("

==
doc/src/sgml/catalogs.sgml

3.
@@ -8071,7 +8071,8 @@ SCRAM-SHA-256$iteration
count:
i = initialize,
d = data is being copied,
f = finished table copy,
-   s = synchronized,
+   p = synchronized but not yet cleaned up,
+   s = synchronization done,
r = ready (normal replication)

@@ -8082,8 +8083,8 @@ SCRAM-SHA-256$iteration
count:

Remote LSN of the state change used for synchronization coordination
-   when in s or r states,
-   otherwise null
+   when in p, s or
+   r states, otherwise null

This state order and choice of the letter are impacted by my "General Comment".

IMO it should be more like this:

State code: i = initialize, d = data is being copied, f = finished
table copy,  s = synchronization done, c = clean-up done, r = ready
(normal replication)

==
src/backend/commands/subscriptioncmds.c

4. AlterSubscription_refresh

Some adjustments are needed according to my "General Comment".

~~~

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-06 Thread Amit Kapila

On Tue, Feb 7, 2023 at 6:03 AM Peter Smith  wrote:
>
> 5.
> +defGetMinApplyDelay(DefElem *def)
> +{
> + char*input_string;
> + int result;
> + const char *hintmsg;
> +
> + input_string = defGetString(def);
> +
> + /*
> + * Parse given string as parameter which has millisecond unit
> + */
> + if (!parse_int(input_string, , GUC_UNIT_MS, ))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("invalid value for parameter \"%s\": \"%s\"",
> + "min_apply_delay", input_string),
> + hintmsg ? errhint("%s", _(hintmsg)) : 0));
> +
> + /*
> + * Check both the lower boundary for the valid min_apply_delay range and
> + * the upper boundary as the safeguard for some platforms where INT_MAX is
> + * wider than int32 respectively. Although parse_int() has confirmed that
> + * the result is less than or equal to INT_MAX, the value will be stored
> + * in a catalog column of int32.
> + */
> + if (result < 0 || result > PG_INT32_MAX)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
> + result,
> + "min_apply_delay",
> + 0, PG_INT32_MAX)));
> +
> + return result;
> +}
>
> 5a.
> Since there are no translator considerations here why not write the
> first error like:
>
> errmsg("invalid value for parameter \"min_apply_delay\": \"%s\"",
> input_string)
>
> ~
>
> 5b.
> Since there are no translator considerations here why not write the
> second error like:
>
> errmsg("%d ms is outside the valid range for parameter
> \"min_apply_delay\" (%d .. %d)",
> result, 0, PG_INT32_MAX))
>

I see that existing usage in the code matches what the patch had
before this comment. See below and similar usages in the code.
if (start <= 0)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("invalid value for parameter \"%s\": %d",
"start", start)));

-- 
With Regards,
Amit Kapila.

Re: Exit walsender before confirming remote flush in logical replication

2023-02-06 Thread Amit Kapila

On Tue, Feb 7, 2023 at 2:04 AM Andres Freund  wrote:
>
> On 2023-02-06 12:23:54 +0530, Amit Kapila wrote:
> > On Mon, Feb 6, 2023 at 10:33 AM Andres Freund  wrote:
> > > Smart shutdown is practically unusable. I don't think it makes sense to 
> > > tie behavior of walsender to it in any way.
> > >
> >
> > So, we have the following options: (a) do nothing for this; (b)
> > clarify the current behavior in docs. Any suggestions?
>
> b) seems good.
>
> I also think it'd make sense to improve this on a code-level. Just not in the
> wholesale way discussed so far.
>
> How about we make it an option in START_REPLICATION? Delayed logical rep can
> toggle that on by default.
>

Works for me. So, when this option is set in START_REPLICATION
message, walsender will set some flag and allow itself to exit at
shutdown without waiting for WAL to be sent?

-- 
With Regards,
Amit Kapila.

Re: tests against running server occasionally fail, postgres_fdw & tenk1

2023-02-06 Thread Andres Freund

Hi,

On 2023-02-06 17:53:00 -0800, Andres Freund wrote:
> WRT the fdw_retry_check: I wonder if we should increase the log level of
> a) pgfdw_inval_callback deciding to disconnect
> b) ReceiveSharedInvalidMessages() deciding to reset
>
> to DEBUG1, at least temporarily?
>
> Alternatively we could add a
> SET log_min_messages=debug4;
> ...
> RESET log_min_messages;
>
> around the postgres_fdw connection re-establishment test?
>
>
> One thing nudging me towards the more global approach is that I have the vague
> feelign that there's a wider issue around hitting more sinval resets than we
> should - and it'd right now be very hard to know about that.

Luckily it proved to be easy enough to reproduce on a private branch, by
setting the test to repeat a couple times.

I set the aforementioned log messages to LOG. And indeed:

2023-02-07 02:54:18.773 UTC [10800][client backend] 
[pg_regress/postgres_fdw][7/10526:0] LOG:  cache state reset
2023-02-07 02:54:18.773 UTC [10800][client backend] 
[pg_regress/postgres_fdw][7/10526:0] LOG:  discarding connection 0x802251f00

that was preceded by another log message less than 200 ms before:
2023-02-07 02:54:18.588 UTC [10800][client backend] 
[pg_regress/postgres_fdw][7/10523:55242] STATEMENT:  ALTER SERVER loopback 
OPTIONS (application_name 'fdw_retry_check');

The log statements inbetween are about isolation/reindex-concurrently-toast
and pg_regress/indexing.

So the problem is indeed that we somehow quickly overflow the sinval queue. I
guess we need a bit more logging around the size of the sinval queue and its
"fullness"?


I'm a bit surprised that MAXNUMMESSAGES is a hardcoded 4096.  It's not
particularly surprising that that's quickly overflown?


There's something off. Isolationtester's control connection emits *loads* of
invalidation messages:
2023-02-06 19:29:06.430 PST [2125297][client 
backend][6/0:121864][isolation/receipt-report/control connection] LOG:  
previously emitted 7662 messages, 21 this time
2023-02-06 19:29:06.566 PST [2125297][client 
backend][6/0:121873][isolation/receipt-report/control connection] LOG:  
previously emitted 8155 messages, 99 this time
2023-02-06 19:29:06.655 PST [2125297][client 
backend][6/0:121881][isolation/receipt-report/control connection] LOG:  
previously emitted 8621 messages, 99 this time
2023-02-06 19:29:06.772 PST [2125297][client 
backend][6/0:121892][isolation/receipt-report/control connection] LOG:  
previously emitted 9208 messages, 85 this time
2023-02-06 19:29:06.867 PST [2125297][client 
backend][6/0:121900][isolation/receipt-report/control connection] LOG:  
previously emitted 9674 messages, 85 this time

and this happens with lots of other tests.

Greetings,

Andres Freund


PS: The reindex-concurrently-toast output seems to indicate something is
broken in the test... There's lots of non-existing table references in the
expected file, without that immediately making sense.

Re: Perform streaming logical transactions by background workers and parallel apply

2023-02-06 Thread Amit Kapila

On Mon, Feb 6, 2023 at 3:43 PM houzj.f...@fujitsu.com
 wrote:
>
> while reading the code, I noticed that in pa_send_data() we set wait event
> to WAIT_EVENT_LOGICAL_PARALLEL_APPLY_STATE_CHANGE while sending the
> message to the queue. Because this state is used in multiple places, user 
> might
> not be able to distinguish what they are waiting for. So It seems we'd better
> to use WAIT_EVENT_MQ_SEND here which will be eaier to distinguish and
> understand. Here is a tiny patch for that.
>

Thanks for noticing this. The patch LGTM. I'll push this in some time.

-- 
With Regards,
Amit Kapila.

A problem in deconstruct_distribute_oj_quals

2023-02-06 Thread Richard Guo

In deconstruct_distribute_oj_quals, when we've identified a commutable
left join which provides join clause with flexible semantics, we try to
generate multiple versions of the join clause.  Here we have the logic
that puts back any ojrelids that were removed from its min_righthand.

 /*
  * Put any OJ relids that were removed from min_righthand back into
  * ojscope, else distribute_qual_to_rels will complain.
  */
 ojscope = bms_join(ojscope, bms_intersect(sjinfo->commute_below,
   sjinfo->syn_righthand));

I doubt this is necessary.  It seems to me that all relids mentioned
within the join clause have already been contained in ojscope, which is
the union of min_lefthand and min_righthand.

I noticed this code because I came across a problem with a query as
below.

create table t (a int);

select t1.a from (t t1 left join t t2 on true) left join (t t3 left join t
t4 on t3.a = t4.a) on t2.a = t3.a;

When we deal with qual 't2.a = t3.a', deconstruct_distribute_oj_quals
would always add the OJ relid of t3/t4 into its required_relids, due to
the code above, which I think is wrong.  The direct consequence is that
we would miss the plan that joins t2 and t3 directly.

If we add unique constraint for 'a' and try the outer-join removal
logic, we would notice that the left join of t2/t3 cannot be removed
because its join qual is treated as pushed down due to the fact that its
required_relids exceed the scope of the join.  I think this is also not
correct.

So is it safe we remove that code?

Thanks
Richard

RE: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-06 Thread Hayato Kuroda (Fujitsu)

Dear Peter,

Thank you for reviewing! PSA new version.

> ==
> Commit Message
> 
> 1.
> Discussion:
> https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4r
> zr...@mail.gmail.com
> 
> tmp
> 
> ~
> 
> What's that "tmp" doing there? A typo?

Removed. It was a typo.
I used `git rebase` command to combining the local commits,
but the commit message seemed to be remained.

> ==
> doc/src/sgml/catalogs.sgml
> 
> 2.
> + 
> +  
> +   subminapplydelay int4
> +  
> +  
> +   The minimum delay (ms) for applying changes.
> +  
> + 
> 
> For consistency remove the period (.) because the other
> single-sentence descriptions on this page do not have one.

I have also confirmed and agreed. Fixed.

> ==
> src/backend/commands/subscriptioncmds.c
> 
> 3. AlterSubscription
> + errmsg("cannot set parallel streaming mode for subscription with %s",
> +"min_apply_delay"));
> 
> Since there are no translator considerations here why not write it like this:
> 
> errmsg("cannot set parallel streaming mode for subscription with
> min_apply_delay")

Fixed.

> ~~~
> 
> 4. AlterSubscription
> + errmsg("cannot set %s for subscription in parallel streaming mode",
> +"min_apply_delay"));
> 
> Since there are no translator considerations here why not write it like this:
> 
> errmsg("cannot set min_apply_delay for subscription in parallel streaming 
> mode")

Fixed.

> ~~~
> 
> 5.
> +defGetMinApplyDelay(DefElem *def)
> +{
> + char*input_string;
> + int result;
> + const char *hintmsg;
> +
> + input_string = defGetString(def);
> +
> + /*
> + * Parse given string as parameter which has millisecond unit
> + */
> + if (!parse_int(input_string, , GUC_UNIT_MS, ))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("invalid value for parameter \"%s\": \"%s\"",
> + "min_apply_delay", input_string),
> + hintmsg ? errhint("%s", _(hintmsg)) : 0));
> +
> + /*
> + * Check both the lower boundary for the valid min_apply_delay range and
> + * the upper boundary as the safeguard for some platforms where INT_MAX is
> + * wider than int32 respectively. Although parse_int() has confirmed that
> + * the result is less than or equal to INT_MAX, the value will be stored
> + * in a catalog column of int32.
> + */
> + if (result < 0 || result > PG_INT32_MAX)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
> + result,
> + "min_apply_delay",
> + 0, PG_INT32_MAX)));
> +
> + return result;
> +}
> 
> 5a.
> Since there are no translator considerations here why not write the
> first error like:
> 
> errmsg("invalid value for parameter \"min_apply_delay\": \"%s\"",
> input_string)
> 
> ~
> 
> 5b.
> Since there are no translator considerations here why not write the
> second error like:
> 
> errmsg("%d ms is outside the valid range for parameter
> \"min_apply_delay\" (%d .. %d)",
> result, 0, PG_INT32_MAX))

Both of you said were fixed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

> -Original Message-
> From: Peter Smith 
> Sent: Tuesday, February 7, 2023 9:33 AM
> To: Osumi, Takamichi/大墨 昂道 
> Cc: Amit Kapila ; Shi, Yu/侍 雨
> ; Kyotaro Horiguchi ;
> vignes...@gmail.com; Kuroda, Hayato/黒田 隼人
> ; shveta.ma...@gmail.com; dilipbal...@gmail.com;
> eu...@eulerto.com; m.melihmu...@gmail.com; and...@anarazel.de;
> mar...@f10.com.br; pgsql-hack...@postgresql.org
> Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)
> 
> Here are my review comments for v29-0001.
> 
> ==
> Commit Message
> 
> 1.
> Discussion:
> https://postgr.es/m/CAB-JLwYOYwL=XTyAXKiH5CtM_Vm8KjKh7aaitCKvmCh4r
> zr...@mail.gmail.com
> 
> tmp
> 
> ~
> 
> What's that "tmp" doing there? A typo?
> 
> ==
> doc/src/sgml/catalogs.sgml
> 
> 2.
> + 
> +  
> +   subminapplydelay int4
> +  
> +  
> +   The minimum delay (ms) for applying changes.
> +  
> + 
> 
> For consistency remove the period (.) because the other
> single-sentence descriptions on this page do not have one.
> 
> ==
> src/backend/commands/subscriptioncmds.c
> 
> 3. AlterSubscription
> + errmsg("cannot set parallel streaming mode for subscription with %s",
> +"min_apply_delay"));
> 
> Since there are no translator considerations here why not write it like this:
> 
> errmsg("cannot set parallel streaming mode for subscription with
> min_apply_delay")
> 
> ~~~
> 
> 4. AlterSubscription
> + errmsg("cannot set %s for subscription in parallel streaming mode",
> +"min_apply_delay"));
> 
> Since there are no translator considerations here why not write it like this:
> 
> errmsg("cannot set min_apply_delay for subscription in parallel streaming 
> mode")
> 
> ~~~
> 
> 5.
> +defGetMinApplyDelay(DefElem *def)
> +{
> + char*input_string;
> + int result;
> + const char *hintmsg;
> +
> + input_string = defGetString(def);
> +
> + /*
> + * Parse given string as parameter which has millisecond unit
> + */

RE: [PATCH] Reuse Workers and Replication Slots during Logical Replication

2023-02-06 Thread shiy.f...@fujitsu.com


On Thu, Feb 2, 2023 11:48 AM shveta malik  wrote:
> 
> On Wed, Feb 1, 2023 at 5:42 PM Melih Mutlu 
> wrote:
> >
> > Hi Shveta,
> >
> > shveta malik , 1 Şub 2023 Çar, 15:01 tarihinde
> şunu yazdı:
> >>
> >> On Wed, Feb 1, 2023 at 5:05 PM Melih Mutlu 
> wrote:
> >> 2) I found a crash in the previous patch (v9), but have not tested it
> >> on the latest yet. Crash happens when all the replication slots are
> >> consumed and we are trying to create new. I tweaked the settings like
> >> below so that it can be reproduced easily:
> >> max_sync_workers_per_subscription=3
> >> max_replication_slots = 2
> >> and then ran the test case shared by you. I think there is some memory
> >> corruption happening. (I did test in debug mode, have not tried in
> >> release mode). I tried to put some traces to identify the root-cause.
> >> I observed that worker_1 keeps on moving from 1 table to another table
> >> correctly, but at some point, it gets corrupted i.e. origin-name
> >> obtained for it is wrong and it tries to advance that and since that
> >> origin does not exist, it  asserts and then something else crashes.
> >> From log: (new trace lines added by me are prefixed by shveta, also
> >> tweaked code to have my comment 1 fixed to have clarity on worker-id).
> >>
> >> form below traces, it is clear that worker_1 was moving from one
> >> relation to another, always getting correct origin 'pg_16688_1', but
> >> at the end it got 'pg_16688_49' which does not exist. Second part of
> >> trace shows who updated 'pg_16688_49', it was done by worker_49
> which
> >> even did not get chance to create this origin due to max_rep_slot
> >> reached.
> >
> >
> > Thanks for investigating this error. I think it's the same error as the one 
> > Shi
> reported earlier. [1]
> > I couldn't reproduce it yet but will apply your tweaks and try again.
> > Looking into this.
> >
> > [1] https://www.postgresql.org/message-
> id/OSZPR01MB631013C833C98E826B3CFCB9FDC69%40OSZPR01MB6310.jpn
> prd01.prod.outlook.com
> >
> 
> Hi Melih,
> I think I am able to identify the root cause. It is not memory
> corruption, but the way origin-names are stored in system-catalog
> mapped to a particular relation-id before even those are created.
> 
> After adding few more logs:
> 
> [4227] LOG:  shveta- LogicalRepSyncTableStart- worker_49 constructed
> originname :pg_16684_49, relid:16540
> [4227] LOG:  shveta- LogicalRepSyncTableStart- worker_49
> updated-origin in system catalog:pg_16684_49,
> slot:pg_16684_sync_49_7195149685251088378, relid:16540
> [4227] LOG:  shveta- LogicalRepSyncTableStart- worker_49
> get-origin-id2 originid:0, originname:pg_16684_49
> [4227] ERROR:  could not create replication slot
> "pg_16684_sync_49_7195149685251088378": ERROR:  all replication slots
> are in use
> HINT:  Free one or increase max_replication_slots.
> 
> 
> [4428] LOG:  shveta- LogicalRepSyncTableStart- worker_148 constructed
> originname :pg_16684_49, relid:16540
> [4428] LOG:  could not drop replication slot
> "pg_16684_sync_49_7195149685251088378" on publisher: ERROR:
> replication slot "pg_16684_sync_49_7195149  685251088378" does not
> exist
> [4428] LOG:  shveta- LogicalRepSyncTableStart- worker_148 drop-origin
> originname:pg_16684_49
> [4428] LOG:  shveta- LogicalRepSyncTableStart- worker_148
> updated-origin:pg_16684_49,
> slot:pg_16684_sync_148_7195149685251088378, relid:16540
> 
> So from above, worker_49 came and picked up relid:16540, constructed
> origin-name and slot-name and updated in system-catalog and then it
> tried to actually create that slot and origin but since max-slot count
> was reached, it failed and exited, but did not do cleanup from system
> catalog for that relid.
> 
> Then worker_148 came and also picked up table 16540 since it was not
> completed/started by previous worker, but this time it found that
> origin and slot entry present in system-catalog against this relid, so
> it picked the same names and started processing, but since those do
> not exist, it asserted while advancing the origin.
> 
> The assert is only reproduced when an already running worker (say
> worker_1) who has 'created=true' set, gets to sync the relid for which
> a previously failed worker has tried and updated origin-name w/o
> creating it. In such a case worker_1 (with created=true) will try to
> reuse the origin and thus will try to advance it and will end up
> asserting. That is why you might not see the assertion always. The
> condition 'created' is set to true for that worker and it goes to
> reuse the origin updated by the previous worker.
> 
> So to fix this, I think either we update origin and slot entries in
> the system catalog after the creation has passed or we clean-up the
> system catalog in case of failure. What do you think?
> 

I think the first way seems better.

I reproduced the problem I reported before with latest patch (v7-0001,
v10-0002), and looked into this problem. It is caused by a similar reason. Here
is some analysis

RE: Perform streaming logical transactions by background workers and parallel apply

2023-02-06 Thread Hayato Kuroda (Fujitsu)

Dear Hou,

> I think PARALLEL_APPLY_MAIN waits for two kinds of event: 1) wait for new
> message from the queue 2) wait for the partial file state to be set. So, I
> think introducing a new general event for them is better and it is also
> consistent with the WAIT_EVENT_LOGICAL_APPLY_MAIN which is used in the
> main
> loop of leader apply worker(LogicalRepApplyLoop). But the event in
> pg_send_data() is only for message send, so it seems fine to use
> WAIT_EVENT_MQ_SEND, besides MQ_SEND is also unique in parallel apply
> worker and
> user can distinglish without adding new event.

Thank you for your explanation. I think both of you said are reasonable.


Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Re: tests against running server occasionally fail, postgres_fdw & tenk1

2023-02-06 Thread Andres Freund

Hi,

On 2022-12-08 16:36:07 -0800, Andres Freund wrote:
> On 2022-12-08 16:15:11 -0800, Andres Freund wrote:
> > Unfortunately cfbot shows that that doesn't work entirely reliably.
> >
> > The most frequent case is postgres_fdw, which somewhat regularly fails with 
> > a
> > regression.diff like this:
> >
> > diff -U3 
> > /tmp/cirrus-ci-build/contrib/postgres_fdw/expected/postgres_fdw.out 
> > /tmp/cirrus-ci-build/build/testrun/postgres_fdw-running/regress/results/postgres_fdw.out
> > --- /tmp/cirrus-ci-build/contrib/postgres_fdw/expected/postgres_fdw.out 
> > 2022-12-08 20:35:24.772888000 +
> > +++ 
> > /tmp/cirrus-ci-build/build/testrun/postgres_fdw-running/regress/results/postgres_fdw.out
> > 2022-12-08 20:43:38.19945 +
> > @@ -9911,8 +9911,7 @@
> > WHERE application_name = 'fdw_retry_check';
> >   pg_terminate_backend
> >  --
> > - t
> > -(1 row)
> > +(0 rows)
> >
> >  -- This query should detect the broken connection when starting new remote
> >  -- transaction, reestablish new connection, and then succeed.

> >
> > I guess that a cache reset message arrives and leads to the connection being
> > terminated. Unfortunately that's hard to see right now, as the relevant log
> > messages are output with DEBUG3 - it's quite verbose, so enabling it for all
> > tests will be painful.
> >
> > I *think* I have seen this failure locally at least once, when running the
> > test normally.
> >
> >
> > I'll try to reproduce this locally for a bit. If I can't, the only other 
> > idea
> > I have for debugging this is to change log_min_messages in that section of 
> > the
> > postgres_fdw test to DEBUG3.
>
> Oh. I tried to reproduce the issue, without success so far, but eventually my
> test loop got stuck in something I reported previously and forgot about since:
> https://www.postgresql.org/message-id/20220925232237.p6uskba2dw6fnwj2%40awork3.anarazel.de
>
> I wonder if the timing on the freebsd CI task works out to hitting a "smaller
> version" of the problem that eventually resolves itself, which then leads to a
> sinval reset getting sent, causing the observable problem.

The issue referenced above is now fixed, and I haven't seen instances of it
since then. I also just now fixed the issue that often lead to failing to
upload logs.

Unfortunately the fdw_retry_check issue from above has re-occurred since then:

https://cirrus-ci.com/task/4901157940756480
https://api.cirrus-ci.com/v1/artifact/task/4901157940756480/testrun/build/testrun/postgres_fdw-running/regress/regression.diffs


Another run hit an issue we've been fighting repeatedly on the buildfarm / CI:
https://cirrus-ci.com/task/5527490404286464
https://api.cirrus-ci.com/v1/artifact/task/5527490404286464/testrun/build/testrun/regress-running/regress/regression.diffs

diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/create_index.out 
/tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/create_index.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/create_index.out 
2023-02-06 23:52:43.627604000 +
+++ 
/tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/create_index.out
 2023-02-07 00:03:04.535232000 +
@@ -1930,12 +1930,13 @@
 SELECT thousand, tenthous FROM tenk1
 WHERE thousand < 2 AND tenthous IN (1001,3000)
 ORDER BY thousand;
-  QUERY PLAN

- Index Only Scan using tenk1_thous_tenthous on tenk1
-   Index Cond: (thousand < 2)
-   Filter: (tenthous = ANY ('{1001,3000}'::integer[]))
-(3 rows)
+  QUERY PLAN
+--
+ Sort
+   Sort Key: thousand
+   ->  Index Only Scan using tenk1_thous_tenthous on tenk1
+ Index Cond: ((thousand < 2) AND (tenthous = ANY 
('{1001,3000}'::integer[])))
+(4 rows)

 SELECT thousand, tenthous FROM tenk1
 WHERE thousand < 2 AND tenthous IN (1001,3000)


I'd be tempted to disable the test, but it also found genuine issues in a
bunch of CF entries, and all these test instabilities seem like ones we'd also
see, with a lower occurrence on the buildfarm.


WRT the fdw_retry_check: I wonder if we should increase the log level of
a) pgfdw_inval_callback deciding to disconnect
b) ReceiveSharedInvalidMessages() deciding to reset

to DEBUG1, at least temporarily?

Alternatively we could add a
SET log_min_messages=debug4;
...
RESET log_min_messages;

around the postgres_fdw connection re-establishment test?


One thing nudging me towards the more global approach is that I have the vague
feelign that there's a wider issue around hitting more sinval resets than we
should - and it'd right now be very hard to know about that.


Greetings,

Andres Freund

Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-06 Thread Peter Smith

Here are my review comments for v29-0001.

==
Commit Message

1.
Discussion: 
https://postgr.es/m/CAB-JLwYOYwL=xtyaxkih5ctm_vm8kjkh7aaitckvmch4rzr...@mail.gmail.com

tmp

~

What's that "tmp" doing there? A typo?

==
doc/src/sgml/catalogs.sgml

2.
+ 
+  
+   subminapplydelay int4
+  
+  
+   The minimum delay (ms) for applying changes.
+  
+ 

For consistency remove the period (.) because the other
single-sentence descriptions on this page do not have one.

==
src/backend/commands/subscriptioncmds.c

3. AlterSubscription
+ errmsg("cannot set parallel streaming mode for subscription with %s",
+"min_apply_delay"));

Since there are no translator considerations here why not write it like this:

errmsg("cannot set parallel streaming mode for subscription with
min_apply_delay")

~~~

4. AlterSubscription
+ errmsg("cannot set %s for subscription in parallel streaming mode",
+"min_apply_delay"));

Since there are no translator considerations here why not write it like this:

errmsg("cannot set min_apply_delay for subscription in parallel streaming mode")

~~~

5.
+defGetMinApplyDelay(DefElem *def)
+{
+ char*input_string;
+ int result;
+ const char *hintmsg;
+
+ input_string = defGetString(def);
+
+ /*
+ * Parse given string as parameter which has millisecond unit
+ */
+ if (!parse_int(input_string, , GUC_UNIT_MS, ))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": \"%s\"",
+ "min_apply_delay", input_string),
+ hintmsg ? errhint("%s", _(hintmsg)) : 0));
+
+ /*
+ * Check both the lower boundary for the valid min_apply_delay range and
+ * the upper boundary as the safeguard for some platforms where INT_MAX is
+ * wider than int32 respectively. Although parse_int() has confirmed that
+ * the result is less than or equal to INT_MAX, the value will be stored
+ * in a catalog column of int32.
+ */
+ if (result < 0 || result > PG_INT32_MAX)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("%d ms is outside the valid range for parameter \"%s\" (%d .. %d)",
+ result,
+ "min_apply_delay",
+ 0, PG_INT32_MAX)));
+
+ return result;
+}

5a.
Since there are no translator considerations here why not write the
first error like:

errmsg("invalid value for parameter \"min_apply_delay\": \"%s\"",
input_string)

~

5b.
Since there are no translator considerations here why not write the
second error like:

errmsg("%d ms is outside the valid range for parameter
\"min_apply_delay\" (%d .. %d)",
result, 0, PG_INT32_MAX))

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Generating code for query jumbling through gen_node_support.pl

2023-02-06 Thread Michael Paquier

On Sun, Feb 05, 2023 at 07:40:57PM +0900, Michael Paquier wrote:
> Comments or objections are welcome, of course.

Done this part.

> (FWIW, I'd like to think that there is an argument to normalize the
> A_Const nodes for a portion of the DDL queries, by ignoring their
> values in the query jumbling and mark a location, which would be
> really useful for some workloads, but that's a separate discussion I
> am keeping for later.)

And I have a patch pretty much OK for this part of the discussion.
Will post soon..
--
Michael


signature.asc
Description: PGP signature

Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)

2023-02-06 Thread Thomas Munro

On Tue, Feb 7, 2023 at 1:06 PM Tomas Vondra
 wrote:
> On 2/7/23 00:48, Thomas Munro wrote:
> > On Tue, Feb 7, 2023 at 12:46 PM Tomas Vondra
> >  wrote:
> >> No, I left the workload as it was for the first lockup, so `make check`
> >> runs everything as is up until the "join" test suite.
> >
> > Wait, shouldn't that be join_hash?
>
> No, because join_hash does not exist on 11 (it was added in 12). Also,
> it actually locked up like this - that's the lockup I reported on 28/1.

Oh, good.  I had been trying to repro with 12 here and forgot that you
were looking at 11...

Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)

2023-02-06 Thread Tomas Vondra

On 2/7/23 00:48, Thomas Munro wrote:
> On Tue, Feb 7, 2023 at 12:46 PM Tomas Vondra
>  wrote:
>> No, I left the workload as it was for the first lockup, so `make check`
>> runs everything as is up until the "join" test suite.
> 
> Wait, shouldn't that be join_hash?

No, because join_hash does not exist on 11 (it was added in 12). Also,
it actually locked up like this - that's the lockup I reported on 28/1.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)

2023-02-06 Thread Thomas Munro

On Tue, Feb 7, 2023 at 12:46 PM Tomas Vondra
 wrote:
> No, I left the workload as it was for the first lockup, so `make check`
> runs everything as is up until the "join" test suite.

Wait, shouldn't that be join_hash?

Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)

2023-02-06 Thread Tomas Vondra




On 2/6/23 20:20, Andres Freund wrote:
> Hi,
> 
> On 2023-02-06 19:51:19 +0100, Tomas Vondra wrote:
>>> No. The only thing the machine is doing is
>>>
>>>   while /usr/bin/true; do
>>> make check
>>>   done
>>>
>>> I can't reduce the workload further, because the "join" test is in a
>>> separate parallel group (I cut down parallel_schedule). I could make the
>>> machine busier, of course.
>>>
>>> However, the other lockup I saw was when using serial_schedule, so I
>>> guess lower concurrency makes it more likely.
>>>
>>
>> FWIW the machine is now on run ~2700 without any further lockups :-/
>>
>> Seems it was quite lucky we hit it twice in a handful of attempts.
> 
> Did you cut down the workload before you reproduced it the first time, or
> after? It's quite possible that it's not reproducible in isolation.
> 

No, I left the workload as it was for the first lockup, so `make check`
runs everything as is up until the "join" test suite.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: gokiburi versus the back branches

2023-02-06 Thread Tom Lane

Michael Paquier  writes:
> On Mon, Feb 06, 2023 at 06:27:50PM -0500, Tom Lane wrote:
>> Is it time to back-patch that commit?

> Yes, this is my intention as of this message from last week, once this
> week's release is tagged:
> https://www.postgresql.org/message-id/y9smhxo51hrxa...@paquier.xyz

D'oh, I'd totally forgotten that conversation already :-(

regards, tom lane

Re: pg_upgrade test failure

2023-02-06 Thread Andres Freund

On 2023-02-06 14:14:22 -0800, Andres Freund wrote:
> On 2023-02-07 11:03:18 +1300, Thomas Munro wrote:
> > What I see is that there were 1254 FreeBSD tasks run in that window, of
> > which 163 failed, and (more interestingly) 111 of those failures succeeded
> > on every other platform.  And clicking on a few on cfbot's page reveals that
> > it's the new running stuff, and I'm still trying to find the interesting
> > logs...
> 
> I think I figured out why the logs frequently fail to upload - the server is
> still running, so the size changes during the upload, causing the upload to
> fail with errors like:
> 
> [12:46:43.552] Failed to upload artifacts: Put 
> "https://storage.googleapis.com/cirrus-ci-5309429912436736-3271c9/artifacts/postgresql-cfbot/postgresql/6729936359129088/testrun/build/testrun/runningcheck.log?X-Goog-Algorithm=GOOG4-RSA-SHA256=cirrus-ci%40cirrus-ci-community.iam.gserviceaccount.com%2F20230206%2Fauto%2Fstorage%2Fgoog4_request=20230206T124536Z=600=host%3Bx-goog-content-length-range%3Bx-goog-meta-created_by_task=8e84192cbc754180b8baa6c00c41b463f580fe7183f0e7113c253aac13cc2458b835caef7940b91e102e96d54cff2b5714c77390e74244e2fb88c00c9957a801e33cbee2ac960e0db8a01fe08ee945bedf4616881e6beafa3a162c22948ac0b9a9359d93e1f461fc9f49385b784b75d633f1b01805b987d9d53bc7fb55263917ec85180a2140659d50990f066160f03e8bb8984e8d2aadb64c875c253167cf24da152a18d69fcd3d941edce145931e4feb23dc8cf43de7b7bbfc565786c1c692406f2a0a127f30385a8c4b66f96709b51d26d3c71617991c731b0e7206ee3906338dedf6359412edd024f8c76bd33400f4c9320c2bde9512fa8bcd6289e54d52":
>  http2: request body larger than specified content length
> 
> I'm testing adding a pg_ctl stop to the on_failure right now.

Pushed the fix doing so.

Re: Temporary tables versus wraparound... again

2023-02-06 Thread Andres Freund

Hi,

On 2023-02-05 15:30:57 -0800, Andres Freund wrote:
> Hm, I suspect the problem is that we didn't shut down the server due to
> the error, so the log file was changing while cirrus was trying to
> upload.

Pushed a fix for that.

Greetings,

Andres Freund

Re: gokiburi versus the back branches

2023-02-06 Thread Michael Paquier

On Mon, Feb 06, 2023 at 06:27:50PM -0500, Tom Lane wrote:
> Is it time to back-patch that commit?  The alternative would be
> to run the animal with an ASLR-disabling environment variable.
> On the whole I think testing that f3e78069d works is more
> useful than working around lack of it.

Yes, this is my intention as of this message from last week, once this
week's release is tagged:
https://www.postgresql.org/message-id/y9smhxo51hrxa...@paquier.xyz

Thanks,
--
Michael


signature.asc
Description: PGP signature

gokiburi versus the back branches

2023-02-06 Thread Tom Lane

I notice that Michael's new BF animal gokiburi is failing in
all the pre-v15 branches, though it's fine in v15 and HEAD.
It's evidently dying from ASLR effects because it's trying
to build with EXEC_BACKEND on Linux: there's lots of

2023-02-06 06:07:02.131 GMT [1503972] FATAL:  could not reattach to shared 
memory (key=813803, addr=0x8c3a5000): Invalid argument
2023-02-06 06:07:02.131 GMT [1503971] FATAL:  could not reattach to shared 
memory (key=813803, addr=0x8c3a5000): Invalid argument
2023-02-06 06:07:02.132 GMT [1503976] FATAL:  could not reattach to shared 
memory (key=813803, addr=0x8c3a5000): Invalid argument

in its logs.

The reason it's okay in v15 and up is presumably this:

Author: Thomas Munro 
Branch: master Release: REL_15_BR [f3e78069d] 2022-01-11 00:04:33 +1300

Make EXEC_BACKEND more convenient on Linux and FreeBSD.

Try to disable ASLR when building in EXEC_BACKEND mode, to avoid random
memory mapping failures while testing.  For developer use only, no
effect on regular builds.

Is it time to back-patch that commit?  The alternative would be
to run the animal with an ASLR-disabling environment variable.
On the whole I think testing that f3e78069d works is more
useful than working around lack of it.

regards, tom lane

Re: pg_upgrade test failure

2023-02-06 Thread Andres Freund

Hi,

On 2023-02-07 11:03:18 +1300, Thomas Munro wrote:
> On Tue, Feb 7, 2023 at 10:57 AM Andres Freund  wrote:
> > On February 6, 2023 1:51:20 PM PST, Thomas Munro  
> > wrote:
> > >Next up: the new "running" tests, spuriously failing around 8.8% of CI
> > >builds on FreeBSD.  I'll go and ping that thread...
> >
> > Is that rate unchanged? I thought I fixed the main issue last week?
> 
> Unfortunately my cfbot database only holds a week's history.

Would be interesting to increase that to a considerably longer time. I can't
imagine that that'd take all that much resources?


> What I see is that there were 1254 FreeBSD tasks run in that window, of
> which 163 failed, and (more interestingly) 111 of those failures succeeded
> on every other platform.  And clicking on a few on cfbot's page reveals that
> it's the new running stuff, and I'm still trying to find the interesting
> logs...

I think I figured out why the logs frequently fail to upload - the server is
still running, so the size changes during the upload, causing the upload to
fail with errors like:

[12:46:43.552] Failed to upload artifacts: Put 
"https://storage.googleapis.com/cirrus-ci-5309429912436736-3271c9/artifacts/postgresql-cfbot/postgresql/6729936359129088/testrun/build/testrun/runningcheck.log?X-Goog-Algorithm=GOOG4-RSA-SHA256=cirrus-ci%40cirrus-ci-community.iam.gserviceaccount.com%2F20230206%2Fauto%2Fstorage%2Fgoog4_request=20230206T124536Z=600=host%3Bx-goog-content-length-range%3Bx-goog-meta-created_by_task=8e84192cbc754180b8baa6c00c41b463f580fe7183f0e7113c253aac13cc2458b835caef7940b91e102e96d54cff2b5714c77390e74244e2fb88c00c9957a801e33cbee2ac960e0db8a01fe08ee945bedf4616881e6beafa3a162c22948ac0b9a9359d93e1f461fc9f49385b784b75d633f1b01805b987d9d53bc7fb55263917ec85180a2140659d50990f066160f03e8bb8984e8d2aadb64c875c253167cf24da152a18d69fcd3d941edce145931e4feb23dc8cf43de7b7bbfc565786c1c692406f2a0a127f30385a8c4b66f96709b51d26d3c71617991c731b0e7206ee3906338dedf6359412edd024f8c76bd33400f4c9320c2bde9512fa8bcd6289e54d52":
 http2: request body larger than specified content length

I'm testing adding a pg_ctl stop to the on_failure right now.

Greetings,

Andres Freund

Re: pg_upgrade test failure

2023-02-06 Thread Thomas Munro

On Tue, Feb 7, 2023 at 11:03 AM Thomas Munro  wrote:
> On Tue, Feb 7, 2023 at 10:57 AM Andres Freund  wrote:
> > On February 6, 2023 1:51:20 PM PST, Thomas Munro  
> > wrote:
> > >Next up: the new "running" tests, spuriously failing around 8.8% of CI
> > >builds on FreeBSD.  I'll go and ping that thread...
> >
> > Is that rate unchanged? I thought I fixed the main issue last week?
>
> Unfortunately my cfbot database only holds a week's history.  What I
> see is that there were 1254 FreeBSD tasks run in that window, of which
> 163 failed, and (more interestingly) 111 of those failures succeeded
> on every other platform.  And clicking on a few on cfbot's page
> reveals that it's the new running stuff, and I'm still trying to find
> the interesting logs...

Ah, that number might include some other problems, including in
subscription (#2900).  That's the problem with flapping tests, you get
desensitised and stop looking closely and miss things...

Re: pg_upgrade test failure

2023-02-06 Thread Thomas Munro

On Tue, Feb 7, 2023 at 10:57 AM Andres Freund  wrote:
> On February 6, 2023 1:51:20 PM PST, Thomas Munro  
> wrote:
> >Next up: the new "running" tests, spuriously failing around 8.8% of CI
> >builds on FreeBSD.  I'll go and ping that thread...
>
> Is that rate unchanged? I thought I fixed the main issue last week?

Unfortunately my cfbot database only holds a week's history.  What I
see is that there were 1254 FreeBSD tasks run in that window, of which
163 failed, and (more interestingly) 111 of those failures succeeded
on every other platform.  And clicking on a few on cfbot's page
reveals that it's the new running stuff, and I'm still trying to find
the interesting logs...

Re: OpenSSL 3.0.0 vs old branches

2023-02-06 Thread Tom Lane

Andrew Dunstan  writes:
> On 2023-02-06 Mo 11:13, Tom Lane wrote:
>> So presumably, changing this test would break it for OpenSSL 0.9.8,
>> which is still nominally supported in those branches.  On the other
>> hand, this test isn't run by default, so users would likely never
>> notice anyway.

> Presumably we don't have any buildfarm animals running with such old 
> versions of openssl, or they would be failing the same test on release 
> >= 13.

That test isn't run by default in the buildfarm either, no?

But indeed, probably nobody in the community is testing such builds
at all.  I did have such setups on my old dinosaur BF animals, but
they bit the dust last year for unrelated reasons.  I wonder how
realistic it is to claim that we still support those old OpenSSL
versions.

regards, tom lane

Re: pg_upgrade test failure

2023-02-06 Thread Andres Freund

Hi, 

On February 6, 2023 1:51:20 PM PST, Thomas Munro  wrote:
>Next up: the new "running" tests, spuriously failing around 8.8% of CI
>builds on FreeBSD.  I'll go and ping that thread...

Is that rate unchanged? I thought I fixed the main issue last week?

Greetings,

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: pg_upgrade test failure

2023-02-06 Thread Thomas Munro

On Wed, Feb 1, 2023 at 2:44 PM Thomas Munro  wrote:
> OK, I pushed that.  Third time lucky?

I pulled down logs for a week of Windows CI, just over ~1k builds.
The failure rate was a few per day before, but there are no failures
like that after that went in.  There are logs that contain the
"Directory not empty" warning, but clearly the
try-again-and-this-time-wait-for-the-other-process logic must be
working (as horrible as it is) because then the test checks that the
directory is gone, and succeeds.  Hooray.

So that's one of our biggest CI flappers fixed.  Unfortunately without
treating the root cause, really.

Next up: the new "running" tests, spuriously failing around 8.8% of CI
builds on FreeBSD.  I'll go and ping that thread...

Re: run pgindent on a regular basis / scripted manner

2023-02-06 Thread Tom Lane

Andres Freund  writes:
> On 2023-02-06 18:17:02 +0100, Peter Eisentraut wrote:
>> First, as a matter of principle, it would introduce another level of
>> gatekeeping power.  Right now, the committers are as a group in charge of
>> what gets into the tree.  Adding commit hooks that are installed
>> somewhere(?) by someone(?) and can only be seen by some(?) would upset that.
>> If we were using something like github or gitlab (not suggesting that, but
>> for illustration), then you could put this kind of thing under .github/ or
>> similar and then it would be under the same control as the source code
>> itself.

> Well, we did talk about adding a pre-commit hook to the repository, with
> instructions for how to enable it. And I don't see a problem with adding the
> pre-receive we're discussing here to src/tools/something.

Yeah.  I don't think we are seriously considering putting any restrictions
in place on gitmaster --- the idea is to offer better tools to committers
to let them check/fix the indentation of what they are working on.  If
somebody wants to run that as a local pre-commit hook, that's their choice.

regards, tom lane

Re: run pgindent on a regular basis / scripted manner

2023-02-06 Thread Andres Freund

Hi,

On 2023-02-06 18:17:02 +0100, Peter Eisentraut wrote:
> First, as a matter of principle, it would introduce another level of
> gatekeeping power.  Right now, the committers are as a group in charge of
> what gets into the tree.  Adding commit hooks that are installed
> somewhere(?) by someone(?) and can only be seen by some(?) would upset that.
> If we were using something like github or gitlab (not suggesting that, but
> for illustration), then you could put this kind of thing under .github/ or
> similar and then it would be under the same control as the source code
> itself.

Well, we did talk about adding a pre-commit hook to the repository, with
instructions for how to enable it. And I don't see a problem with adding the
pre-receive we're discussing here to src/tools/something.

> Also, pgindent takes tens of seconds to run, so hooking that into the git
> push process would slow this down quite a bit.  And maybe we want to add
> pgperltidy and so on, where would this lead?

Yes, I've been annoyed by this as well. This is painful, even without a push
hook.  Not just for pgindent, headerscheck/cpluspluscheck are quite painful as
well. I came to the conclusion that we ought to integrate pgindent etc into
the buildsystem properly. Instead of running such targets serially across all
files, without logic to prevent re-processing files, the relevant targets
ought to be run once for each process, and create a stamp file.

> If somehow your local indenting doesn't give you the "correct" result for
> some reason, you might sit there for minutes and minutes trying to fix and
> push and fix and push.

I was imagining that such a pre-receive hook would spit out the target that
you'd need to run locally to verify that the issue is resolved.

> Then, consider the typedefs issue.  If you add a typedef but don't add it to
> the typedefs list but otherwise pgindent your code perfectly, the push would
> be accepted.  If then later someone updates the typedefs list, perhaps from
> the build farm, it would then reject the indentation of your previously
> committed code, thus making it their problem.

I'd like to address this one via the buildsystem as well. We can do the
trickery that the buildfarm uses to extract typedefs as part of the build, and
update typedefs.list with the additional types.  There's really no need to
force us to do this manually.

Greetings,

Andres Freund

Re: OpenSSL 3.0.0 vs old branches

2023-02-06 Thread Andrew Dunstan



On 2023-02-06 Mo 11:13, Tom Lane wrote:

Andrew Dunstan  writes:

I recently moved crake to a new machine running Fedora 36, which has
OpenSSL 3.0.0. This causes the SSL tests to fail on branches earlier
than release 13, so I propose to backpatch commit f0d2c65f17 to the
release 11 and 12 branches.

Hmm ... according to that commit message,

   Note that the minimum supported OpenSSL version is 1.0.1 as of
   7b283d0e1d1d79bf1c962d790c94d2a53f3bb38a, so this does not introduce
   any new version requirements.

So presumably, changing this test would break it for OpenSSL 0.9.8,
which is still nominally supported in those branches.  On the other
hand, this test isn't run by default, so users would likely never
notice anyway.

On the whole, +1 for doing this (after the release freeze lifts).





Presumably we don't have any buildfarm animals running with such old 
versions of openssl, or they would be failing the same test on release 
>= 13.



I'll push this in due course.


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Re: Allow tailoring of ICU locales with custom rules

2023-02-06 Thread Peter Eisentraut


On 04.02.23 14:41, Daniel Verite wrote:

However it still leaves "daticurules" empty in the destination db,
because of an actual bug in the current patch.

Looking at createdb() in commands.c, it creates this variable:

@@ -711,6 +714,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
char   *dbcollate = NULL;
char   *dbctype = NULL;
char   *dbiculocale = NULL;
+   char   *dbicurules = NULL;
chardblocprovider = '\0';
char   *canonname;
int encoding = -1;

and then reads it later

@@ -1007,6 +1017,8 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
dblocprovider = src_locprovider;
if (dbiculocale == NULL && dblocprovider == COLLPROVIDER_ICU)
dbiculocale = src_iculocale;
+   if (dbicurules == NULL && dblocprovider == COLLPROVIDER_ICU)
+   dbicurules = src_icurules;
  
	/* Some encodings are client only */

if (!PG_VALID_BE_ENCODING(encoding))

but it forgets to assign it in between, so it stays NULL and src_icurules
is taken instead.


Right.  Here is a new patch with this fixed.
From 7ca76032097397d685a313500c96a41b2c38ecc6 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 6 Feb 2023 21:58:24 +0100
Subject: [PATCH v4] Allow tailoring of ICU locales with custom rules

This exposes the ICU facility to add custom collation rules to a
standard collation.

Discussion: 
https://www.postgresql.org/message-id/flat/821c71a4-6ef0-d366-9acf-bb8e367f7...@enterprisedb.com
---
 doc/src/sgml/catalogs.sgml| 18 +++
 doc/src/sgml/ref/create_collation.sgml| 22 
 doc/src/sgml/ref/create_database.sgml | 12 +
 src/backend/catalog/pg_collation.c|  5 ++
 src/backend/commands/collationcmds.c  | 23 +++--
 src/backend/commands/dbcommands.c | 51 +--
 src/backend/utils/adt/pg_locale.c | 41 ++-
 src/backend/utils/init/postinit.c | 11 +++-
 src/include/catalog/pg_collation.h|  2 +
 src/include/catalog/pg_database.h |  3 ++
 src/include/utils/pg_locale.h |  1 +
 .../regress/expected/collate.icu.utf8.out | 30 +++
 src/test/regress/sql/collate.icu.utf8.sql | 13 +
 13 files changed, 222 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c1e4048054..746baf5053 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2428,6 +2428,15 @@ pg_collation 
Columns
   
  
 
+ 
+  
+   collicurules text
+  
+  
+   ICU collation rules for this collation object
+  
+ 
+
  
   
collversion text
@@ -3106,6 +3115,15 @@ pg_database 
Columns
   
  
 
+ 
+  
+   daticurules text
+  
+  
+   ICU collation rules for this database
+  
+ 
+
  
   
datcollversion text
diff --git a/doc/src/sgml/ref/create_collation.sgml 
b/doc/src/sgml/ref/create_collation.sgml
index 58f5f0cd63..2c7266107e 100644
--- a/doc/src/sgml/ref/create_collation.sgml
+++ b/doc/src/sgml/ref/create_collation.sgml
@@ -27,6 +27,7 @@
 [ LC_CTYPE = lc_ctype, ]
 [ PROVIDER = provider, ]
 [ DETERMINISTIC = boolean, ]
+[ RULES = rules, ]
 [ VERSION = version ]
 )
 CREATE COLLATION [ IF NOT EXISTS ] name FROM 
existing_collation
@@ -149,6 +150,19 @@ Parameters
  
 
 
+
+ rules
+
+ 
+  
+   Specifies additional collation rules to customize the behavior of the
+   collation.  This is supported for ICU only.  See https://unicode-org.github.io/icu/userguide/collation/customization/"/>
+   for details on the syntax.
+  
+ 
+
+
 
  version
 
@@ -228,6 +242,14 @@ Examples
 
   
 
+  
+   To create a collation using the ICU provider, based on the English ICU
+   locale, with custom rules:
+
+
+
+  
+
   
To create a collation from an existing collation:
 
diff --git a/doc/src/sgml/ref/create_database.sgml 
b/doc/src/sgml/ref/create_database.sgml
index f3df2def86..860211e4d6 100644
--- a/doc/src/sgml/ref/create_database.sgml
+++ b/doc/src/sgml/ref/create_database.sgml
@@ -192,6 +192,18 @@ Parameters
   
  
 
+ 
+  icu_rules
+  
+   
+Specifies additional collation rules to customize the behavior of the
+collation.  This is supported for ICU only.  See https://unicode-org.github.io/icu/userguide/collation/customization/"/>
+for details on the syntax.
+   
+  
+ 
+
  
   locale_provider
 
diff --git a/src/backend/catalog/pg_collation.c 
b/src/backend/catalog/pg_collation.c
index 287b13725d..fd022e6fc2 100644
--- a/src/backend/catalog/pg_collation.c
+++ b/src/backend/catalog/pg_collation.c
@@ -50,6 +50,7 @@ CollationCreate(const char *collname, Oid collnamespace,

Re: run pgindent on a regular basis / scripted manner

2023-02-06 Thread Andrew Dunstan



On 2023-02-06 Mo 12:17, Peter Eisentraut wrote:

On 02.02.23 07:40, Tom Lane wrote:

Noah Misch  writes:
Regarding the concern about a pre-receive hook blocking an emergency 
push, the
hook could approve every push where a string like "pgindent: no" 
appears in a
commit message within the push.  You'd still want to make the tree 
clean
sometime the same week or so.  It's cheap to provide a break-glass 
like that.


I think the real question here is whether we can get all (or at least
a solid majority of) committers to accept such draconian constraints.
I'd buy into it, and evidently so would you, but I can't help noting
that less than a quarter of active committers have bothered to
comment on this thread.  I suspect the other three-quarters would
be quite annoyed if we tried to institute such requirements. That's
not manpower we can afford to drive away.


I have some concerns about this.

First, as a matter of principle, it would introduce another level of 
gatekeeping power.  Right now, the committers are as a group in charge 
of what gets into the tree.  Adding commit hooks that are installed 
somewhere(?) by someone(?) and can only be seen by some(?) would upset 
that.  If we were using something like github or gitlab (not 
suggesting that, but for illustration), then you could put this kind 
of thing under .github/ or similar and then it would be under the same 
control as the source code itself.


Also, pgindent takes tens of seconds to run, so hooking that into the 
git push process would slow this down quite a bit.  And maybe we want 
to add pgperltidy and so on, where would this lead?  If somehow your 
local indenting doesn't give you the "correct" result for some reason, 
you might sit there for minutes and minutes trying to fix and push and 
fix and push.



Well, pgindent should produce canonical results or we're surely doing it 
wrong. Regarding the time it takes, if we are only indenting the changed 
files that time will be vastly reduced for most cases.


But I take your point to some extent. I think we should start by making 
it easier and quicker to run pgindent locally, both by hand and in local 
git hooks, for ordinary developers and for committers, and we should 
encourage committers to be stricter in their use of pgindent. If there 
are features we need to make this possible, speak up (c.f. Robert's 
email earlier today). I'm committed to making this as easy as possible 
for people.


Once we get over those hurdles we can possibly revisit automation.




Then, consider the typedefs issue.  If you add a typedef but don't add 
it to the typedefs list but otherwise pgindent your code perfectly, 
the push would be accepted.  If then later someone updates the 
typedefs list, perhaps from the build farm, it would then reject the 
indentation of your previously committed code, thus making it their 
problem.



It would be nice if there were a gadget that would find new typedefs and 
warn you about them. Unfortunately our current code to find typedefs 
isn't all that fast either. Nicer still would be a way of not needing 
the typedefs list, but I don't think anyone has come up with one yet 
that meets our other requirements.





I think a better way to address these issues would be making this into 
a test suite, so that you can run some command that checks "is 
everything indented correctly".  Then you can run this locally, on the 
build farm, in the cfbot etc. in a uniform way and apply the existing 
blaming/encouragement processes like for any other test failure.





Well arguably the new --silent-diff and --show-diff modes are such tests :-)


cheers


andrew


--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

2023-02-06 Thread Andres Freund

Hi,

On 2023-02-04 11:10:55 -0800, Peter Geoghegan wrote:
> On Sat, Feb 4, 2023 at 2:57 AM Andres Freund  wrote:
> > Is there a good way to make breakage in the page recycling mechanism
> > visible with gist? I guess to see corruption, I'd have to halt a scan
> > before a page is visited with gdb, then cause the page to be recycled
> > prematurely in another session, then unblock the first? Which'd then
> > visit that page, thinking it to be in a different part of the tree than
> > it actually is?
> 
> Yes. This bug is similar to an ancient nbtree bug fixed back in 2012,
> by commit d3abbbeb.
> 
> > which clearly doesn't seem right.
> >
> > I just can't quite judge how bad that is.
> 
> It's really hard to judge, even if you're an expert. We're talking
> about a fairly chaotic scenario. My guess is that there is a very
> small chance of a very unpleasant scenario if you have a GiST index
> that has regular page deletions, and if you use
> vacuum_defer_cleanup_age. It's likely that most GiST indexes never
> have any page deletions due to the workload characteristics.

Thanks.

Sounds like a problem here is too hard to repro. I mostly wanted to know how
to be more confident about a fix working correctly. There's no tests for the
whole page recycling behaviour, afaics, so it's a bit scary to change things
around.

I didn't quite feel confident pushing a fix for this just before a minor
release, so I'll push once the minor releases are tagged. A quite minimal fix
to GetFullRecentGlobalXmin() in 12-13 (returning FirstNormalTransactionId if
epoch == 0 and RecentGlobalXmin > nextxid_xid), and the slightly larger fix in
14+.

Greetings,

Andres Freund

Re: Exit walsender before confirming remote flush in logical replication

2023-02-06 Thread Andres Freund

Hi,

On 2023-02-06 12:23:54 +0530, Amit Kapila wrote:
> On Mon, Feb 6, 2023 at 10:33 AM Andres Freund  wrote:
> > Smart shutdown is practically unusable. I don't think it makes sense to tie 
> > behavior of walsender to it in any way.
> >
> 
> So, we have the following options: (a) do nothing for this; (b)
> clarify the current behavior in docs. Any suggestions?

b) seems good.

I also think it'd make sense to improve this on a code-level. Just not in the
wholesale way discussed so far.

How about we make it an option in START_REPLICATION? Delayed logical rep can
toggle that on by default.

Greetings,

Andres Freund

Re: SLRUs in the main buffer pool - Page Header definitions

2023-02-06 Thread Andres Freund

Hi,

> From 098f37c0a17fc32a94bff43817414e01fcfb234f Mon Sep 17 00:00:00 2001
> From: Rishu Bagga 
> Date: Thu, 15 Sep 2022 00:55:25 +
> Subject: [PATCH] slru to buffercache + page headers + upgrade
> 
> ---
>  contrib/amcheck/verify_nbtree.c |2 +-
>  [...]
>  65 files changed, 2176 insertions(+), 3258 deletions(-)

Unfortunately a quite large patch, with this little explanation, is hard to
review. I could read through the entire thread to try to figure out what this
is doing, but I really shouldn't have to.

You're changing quite fundamental APIs across the tree. Why is that required
for the topic at hand? Why is it worth doing that, given it'll break loads of
extensions?

Greetings,

Andres Freund

Re: Pluggable toaster

2023-02-06 Thread Nikita Malakhov

Hi,

It'll be great to see more opinions on the approach as a whole.

>The problem that I see being raised here, is that there was little
>discussion and no observed community consensus about the design of
>this complex feature *before* this patch with high complexity was
>provided.
>The next action that was requested is to take a step back and decide
>how we would want to implement type-aware TOASTing (and the associated
>patch compression dictionaries) *before* we look into the type-aware
>toasting.

We decided to put this improvement as a patch because we thought
that the most complex and questionable part would be the TOAST
implementations (the Toasters) itself, and the Pluggable TOAST is
just a tool to make plugging different TOAST implementations clean
and simple.

--
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Re: pglz compression performance, take two

2023-02-06 Thread Andrey Borodin

On Mon, Feb 6, 2023 at 11:57 AM Tomas Vondra
 wrote:
>
> I wonder what that means for the patch. I haven't investigated this at
> all, but it seems as if the optimization means we fail to find a match,
> producing a tad larger output. That may be still be a good tradeoff, as
> long as the output is correct (assuming it doesn't break some promise
> regarding expected output).
>

Yes, patch produces correct results and faster. And keeps the
compression ratio the same except for some one odd case.
The only problem is I do not understand _why_ it happens in that odd
case. And so far I failed to extract input\outputs of that odd case,
because it is buried under so many layers of abstraction and affects
only late stats.
Maybe the problem is not in compression at all...

Best regards, Andrey Borodin.

Re: Pluggable toaster

2023-02-06 Thread Matthias van de Meent

On Mon, 6 Feb 2023 at 15:38, Aleksander Alekseev
 wrote:
> I would like to point out however that there were several other pieces
> of feedback that could have been missed:
>
> * No one wants to see this as an extension. This was my original
> proposal (adding ZSON to /contrib/) and it was rejected. The community
> explicitly wants this to be a core feature with its syntax,
> autocompletion, documentation and stuff.

I believe that is a misrepresentation of the situation. ZSON had
(has?) several systemic issues and could not be accepted to /contrib/
in the way it was implemented, and it was commented that it would make
sense that the feature of compression assisted by dictionaries would
be implemented in core. Yet still, that feature is only closely
related to pluggable TOAST, but it is not the same as making TOAST
pluggable.

> * The community wants the feature to have a simple implementation. You
> said yourself that the idea of type-aware TOASTers is very invasive,
> and I completely agree.

I'm not sure that this is correct either. Compression (and TOAST) is
inherently complex, and I don't think we should reject improvements
because they are complex.
The problem that I see being raised here, is that there was little
discussion and no observed community consensus about the design of
this complex feature *before* this patch with high complexity was
provided.
The next action that was requested is to take a step back and decide
how we would want to implement type-aware TOASTing (and the associated
patch compression dictionaries) *before* we look into the type-aware
toasting.

> * People also want this to be simple from the user perspective, as
> simple as just CREATE COMPRESSED TABLE ... [USING lz4|zstd];

Could you provide a reference to this? I have yet to see a COMPRESSED
TABLE feature or syntax, let alone users asking for TOAST to be as
easily usable as that feature or syntax.

Kind regards,

Matthias van de Meent

Re: Pluggable toaster

2023-02-06 Thread Matthias van de Meent

On Mon, 6 Feb 2023 at 20:24, Andres Freund  wrote:
>
> Hi,
>
> On 2023-02-06 16:38:01 +0300, Nikita Malakhov wrote:
> > So we decided to reduce changes in the core to the minimum necessary
> > to make it available through the hooks, because the hooks part is very
> > lightweight and simple to keep rebasing onto the vanilla core.
>
> At least I don't think we should accept such hooks. I don't think I am alone
> in that.

+1

Assuming type-aware TOASTing is the goal, I don't think hooks are the
right design to implement that.

-Matthias van de Meent

Re: pglz compression performance, take two

2023-02-06 Thread Tomas Vondra




On 2/6/23 03:00, Andrey Borodin wrote:
> On Sun, Feb 5, 2023 at 5:51 PM Tomas Vondra
>  wrote:
>>
>> On 2/5/23 19:36, Andrey Borodin wrote:
>>> On Fri, Jan 6, 2023 at 10:02 PM Andrey Borodin  
>>> wrote:

 Hello! Please find attached v8.
>>>
>>> I got some interesting feedback from some patch users.
>>> There was an oversight that frequently yielded results that are 1,2 or
>>> 3 bytes longer than expected.
>>> Looking closer I found that the correctness of the last 3-byte tail is
>>> checked in two places. PFA fix for this. Previously compressed data
>>> was correct, however in some cases few bytes longer than the result of
>>> current pglz implementation.
>>>
>>
>> Thanks. What were the consequences of the issue? Lower compression
>> ratio, or did we then fail to decompress the data (or would current pglz
>> implementation fail to decompress it)?
>>
> The data was decompressed fine. But extension tests (Citus's columnar
> engine) hard-coded a lot of compression ratio stuff.

OK. Not sure I'd blame the patch for these failures, as long as long as
the result is still correct and can be decompressed. I'm not aware of a
specification of what the compression must (not) produce.

> And there is still 1 more test where optimized version produces 1 byte
> longer output. I'm trying to find it, but with no success yet.
> 
> There are known and documented cases when optimized pglz version would
> do so. good_match without 10-division and memcmp by 4 bytes. But even
> disabling this, still observing 1-byte longer compression results
> persists... The problem is the length is changed after deleting some
> data, so compression of that particular sequence seems to be somewhere
> far away.
> It was funny at the beginning - to hunt for 1 byte. But the weekend is
> ending, and it seems that byte slipped from me again...
> 

I wonder what that means for the patch. I haven't investigated this at
all, but it seems as if the optimization means we fail to find a match,
producing a tad larger output. That may be still be a good tradeoff, as
long as the output is correct (assuming it doesn't break some promise
regarding expected output).

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Non-superuser subscription owners

2023-02-06 Thread Robert Haas

On Mon, Feb 6, 2023 at 2:18 PM Andres Freund  wrote:
> It's decidedly not great, yes. I don't know if it's quite a CVE type issue,
> after all, the same is true for any other type of query the superuser
> executes. But at the very least the documentation needs to be better, with a
> big red box making sure the admin is aware of the problem.

I don't think that's the same thing at all. A superuser executing a
query interactively can indeed cause all sorts of bad things to
happen, but you don't have to log in as superuser and run DML queries
on tables owned by unprivileged users, and you shouldn't.

But what we're talking about here is -- the superuser comes along and
sets up logical replication in the configuration in what seems to be
exactly the way it was intended to be used, and now any user who can
log into the subscriber node can become superuser for free whenever
they want, without the superuser doing anything at all, even logging
in. Saying it's "not ideal" seems like you're putting it in the same
category as "the cheese got moldy in the fridge" but to me it sounds
more like "the fridge exploded and the house is on fire."

If we were to document this, I assume that the warning we would add to
the documentation would look like this:

<-- begin documentation text -->
Pretty much don't ever use logical replication. In any normal
configuration, it lets every user on your system escalate to superuser
whenever they want. It is possible to make it safe, if you make sure
all the tables on the replica are owned by the superuser and none of
them have any triggers, defaults, expression indexes, or anything else
associated with them that might execute any code while replicating.
But notice that this makes logical replication pretty much useless for
one of its intended purposes, which is high availability, because if
you actually fail over, you're going to then have to change the owners
of all of those tables and apply any missing triggers, defaults,
expression indexes, or anything like that which you may want to have.
And then to fail back you're going to have to remove all of that stuff
again and once again make the tables superuser-owned. That's obviously
pretty impractical, so you probably shouldn't use logical replication
at all until we get around to fixing this. You might wonder why we
implemented a feature that can't be used in any kind of normal way
without completely and totally breaking your system security -- but
don't ask us, we don't know, either!
<-- end documentation text -->

Honestly, this makes the CREATEROLE exploit that I fixed recently in
master look like a walk in the park. Sure, it's a pain for service
providers, who might otherwise use it, but most normal users don't and
wouldn't no matter how it worked, and really are not going to care.
But people do use logical replication, and it seems to me that the
issue you're describing here means that approximately 100% of those
installations have a vulnerability allowing any local user who owns a
table or can create one to escalate to superuser. Far from being not
quite a CVE issue, that seems substantially more serious than most
things we get CVEs for.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [PATCH] Compression dictionaries for JSONB

2023-02-06 Thread Andres Freund

Hi,

On 2023-02-06 16:16:41 +0100, Matthias van de Meent wrote:
> On Mon, 6 Feb 2023 at 15:03, Aleksander Alekseev
>  wrote:
> >
> > Hi,
> >
> > I see your point regarding the fact that creating dictionaries on a
> > training set is too beneficial to neglect it. Can't argue with this.
> >
> > What puzzles me though is: what prevents us from doing this on a page
> > level as suggested previously?
> 
> The complexity of page-level compression is significant, as pages are
> currently a base primitive of our persistency and consistency scheme.

+many

It's also not all a panacea performance-wise, datum-level decompression can
often be deferred much longer than page level decompression. For things like
json[b], you'd hopefully normally have some "pre-filtering" based on proper
columns, before you need to dig into the json datum.

It's also not necessarily that good, compression ratio wise. Particularly for
wider datums you're not going to be able to remove much duplication, because
there's only a handful of tuples. Consider the case of json keys - the
dictionary will often do better than page level compression, because it'll
have the common keys in the dictionary, which means the "full" keys never will
have to appear on a page, whereas page-level compression will have the keys on
it, at least once.

Of course you can use a dictionary for page-level compression too, but the
gains when it works well will often be limited, because in most OLTP usable
page-compression schemes I'm aware of, you can't compress a page all that far
down, because you need a small number of possible "compressed page sizes".

> > More similar data you compress the more space and disk I/O you save.
> > Additionally you don't have to compress/decompress the data every time
> > you access it. Everything that's in shared buffers is uncompressed.
> > Not to mention the fact that you don't care what's in pg_attribute,
> > the fact that schema may change, etc. There is a table and a
> > dictionary for this table that you refresh from time to time. Very
> > simple.
> 
> You cannot "just" refresh a dictionary used once to compress an
> object, because you need it to decompress the object too.

Right. That's what I was trying to refer to when mentioning that we might need
to add a bit of additional information to the varlena header for datums
compressed with a dictionary.

Greetings,

Andres Freund

Re: Pluggable toaster

2023-02-06 Thread Andres Freund

Hi,

On 2023-02-06 16:38:01 +0300, Nikita Malakhov wrote:
> So we decided to reduce changes in the core to the minimum necessary
> to make it available through the hooks, because the hooks part is very
> lightweight and simple to keep rebasing onto the vanilla core.

At least I don't think we should accept such hooks. I don't think I am alone
in that.

Greetings,

Andres Freund

Re: GUCs to control abbreviated sort keys

2023-02-06 Thread Thomas Munro

On Fri, Jan 27, 2023 at 12:29 PM Jeff Davis  wrote:
> I am using these GUCs for testing the various collation paths in my
> collation refactoring branch.

Speaking of testing, has anyone ever tried porting Tom's random test
program[1] to ICU?

[1] https://www.postgresql.org/message-id/31913.1458747...@sss.pgh.pa.us

Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)

2023-02-06 Thread Andres Freund

Hi,

On 2023-02-06 19:51:19 +0100, Tomas Vondra wrote:
> > No. The only thing the machine is doing is
> > 
> >   while /usr/bin/true; do
> > make check
> >   done
> > 
> > I can't reduce the workload further, because the "join" test is in a
> > separate parallel group (I cut down parallel_schedule). I could make the
> > machine busier, of course.
> > 
> > However, the other lockup I saw was when using serial_schedule, so I
> > guess lower concurrency makes it more likely.
> > 
> 
> FWIW the machine is now on run ~2700 without any further lockups :-/
> 
> Seems it was quite lucky we hit it twice in a handful of attempts.

Did you cut down the workload before you reproduced it the first time, or
after? It's quite possible that it's not reproducible in isolation.

Greetings,

Andres Freund

Re: Non-superuser subscription owners

2023-02-06 Thread Andres Freund

Hi,

On 2023-02-06 14:07:39 -0500, Robert Haas wrote:
> On Fri, Feb 3, 2023 at 3:47 AM Andres Freund  wrote:
> > On 2023-02-02 09:28:03 -0500, Robert Haas wrote:
> > > I don't know what you mean by this. DML doesn't confer privileges. If
> > > code gets executed and runs with the replication user's credentials,
> > > that could lead to privilege escalation, but just moving rows around
> > > doesn't, at least not in the database sense.
> >
> > Executing DML ends up executing code. Think predicated/expression
> > indexes, triggers, default expressions etc. If a badly written trigger
> > etc can be tricked to do arbitrary code exec, an attack will be able to
> > run with the privs of the run-as user.  How bad that is is influenced to
> > some degree by the amount of privileges that user has.
> 
> I spent some time studying this today. I think you're right. What I'm
> confused about is: why do we consider this situation even vaguely
> acceptable? Isn't this basically an admission that our logical
> replication security model is completely and totally broken and we
> need to fix it somehow and file for a CVE number? Like, in released
> branches, you can't even have a subscription owned by a non-superuser.
> But any non-superuser can set a default expression or create an enable
> always trigger and sure enough, if that table is replicated, the
> system will run that trigger as the subscription owner, who is a
> superuser. Which AFAICS means that if a non-superuser owns a table
> that is part of a subscription, they can instantly hack superuser.
> Which seems, uh, extremely bad. Am I missing something?

It's decidedly not great, yes. I don't know if it's quite a CVE type issue,
after all, the same is true for any other type of query the superuser
executes. But at the very least the documentation needs to be better, with a
big red box making sure the admin is aware of the problem.

I think we need some more fundamental ways to deal with this issue, including
but not restricted to the replication context. Some potentially relevant
discussion is in this thread:
https://postgr.es/m/75b0dbb55e9febea54c441efff8012a6d2cb5bd7.camel%40j-davis.com

I don't agree with Jeff's proposal, but I think there's some worthwhile ideas
in the idea + followups.


> And then, in master, where there's some provision for non-superuser
> subscription owners, we maybe need to re-think the privileges required
> to replicate into a table in the first place. I don't think that
> having I/U/D permissions on a table is really sufficient to justify
> performing those operations *as the table owner*; perhaps the check
> ought to be whether you have the privileges of the table owner.

Yes, I think we ought to check role membership, including non-inherited
memberships.

Greetings,

Andres Freund

Re: Non-superuser subscription owners

2023-02-06 Thread Robert Haas

On Fri, Feb 3, 2023 at 3:47 AM Andres Freund  wrote:
> On 2023-02-02 09:28:03 -0500, Robert Haas wrote:
> > I don't know what you mean by this. DML doesn't confer privileges. If
> > code gets executed and runs with the replication user's credentials,
> > that could lead to privilege escalation, but just moving rows around
> > doesn't, at least not in the database sense.
>
> Executing DML ends up executing code. Think predicated/expression
> indexes, triggers, default expressions etc. If a badly written trigger
> etc can be tricked to do arbitrary code exec, an attack will be able to
> run with the privs of the run-as user.  How bad that is is influenced to
> some degree by the amount of privileges that user has.

I spent some time studying this today. I think you're right. What I'm
confused about is: why do we consider this situation even vaguely
acceptable? Isn't this basically an admission that our logical
replication security model is completely and totally broken and we
need to fix it somehow and file for a CVE number? Like, in released
branches, you can't even have a subscription owned by a non-superuser.
But any non-superuser can set a default expression or create an enable
always trigger and sure enough, if that table is replicated, the
system will run that trigger as the subscription owner, who is a
superuser. Which AFAICS means that if a non-superuser owns a table
that is part of a subscription, they can instantly hack superuser.
Which seems, uh, extremely bad. Am I missing something?

Based on other remarks you made upthread, it seems like we ought to be
doing the actual replication as the table owner, since the table owner
has to be prepared for executable code attached to the table to be
re-run on rows in the table at any table when somebody does a REINDEX.
And then, in master, where there's some provision for non-superuser
subscription owners, we maybe need to re-think the privileges required
to replicate into a table in the first place. I don't think that
having I/U/D permissions on a table is really sufficient to justify
performing those operations *as the table owner*; perhaps the check
ought to be whether you have the privileges of the table owner.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: undersized unions

2023-02-06 Thread Robert Haas

On Mon, Feb 6, 2023 at 1:28 PM Andres Freund  wrote:
> Perhaps something like

Yeah, that'd work. You'd want a big ol' warning comment here:

> typedef struct NumericData
> {
> int32   vl_len_;/* varlena header (do not 
> touch directly!) */
> NumericBase data;
> } NumericData;

like /* actually NumericShort or NumericLong */ or something

> Although I'd actually be tempted to rip out all the casts but NUMERIC_LONG()
> in this case, because it's all all just accessing the base n_header anyway.

Yeah.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)

2023-02-06 Thread Tomas Vondra

On 1/29/23 19:08, Tomas Vondra wrote:
> 
> 
> On 1/29/23 18:53, Andres Freund wrote:
>> Hi,
>>
>> On 2023-01-29 18:39:05 +0100, Tomas Vondra wrote:
>>> Will do, but I'll wait for another lockup to see how frequent it
>>> actually is. I'm now at ~90 runs total, and it didn't happen again yet.
>>> So hitting it after 15 runs might have been a bit of a luck.
>>
>> Was there a difference in how much load there was on the machine between
>> "reproduced in 15 runs" and "not reproed in 90"?  If indeed lack of barriers
>> is related to the issue, an increase in context switches could substantially
>> change the behaviour (in both directions).  More intra-process context
>> switches can amount to "probabilistic barriers" because that'll be a
>> barrier. At the same time it can make it more likely that the relatively
>> narrow window in WaitEventSetWait() is hit, or lead to larger delays
>> processing signals.
>>
> 
> No. The only thing the machine is doing is
> 
>   while /usr/bin/true; do
> make check
>   done
> 
> I can't reduce the workload further, because the "join" test is in a
> separate parallel group (I cut down parallel_schedule). I could make the
> machine busier, of course.
> 
> However, the other lockup I saw was when using serial_schedule, so I
> guess lower concurrency makes it more likely.
> 

FWIW the machine is now on run ~2700 without any further lockups :-/

Seems it was quite lucky we hit it twice in a handful of attempts.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: GUCs to control abbreviated sort keys

2023-02-06 Thread Peter Geoghegan

On Fri, Jan 27, 2023 at 12:34 PM Jeff Davis  wrote:
> It is non-deterministic, but I tried with two generated files, and got
> similar results.

Jeff and I coordinated off-list. It turned out that the
nondeterministic nature of the program to generate test data was
behind my initial inability to recreate Jeff's results. Once Jeff
provided me with the exact data that he saw the problem with, I was
able to recreate the problematic case for abbreviated keys.

It turns out that this was due to aborting abbreviation way too late
in the process. It would happen relatively late in the process, when
more than 50% of all tuples had already had abbreviations generated by
ICU. This was a marginal case for abbreviated keys, which is precisely
why it only happened this long into the process. That factor is also
likely why I couldn't recreate the problem at first, even though I had
test data that was substantially the same as the data required to show
the problem.

Attached patch fixes the issue. It teaches varstr_abbrev_abort to do
something similar to every other abbreviated keys abort function: stop
estimating cardinality entirely (give up on giving up) once there are
a certain number of distinct abbreviated keys, regardless of any other
factor.

This is very closely based on existing code from numeric_abbrev_abort,
though I use a cutoff of 10k rather than a cutoff of 100k. This
difference is justified by the special considerations for text, where
we authoritative comparisons have further optimizations such as
strcoll caching and the memcmp equality fast path. It's also required
to actually fix the test case at hand -- 100k isn't enough to avoid
the performance issue Jeff reported.

I think that this should be committed to HEAD only.

-- 
Peter Geoghegan

v1-0001-Abort-abbreviated-keys-for-text-less-eagerly.patch
Description: Binary data

Re: undersized unions

2023-02-06 Thread Andres Freund

Hi,

On 2023-02-06 11:55:40 -0500, Tom Lane wrote:
> I am, however, very dubious that Andres is correct that there's a
> problem here.  Given that two of the variants of union NumericChoice
> are structs ending with a flexible array, any compiler that thinks
> it knows the size of the union precisely is broken.

The compiler just complains about the minimum size of the union, which is
  Max(offsetof(NumericShort, n_data), offsetof(NumericLong, n_data))
IOW, our trickery with flexible arrays would allow us to allocate just 8 bytes
for a NumericData, but not just 6.

Flexible arrays allow the compiler to understand the variable size, but we
don't use it for all variability. Hence the warnings.

Greetings,

Andres Freund

Re: undersized unions

2023-02-06 Thread Andres Freund

Hi

On 2023-02-06 11:42:57 -0500, Robert Haas wrote:
> On Sun, Feb 5, 2023 at 6:28 AM Andres Freund  wrote:
> > On the other hand, it also just seems risky from a code writing 
> > perspective. It's not immediate obvious that it'd be unsafe to create an 
> > on-stack Numeric by assigning *ptr. But it is.
>
> Well, I think that is pretty obvious: we have lots of things that are
> essentially variable-length types, and you can't put any of them on
> the stack.

There's a difference between a stack copy not containing all the data, and a
stack copy triggering undefined behaviour. But I agree, it's not something
that'll commonly endanger us.


> But I do also think that the Numeric situation is messier than some
> others we have got, and that's partly my fault, and it would be nice
> to make it better.
>
> I do not really know exactly how to do that, though. Our usual pattern
> is to just have a struct and end with a variable-length array, or
> alternatively add a comment says "other stuff follows!" at the end of
> the struct definition, without doing anything that C knows about at
> all. But here it's more complicated: there's a uint16 value for sure,
> and then maybe an int16 value, and then some number of NumericDigit
> values. That "maybe an int16 value" part is not something that C has a
> built-in way of representing, to my knowledge, which is why we end up
> with this hackish thing.

Perhaps something like

typedef struct NumericBase
{
uint16  n_header;
} NumericBase;

typedef struct NumericData
{
int32   vl_len_;/* varlena header (do not touch 
directly!) */
NumericBase data;
} NumericData;

/* subclass of NumericBase, needs to start in a compatible way */
typedef struct NumericLong
{
uint16  n_sign_dscale;  /* Sign + display scale */
int16   n_weight;   /* Weight of 1st digit  */
} NumericLong;

/* subclass of NumericBase, needs to start in a compatible way */
struct NumericShort
{
uint16  n_header;   /* Sign + display scale + 
weight */
NumericDigit n_data[FLEXIBLE_ARRAY_MEMBER]; /* Digits */
};

Macros that e.g. access n_long would need to cast, before they're able to
access n_long. So we'd end up with something like

#define NUMERIC_SHORT(n)  ((NumericShort *)&((n)->data))
#define NUMERIC_LONG(n)  ((NumericLong *)&((n)->data))

#define NUMERIC_WEIGHT(n)   (NUMERIC_HEADER_IS_SHORT((n)) ? \
((NUMERIC_SHORT(n)->n_header & NUMERIC_SHORT_WEIGHT_SIGN_MASK ? \
~NUMERIC_SHORT_WEIGHT_MASK : 0) \
 | (NUMERIC_SHORT(n)->n_header & NUMERIC_SHORT_WEIGHT_MASK)) \
: (NUMERIC_LONG(n)->n_weight))

Although I'd actually be tempted to rip out all the casts but NUMERIC_LONG()
in this case, because it's all all just accessing the base n_header anyway.

Greetings,

Andres Freund

2023-02-09 release announcement draft

2023-02-06 Thread Jonathan S. Katz


Hi,

Attached is a draft of the announcement for the 2023-02-09 update release.

Please review and provide corrections, notable omissions, and 
suggestions no later than 2023-02-09 0:00 AoE.


Thanks!

Jonathan
The PostgreSQL Global Development Group has released an update to all supported
versions of PostgreSQL, including 15.2, 14.7, 13.10, 12.14, and 11.19.
This release fixes over 60 bugs reported over the last several months.

For the full list of changes, please review the
[release notes](https://www.postgresql.org/docs/release/).

Bug Fixes and Improvements
--
 
This update fixes over 60 bugs that were reported in the last several months.
The issues listed below affect PostgreSQL 15. Some of these issues may also
affect other supported versions of PostgreSQL.

Included in this release:

* Fix for partitioned tables to correctly update
[`GENERATED`](https://www.postgresql.org/docs/current/ddl-generated-columns.html)
columns in child tables if the `GENERATED` column does not exist in the parent
table or the child generated column has different dependencies than the parent.
* Several fixes for the 
[`MERGE`](https://www.postgresql.org/docs/current/sql-merge.html)
command.
* Allow a
[`WITH RECURSIVE ... 
CYCLE`](https://www.postgresql.org/docs/current/queries-with.html#QUERIES-WITH-CYCLE)
query to access its output column.
* Fix an issue with bulk insertions on foreign tables that could lead to logical
inconsistencies, for example, a `BEFORE ROW` trigger may not process rows that
should be available.
* Reject uses of undefined variables in
[`jsonpath`](https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-SQLJSON-PATH)
existence checks.
* Fix for [`jsonb` 
subscripting](https://www.postgresql.org/docs/current/datatype-json.html#JSONB-SUBSCRIPTING)
to handle very large subscript values.
* Honor updated values of `checkpoint_completion_target` on reload.
* Log the correct ending timestamp in `recovery_target_xid` mode.
* Fix issue to allow longer column lists when using logical replication.
* Prevent "wrong tuple length" failure at the end of
[`VACUUM`](https://www.postgresql.org/docs/current/sql-vacuum.html).
* Avoid an immediate commit after
[`ANALYZE`](https://www.postgresql.org/docs/current/sql-analyze.html) when using
query pipelining.
* Several fixes to the query planner, including one that provides more
opportunities for using
[memoization with partitionwise 
joins](https://www.postgresql.org/docs/current/runtime-config-query.html).
* Fix for statistics collection to correctly handle when a relation changes
(e.g. a table is converted to a view).
* Ensure [full text 
search](https://www.postgresql.org/docs/current/textsearch.html)
queries can be cancelled while performing phrase matches.
* Fix deadlock between [`DROP 
DATABASE`](https://www.postgresql.org/docs/current/sql-dropdatabase.html)
and logical replication worker process.
* Fix small session-lifespan memory leak when
[`CREATE 
SUBSCRIPTION`](https://www.postgresql.org/docs/current/sql-createsubscription.html)
fails its connection attempt.
* Performance improvement for replicas with
[`hot_standby`](https://www.postgresql.org/docs/current/hot-standby.html)
enabled that are processing `SELECT` queries.
* Several fixes for logical decoding that improve its stability and bloat
handling.
* Fix the default logical replication plug-in, `pgoutput`, to not send columns
that are not listed in a table's replication
[column 
list](https://www.postgresql.org/docs/current/logical-replication-col-lists.html).
* Fix possible corruption of very large tablespace map files in
[`pg_basebackup`](https://www.postgresql.org/docs/current/app-pgbasebackup.html).
* Remove a harmless warning from
[`pg_dump`](https://www.postgresql.org/docs/current/app-pgdump.html) in
`--if-exists` mode when the
[`public` 
schema](https://www.postgresql.org/docs/current/ddl-schemas.html#DDL-SCHEMAS-PUBLIC)
has a non-default owner.
* Fix the [`psql`](https://www.postgresql.org/docs/current/app-psql.html)
commands `\sf` and `\ef` to handle SQL-language functions that have
[SQL-standard function 
bodies](https://www.postgresql.org/docs/currenet/sql-createprocedure.html)
(i.e. `BEGIN ATOMIC`).
* Fix tab completion of `ALTER FUNCTION/PROCEDURE/ROUTINE ... SET SCHEMA`.
* Update the 
[`pageinspect`](https://www.postgresql.org/docs/current/pageinspect.html)
extension to mark its disk-accessing functions as `PARALLEL RESTRICTED`.
* Fix the [`seg`](https://www.postgresql.org/docs/current/seg.html) extension to
not crash or print garbage if an input number has more than 127 digits.

This release also updates time zone data files to tzdata release 2022g for
DST law changes in Greenland and Mexico, plus historical corrections for
northern Canada, Colombia, and Singapore. Notably, a new timezone,
America/Ciudad_Juarez, has been split off from America/Ojinaga.

For the full list of changes available, please review the
[release

Re: [PATCH] Add pretty-printed XML output option

2023-02-06 Thread Jim Jones


On 06.02.23 17:23, Tom Lane wrote:

Your patch needs to also update expected/xml_1.out to match the output
the test produces when run without --with-libxml.


Thanks a lot! It seems to do the trick.

xml_1.out updated in v4.

Best, Jim
From ced9fccddc033de98709a6e93dc6530ce68149db Mon Sep 17 00:00:00 2001
From: Jim Jones 
Date: Thu, 2 Feb 2023 21:27:16 +0100
Subject: [PATCH v4 1/4] Add pretty-printed XML output option

This small patch introduces a XML pretty print function.
It basically takes advantage of the indentation feature
of xmlDocDumpFormatMemory from libxml2 to format XML strings.
---
 doc/src/sgml/func.sgml|  34 ++
 src/backend/utils/adt/xml.c   |  30 +
 src/include/catalog/pg_proc.dat   |   3 +
 src/test/regress/expected/xml.out | 107 ++
 src/test/regress/sql/xml.sql  |  34 ++
 5 files changed, 208 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e09e289a43..e8b5e581f0 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14293,6 +14293,40 @@ SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
 ]]>
 

+   
+ 
+xmlpretty
+
+ 
+ xmlpretty
+ 
+ 
+
+xmlpretty ( xml ) xml
+
+ 
+ 
+ Converts the given XML value to pretty-printed, indented text.
+ 
+ 
+ 
+ Example:
+ 
+ 
+   
+   

 

diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 079bcb1208..6409133137 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -473,6 +473,36 @@ xmlBuffer_to_xmltype(xmlBufferPtr buf)
 }
 #endif
 
+Datum
+xmlpretty(PG_FUNCTION_ARGS)
+{
+#ifdef USE_LIBXML
+
+xmlDocPtr  doc;
+xmlChar*buf = NULL;
+text   *arg = PG_GETARG_TEXT_PP(0);
+
+doc = xml_parse(arg, XMLOPTION_DOCUMENT, false, GetDatabaseEncoding(), NULL);
+
+/**
+* xmlDocDumpFormatMemory (
+*   xmlDocPtr doc,  # the XML document.
+*   xmlChar ** buf, # buffer where the formatted XML document will be stored.
+*   int *size,  # this could store the size of the created buffer
+* but as we do not need it, we can leave it NULL.
+*   int format) # 1 = node indenting.
+*/
+xmlDocDumpFormatMemory(doc, , NULL, 1);
+
+xmlFreeDoc(doc);
+PG_RETURN_XML_P(cstring_to_xmltype((char*)buf));
+
+#else
+NO_XML_SUPPORT();
+return 0;
+#endif
+}
+
 
 Datum
 xmlcomment(PG_FUNCTION_ARGS)
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index c0f2a8a77c..3224dc3e76 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8842,6 +8842,9 @@
 { oid => '3053', descr => 'determine if a string is well formed XML content',
   proname => 'xml_is_well_formed_content', prorettype => 'bool',
   proargtypes => 'text', prosrc => 'xml_is_well_formed_content' },
+  { oid => '4642', descr => 'Indented text from xml',
+  proname => 'xmlpretty', prorettype => 'xml',
+  proargtypes => 'xml', prosrc => 'xmlpretty' },
 
 # json
 { oid => '321', descr => 'I/O',
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 3c357a9c7e..98a338ad8d 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1599,3 +1599,110 @@ SELECT * FROM XMLTABLE('.' PASSING XMLELEMENT(NAME a) columns a varchar(20) PATH
   | foo/
 (1 row)
 
+-- XML pretty print: single line XML string
+SELECT xmlpretty('Belgian Waffles$5.95Two of our famous Belgian Waffles with plenty of real maple syrup650')::xml;
+xmlpretty 
+--
+ +
+ +
+ Belgian Waffles+
+ $5.95+
+ Two of our famous Belgian Waffles with plenty of real maple syrup+
+ 650+
+  +
++
+ 
+(1 row)
+
+-- XML pretty print: XML string with space, tabs and newline between nodes
+SELECT xmlpretty(' Belgian Waffles $15.95
+ Two of our famous Belgian Waffles with plenty of real maple syrup
+650')::xml;
+xmlpretty 
+--
+

Re: run pgindent on a regular basis / scripted manner

2023-02-06 Thread Andrew Dunstan



On 2023-02-06 Mo 12:03, Andrew Dunstan wrote:



On 2023-02-06 Mo 10:36, Robert Haas wrote:

On Mon, Feb 6, 2023 at 10:21 AM Andrew Dunstan  wrote:

Good suggestions. 1 and 3 seem fairly straightforward. I'll start on those, and 
look into 2.

Thanks!



Here's a quick patch for 1 and 3. Would also need to adjust the docco.




This time with patch.


cheers


andrew


--
Andrew Dunstan
EDB:https://www.enterprisedb.com
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 56640e576a..ca329747a4 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -23,12 +23,14 @@ my $devnull = File::Spec->devnull;
 
 my ($typedefs_file, $typedef_str, $code_base,
 	@excludes,  $indent,  $build,
-	$show_diff, $silent_diff, $help);
+	$show_diff, $silent_diff, $help,
+	@commits,);
 
 $help = 0;
 
 my %options = (
 	"help"   => \$help,
+	"commit=s"   => \@commits,
 	"typedefs=s" => \$typedefs_file,
 	"list-of-typedefs=s" => \$typedef_str,
 	"code-base=s"=> \$code_base,
@@ -44,6 +46,9 @@ usage() if $help;
 usage("Cannot have both --silent-diff and --show-diff")
   if $silent_diff && $show_diff;
 
+usage("Cannot use --commit with --code-base or command line file list")
+  if (@commits && ($code_base || @ARGV));
+
 run_build($code_base) if ($build);
 
 # command line option wins, then environment (which is how --build sets it) ,
@@ -388,6 +393,7 @@ Usage:
 pgindent [OPTION]... [FILE]...
 Options:
 	--help  show this message and quit
+--commit=gitref use files modified by the named commit
 	--typedefs=FILE file containing a list of typedefs
 	--list-of-typedefs=STR  string containing typedefs, space separated
 	--code-base=DIR path to the base of PostgreSQL source code
@@ -396,7 +402,7 @@ Options:
 	--build build the pg_bsd_indent program
 	--show-diff show the changes that would be made
 	--silent-diff   exit with status 2 if any changes would be made
-The --excludes option can be given more than once.
+The --excludes and --commit options can be given more than once.
 EOF
 	if ($help)
 	{
@@ -412,27 +418,38 @@ EOF
 
 # main
 
-# get the list of files under code base, if it's set
-File::Find::find(
-	{
-		wanted => sub {
-			my ($dev, $ino, $mode, $nlink, $uid, $gid);
-			(($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))
-			  && -f _
-			  && /^.*\.[ch]\z/s
-			  && push(@files, $File::Find::name);
-		}
-	},
-	$code_base) if $code_base;
-
 $filtered_typedefs_fh = load_typedefs();
 
 check_indent();
 
-# any non-option arguments are files to be processed
-push(@files, @ARGV);
+build_clean($code_base) if $build;
 
-# the exclude list applies to command line arguments as well as found files
+my $wanted = sub
+{
+	my ($dev, $ino, $mode, $nlink, $uid, $gid);
+	(($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))
+	  && -f _
+	  && /^.*\.[ch]\z/s
+	  && push(@files, $File::Find::name);
+};
+
+# get the list of files under code base, if it's set
+File::Find::find({wanted => $wanted }, $code_base) if $code_base;
+
+# any non-option arguments are files or directories to be processed
+File::Find::find({wanted => $wanted}, @ARGV) if @ARGV;
+
+# process named commits by comparing each with their immediate ancestor
+foreach my $commit (@commits)
+{
+	my $prev="$commit~";
+	my @affected=`git diff-tree --no-commit-id --name-only -r $commit $prev`;
+	die "git error" if $?;
+	chomp(@affected);
+	push(@files,@affected);
+}
+
+# remove excluded files from the file list
 process_exclude();
 
 foreach my $source_filename (@files)
@@ -481,6 +498,4 @@ foreach my $source_filename (@files)
 	}
 }
 
-build_clean($code_base) if $build;
-
 exit 0;

Re: Logical Replication Custom Column Expression

2023-02-06 Thread Stavros Koureas

>> And yes, probably you need to change the way you reply to email on
>> this list. Top-posting is generally avoided. See
>> https://wiki.postgresql.org/wiki/Mailing_Lists.

>Thanks for bringing this into the discussion :)

Thinking these days more about this topic, subscriber name is not a bad
idea, although it makes sense to be able to give your own value even on
subscriber level, for example an integer.
Having a custom integer value is better as definitely this integer will
participate later in all joins beside the tables and for sure joining with
an integer it would be quicker rather than joining on a character varying
(plus the rest of the columns).
In addition, discussing with other people and also on Stack
Overflow/DBAExchange I have found that other people think it is a great
enhancement for analytical purposes.

Στις Τετ 30 Νοε 2022 στις 10:39 π.μ., ο/η Stavros Koureas <
koureasstav...@gmail.com> έγραψε:

>
>
> Στις Τρί 29 Νοε 2022 στις 3:27 μ.μ., ο/η Ashutosh Bapat <
> ashutosh.bapat@gmail.com> έγραψε:
> > That would be too restrictive - not necessarily in your application
> > but generally. There could be some tables where consolidating rows
> > with same PK from different publishers into a single row in subscriber
> > would be desirable. I think we need to enable the property for every
> > subscriber that intends to add publisher column to the desired and
> > subscribed tables. But there should be another option per table which
> > will indicate that receiver should add publisher when INSERTING row to
> > that table.
>
> So we are discussing the scope level of this property, if this property
> will be implemented on subscriber level or on subscriber table.
> In that case I am not sure how this will be implemented as currently
> postgres subscribers can have multiple tables streamed from a single
> publisher.
> In that case we may have an additional syntax on subscriber, for example:
>
> CREATE SUBSCRIPTION sub1 CONNECTION 'host=localhost port=5432
> user=postgres password=XX dbname=publisher1' PUBLICATION pub1 with
> (enabled = false, create_slot = false, slot_name = NONE, tables =
> {tableA:union, tableB:none, });
>
> Something like this?
>
> > And yes, probably you need to change the way you reply to email on
> > this list. Top-posting is generally avoided. See
> > https://wiki.postgresql.org/wiki/Mailing_Lists.
>
> Thanks for bringing this into the discussion :)
>

Re: Understanding years part of Interval

2023-02-06 Thread Marcos Pegoraro

Em seg., 6 de fev. de 2023 às 10:59, Erik Wienhold 
escreveu:

> > On 06/02/2023 12:20 CET Marcos Pegoraro  wrote:
> >
> > I was just playing with some random timestamps for a week, for a month,
> > for a year ...
> >
> > select distinct current_date+((random()::numeric)||'month')::interval
> from generate_series(1,100) order by 1;
> > It´s with distinct clause because if you change that 'month' for a 'year'
> > it´ll return only 12 rows, instead of 100. So, why years part of interval
> > works differently than any other ?
> >
> > select '1.01 week'::interval; --> 0 years 0 mons 7 days 1 hours 40 mins
> 48.00 secs
> > select '1.01 month'::interval; --> 0 years 1 mons 0 days 7 hours 12 mins
> 0.00 secs
> > select '1.01 year'::interval; --> 1 years 0 mons 0 days 0 hours 0 mins
> 0.00 secs
>
> Explained in
> https://www.postgresql.org/docs/15/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
> :
>
> Field values can have fractional parts: for example, '1.5 weeks' or
> '01:02:03.45'. However, because interval internally stores only
> three integer units (months, days, microseconds), fractional units
> must be spilled to smaller units. Fractional parts of units greater
> than months are rounded to be an integer number of months, e.g.
> '1.5 years' becomes '1 year 6 mons'. Fractional parts of weeks and
> days are computed to be an integer number of days and microseconds,
> assuming 30 days per month and 24 hours per day, e.g., '1.75
> months'
> becomes 1 mon 22 days 12:00:00. Only seconds will ever be shown as
> fractional on output.
>
> Internally interval values are stored as months, days, and
> microseconds. This is done because the number of days in a month
> varies, and a day can have 23 or 25 hours if a daylight savings
> time
> adjustment is involved.
>
> I´ve sent this message initially to general and Erik told me it's
documented, so it's better to hackers help me if this has an explaining why
it's done that way.

select '1 year'::interval = '1.05 year'::interval -->true ?
I cannot agree that this select returns true.

Re: run pgindent on a regular basis / scripted manner

2023-02-06 Thread Peter Eisentraut


On 02.02.23 07:40, Tom Lane wrote:

Noah Misch  writes:

Regarding the concern about a pre-receive hook blocking an emergency push, the
hook could approve every push where a string like "pgindent: no" appears in a
commit message within the push.  You'd still want to make the tree clean
sometime the same week or so.  It's cheap to provide a break-glass like that.


I think the real question here is whether we can get all (or at least
a solid majority of) committers to accept such draconian constraints.
I'd buy into it, and evidently so would you, but I can't help noting
that less than a quarter of active committers have bothered to
comment on this thread.  I suspect the other three-quarters would
be quite annoyed if we tried to institute such requirements.  That's
not manpower we can afford to drive away.


I have some concerns about this.

First, as a matter of principle, it would introduce another level of 
gatekeeping power.  Right now, the committers are as a group in charge 
of what gets into the tree.  Adding commit hooks that are installed 
somewhere(?) by someone(?) and can only be seen by some(?) would upset 
that.  If we were using something like github or gitlab (not suggesting 
that, but for illustration), then you could put this kind of thing under 
.github/ or similar and then it would be under the same control as the 
source code itself.


Also, pgindent takes tens of seconds to run, so hooking that into the 
git push process would slow this down quite a bit.  And maybe we want to 
add pgperltidy and so on, where would this lead?  If somehow your local 
indenting doesn't give you the "correct" result for some reason, you 
might sit there for minutes and minutes trying to fix and push and fix 
and push.


Then, consider the typedefs issue.  If you add a typedef but don't add 
it to the typedefs list but otherwise pgindent your code perfectly, the 
push would be accepted.  If then later someone updates the typedefs 
list, perhaps from the build farm, it would then reject the indentation 
of your previously committed code, thus making it their problem.


I think a better way to address these issues would be making this into a 
test suite, so that you can run some command that checks "is everything 
indented correctly".  Then you can run this locally, on the build farm, 
in the cfbot etc. in a uniform way and apply the existing 
blaming/encouragement processes like for any other test failure.

Re: run pgindent on a regular basis / scripted manner

2023-02-06 Thread Andrew Dunstan



On 2023-02-06 Mo 10:36, Robert Haas wrote:

On Mon, Feb 6, 2023 at 10:21 AM Andrew Dunstan  wrote:

Good suggestions. 1 and 3 seem fairly straightforward. I'll start on those, and 
look into 2.

Thanks!



Here's a quick patch for 1 and 3. Would also need to adjust the docco.


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Re: undersized unions

2023-02-06 Thread Tom Lane

Robert Haas  writes:
> I do not really know exactly how to do that, though. Our usual pattern
> is to just have a struct and end with a variable-length array, or
> alternatively add a comment says "other stuff follows!" at the end of
> the struct definition, without doing anything that C knows about at
> all. But here it's more complicated: there's a uint16 value for sure,
> and then maybe an int16 value, and then some number of NumericDigit
> values. That "maybe an int16 value" part is not something that C has a
> built-in way of representing, to my knowledge, which is why we end up
> with this hackish thing.

If we were willing to blow off the optimizations for NBASE < 1,
and say that NumericDigit is always int16, then it would be possible
to represent all of these variants as plain array-of-int16, with
some conventions about which indexes are what (and some casting
between int16 and uint16).

I am, however, very dubious that Andres is correct that there's a
problem here.  Given that two of the variants of union NumericChoice
are structs ending with a flexible array, any compiler that thinks
it knows the size of the union precisely is broken.

regards, tom lane

Re: undersized unions

2023-02-06 Thread Robert Haas

On Sun, Feb 5, 2023 at 6:28 AM Andres Freund  wrote:
> On the other hand, it also just seems risky from a code writing perspective. 
> It's not immediate obvious that it'd be unsafe to create an on-stack Numeric 
> by assigning *ptr. But it is.

Well, I think that is pretty obvious: we have lots of things that are
essentially variable-length types, and you can't put any of them on
the stack.

But I do also think that the Numeric situation is messier than some
others we have got, and that's partly my fault, and it would be nice
to make it better.

I do not really know exactly how to do that, though. Our usual pattern
is to just have a struct and end with a variable-length array, or
alternatively add a comment says "other stuff follows!" at the end of
the struct definition, without doing anything that C knows about at
all. But here it's more complicated: there's a uint16 value for sure,
and then maybe an int16 value, and then some number of NumericDigit
values. That "maybe an int16 value" part is not something that C has a
built-in way of representing, to my knowledge, which is why we end up
with this hackish thing.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Parallelize correlated subqueries that execute within each worker

2023-02-06 Thread Antonin Houska

James Coleman  wrote:

> On Sat, Jan 21, 2023 at 10:07 PM James Coleman  wrote:
> > ...
> > While working through Tomas's comment about a conditional in the
> > max_parallel_hazard_waker being guaranteed true I realized that in the
> > current version of the patch the safe_param_ids tracking in
> > is_parallel_safe isn't actually needed any longer. That seemed
> > suspicious, and so I started digging, and I found out that in the
> > current approach all of the tests pass with only the changes in
> > clauses.c. I don't believe that the other changes aren't needed;
> > rather I believe there isn't yet a test case exercising them, but I
> > realize that means I can't prove they're needed. I spent some time
> > poking at this, but at least with my current level of imagination I
> > haven't stumbled across a query that would exercise these checks.
> 
> I played with this a good bit more yesterday, I'm now a good bit more
> confident this is correct. I've cleaned up the patch; see attached for
> v7.
> 
> Here's some of my thought process:
> The comments in src/include/nodes/pathnodes.h:2953 tell us that
> PARAM_EXEC params are used to pass values around from one plan node to
> another in the following ways:
> 1. Values down into subqueries (for outer references in subqueries)
> 2. Up out of subqueries (for the results of a subplan)
> 3. From a NestLoop plan node into its inner relation (when the inner
> scan is parameterized with values from the outer relation)
> 
> Case (2) is already known to be safe (we currently add these params to
> safe_param_ids in max_parallel_hazard_walker when we encounter a
> SubPlan node).
> 
> I also believe case (3) is already handled. We don't build partial
> paths for joins when joinrel->lateral_relids is non-empty, and join
> order calculations already require that parameterization here go the
> correct way (i.e., inner depends on outer rather than the other way
> around).
> 
> That leaves us with only case (1) to consider in this patch. Another
> way of saying this is that this is really the only thing the
> safe_param_ids tracking is guarding against. For params passed down
> into subqueries we can further distinguish between init plans and
> "regular" subplans. We already know that params from init plans are
> safe (at the right level). So we're concerned here with a way to know
> if the params passed to subplans are safe. We already track required
> rels in ParamPathInfo, so it's fairly simple to do this test.
> 
> Which this patch we do in fact now see (as expected) rels with
> non-empty lateral_relids showing up in generate_[useful_]gather_paths.
> And the partial paths can now have non-empty required outer rels. I'm
> not able to come up with a plan that would actually be caught by those
> checks; I theorize that because of the few places we actually call
> generate_[useful_]gather_paths we are in practice already excluding
> those, but for now I've left these as a conditional rather than an
> assertion because it seems like the kind of guard we'd want to ensure
> those methods are safe.

Maybe we can later (in separate patches) relax the restrictions imposed on
partial path creation a little bit, so that more parameterized partial paths
are created.

One particular case that should be rejected by your checks is a partial index
path, which can be parameterized, but I couldn't construct a query that makes
your checks fire. Maybe the reason is that a parameterized index path is
mostly used on the inner side of a NL join, however no partial path can be
used there. (The problem is that each worker evaluating the NL join would only
see a subset of the inner relation, which whould lead to incorrect results.)

So I'd also choose conditions rather than assert statements.


Following are my (minor) findings:

In generate_gather_paths() you added this test

/*
 * Delay gather path creation until the level in the join tree where all
 * params used in a worker are generated within that worker.
 */
if (!bms_is_subset(required_outer, rel->relids))
return;

but I'm not sure if required_outer can contain anything of rel->relids. How
about using bms_is_empty(required) outer, or even this?

if (required_outer)
return;

Similarly,

/* We can't pass params to workers. */
if (!bms_is_subset(PATH_REQ_OUTER(cheapest_partial_path), rel->relids))

might look like

if (!bms_is_empty(PATH_REQ_OUTER(cheapest_partial_path)))

or

if (PATH_REQ_OUTER(cheapest_partial_path))

In particular, build_index_paths() does the following when setting
outer_relids (which eventually becomes (path->param_info->ppi_req_outer):

/* Enforce convention that outer_relids is exactly NULL if empty */
if (bms_is_empty(outer_relids))
outer_relids = NULL;


Another question is whether in this call

simple_gather_path = (Path *)
create_gather_path(root, rel,

Re: [PATCH] Add pretty-printed XML output option

2023-02-06 Thread Tom Lane

Jim Jones  writes:
> The cfbot on "Windows - Server 2019, VS 2019 - Meson & ninja" is failing 
> the regression tests with the error:

> ERROR:  unsupported XML feature
> DETAIL:  This functionality requires the server to be built with libxml 
> support.

> Is there anything I can do to enable libxml to run my regression tests?

Your patch needs to also update expected/xml_1.out to match the output
the test produces when run without --with-libxml.

regards, tom lane

Re: when the startup process doesn't (logging startup delays)

2023-02-06 Thread Robert Haas

On Mon, Feb 6, 2023 at 11:15 AM Tom Lane  wrote:
> Robert Haas  writes:
> > On Mon, Feb 6, 2023 at 11:07 AM Tom Lane  wrote:
> >> Umm ... is this really the sort of patch to be committing on a
> >> release wrap day?
>
> > Oh, shoot, I wasn't thinking about that. Would you like me to revert
> > it in v15 for now?
>
> Yeah, seems like the safest course.  I wouldn't object to it going in
> after the release is over, but right now there's zero margin for error.

Done.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

2023-02-06 Thread Robert Haas

On Wed, Feb 1, 2023 at 3:37 PM Jacob Champion  wrote:
> > I'm not an expert on this stuff, but to me that feels like a weak and
> > fuzzy concept. If the client is going to tell the server something,
> > I'd much rather have it say something like "i'm proxying a request
> > from my local user rhaas, who authenticated using such and such a
> > method and connected from such and such an IP yadda yadda". That feels
> > to me like really clear communication that the server can then be
> > configured to something about via pg_hba.conf or similar. Saying "use
> > in-band authentication only", to me, feels much murkier. As the
> > recipient of that message, I don't know exactly what to do about it,
> > and it feels like whatever heuristic I adopt might end up being wrong
> > and something bad happens anyway.
>
> Is it maybe just a matter of terminology? If a proxy tells the server,
> "This user is logging in. Here's the password I have for them. DO NOT
> authenticate using anything else," and the HBA says to use ident auth
> for that user, then the server fails the connection. That's what I
> mean by in-band -- the proxy says, "here are the credentials for this
> connection." That's it.

I don't think that's quite the right concept. It seems to me that the
client is responsible for informing the server of what the situation
is, and the server is responsible for deciding whether to allow the
connection. In your scenario, the client is not only communicating
information ("here's the password I have got") but also making demands
on the server ("DO NOT authenticate using anything else"). I like the
first part fine, but not the second part.

Consider the scenario where somebody wants to allow a connection that
is proxied and does not require a password. For example, maybe I have
a group of three machines that all mutually trust each other and the
network is locked down so that we need not worry about IP spoofing or
whatever. Just be doubly sure, they all have SSL certificates so that
they can verify that an incoming connection is from one of the other
trusted machines. I, as the administrator, want to configure things so
that each machine will proxy connections to the others as long as
local user = remote user. When the remote machine receives the
connection, it can trust that the request is legitimate provided that
the SSL certificate is successfully verified.

The way I think this should work is, first, on each machine, in the
proxy configuration, there should be a rule that says "only proxy
connections where local user = remote user" (and any other rules I
want to enforce). Second, in the HBA configuration, there should be a
rule that says "if somebody is trying to proxy a connection, it has to
be for one of these IPs and they have to authenticate using an SSL
certificate". In this kind of scenario, the client has no business
demanding that the server authenticate using the password rather than
anything else. The server, not the client, is in charge of deciding
which connections to accept; the client's job is only to decide which
connections to proxy. And the human being is responsible for making
sure that the combination of those two things implements the intended
security policy.

> Agreed. The danger from my end is, I'm trained on configuration
> formats that have infinite bells and whistles. I don't really want to
> go too crazy with it.

Yeah. If I remember my math well enough, the time required to
implement infinite bells and whistles will also be infinite, and as a
wise man once said, real artists ship.

It does seem like a good idea, if we can, to make the configuration
file format flexible enough that we can easily extend it with more
bells and whistles later if we so choose. But realistically most
people are going to have very simple configurations.

> > and that maybe
> > has something in common with our existing configuration file syntaxes.
> > But if we have to invent something new, then we can do that.
>
> Okay. Personally I'd like
> - the ability to set options globally (so filters are optional)
> - the ability to maintain many options for a specific scope (host? IP
> range?) without making my config lines grow without bound
> - the ability to audit a configuration without trusting its comments
>
> But getting all of my wishlist into a sane configuration format that
> handles all the use cases is the tricky part. I'll think about it.

Nobody seemed too keen on my proposal of a bunch of tab-separated
fields; maybe we're all traumatized from pg_hba.conf and should look
for something more complex with a real parser. I thought that
tab-separated fields might be good enough and simple to implement, but
it doesn't matter how simple it is to implement if nobody likes it. We
could do something that looks more like a series of if-then rules,
e.g.

target-host 127.0.0.0/8 => reject
authentication-method scram => accept
reject

But it's only a hop, skip and a jump from there to something that
looks an awful

Re: [PATCH] Add pretty-printed XML output option

2023-02-06 Thread Jim Jones


Hi,

The cfbot on "Windows - Server 2019, VS 2019 - Meson & ninja" is failing 
the regression tests with the error:


ERROR:  unsupported XML feature
DETAIL:  This functionality requires the server to be built with libxml 
support.


Is there anything I can do to enable libxml to run my regression tests?

v3 adds a missing xmlFree call.

Best, Jim
From ced9fccddc033de98709a6e93dc6530ce68149db Mon Sep 17 00:00:00 2001
From: Jim Jones 
Date: Thu, 2 Feb 2023 21:27:16 +0100
Subject: [PATCH v3 1/3] Add pretty-printed XML output option

This small patch introduces a XML pretty print function.
It basically takes advantage of the indentation feature
of xmlDocDumpFormatMemory from libxml2 to format XML strings.
---
 doc/src/sgml/func.sgml|  34 ++
 src/backend/utils/adt/xml.c   |  30 +
 src/include/catalog/pg_proc.dat   |   3 +
 src/test/regress/expected/xml.out | 107 ++
 src/test/regress/sql/xml.sql  |  34 ++
 5 files changed, 208 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e09e289a43..e8b5e581f0 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14293,6 +14293,40 @@ SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
 ]]>
 

+   
+ 
+xmlpretty
+
+ 
+ xmlpretty
+ 
+ 
+
+xmlpretty ( xml ) xml
+
+ 
+ 
+ Converts the given XML value to pretty-printed, indented text.
+ 
+ 
+ 
+ Example:
+ 
+ 
+   
+   

 

diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 079bcb1208..6409133137 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -473,6 +473,36 @@ xmlBuffer_to_xmltype(xmlBufferPtr buf)
 }
 #endif
 
+Datum
+xmlpretty(PG_FUNCTION_ARGS)
+{
+#ifdef USE_LIBXML
+
+xmlDocPtr  doc;
+xmlChar*buf = NULL;
+text   *arg = PG_GETARG_TEXT_PP(0);
+
+doc = xml_parse(arg, XMLOPTION_DOCUMENT, false, GetDatabaseEncoding(), NULL);
+
+/**
+* xmlDocDumpFormatMemory (
+*   xmlDocPtr doc,  # the XML document.
+*   xmlChar ** buf, # buffer where the formatted XML document will be stored.
+*   int *size,  # this could store the size of the created buffer
+* but as we do not need it, we can leave it NULL.
+*   int format) # 1 = node indenting.
+*/
+xmlDocDumpFormatMemory(doc, , NULL, 1);
+
+xmlFreeDoc(doc);
+PG_RETURN_XML_P(cstring_to_xmltype((char*)buf));
+
+#else
+NO_XML_SUPPORT();
+return 0;
+#endif
+}
+
 
 Datum
 xmlcomment(PG_FUNCTION_ARGS)
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index c0f2a8a77c..3224dc3e76 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8842,6 +8842,9 @@
 { oid => '3053', descr => 'determine if a string is well formed XML content',
   proname => 'xml_is_well_formed_content', prorettype => 'bool',
   proargtypes => 'text', prosrc => 'xml_is_well_formed_content' },
+  { oid => '4642', descr => 'Indented text from xml',
+  proname => 'xmlpretty', prorettype => 'xml',
+  proargtypes => 'xml', prosrc => 'xmlpretty' },
 
 # json
 { oid => '321', descr => 'I/O',
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 3c357a9c7e..98a338ad8d 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1599,3 +1599,110 @@ SELECT * FROM XMLTABLE('.' PASSING XMLELEMENT(NAME a) columns a varchar(20) PATH
   | foo/
 (1 row)
 
+-- XML pretty print: single line XML string
+SELECT xmlpretty('Belgian Waffles$5.95Two of our famous Belgian Waffles with plenty of real maple syrup650')::xml;
+xmlpretty 
+--
+ +
+ +
+ Belgian Waffles+
+ $5.95+
+ Two of our famous Belgian Waffles with plenty of real maple syrup+
+ 650+
+  +
++
+ 
+(1 row)
+
+-- XML pretty print: XML string with space, tabs and newline between nodes
+SELECT xmlpretty(' Belgian Waffles $15.95
+ Two of our famous Belgian Waffles with plenty of real maple syrup
+650')::xml;
+xmlpretty

Re: make_ctags: use -I option to ignore pg_node_attr macro

2023-02-06 Thread Fujii Masao





On 2022/10/19 13:25, Tatsuo Ishii wrote:

Thanks. the v6 patch pushed to master branch.


Since this commit, make_etags has started failing to generate
tags files with the following error messages, on my MacOS.

$ src/tools/make_etags
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ctags:
 illegal option -- e
usage: ctags [-BFadtuwvx] [-f tagsfile] file ...
sort: No such file or directory


In my MacOS, non-Exuberant ctags is installed and doesn't support
-e option. But the commit changed make_etags so that it always
calls ctags with -e option via make_ctags. This seems the cause of
the above failure.

IS_EXUBERANT=""
ctags --version 2>&1 | grep Exuberant && IS_EXUBERANT="Y"

make_ctags has the above code and seems to support non-Exuberant ctags.
If so, we should revert the changes of make_etags by the commit and
make make_etags work with that ctags? Or, we should support
only Exuberant-type ctags (btw, I'm ok with this) and get rid of
something like the above code?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: Where is the filter?

2023-02-06 Thread Robert Haas

On Sat, Feb 4, 2023 at 11:29 PM jack...@gmail.com  wrote:
> When I use 'select * from t where a = 1'; And I debug to find where the 'a = 
> 1' is used,
> when I arrive ExecScan in src/backend/executor/execScan.c, line 158, where 
> this 'a = 1' is
> stored in?

It depends somewhat on what query plan you got. For instance if it was
a Seq Scan then it will be a filter-condition, or "qual", and the call
to ExecQual() later in ExecScan() will be responsible for evaluating
it. But if you are using an index scan, then it will probably become
an index qual, and those are passed down into the index machinery and
internally handled by the index AM. It's a good idea when you're
debugging this sort of thing to start by looking at the EXPLAIN or
EXPLAIN ANALYZE output, and perhaps also the output with
debug_print_plan = true.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: when the startup process doesn't (logging startup delays)

2023-02-06 Thread Tom Lane

Robert Haas  writes:
> On Mon, Feb 6, 2023 at 11:07 AM Tom Lane  wrote:
>> Umm ... is this really the sort of patch to be committing on a
>> release wrap day?

> Oh, shoot, I wasn't thinking about that. Would you like me to revert
> it in v15 for now?

Yeah, seems like the safest course.  I wouldn't object to it going in
after the release is over, but right now there's zero margin for error.

regards, tom lane

Re: OpenSSL 3.0.0 vs old branches

2023-02-06 Thread Tom Lane

Andrew Dunstan  writes:
> I recently moved crake to a new machine running Fedora 36, which has 
> OpenSSL 3.0.0. This causes the SSL tests to fail on branches earlier 
> than release 13, so I propose to backpatch commit f0d2c65f17 to the 
> release 11 and 12 branches.

Hmm ... according to that commit message,

  Note that the minimum supported OpenSSL version is 1.0.1 as of
  7b283d0e1d1d79bf1c962d790c94d2a53f3bb38a, so this does not introduce
  any new version requirements.

So presumably, changing this test would break it for OpenSSL 0.9.8,
which is still nominally supported in those branches.  On the other
hand, this test isn't run by default, so users would likely never
notice anyway.

On the whole, +1 for doing this (after the release freeze lifts).

regards, tom lane

Re: when the startup process doesn't (logging startup delays)

2023-02-06 Thread Robert Haas

On Mon, Feb 6, 2023 at 11:07 AM Tom Lane  wrote:
> Umm ... is this really the sort of patch to be committing on a
> release wrap day?

Oh, shoot, I wasn't thinking about that. Would you like me to revert
it in v15 for now?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: when the startup process doesn't (logging startup delays)

2023-02-06 Thread Tom Lane

Robert Haas  writes:
> On Thu, Feb 2, 2023 at 11:22 PM Bharath Rupireddy
>  wrote:
>> I took the v4 patch, added some comments and attached it as the v7
>> patch here. Please find it.

> Committed and back-patched to v15.

Umm ... is this really the sort of patch to be committing on a
release wrap day?

regards, tom lane

Re: when the startup process doesn't (logging startup delays)

2023-02-06 Thread Robert Haas

On Thu, Feb 2, 2023 at 11:22 PM Bharath Rupireddy
 wrote:
> I took the v4 patch, added some comments and attached it as the v7
> patch here. Please find it.

Committed and back-patched to v15.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

OpenSSL 3.0.0 vs old branches

2023-02-06 Thread Andrew Dunstan



I recently moved crake to a new machine running Fedora 36, which has 
OpenSSL 3.0.0. This causes the SSL tests to fail on branches earlier 
than release 13, so I propose to backpatch commit f0d2c65f17 to the 
release 11 and 12 branches.



cheers


andrew


--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

2023-02-06 Thread Robert Haas

On Mon, Feb 6, 2023 at 10:21 AM Andrew Dunstan  wrote:
> Good suggestions. 1 and 3 seem fairly straightforward. I'll start on those, 
> and look into 2.

Thanks!

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

2023-02-06 Thread Robert Haas

On Mon, Feb 6, 2023 at 10:16 AM Tom Lane  wrote:
> I'll just note that adding features like those to a Perl script
> is going to be a ton easier than doing it inside pg_bsd_indent.

+1.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

2023-02-06 Thread Tom Lane

Alvaro Herrera  writes:
> I'm writing my own patch for this problem.  While playing around with
> it, I noticed this:

> struct Command {
> /* size: 2168, cachelines: 34, members: 11 */
> /* sum members: 2164, holes: 1, sum holes: 4 */
> /* last cacheline: 56 bytes */
> };

I think the original intent was for argv[] to be at the end,
which fell victim to ye olde add-at-the-end antipattern.
Cache-friendliness-wise, putting it back to the end would
likely be enough.  But turning it into a variable-size array
would be better from a functionality standpoint.

regards, tom lane

1 2 >

1 - 100 of 123 matches

Mail list logo