Re: extra semicolon in postgres_fdw test cases

2021-03-30 Thread Amit Kapila
On Wed, Mar 31, 2021 at 9:47 AM Amit Kapila  wrote:
>
> On Wed, Mar 31, 2021 at 9:35 AM Michael Paquier  wrote:
> >
> > On Tue, Mar 30, 2021 at 05:00:53PM +0530, Amit Kapila wrote:
> > > Looks good to me as well but I think one can choose not to backpatch
> > > as there is no functional impact but OTOH, there is some value in
> > > keeping tests/code consistent.
> >
> > FWIW, I would not bother with the back branches for just that, but if
> > you feel that this is better, of course feel free.
> >
>
> Fair enough. I'll push this just for HEAD.
>

Pushed!

-- 
With Regards,
Amit Kapila.




Re: Stronger safeguard for archive recovery not to miss data

2021-03-30 Thread Kyotaro Horiguchi
At Wed, 31 Mar 2021 14:23:26 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> I apologize in advance if I'm missing something.

Oops! I missed the case of replica side.  It's obviously useless that
replica continue to run allowing a stopping gap made by
wal_level=minimal.

So, I don't object to this patch.

Sorry for the noise.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Stronger safeguard for archive recovery not to miss data

2021-03-30 Thread Kyotaro Horiguchi
At Wed, 31 Mar 2021 02:11:48 +0900, Fujii Masao  
wrote in 
> 
> 
> On 2021/03/26 22:14, David Steele wrote:
> > On 3/25/21 9:23 PM, Fujii Masao wrote:
> >>
> >> On 2021/03/25 23:21, David Steele wrote:
> >>> On 1/25/21 3:55 AM, Laurenz Albe wrote:
>  On Mon, 2021-01-25 at 08:19 +, osumi.takami...@fujitsu.com wrote:
> >> I think you should pst another patch where the second, now
> >> superfluous, error
> >> message is removed.
> >
> > Updated. This patch showed no failure during regression tests
> > and has been aligned by pgindent.
> 
>  Looks good to me.
>  I'll set it to "ready for committer" again.
> >>>
> >>> Fujii, does the new patch in [1] address your concerns?
> >>
> >> No. I'm still not sure if this patch is good idea... I understand
> >> why this safeguard is necessary. OTOH I'm afraid it increases
> >> a bit the risk that users get unstartable database, i.e., lose whole
> >> database.
> >> But maybe I'm concerned about rare case and my opinion is minority
> >> one.
> >> So I'd like to hear more opinions about this patch.
> > After reviewing the patch I am +1. I think allowing corruption in
> > recovery by default is not a good idea. There is currently a warning
> > but I don't think that is nearly strong enough and is easily missed.
> > Also, "data may be missing" makes this sound like an acceptable
> > situation. What is really happening is corruption is being introduced
> > that may make the cluster unusable or at the very least lead to errors
> > during normal operation.
> 
> Ok, now you, Osumi-san and Laurenz agree to this change
> while I'm only the person not to like this. So unless we can hear
> any other objections to this change, probably we should commit the
> patch.

Sorry for being late, but FWIW, I'm confused.

This patch checks if archive recovery is requested after shutting down
while in the minimal mode.  Since we officially reject to take a
backup while wal_level=minimal, the error is not supposed to be raised
while recoverying from a past backup.  If the cluster was running in
minimal mode, the archive recovery does nothing substantial (would end
with running a crash recvoery at worst).  I'm not sure how the
concrete curruption mode caused by the operation looks like.  I agree
that we should reject archive recovery on a wal_level=minimal cluster,
but the reason is not that the operation is breaking the past backup,
but that it's just nonsentical.

On the other hand, I don't think this patch effectively protect
archive from getting a stopping gap. Just starting the server without
archive recovery succeeds and the archive is successfully broken for
the backups in the past.  In other words, if we are intending to let
users know that their backups have gotten a stopping gap, this patch
doesn't seem to work in that sense.  I would object to reject starting
with wal_level=replica and without archive recovery after shutting
down in minimal mode. That's obviously an overkill.

So, what I think we should do for the objective is to warn if
archiving is enabled but backup is not yet taken.  Yeah, (as I
remember that such discussion has been happened) I don't think we have
a reliable way to know that.

I apologize in advance if I'm missing something.

> > If we want to allow recovery to play past this point I think it would
> > make more sense to have a GUC (like ignore_invalid_pages [1]) that
> > allows recovery to proceed and emits a warning instead of fatal.
> > Looking the patch, I see a few things:
> > 1) Typo in the tests:
> > This protection shold apply to recovery mode
> > should be:
> > This protection should apply to recovery mode
> > 2) I don't think it should be the job of this patch to restructure the
> > if conditions, even if it is more efficient. It just obscures the
> > purpose of the patch.
> 
> +1

So, I don't think even this is needed.

> > So, I would revert all the changes in xlog.c except changing the
> > warning to an error:
> > -    ereport(WARNING,
> > -    (errmsg("WAL was generated with wal_level=minimal,
> > -data may be missing"),
> > - errhint("This happens if you temporarily set
> > -wal_level=minimal without taking a new base backup.")));
> > +    ereport(FATAL,
> > +    (errmsg("WAL was generated with
> > wal_level=minimal, cannot continue recovering"),
> > + errdetail("This happens if you temporarily set
> > wal_level=minimal on the server."),
> > + errhint("Run recovery again from a new base
> > backup taken after setting wal_level higher than minimal")));
> I guess that users usually encounter this error because they have not
> taken base backups yet after setting wal_level to higher than minimal
> and have to use the old base backup for archive recovery. So I'm not
> sure
> how much only this HINT is helpful for them. Isn't it better to append
> something like "If there is no such backup, recover to the point in

Re: Asynchronous Append on postgres_fdw nodes.

2021-03-30 Thread Etsuro Fujita
On Wed, Mar 31, 2021 at 10:11 AM Kyotaro Horiguchi
 wrote:
> + async_capable
> + 
> +  
> +   This option controls whether postgres_fdw allows
> +   foreign tables to be scanned concurrently for asynchronous execution.
> +   It can be specified for a foreign table or a foreign server.
>
> Isn't it strange that an option named "async_capable" *allows* async?

I think "async_capable" is a good name for that option.  See the
option "updatable" below in the postgres_fdw documentation.

> +* We'll prefer to consider this join async-capable if any 
> table from
> +* either side of the join is considered async-capable.
> +*/
> +   fpinfo->async_capable = fpinfo_o->async_capable ||
> +   fpinfo_i->async_capable;
>
> We need to explain this behavior in the documentation.
>
> Regarding to the wording "async capable", if it literally represents
> the capability to run asynchronously, when any one element of a
> combined path doesn't have the capability, the whole path cannot be
> async-capable.  If it represents allowance for an element to run
> asynchronously, then the whole path is inhibited to run asynchronously
> unless all elements are allowed to do so.  If it represents
> enforcement or suggestion to run asynchronously, enforcing asynchrony
> to an element would lead to running the whole path asynchronously
> since all elements of postgres_fdw are capable to run asynchronously
> as the nature.
>
> It looks somewhat inconsistent to be inhibitive for the default value
> of "async_capable", but agressive in merging?

If the foreign table has async_capable=true, it actually means that
there are resources (CPU, IO, network, etc.) to scan the foreign table
concurrently.  And if any table from either side of the join has such
resources, then they could also be used for the join.  So I don't
think this behavior is aggressive.  I think it would be better to add
more comments, though.

Anyway, these are all about naming and docs/comments, so I'll return
to this after committing the patch.

Thanks for the review!

Best regards,
Etsuro Fujita




Re: Use consistent terminology for tablesync slots.

2021-03-30 Thread Amit Kapila
On Wed, Mar 31, 2021 at 6:39 AM Peter Smith  wrote:
>
> On Tue, Mar 30, 2021 at 8:14 PM Amit Kapila  wrote:
> >
> > On Tue, Mar 30, 2021 at 2:21 PM Peter Smith  wrote:
> > >
> > > Hi,
> > >
> > > The logical replication tablesync worker creates tablesync slots.
> > >
> > > Previously some PG docs pages were referring to these as "tablesync
> > > slots", but other pages called them as "table synchronization slots".
> > >
> > > PSA a trivial patch which (for consistency) now calls them all the
> > > same -  "tablesync slots"
> > >
> >
> > +1 for the consistency. But I think it better to use "table
> > synchronization slots" in the user-facing docs as that makes it easier
> > for users to understand.
> >
>
> PSA patch version 2 updated to use "table synchronization slots" as suggested.
>

Thanks, Pushed!

-- 
With Regards,
Amit Kapila.




Re: [PATCH] add concurrent_abort callback for output plugin

2021-03-30 Thread Amit Kapila
On Wed, Mar 31, 2021 at 6:42 AM Ajin Cherian  wrote:
>
> Updated.
>

I have slightly adjusted the comments, docs, and commit message. What
do you think about the attached?

-- 
With Regards,
Amit Kapila.


v4-0001-Ensure-to-send-a-prepare-after-we-detect-concurre.patch
Description: Binary data


Re: pg_amcheck contrib application

2021-03-30 Thread Mark Dilger


> On Mar 30, 2021, at 12:45 PM, Robert Haas  wrote:
> 
> On Mon, Mar 29, 2021 at 7:16 PM Mark Dilger
>  wrote:
>> Sure, here are four patches which do the same as the single v12 patch did.
> 
> Thanks. Here are some comments on 0003 and 0004:
> 
> When you posted v11, you said that "Rather than print out all four
> toast pointer fields for each toast failure, va_rawsize, va_extsize,
> and va_toastrelid are only mentioned in the corruption message if they
> are related to the specific corruption.  Otherwise, just the
> va_valueid is mentioned in the corruption message." I like that
> principal; in fact, as you know, I suggested it. But, with the v13
> patches applied, exactly zero of the callers to
> report_toast_corruption() appear to be following it, because none of
> them include the value ID. I think you need to revise the messages,
> e.g.

These changes got lost between v11 and v12.  I've put them back, as well as 
updating to use your language.

> "toasted value for attribute %u missing from toast table" ->
> "toast value %u not found in toast table";

Changed.

> "final toast chunk number
> %u differs from expected value %u" -> "toast value %u was expected to
> end at chunk %u, but ended at chunk %u";

Changed.

> "toast chunk sequence number
> is null" -> "toast value %u has toast chunk with null sequence
> number".

Changed.

> In the first of those example cases, I think you need not
> mention the attribute number because it's already there in its own
> column.

Correct.  I'd removed that but lost that work in v12.

> On a related note, it doesn't look like you are actually checking
> va_toastrelid here. Doing so seems like it would be a good idea. It
> also seems like it would be good to check that the compressed size is
> less than or equal to the uncompressed size.

Yeah, those checks were in v11 but got lost when I changed things for v12.  
They are back in v14.

> I do not like the name tuple_is_volatile, because volatile has a
> couple of meanings already, and this isn't one of them. A
> SQL-callable function is volatile if it might return different outputs
> given the same inputs, even within the same SQL statement. A C
> variable is volatile if it might be magically modified in ways not
> known to the compiler. I had suggested tuple_cannot_die_now, which is
> closer to the mark. If you want to be even more precise, you could
> talk about whether the tuple is potentially prunable (e.g.
> tuple_can_be_pruned, which inverts the sense). That's really what
> we're worried about: whether MVCC rules would permit the tuple to be
> pruned after we release the buffer lock and before we check TOAST.

I used "tuple_can_be_pruned".  I didn't like "tuple_cannot_die_now", and still 
don't like that name, as it has several wrong interpretations.  One meaning of 
"cannot die now" is that it has become immortal.  Another is "cannot be deleted 
from the table".  

> I would ideally prefer not to rename report_corruption(). The old name
> is clearer, and changing it produces a bunch of churn that I'd rather
> avoid. Perhaps the common helper function could be called
> report_corruption_internal(), and the callers could be
> report_corruption() and report_toast_corruption().

Yes, hence the commit message in the previous patch set, "This patch can 
probably be left out if the committer believes it creates more git churn than 
it is worth."  I've removed this patch from this next patch set, and used the 
function names you suggest.

> Regarding 0001 and 0002, I think the logic in 0002 looks a lot closer
> to correct now, but I want to go through it in more detail. I think,
> though, that you've made some of my comments worse. For example, I
> wrote: "It should be impossible for xvac to still be running, since
> we've removed all that code, but even if it were, it ought to be safe
> to read the tuple, since the original inserter must have committed.
> But, if the xvac transaction committed, this tuple (and its associated
> TOAST tuples) could be pruned at any time." You changed that to read
> "We don't bother comparing against safe_xmin because the VACUUM FULL
> must have committed prior to an upgrade and can't still be running."
> Your comment is shorter, which is a point in its favor, but what I was
> trying to emphasize is that the logic would be correct EVEN IF we
> again started to use HEAP_MOVED_OFF and HEAP_MOVED_IN again. Your
> version makes it sound like the code would need to be revised in that
> case. If that's true, then my comment was wrong, but I didn't think it
> was true, or I wouldn't have written the comment in that way.

I think the logic would have to change if we brought back the old VACUUM FULL 
behavior.

I'm not looking at the old VACUUM FULL code, but my assumption is that if the 
xvac code were resurrected, then when a tuple is moved off by a VACUUM FULL, 
the old tuple and associated toast cannot be pruned until concurrent 
transactions end.  So, if amcheck is running more-or-less 

Re: extra semicolon in postgres_fdw test cases

2021-03-30 Thread Amit Kapila
On Wed, Mar 31, 2021 at 9:35 AM Michael Paquier  wrote:
>
> On Tue, Mar 30, 2021 at 05:00:53PM +0530, Amit Kapila wrote:
> > Looks good to me as well but I think one can choose not to backpatch
> > as there is no functional impact but OTOH, there is some value in
> > keeping tests/code consistent.
>
> FWIW, I would not bother with the back branches for just that, but if
> you feel that this is better, of course feel free.
>

Fair enough. I'll push this just for HEAD.

-- 
With Regards,
Amit Kapila.




Re: unconstrained memory growth in long running procedure stored procedure after upgrading 11-12

2021-03-30 Thread Justin Pryzby
On Tue, Mar 30, 2021 at 04:17:03PM -0500, Merlin Moncure wrote:
> Instructions:
> 1. run the attached script in psql, pgtask_test.sql. It will create a
> database, initialize it, and run the main procedure. dblink must be
> available
> 2. in another window, run SELECT CreateTaskChain('test', 'DEV');

For your reproducer, I needed to:  
1.1) comment this:
|INSERT INTO Task SELECT
|  -- 'test',
1.2) then run: CALL MAIN();

Anyway I reproduced this without an extension this time:

CREATE OR REPLACE FUNCTION cfn() RETURNS void LANGUAGE PLPGSQL AS $$ declare a 
record; begin FOR a IN SELECT generate_series(1,99) LOOP PERFORM format('select 
1'); END LOOP; END $$;
$ yes 'SET jit_above_cost=0; SET jit_inline_above_cost=0; SET jit=on; SET 
client_min_messages=debug; SET log_executor_stats=on; SELECT cfn();' |head -11 
|psql 2>&1 |grep 'max resident'
!   33708 kB max resident size
!   35956 kB max resident size
!   37800 kB max resident size
!   40300 kB max resident size
!   41928 kB max resident size
!   43928 kB max resident size
!   48496 kB max resident size
!   48964 kB max resident size
!   50460 kB max resident size
!   52272 kB max resident size
!   53740 kB max resident size

There's also a relatively microscopic leak even if inline is off.  It may be
that this is what I reproduced last time - I couldn't see how a few hundred kB
leak was causing a our process to be GB sized.  It may or may not be a separate
issue, though.

-- 
Justin




Re: What to call an executor node which lazily caches tuples in a hash table?

2021-03-30 Thread David Rowley
On Wed, 31 Mar 2021 at 14:43, Andy Fan  wrote:
>  At last, I still want to vote for "Tuple(s) Cache", which sounds simple and 
> enough.
> I was thinking if we need to put "Lazy" in the node name since we do build 
> cache
> lazily,  then I found we didn't call "Materialize"  as "Lazy Materialize", so 
> I think we
> can keep consistent.

I thought about this a little more and I can see now why I put the
word "Cache" in the original name. I do now agree we really need to
keep the word "Cache" in the name.

The EXPLAIN ANALYZE talks about "hits", "misses" and "evictions", all
of those are things that caches do.

   ->  Result Cache (actual rows=1 loops=403)
 Cache Key: c1.relkind
 Hits: 398  Misses: 5  Evictions: 0  Overflows: 0  Memory Usage: 1kB
 ->  Aggregate (actual rows=1 loops=5)

I don't think there's any need to put the word "Lazy" in the name as
if we're keeping "Cache", then most caches do only cache results of
values that have been looked for.

I'm just not sure if "Tuple" is the best word or not.  I primarily
think of "tuple" as the word we use internally, but a quick grep of
the docs reminds me that's not the case. The word is used all over the
documents. We have GUCs like parallel_tuple_cost and cpu_tuple_cost.
So it does seem like the sort of thing anyone who is interested in
looking at the EXPLAIN output should know about. I'm just not
massively keen on using that word in the name.  The only other options
that come to mind are "Result" and "Parameterized". However, I think
"Parameterized" does not add much meaning.  I think most people would
expect a cache to have a key.  I sort of see why I went with "Result
Cache" now.

Does anyone else like the name "Tuple Cache"?

David




Re: extra semicolon in postgres_fdw test cases

2021-03-30 Thread Michael Paquier
On Tue, Mar 30, 2021 at 05:00:53PM +0530, Amit Kapila wrote:
> Looks good to me as well but I think one can choose not to backpatch
> as there is no functional impact but OTOH, there is some value in
> keeping tests/code consistent.

FWIW, I would not bother with the back branches for just that, but if
you feel that this is better, of course feel free.
--
Michael


signature.asc
Description: PGP signature


Re: Issue with point_ops and NaN

2021-03-30 Thread Julien Rouhaud
On Tue, Mar 30, 2021 at 11:39:40PM +0800, Julien Rouhaud wrote:
> On Tue, Mar 30, 2021 at 11:02:32AM -0400, Tom Lane wrote:
> > Julien Rouhaud  writes:
> > > On Tue, Mar 30, 2021 at 02:47:05PM +0200, Laurenz Albe wrote:
> > >> I'd say that this is certainly wrong:
> > >> SELECT point('NaN','NaN') <@ polygon('(0,0),(1,0),(1,1),(0,0)');
> > >> 
> > >> ?column? 
> > >> --
> > >>  t
> > >>  (1 row)
> > 
> > > Yeah that's what I think too, but I wanted to have confirmation.
> > 
> > Agreed --- one could make an argument for either 'false' or NULL
> > result, but surely not 'true'.
> 
> I would think that it should return NULL since it's not inside nor outside the
> polygon, but I'm fine with false.
> 
> > I wonder if Horiguchi-san's patch [1] improves this case.
> 
> Oh I totally missed that patch :(
> 
> After a quick look I see this addition in point_inside():
> 
> + /* NaN makes the point cannot be inside the polygon */
> + if (unlikely(isnan(x) || isnan(y)))
> + return 0;
> 
> So I would assume that it should fix this case too.  I'll check tomorrow.

I confirm that this patch fixes the issue, and after looking a bit more at the
thread it's unsurprising since Jesse initially reported the exact same problem.

I'll try to review it as soon as I'll be done with my work duties.




Re: Proposal: Save user's original authenticated identity for logging

2021-03-30 Thread Michael Paquier
On Tue, Mar 30, 2021 at 05:06:51PM +, Jacob Champion wrote:
> The key there is "if there is a failure" -- false positives need to be
> debugged too. Tests I've worked with recently for the NSS work were
> succeeding for the wrong reasons. Overly generic logfile matches ("SSL
> error"), for example.

Indeed, so that's a test stability issue.  It looks like a good idea
to make those tests more picky with the sub-errors they expect.  I see
most "certificate verify failed" a lot, two "sslv3 alert certificate
revoked" and one "tlsv1 alert unknown ca" with 1.1.1, but it is not
something that this patch has to address IMO.

> modules/ssl_passphrase_callback/t/001_testfunc.pl is where I pulled
> this pattern from.

I see.  For this case, I see no issue as the input caught is from
_PG_init() so that seems better than a wait on the logs generated.

> Does unilateral log truncation play any nicer with parallel test runs?
> I understand not wanting to make an existing problem worse, but it
> doesn't seem like the existing tests were written for general
> parallelism.

TAP tests running in parallel use their own isolated backend, wiht
dedicated paths and ports.

> Would it be acceptable to adjust the tests for live rotation using the
> logging collector, rather than a full restart? It would unfortunately
> mean that we have to somehow wait for the rotation to complete, since
> that's asynchronous.
> 
> (Speaking of asynchronous: how does the existing check-and-truncate
> code make sure that the log entries it's looking for have been flushed
> to disk? Shutting down the server guarantees it.)

stderr redirection looks to be working pretty well with
issues_sql_like().

> I took a stab at this in v13: "Causes each attempted connection to the
> server to be logged, as well as successful completion of both client
> authentication (if necessary) and authorization." (IMO any further in-
> depth explanation of authn/z and user mapping probably belongs in the
> auth method documentation, and this patch doesn't change any authn/z
> behavior.)
>
> v13 also incorporates the latest SSL cert changes, so it's just a
> single patch now. Tests now cover the CN and DN clientname modes. I
> have not changed the log capture method yet; I'll take a look at it
> next.

Thanks, I am looking into that and I am digging into the code now.
--
Michael


signature.asc
Description: PGP signature


Re: Outdated comment for CreateStmt.inhRelations

2021-03-30 Thread Julien Rouhaud
On Wed, Mar 31, 2021 at 09:36:51AM +0900, Michael Paquier wrote:
> On Tue, Mar 30, 2021 at 08:30:15PM +0800, Julien Rouhaud wrote:
> > I just noticed that the comment for CreateStmt.inhRelations says that it's a
> > List of inhRelation, which hasn't been the case for a very long time.
> 
> Thanks, applied.

Thanks!




Re: extra semicolon in postgres_fdw test cases

2021-03-30 Thread vignesh C
On Tue, Mar 30, 2021 at 3:21 PM Suraj Kharage
 wrote:
>
> Hi,
>
> Noticed that an extra semicolon in a couple of test cases related to 
> postgres_fdw. Removed that in the attached patch. It can be backported till 
> v11 where we added those test cases.
>

Thanks for identifying this, the changes look fine to me.

Regards,
Vignesh




Re: Lowering the ever-growing heap->pd_lower

2021-03-30 Thread Peter Geoghegan
On Wed, Mar 10, 2021 at 6:01 AM Matthias van de Meent
 wrote:
> > The case I was concerned about back when is that there are various bits of
> > code that may visit a page with a predetermined TID in mind to look at.
> > An index lookup is an obvious example, and another one is chasing an
> > update chain's t_ctid link.  You might argue that if the tuple was dead
> > enough to be removed, there should be no such in-flight references to
> > worry about, but I think such an assumption is unsafe.  There is not, for
> > example, any interlock that ensures that a process that has obtained a TID
> > from an index will have completed its heap visit before a VACUUM that
> > subsequently removed that index entry also removes the heap tuple.
>
> I am aware of this problem. I will admit that I did not detected all
> potentially problematic accesses, so I'll show you my work.

Attached is a trivial rebase of your v3, which I've called v4. I am
interested in getting this patch into Postgres 14.

> In my search for problematic accesses, I make the following assumptions:
> * PageRepairFragmentation as found in bufpage is only applicable to
> heap pages; this function is not applied to other pages in core
> postgres. So, any problems that occur are with due to access with an
> OffsetNumber > PageGetMaxOffsetNumber.
> * Items [on heap pages] are only extracted after using PageGetItemId
> for that item on the page, whilst holding a lock.

> The 3 problem cases were classified based on the origin of the
> potentially invalid pointer.
>
> Problem case 1: table scan; heapam.c lines 678 and 811, in heapgettup

I think that it boils down to this: 100% of the cases where this could
be a problem all either involve old TIDs, or old line pointer -- in
principle these could be invalidated in some way, like your backwards
scan example. But that's it. Bear in mind that we always call
PageRepairFragmentation() with a super-exclusive lock.

> This is in the replay of transaction logs, in heap_xlog_freeze_page.
> As I am unaware whether or not pages to which these transaction logs
> are applied can contain changes from the xlog-generating instance, I
> flagged this as a potential problem. The application of the xlogs is
> probably safe (it assumes the existence of a HeapTupleHeader for that
> ItemId), but my limited knowledge put this on the 'potential problem'
> list.
>
> Please advise on this; I cannot find the right documentation

Are you aware of wal_consistency_checking?

I think that this should be fine. There are differences that are
possible between a replica and primary, but they're very minor and
don't seem relevant.

> Problem case 3: invalid redirect pointers; pruneheap.c line 949, in
> heap_get_root_tuples
>
> The heap pruning mechanism currently assumes that all redirects are
> valid. Currently, if a HOT root points to !ItemIdIsNormal, it breaks
> out of the loop, but doesn't actually fail. This code is already
> potentially problematic because it has no bounds or sanity checking
> for the nextoffnum, but with shrinking pd_linp it would also add the
> failure mode of HOT tuples pointing into what is now arbitrary data.

heap_prune_chain() is less trusting than heap_get_root_tuples(),
though -- it doesn't trust redirects (because there is a generic
offnum sanity check at the start of its loop). I think that the
inconsistency between these two functions probably isn't hugely
significant.

Ideally it would be 100% clear which of the defenses in code like this
is merely extra hardening. The assumptions should be formalized. There
is nothing wrong with hardening, but we should know it when we see it.

> This failure mode is now accompanied by an Assert, which fails when
> the redirect is to an invalid OffsetNumber. This is not enough to not
> exhibit arbitrary behaviour when accessing corrupted data with
> non-assert builds, but I think that that is fine; we already do not
> have a guaranteed behaviour for this failure mode.

What about my "set would-be LP_UNUSED space to all-0x7F bytes in
CLOBBER_FREED_MEMORY builds" idea? Did you think about that?

--
Peter Geoghegan


v4-0001-Truncate-a-pages-line-pointer-array-when-it-has-t.patch
Description: Binary data


Re: DROP INDEX docs - explicit lock naming

2021-03-30 Thread Greg Rychlewski
Thanks for pointing that out. I've attached a new patch with several other
updates where I felt confident the docs were referring to an ACCESS
EXCLUSIVE lock.

On Tue, Mar 30, 2021 at 8:47 PM Michael Paquier  wrote:

> On Tue, Mar 30, 2021 at 10:33:46AM -0400, Greg Rychlewski wrote:
> > While reading the documentation for DROP INDEX[1], I noticed the lock was
> > described colloquially as an "exclusive" lock, which made me pause for a
> > second because it's the same name as the EXCLUSIVE table lock.
> >
> > The attached patch explicitly states that an ACCESS EXCLUSIVE lock is
> > acquired.
>
> Indeed, this could be read as ACCESS SHARE being allowed, but that's
> never the case for any of the index code paths, except if CONCURRENTLY
> is involved.  It is not the only place in the docs where we could do
> more clarification.  For instance, reindex.sgml mentions twice an
> exclusive lock but that should be an access exclusive lock.  To be
> exact, I can spot 27 places under doc/ that could be improved.  Such
> changes depend on the surrounding context, of course.
> --
> Michael
>


v2-docs-access-exclusive-lock.patch
Description: Binary data


Re: making update/delete of inheritance trees scale better

2021-03-30 Thread Amit Langote
On Wed, Mar 31, 2021 at 11:56 AM Tom Lane  wrote:
> I noticed something else interesting.  If you try an actually-useful
> UPDATE, ie one that has to do some computation in the target list,
> you can get a plan like this if it's a partitioned table:
>
> EXPLAIN (verbose, costs off) UPDATE parent SET f2 = f2 + 1;
> QUERY PLAN
> ---
>  Update on public.parent
>Update on public.child1 parent_1
>Update on public.child2 parent_2
>Update on public.child3 parent_3
>->  Append
>  ->  Seq Scan on public.child1 parent_1
>Output: (parent_1.f2 + 1), parent_1.tableoid, parent_1.ctid
>  ->  Seq Scan on public.child2 parent_2
>Output: (parent_2.f2 + 1), parent_2.tableoid, parent_2.ctid
>  ->  Seq Scan on public.child3 parent_3
>Output: (parent_3.f2 + 1), parent_3.tableoid, parent_3.ctid
>
> But when using traditional inheritance, it looks more like:
>
> EXPLAIN (verbose, costs off) UPDATE parent SET f2 = f2 + 1;
> QUERY PLAN
> ---
>  Update on public.parent
>Update on public.parent parent_1
>Update on public.child1 parent_2
>Update on public.child2 parent_3
>Update on public.child3 parent_4
>->  Result
>  Output: (parent.f2 + 1), parent.tableoid, parent.ctid
>  ->  Append
>->  Seq Scan on public.parent parent_1
>  Output: parent_1.f2, parent_1.tableoid, parent_1.ctid
>->  Seq Scan on public.child1 parent_2
>  Output: parent_2.f2, parent_2.tableoid, parent_2.ctid
>->  Seq Scan on public.child2 parent_3
>  Output: parent_3.f2, parent_3.tableoid, parent_3.ctid
>->  Seq Scan on public.child3 parent_4
>  Output: parent_4.f2, parent_4.tableoid, parent_4.ctid
>
> That is, instead of shoving the "f2 + 1" computation down to the table
> scans, it gets done in a separate Result node, implying yet another
> extra node in the plan with resultant slowdown.  The reason for this
> seems to be that apply_scanjoin_target_to_paths has special logic
> to push the target down to members of a partitioned table, but it
> doesn't do that for other sorts of appendrels.  That isn't new
> with this patch, you can see the same behavior in SELECT.

I've noticed this too when investigating why
find_modifytable_subplan() needed to deal with a Result node in some
cases.

> Given the distinct whiff of second-class citizenship that traditional
> inheritance has today, I'm not sure how excited people will be about
> fixing this.  I've complained before that apply_scanjoin_target_to_paths
> is brute-force and needs to be rewritten, but I don't really want to
> undertake that task right now.

I remember having *unsuccessfully* tried to make
apply_scanjoin_target_to_paths() do the targetlist pushdown for the
traditional inheritance cases as well.  I agree that rethinking the
whole apply_scanjoin_target_to_paths() approach might be a better use
of our time.  It has a looping-over-the-whole-partition-array
bottleneck for simple lookup queries that I have long wanted to
propose doing something about.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2021-03-30 Thread Julien Rouhaud
On Thu, Mar 25, 2021 at 10:36:38AM +0800, Julien Rouhaud wrote:
> On Wed, Mar 24, 2021 at 01:02:00PM -0300, Alvaro Herrera wrote:
> > On 2021-Mar-24, Julien Rouhaud wrote:
> > 
> > > From e08c9d5fc86ba722844d97000798de868890aba3 Mon Sep 17 00:00:00 2001
> > > From: Bruce Momjian 
> > > Date: Mon, 22 Mar 2021 17:43:23 -0400
> > > Subject: [PATCH v20 2/3] Expose queryid in pg_stat_activity and
> > 
> > >  src/backend/executor/execMain.c   |   9 ++
> > >  src/backend/executor/execParallel.c   |  14 ++-
> > >  src/backend/executor/nodeGather.c |   3 +-
> > >  src/backend/executor/nodeGatherMerge.c|   4 +-
> > 
> > Hmm...
> > 
> > I find it odd that there's executor code that acquires the current query
> > ID from pgstat, after having been put there by planner or ExecutorStart
> > itself.  Seems like a modularity violation.  I wonder if it would make
> > more sense to have the value maybe in struct EState (or perhaps there's
> > a better place -- but I don't think they have a way to reach the
> > QueryDesc anyhow), put there by ExecutorStart, so that places such as
> > execParallel, nodeGather etc don't have to fetch it from pgstat but from
> > EState.
> 
> The current queryid is already available in the Estate, as the underlying
> PlannedStmt contains it.  The problem is that we want to display the top level
> queryid, not the current query one, and the top level queryid is held in
> pgstat.

So is the current approach ok?  If not I'm afraid that detecting and caching
the top level queryid in the executor parts would lead to some code
duplication.




Re: Add missing function abs (interval)

2021-03-30 Thread Isaac Morland
I've attached a patch for this. Turns out there was a comment in the source
explaining that there is no interval_abs because it's not clear what to
return; but I think it's clear that if i is an interval the larger of i and
-i should be considered to be the absolute value, the same as would be done
for any type; essentially, if the type is orderable and has a meaningful
definition of unary minus, the definition of abs follows from those.

This does have some odd effects, as was observed in the previous discussion
pointed at by John Naylor above (for which thanks!). But those odd effects
are not due to abs(interval) itself but rather due to the odd behaviour of
interval, where values which compare equal to '0'::interval can change a
timestamp when added to it. This in turn comes from what the interval data
type is trying to do combined with the inherently complicated nature of our
timekeeping system.

I have included in the test case some testing of what happens with '1 month
-30 days'::interval, which is "equal" to '0'::interval.

At least one thing concerns me about my code: Given an interval i, I
palloc() space to calculate -i; then either return that or the original
input depending on the result of a comparison. Will I leak space as a
result? Should I free the value if I don't return it?

In addition to adding abs(interval) and related @ operator, I would like to
update interval_smaller and interval_larger to change < and > to <= and >=
respectively. This is to make the min(interval) and max(interval)
aggregates return the first of multiple distinct "equal" intervals,
contrary to the current behaviour:

odyssey=> select max (i) from (values ('1 month -30 days'::interval), ('-1
month 30 days'))t(i);
   max
--
 -1 mons +30 days
(1 row)

odyssey=> select min (i) from (values ('1 month -30 days'::interval), ('-1
month 30 days'))t(i);
   min
--
 -1 mons +30 days
(1 row)

odyssey=>

GREATEST and LEAST already take the first value:

odyssey=> select greatest ('1 month -30 days'::interval, '-1 month 30
days');
greatest

 1 mon -30 days
(1 row)

odyssey=> select least ('1 month -30 days'::interval, '-1 month 30 days');
 least

 1 mon -30 days
(1 row)

odyssey=>


On Mon, 29 Mar 2021 at 21:06, Michael Paquier  wrote:

> On Mon, Mar 29, 2021 at 07:15:19PM -0400, John Naylor wrote:
> > Looking in the archives, I see this attempt that you can build upon:
> >
> https://www.postgresql.org/message-id/flat/CAHE3wggpj%2Bk-zXLUdcBDRe3oahkb21pSMPDm-HzPjZxJn4vMMw%40mail.gmail.com
>
> I see no problem with doing something more here.  If you can get a
> patch, please feel free to add it to the next commit fest, for
> Postgres 15:
> https://commitfest.postgresql.org/33/
> --
> Michael
>


0001-Add-abs-interval-function-and-related-operator.patch
Description: Binary data


Re: multi-install PostgresNode fails with older postgres versions

2021-03-30 Thread Michael Paquier
On Tue, Mar 30, 2021 at 11:06:55PM -0300, Alvaro Herrera wrote:
> On 2021-Mar-31, Michael Paquier wrote:
>> There is already TestLib::check_pg_config().  Shouldn't you leverage
>> that with PG_VERSION_NUM or equivalent?
> 
> hmm, I wonder if we shouldn't take the stance that it is not TestLib's
> business to be calling any Pg binaries.  So that routine should be moved
> to PostgresNode.

Hmm.  I can see the intention, but I am not sure that this is
completely correct either to assume that we require a backend node
when tests just do tests with frontends in standalone, aka with
TestLib::program_*.
--
Michael


signature.asc
Description: PGP signature


Re: What to call an executor node which lazily caches tuples in a hash table?

2021-03-30 Thread David Rowley
On Wed, 31 Mar 2021 at 14:43, Andy Fan  wrote:
> When naming it,  we may also think about some non native English speakers, so
> some too advanced words may make them uncomfortable.  Actually when  I read
> "Reactive", I googled to find what its meaning is. I knew reactive 
> programming, but I
> do not truly understand "reactive hash".

The origin of that idea came from "reactive" being the opposite of
"proactive". If that's not clear then it's likely a bad choice for a
name.

I had thought proactive would mean "do things beforehand" i.e not on
demand.  Basically, just fill the hash table with records that we need
to put in it rather than all records that we might need, the latter
being what Hash Join does, and the former is what the new node does.

David




Re: making update/delete of inheritance trees scale better

2021-03-30 Thread Tom Lane
I noticed something else interesting.  If you try an actually-useful
UPDATE, ie one that has to do some computation in the target list,
you can get a plan like this if it's a partitioned table:

EXPLAIN (verbose, costs off) UPDATE parent SET f2 = f2 + 1;
QUERY PLAN 
---
 Update on public.parent
   Update on public.child1 parent_1
   Update on public.child2 parent_2
   Update on public.child3 parent_3
   ->  Append
 ->  Seq Scan on public.child1 parent_1
   Output: (parent_1.f2 + 1), parent_1.tableoid, parent_1.ctid
 ->  Seq Scan on public.child2 parent_2
   Output: (parent_2.f2 + 1), parent_2.tableoid, parent_2.ctid
 ->  Seq Scan on public.child3 parent_3
   Output: (parent_3.f2 + 1), parent_3.tableoid, parent_3.ctid

But when using traditional inheritance, it looks more like:

EXPLAIN (verbose, costs off) UPDATE parent SET f2 = f2 + 1;
QUERY PLAN 
---
 Update on public.parent
   Update on public.parent parent_1
   Update on public.child1 parent_2
   Update on public.child2 parent_3
   Update on public.child3 parent_4
   ->  Result
 Output: (parent.f2 + 1), parent.tableoid, parent.ctid
 ->  Append
   ->  Seq Scan on public.parent parent_1
 Output: parent_1.f2, parent_1.tableoid, parent_1.ctid
   ->  Seq Scan on public.child1 parent_2
 Output: parent_2.f2, parent_2.tableoid, parent_2.ctid
   ->  Seq Scan on public.child2 parent_3
 Output: parent_3.f2, parent_3.tableoid, parent_3.ctid
   ->  Seq Scan on public.child3 parent_4
 Output: parent_4.f2, parent_4.tableoid, parent_4.ctid

That is, instead of shoving the "f2 + 1" computation down to the table
scans, it gets done in a separate Result node, implying yet another
extra node in the plan with resultant slowdown.  The reason for this
seems to be that apply_scanjoin_target_to_paths has special logic
to push the target down to members of a partitioned table, but it
doesn't do that for other sorts of appendrels.  That isn't new
with this patch, you can see the same behavior in SELECT.

Given the distinct whiff of second-class citizenship that traditional
inheritance has today, I'm not sure how excited people will be about
fixing this.  I've complained before that apply_scanjoin_target_to_paths
is brute-force and needs to be rewritten, but I don't really want to
undertake that task right now.

regards, tom lane




Re: making update/delete of inheritance trees scale better

2021-03-30 Thread Amit Langote
On Wed, Mar 31, 2021 at 7:13 AM Tom Lane  wrote:
> I wrote:
> > However, I then tried a partitioned equivalent of the 6-column case
> > (script also attached), and it looks like
> > 6 columns 16551   19097   15637   18201
> > which is really noticeably worse, 16% or so.
>
> ... and on the third hand, that might just be some weird compiler-
> and platform-specific artifact.
>
> Using the exact same compiler (RHEL8's gcc 8.3.1) on a different
> x86_64 machine, I measure the same case as about 7% slowdown not
> 16%.  That's still not great, but it calls the original measurement
> into question, for sure.
>
> Using Apple's clang 12.0.0 on an M1 mini, the patch actually clocks
> in a couple percent *faster* than HEAD, for both the partitioned and
> unpartitioned 6-column test cases.
>
> So I'm not sure what to make of these results, but my level of concern
> is less than it was earlier today.  I might've just gotten trapped by
> the usual bugaboo of micro-benchmarking, ie putting too much stock in
> only one test case.

FWIW, I ran the scripts you shared and see the following (median of 3
runs) times in ms for the UPDATE in each script:

heikki6.sql

master: 19139 (jit=off) 18404 (jit=on)
patched: 20202 (jit=off) 19290 (jit=on)

hekki10.sql

master: 21686 (jit=off) 21435 (jit=on)
patched: 20953 (jit=off) 20161 (jit=on)

Patch shows a win for 10 columns here.

part6.sql

master: 20321 (jit=off) 19580 (jit=on)
patched: 22661 (jit=off) 21636 (jit=on)

I wrote part10.sql and ran that too, with these results:

master: 22280 (jit=off) 21876 (jit=on)
patched: 23466 (jit=off) 22905 (jit=on)

The partitioned case slowdown is roughly 10% with 6 columns, 5% with
10.  I would agree that that's not too bad for a worse-case test case,
nor something we couldn't optimize.  I have yet to look closely at
where the problem lies though.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: "box" type description

2021-03-30 Thread Kyotaro Horiguchi
At Mon, 29 Mar 2021 22:44:29 +0200, Christoph Berg  wrote in 
> I believe the "box" type description is slightly incorrect:
> 
> # \dT box
>  Liste der Datentypen
>Schema   │ Name │   Beschreibung
> ┼──┼──
>  pg_catalog │ box  │ geometric box '(lower left,upper right)'
> 
> While the syntax '((3,4),(1,2))'::box works, the canonical spelling is
> '(3,4),(1,2)' and hence the description should be:
> geometric box '(lower left),(upper right)'

Maybe the reason you think so is that a box is printed in that format.

postgres=# select '((1,1),(2,2))'::box;
 box 
-
 (2,2),(1,1)
(1 row)

It doesn't use the word "canonical", but the documentation is saying
that it is the output format.  So I think you're right in that point.


https://www.postgresql.org/docs/13/datatype-geometric.html

Table 8.20. Geometric Types

point   16 bytesPoint on a plane(x,y)
line32 bytesInfinite line   {A,B,C}
lseg32 bytesFinite line segment ((x1,y1),(x2,y2))
box 32 bytesRectangular box 
((x1,y1),(x2,y2))
path16+16n bytesClosed path (similar to polygon)((x1,y1),...)
path16+16n bytesOpen path   [(x1,y1),...]
polygon 40+16n bytesPolygon (similar to closed path)((x1,y1),...)
circle  24 bytesCircle  <(x,y),r> 
(center point and radius)

Similary, lseg seems inconsistent... (It is correctly described in
later sections.)

select '(1,1),(2,2)'::lseg; => [(1,1),(2,2)]

Surely it would be better that the documentation is consistent with
the output function. Perhaps we prefer to fix documentation rather
than to fix implementation to give impacts on applications that may
exist.  (I don't like the notation since the representation of box
doesn't look like one object, though..)


Returing to the description of pg_types, it should be changed like
this following the discussion here.

- pg_catalog | box  | geometric box '(lower left,upper right)'
+ pg_catalog | box  | geometric box 'lower left,upper right'

But I find it hard to read. I fixed it instead as the following in the
attached. However, it might rather be better not changing it..

+ pg_catalog | box  | geometric box 'pt-lower-left,pt-upper-right'

I added a space after commas, since point has it and (I think) it is
easier to read having the ones.

Is there any opinions?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 7c341c8e3f..e118a689c8 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -3264,13 +3264,13 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
 lseg
 32 bytes
 Finite line segment
-((x1,y1),(x2,y2))
+[(x1,y1),(x2,y2)]


 box
 32 bytes
 Rectangular box
-((x1,y1),(x2,y2))
+(x1,y1),(x2,y2)


 path
diff --git a/src/include/catalog/pg_type.dat b/src/include/catalog/pg_type.dat
index 8c145c00be..42adc184d7 100644
--- a/src/include/catalog/pg_type.dat
+++ b/src/include/catalog/pg_type.dat
@@ -185,24 +185,24 @@
   typinput => 'point_in', typoutput => 'point_out', typreceive => 'point_recv',
   typsend => 'point_send', typalign => 'd' },
 { oid => '601', array_type_oid => '1018',
-  descr => 'geometric line segment \'(pt1,pt2)\'',
+  descr => 'geometric line segment \'[pt1, pt2]\'',
   typname => 'lseg', typlen => '32', typbyval => 'f', typcategory => 'G',
   typsubscript => 'raw_array_subscript_handler', typelem => 'point',
   typinput => 'lseg_in', typoutput => 'lseg_out', typreceive => 'lseg_recv',
   typsend => 'lseg_send', typalign => 'd' },
 { oid => '602', array_type_oid => '1019',
-  descr => 'geometric path \'(pt1,...)\'',
+  descr => 'geometric path \'(pt1, ...)\'',
   typname => 'path', typlen => '-1', typbyval => 'f', typcategory => 'G',
   typinput => 'path_in', typoutput => 'path_out', typreceive => 'path_recv',
   typsend => 'path_send', typalign => 'd', typstorage => 'x' },
 { oid => '603', array_type_oid => '1020',
-  descr => 'geometric box \'(lower left,upper right)\'',
+  descr => 'geometric box \'pt-lower-left, pt-upper-right\'',
   typname => 'box', typlen => '32', typbyval => 'f', typcategory => 'G',
   typdelim => ';', typsubscript => 'raw_array_subscript_handler',
   typelem => 'point', typinput => 'box_in', typoutput => 'box_out',
   typreceive => 'box_recv', typsend => 'box_send', typalign => 'd' },
 { oid => '604', array_type_oid => '1027',
-  descr => 'geometric polygon \'(pt1,...)\'',
+  descr => 'geometric polygon \'(pt1, ...)\'',
   typname => 'polygon', typlen => '-1', typbyval => 'f', typcategory => 'G',
   typinput => 'poly_in', typoutput => 'poly_out', typreceive => 'poly_recv',
   typsend => 

Re: [PATCH v3 1/1] Fix detection of preadv/pwritev support for OSX.

2021-03-30 Thread James Hilliard
On Tue, Mar 30, 2021 at 7:51 PM Tom Lane  wrote:
>
> Thomas Munro  writes:
> > Personally I'm mostly concerned about making it easy for new
> > contributors to get a working dev system going on a super common
> > platform without dealing with hard-to-diagnose errors, than people who
> > actually want a different target as a deliberate choice.  Do I
> > understand correctly that there a period of time each year when major
> > upgrades come out of sync and lots of people finish up running a
> > toolchain and OS with this problem for a while due to the default
> > target not matching?   If so I wonder if other projects are running
> > into this with AC_REPLACE_FUNCS and what they're doing.
>
> Yeah, we've seen this happen at least a couple of times, though
> it was only during this past cycle that we (I anyway) entirely
> understood what was happening.
>
> The patches we committed in January (4823621db, 9d23c15a0, 50bebc1ae)
> to improve our PG_SYSROOT selection heuristics should theoretically
> improve the situation ... though I admit I won't have a lot of
> confidence in them till we've been through a couple more rounds of
> asynchronous-XCode-and-macOS releases.  Still, I feel that we
> ought to leave that code alone until we see how it does.

I mean, we know that it will still break under a number of common
circumstances so I think we should be fixing the root cause(improper
detection of target deployment API availability in probes) in some
way as this will probably continue to be an issue otherwise, we already
know that improving PG_SYSROOT selection can not fix the root issue
but rather tries to workaround it in a way that is pretty much guaranteed
to be brittle.

Is there any approach to fixing the root cause of this issue that you think
would be acceptable?

>
> regards, tom lane




RE: libpq debug log

2021-03-30 Thread tsunakawa.ta...@fujitsu.com
From: 'alvhe...@alvh.no-ip.org' 
> > got expected ERROR:  cannot insert multiple commands into a prepared
> statement
> > got expected division-by-zero
> 
> Eh?  this is not at all expected, of course, but clearly not PQtrace's
> fault.  I'll look tomorrow.

Iwata-san and I were starting to investigate the issue.  I guessed the first 
message was not expected.  Please let us know if there's something we can help.

(I was amazed that you finally committed this great feature, libpq pipeline, 
while you are caring about many patches.)


Regards
Takayuki Tsunakawa




Re: multi-install PostgresNode fails with older postgres versions

2021-03-30 Thread Alvaro Herrera
On 2021-Mar-31, Michael Paquier wrote:

> There is already TestLib::check_pg_config().  Shouldn't you leverage
> that with PG_VERSION_NUM or equivalent?

hmm, I wonder if we shouldn't take the stance that it is not TestLib's
business to be calling any Pg binaries.  So that routine should be moved
to PostgresNode.



-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: libpq debug log

2021-03-30 Thread 'alvhe...@alvh.no-ip.org'
On 2021-Mar-30, 'alvhe...@alvh.no-ip.org' wrote:

> got expected ERROR:  cannot insert multiple commands into a prepared statement
> got expected division-by-zero

Eh?  this is not at all expected, of course, but clearly not PQtrace's
fault.  I'll look tomorrow.

-- 
Álvaro Herrera39°49'30"S 73°17'W
"El sentido de las cosas no viene de las cosas, sino de
las inteligencias que las aplican a sus problemas diarios
en busca del progreso." (Ernesto Hernández-Novich)




Re: Trouble with initdb trying to run regression tests

2021-03-30 Thread Isaac Morland
On Tue, 30 Mar 2021 at 18:39, Tom Lane  wrote:

> Isaac Morland  writes:
> > I've built Postgres inside a Ubuntu Vagrant VM. When I try to "make
> check",
> > I get a complaint about the permissions on the data directory:
>
[]

> Further up in initdb.log, there was probably some useful information
> about whether it found an existing directory there or not.
>

Sorry for the noise. Turns out that directory creation in /vagrant does not
respect umask:

vagrant@ubuntu-focal:/vagrant$ umask
0027
vagrant@ubuntu-focal:/vagrant$ mkdir test-umask
vagrant@ubuntu-focal:/vagrant$ ls -ld test-umask/
drwxr-xr-x 1 vagrant vagrant 64 Mar 31 01:12 test-umask/
vagrant@ubuntu-focal:/vagrant$

I knew that file ownership changes are not processed in /vagrant; and
because I remembered that I checked whether permission mode changes were
accepted, but didn't think to check whether umask worked. When I tried
again (git clone, build, make check) in another directory it worked fine.

I was then able to get the tests to run (and pass) in /vagrant by changing
the --temp-instance setting in src/Makefile.global (and looks like I should
be able to edit src/Makefile.global.in and re-run configure) to a location
outside of /vagrant. Is there a way to tell configure to override the
setting? I ask mostly because src/Makefile.global.in says users shouldn't
need to edit it. Otherwise my fix will most likely be to maintain a
one-line update to this file in my checkout.


Re: [PATCH v3 1/1] Fix detection of preadv/pwritev support for OSX.

2021-03-30 Thread Tom Lane
Thomas Munro  writes:
> Personally I'm mostly concerned about making it easy for new
> contributors to get a working dev system going on a super common
> platform without dealing with hard-to-diagnose errors, than people who
> actually want a different target as a deliberate choice.  Do I
> understand correctly that there a period of time each year when major
> upgrades come out of sync and lots of people finish up running a
> toolchain and OS with this problem for a while due to the default
> target not matching?   If so I wonder if other projects are running
> into this with AC_REPLACE_FUNCS and what they're doing.

Yeah, we've seen this happen at least a couple of times, though
it was only during this past cycle that we (I anyway) entirely
understood what was happening.

The patches we committed in January (4823621db, 9d23c15a0, 50bebc1ae)
to improve our PG_SYSROOT selection heuristics should theoretically
improve the situation ... though I admit I won't have a lot of
confidence in them till we've been through a couple more rounds of
asynchronous-XCode-and-macOS releases.  Still, I feel that we
ought to leave that code alone until we see how it does.

regards, tom lane




Re: row filtering for logical replication

2021-03-30 Thread Euler Taveira
On Tue, Mar 30, 2021, at 8:23 AM, Amit Kapila wrote:
> On Mon, Mar 29, 2021 at 6:47 PM Euler Taveira  > wrote:
> >
> Few comments:
> ==
> 1. How can we specify row filters for multiple tables for a
> publication? Consider a case as below:
It is not possible. Row filter is a per table option. Isn't it clear from the
synopsis? The current design allows different row filter for tables in the same
publication. It is more flexible than a single row filter for a set of tables
(even if we would support such variant, there are some cases where the
condition should be different because the column names are not the same). You
can easily build a CREATE PUBLICATION command that adds the same row filter
multiple times using a DO block or use a similar approach in your favorite
language.

> postgres=# CREATE TABLE tab_rowfilter_1 (a int primary key, b text);
> CREATE TABLE
> postgres=# CREATE TABLE tab_rowfilter_2 (c int primary key);
> CREATE TABLE
> 
> postgres=# CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1,
> tab_rowfilter_2 WHERE (a > 1000 AND b <> 'filtered');
> ERROR:  column "a" does not exist
> LINE 1: ...FOR TABLE tab_rowfilter_1, tab_rowfilter_2 WHERE (a > 1000 A...
> 
>  ^
> 
> postgres=# CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1,
> tab_rowfilter_2  WHERE (c > 1000);
> CREATE PUBLICATION
> 
> It gives an error when I tried to specify the columns corresponding to
> the first relation but is fine for columns for the second relation.
> Then, I tried few more combinations like below but that didn't work.
> CREATE PUBLICATION tap_pub_1 FOR TABLE tab_rowfilter_1 As t1,
> tab_rowfilter_2 As t2 WHERE (t1.a > 1000 AND t1.b <> 'filtered');
> 
> Will users be allowed to specify join conditions among columns from
> multiple tables?
It seems you are envisioning row filter as a publication property instead of a
publication-relation property. Due to the flexibility that the later approach
provides, I decided to use it because it covers more use cases. Regarding
allowing joins, it could possibly slow down a critical path, no? This code path
is executed by every change. If there are interest in the join support, we
might add it in a future patch.

> 2.
> + /*
> + * Although ALTER PUBLICATION grammar allows WHERE clause to be specified
> + * for DROP TABLE action, it doesn't make sense to allow it. We implement
> + * this restriction here, instead of complicating the grammar to enforce
> + * it.
> + */
> + if (stmt->tableAction == DEFELEM_DROP)
> + {
> + ListCell   *lc;
> +
> + foreach(lc, stmt->tables)
> + {
> + PublicationTable *t = lfirst(lc);
> +
> + if (t->whereClause)
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("cannot use a WHERE clause when removing table from
> publication \"%s\"",
> + NameStr(pubform->pubname;
> + }
> + }
> 
> Is there a reason to deal with this here separately rather than in the
> ALTER PUBLICATION grammar?
Good question. IIRC the issue is that AlterPublicationStmt->tables has a list
element that was a relation_expr_list and was converted to
publication_table_list. If we share 'tables' with relation_expr_list (for ALTER
PUBLICATION ... DROP TABLE) and publication_table_list (for the other ALTER
PUBLICATION ... ADD|SET TABLE), the OpenTableList() has to know what list
element it is dealing with. I think I came to the conclusion that it is less
uglier to avoid changing OpenTableList() and CloseTableList().

[Doing some experimentation...]

Here is a patch that remove the referred code. It uses 2 distinct list
elements: relation_expr_list for ALTER PUBLICATION ... DROP TABLE and
publication_table_list for for ALTER PUBLICATION ... ADD|SET TABLE. A new
parameter was introduced to deal with the different elements of the list
'tables'.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 99d36d0f5cb5f706c73fcdbb05772580f6814fe6 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Mon, 18 Jan 2021 12:07:51 -0300
Subject: [PATCH] Row filter for logical replication

This feature adds row filter for publication tables. When you define or modify
a publication you can optionally filter rows that does not satisfy a WHERE
condition. It allows you to partially replicate a database or set of tables.
The row filter is per table which means that you can define different row
filters for different tables. A new row filter can be added simply by
informing the WHERE clause after the table name. The WHERE expression must be
enclosed by parentheses.

The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, and DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter 

Re: What to call an executor node which lazily caches tuples in a hash table?

2021-03-30 Thread Andy Fan
On Wed, Mar 31, 2021 at 7:45 AM Zhihong Yu  wrote:

> Hi,
> I was reading this part of the description:
>
> the Result Cache's
> hash table is much smaller than the hash join's due to result cache only
> caching useful values rather than all tuples from the inner side of the
> join.
>
> I think the word 'Result' should be part of the cache name considering the
> above.
>
> Cheers
>
> On Tue, Mar 30, 2021 at 4:30 PM David Rowley  wrote:
>
>> Hackers,
>>
>> Over on [1] I've been working on adding a new type of executor node
>> which caches tuples in a hash table belonging to a given cache key.
>>
>> The current sole use of this node type is to go between a
>> parameterized nested loop and the inner node in order to cache
>> previously seen sets of parameters so that we can skip scanning the
>> inner scan for parameter values that we've already cached.  The node
>> could also be used to cache results from correlated subqueries,
>> although that's not done yet.
>>
>> The cache limits itself to not use more than hash_mem by evicting the
>> least recently used entries whenever more space is needed for new
>> entries.
>>
>> Currently, in the patch, the node is named "Result Cache".  That name
>> was not carefully thought out. I just needed to pick something when
>> writing the code.
>>
>> Here's an EXPLAIN output with the current name:
>>
>> postgres=# explain (costs off) select relkind,c from pg_class c1,
>> lateral (select count(*) c from pg_class c2 where c1.relkind =
>> c2.relkind) c2;
>>  QUERY PLAN
>> 
>>  Nested Loop
>>->  Seq Scan on pg_class c1
>>->  Result Cache
>>  Cache Key: c1.relkind
>>  ->  Aggregate
>>->  Seq Scan on pg_class c2
>>  Filter: (c1.relkind = relkind)
>> (7 rows)
>>
>> I just got off a team call with Andres, Thomas and Melanie. During the
>> call I mentioned that I didn't like the name "Result Cache". Many name
>> suggestions followed:
>>
>> Here's a list of a few that were mentioned:
>>
>> Probe Cache
>> Tuple Cache
>> Keyed Materialize
>> Hash Materialize
>> Result Cache
>> Cache
>> Hash Cache
>> Lazy Hash
>> Reactive Hash
>> Parameterized Hash
>> Parameterized Cache
>> Keyed Inner Cache
>> MRU Cache
>> MRU Hash
>>
>> I was hoping to commit the final patch pretty soon, but thought I'd
>> have another go at seeing if we can get some consensus on a name
>> before doing that. Otherwise, I'd sort of assumed that we'd just reach
>> some consensus after everyone complained about the current name after
>> the feature is committed.
>>
>> My personal preference is "Lazy Hash", but I feel it might be better
>> to use the word "Reactive" instead of "Lazy".
>>
>> There was some previous discussion on the name in [2]. I suggested
>> some other names in [3]. Andy voted for "Tuple Cache" in [4]
>>
>> Votes? Other suggestions?
>>
>> (I've included all the people who have shown some previous interest in
>> naming this node.)
>>
>> David
>>
>> [1]
>> https://www.postgresql.org/message-id/flat/CAApHDvrPcQyQdWERGYWx8J%2B2DLUNgXu%2BfOSbQ1UscxrunyXyrQ%40mail.gmail.com
>> [2]
>> https://www.postgresql.org/message-id/CA%2BTgmoZMxLeanqrS00_p3xNsU3g1v3EKjNZ4dM02ShRxxLiDBw%40mail.gmail.com
>> [3]
>> https://www.postgresql.org/message-id/CAApHDvoj_sH1H3JVXgHuwnxf1FQbjRVOqqgxzOgJX13NiA9-cg%40mail.gmail.com
>> [4]
>> https://www.postgresql.org/message-id/CAKU4AWoshM0JoymwBK6PKOFDMKg-OO6qtSVU_Piqb0dynxeL5w%40mail.gmail.com
>>
>>
>>
I want to share some feelings about other keywords.  Materialize are used
in
Materialize node in executor node, which would write data to disk when
memory
is not enough, and it is used in "Materialized View", where it stores all
the data to disk
This gives me some feeling that "Materialize" usually has something with
disk,
but our result cache node doesn't.

And I think DBA checks plans more than the PostgreSQL developer.  So
some MRU might be too internal for them. As for developers,  if they want to
know such details, they can just read the source code.

When naming it,  we may also think about some non native English speakers,
so
some too advanced words may make them uncomfortable.  Actually when  I read
"Reactive", I googled to find what its meaning is. I knew reactive
programming, but I
do not truly understand "reactive hash".   And Compared with HashJoin, Hash
may
mislead people the result may be spilled into disk as well. so I prefer
"Cache"
over "Hash".

 At last, I still want to vote for "Tuple(s) Cache", which sounds simple
and enough.
I was thinking if we need to put "Lazy" in the node name since we do build
cache
lazily,  then I found we didn't call "Materialize"  as "Lazy Materialize",
so I think we
can keep consistent.

> I was hoping to commit the final patch pretty soon

Very glad to see it,  thanks for the great feature.

-- 
Best Regards
Andy Fan (https://www.aliyun.com/)


Re: Refactor SSL test framework to support multiple TLS libraries

2021-03-30 Thread Michael Paquier
On Tue, Mar 30, 2021 at 12:15:07PM -0300, Alvaro Herrera wrote:
> The only complain I have is that "the given node" is nonsensical in
> PostgresNode.  I suggest to delete the word "given".  Also "This is
> expected to fail with a message that matches the regular expression
> $expected_stderr".

Your suggestions are better, indeed.

> The POD doc for connect_fails uses order: ($connstr, $testname, 
> $expected_stderr)
> but the routine has:
>   +   my ($self, $connstr, $expected_stderr, $testname) = @_;
> 
> these should match.

Fixed.

> (There's quite an inconsistency in the existing test code about
> expected_stderr being a string or a regex; and some regexes are quite
> generic: just qr/SSL error/.  Not this patch responsibility to fix that.)

Jacob has just raised this as an issue for an integration with NLS,
because it may be possible that things fail with "SSL error" but a
different error pattern, causing false positives:
https://www.postgresql.org/message-id/e0f0484a1815b26bb99ef9ddc7a110dfd6425931.ca...@vmware.com

I agree that those matches should be much more picky.  We may need to
be careful across all versions of OpenSSL supported though :/

> As I understand, our perlcriticrc no longer requires 'return' at the end
> of routines (commit 0516f94d18c5), so you can omit that.

Fixed.  Thanks.

With all the comments addressed, with updates to use a single scalar
for all the connection strings and with a proper indentation, I finish
with the attached.  Does that look fine?
--
Michael
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 1e96357d7e..d6e10544bb 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -1511,6 +1511,11 @@ the B parameter is also given.
 If B is set and this parameter is given, the scalar it references
 is set to true if the psql call times out.
 
+=item connstr => B
+
+If set, use this as the connection string for the connection to the
+backend.
+
 =item replication => B
 
 If set, add B to the conninfo string.
@@ -1550,14 +1555,20 @@ sub psql
 	my $replication   = $params{replication};
 	my $timeout   = undef;
 	my $timeout_exception = 'psql timed out';
-	my @psql_params   = (
-		'psql',
-		'-XAtq',
-		'-d',
-		$self->connstr($dbname)
-		  . (defined $replication ? " replication=$replication" : ""),
-		'-f',
-		'-');
+
+	# Build the connection string.
+	my $psql_connstr;
+	if (defined $params{connstr})
+	{
+		$psql_connstr = $params{connstr};
+	}
+	else
+	{
+		$psql_connstr = $self->connstr($dbname);
+	}
+	$psql_connstr .= defined $replication ? " replication=$replication" : "";
+
+	my @psql_params = ('psql', '-XAtq', '-d', $psql_connstr, '-f', '-');
 
 	# If the caller wants an array and hasn't passed stdout/stderr
 	# references, allocate temporary ones to capture them so we
@@ -1849,6 +1860,51 @@ sub interactive_psql
 
 =pod
 
+=item $node->connect_ok($connstr, $test_name)
+
+Attempt a connection with a custom connection string.  This is expected
+to succeed.
+
+=cut
+
+sub connect_ok
+{
+	local $Test::Builder::Level = $Test::Builder::Level + 1;
+	my ($self, $connstr, $test_name) = @_;
+	my ($ret,  $stdout,  $stderr)= $self->psql(
+		'postgres',
+		"SELECT \$\$connected with $connstr\$\$",
+		connstr   => "$connstr",
+		on_error_stop => 0);
+
+	ok($ret == 0, $test_name);
+}
+
+=pod
+
+=item $node->connect_fails($connstr, $expected_stderr, $test_name)
+
+Attempt a connection with a custom connection string.  This is expected
+to fail with a message that matches the regular expression
+$expected_stderr.
+
+=cut
+
+sub connect_fails
+{
+	local $Test::Builder::Level = $Test::Builder::Level + 1;
+	my ($self, $connstr, $expected_stderr, $test_name) = @_;
+	my ($ret, $stdout, $stderr) = $self->psql(
+		'postgres',
+		"SELECT \$\$connected with $connstr\$\$",
+		connstr => "$connstr");
+
+	ok($ret != 0, $test_name);
+	like($stderr, $expected_stderr, "$test_name: matches");
+}
+
+=pod
+
 =item $node->poll_query_until($dbname, $query [, $expected ])
 
 Run B<$query> repeatedly, until it returns the B<$expected> result
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index adaa1b4e9b..b1a63f279c 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -135,103 +135,94 @@ $common_connstr =
   "user=ssltestuser dbname=trustdb sslcert=invalid hostaddr=$SERVERHOSTADDR host=common-name.pg-ssltest.test";
 
 # The server should not accept non-SSL connections.
-test_connect_fails(
-	$common_connstr, "sslmode=disable",
+$node->connect_fails(
+	"$common_connstr sslmode=disable",
 	qr/\Qno pg_hba.conf entry\E/,
 	"server doesn't accept non-SSL connections");
 
 # Try without a root cert. In sslmode=require, this should work. In verify-ca
 # or verify-full mode it should fail.
-test_connect_ok(
-	$common_connstr,
-	"sslrootcert=invalid sslmode=require",
+$node->connect_ok(
+	"$common_connstr sslrootcert=invalid sslmode=require",
 	"connect without server root 

Re: [PATCH v3 1/1] Fix detection of preadv/pwritev support for OSX.

2021-03-30 Thread James Hilliard
On Tue, Mar 30, 2021 at 6:43 PM Thomas Munro  wrote:
>
> On Tue, Mar 30, 2021 at 7:39 PM James Hilliard
>  wrote:
> > On Mon, Mar 29, 2021 at 11:58 PM Tom Lane  wrote:
> > > We haven't claimed in the past to support MACOSX_DEPLOYMENT_TARGET,
> > > and I'm not sure we should start now.  How many people actually care
> > > about that?
> >
> > Seems kinda important for anyone who wants to build postgres
> > compatible with targets older than the host system.
>
> Personally I'm mostly concerned about making it easy for new
> contributors to get a working dev system going on a super common
> platform without dealing with hard-to-diagnose errors, than people who
> actually want a different target as a deliberate choice.

Yeah, that's where I was running into this. Currently we build for the max
deployment target available in the SDK, regardless of if that is the deployment
target actually set, the compiler by default automatically sets the deployment
target to the build host but if the SDK supports newer deployment targets
that's where things break down.

> Do I
> understand correctly that there a period of time each year when major
> upgrades come out of sync and lots of people finish up running a
> toolchain and OS with this problem for a while due to the default
> target not matching?

Well you can hit this if you try and build against a toolchain that supports
targets newer than the host pretty easily, although I think postgres
tries to use the cli tools SDK by default which appears somewhat less
prone to this issue(although I don't think this behavior is guaranteed).

> If so I wonder if other projects are running
> into this with AC_REPLACE_FUNCS and what they're doing.

Well I did come up with another approach, which uses AC_LANG_PROGRAM
instead of AC_CHECK_DECLS that might be better
https://lists.gnu.org/archive/html/autoconf-patches/2021-02/msg7.html

However I didn't submit that version here since it uses a custom probe via
AC_LANG_PROGRAM instead of a standard probe like AC_CHECK_DECLS
which Tom Lane said would be a maintenance issue, at least with this
AC_CHECK_DECLS method we can avoid using any non-standard probes:
https://postgr.es/m/915981.1611254324%40sss.pgh.pa.us


>
> I suppose an alternative strategy would be to try to detect the
> mismatch and spit out a friendlier warning, if we decide we're not
> going to support such builds.




Re: [PATCH] add concurrent_abort callback for output plugin

2021-03-30 Thread Ajin Cherian
On Tue, Mar 30, 2021 at 10:29 PM Markus Wanner  wrote:

>
> I just noticed as of PG13, concurrent_abort is part of ReorderBufferTXN,
> so it seems the prepare_cb (or stream_prepare_cb) can actually figure a
> concurrent abort happened (and the transaction may be incomplete).
> That's good and indeed makes an additional callback unnecessary.
>
> I recommend giving a hint to that field in the documentation as well.
>
> > diff --git a/doc/src/sgml/logicaldecoding.sgml
> b/doc/src/sgml/logicaldecoding.sgml
> > index 80eb96d..d2f8d39 100644
> > --- a/doc/src/sgml/logicaldecoding.sgml
> > +++ b/doc/src/sgml/logicaldecoding.sgml
> > @@ -545,12 +545,14 @@ CREATE TABLE another_catalog_table(data text) WITH
> (user_catalog_table = true);
> >   executed within that transaction. A transaction that is prepared
> for
> >   a two-phase commit using PREPARE TRANSACTION
> will
> >   also be decoded if the output plugin callbacks needed for decoding
> > - them are provided. It is possible that the current transaction
> which
> > + them are provided. It is possible that the current prepared
> transaction which
> >   is being decoded is aborted concurrently via a ROLLBACK
> PREPARED
> >   command. In that case, the logical decoding of this transaction
> will
> > - be aborted too. We will skip all the changes of such a transaction
> once
> > - the abort is detected and abort the transaction when we read WAL
> for
> > - ROLLBACK PREPARED.
> > + be aborted too. All the changes of such a transaction is skipped
> once
>
> typo: changes [..] *are* skipped, plural.
>

Updated.


>
> > + the abort is detected and the prepare_cb
> callback is invoked.
> > + This could result in a prepared transaction with incomplete
> changes.
>
> ... "in which case the concurrent_abort field of the
> passed ReorderBufferTXN struct is set.", as a proposal?
>
> > + This is done so that eventually when the ROLLBACK
> PREPARED
> > + is decoded, there is a corresponding prepared transaction with a
> matching gid.
> >  
> >
> >  
>
> Everything else sounds good to me.
>

Updated.

regards,
Ajin Cherian
Fujitsu Australia


v3-0001-Make-sure-a-prepare-is-sent-when-decoder-detects-.patch
Description: Binary data


Re: Asynchronous Append on postgres_fdw nodes.

2021-03-30 Thread Kyotaro Horiguchi
At Tue, 30 Mar 2021 20:40:35 +0900, Etsuro Fujita  
wrote in 
> On Mon, Mar 29, 2021 at 6:50 PM Etsuro Fujita  wrote:
> > I think the patch would be committable.
> 
> Here is a new version of the patch.
> 
> * Rebased the patch against HEAD.
> * Tweaked docs/comments a bit further.
> * Added the commit message.  Does that make sense?
> 
> I'm happy with the patch, so I'll commit it if there are no objections.

Thanks for the patch.

May I ask some questions?

+ async_capable
+ 
+  
+   This option controls whether postgres_fdw allows
+   foreign tables to be scanned concurrently for asynchronous execution.
+   It can be specified for a foreign table or a foreign server.

Isn't it strange that an option named "async_capable" *allows* async?

+* We'll prefer to consider this join async-capable if any 
table from
+* either side of the join is considered async-capable.
+*/
+   fpinfo->async_capable = fpinfo_o->async_capable ||
+   fpinfo_i->async_capable;

We need to explain this behavior in the documentation.

Regarding to the wording "async capable", if it literally represents
the capability to run asynchronously, when any one element of a
combined path doesn't have the capability, the whole path cannot be
async-capable.  If it represents allowance for an element to run
asynchronously, then the whole path is inhibited to run asynchronously
unless all elements are allowed to do so.  If it represents
enforcement or suggestion to run asynchronously, enforcing asynchrony
to an element would lead to running the whole path asynchronously
since all elements of postgres_fdw are capable to run asynchronously
as the nature.

It looks somewhat inconsistent to be inhibitive for the default value
of "async_capable", but agressive in merging?

If I'm wrong in the understanding, please feel free to go ahead.

regrds.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Use consistent terminology for tablesync slots.

2021-03-30 Thread Peter Smith
On Tue, Mar 30, 2021 at 8:14 PM Amit Kapila  wrote:
>
> On Tue, Mar 30, 2021 at 2:21 PM Peter Smith  wrote:
> >
> > Hi,
> >
> > The logical replication tablesync worker creates tablesync slots.
> >
> > Previously some PG docs pages were referring to these as "tablesync
> > slots", but other pages called them as "table synchronization slots".
> >
> > PSA a trivial patch which (for consistency) now calls them all the
> > same -  "tablesync slots"
> >
>
> +1 for the consistency. But I think it better to use "table
> synchronization slots" in the user-facing docs as that makes it easier
> for users to understand.
>

PSA patch version 2 updated to use "table synchronization slots" as suggested.

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v2-0001-Use-consistent-terminology-for-tablesync-slots.patch
Description: Binary data


Re: Refactor SSL test framework to support multiple TLS libraries

2021-03-30 Thread Michael Paquier
On Tue, Mar 30, 2021 at 07:14:55PM -0300, Alvaro Herrera wrote:
> On 2021-Mar-30, Daniel Gustafsson wrote:
>> This double concatenation could be a single concat, or just use scalar value
>> interpolation in the string to make it even more readable.  As it isn't using
>> the same line broken pattern of the others the concat looks a bit weird as a
>> result.
> 
> +1 for using a single scalar.

Agreed.  I posted things this way to make a lookup at the diffs easier
for the eye, but that was not intended for the final patch.
--
Michael


signature.asc
Description: PGP signature


Re: multi-install PostgresNode fails with older postgres versions

2021-03-30 Thread Mark Dilger



> On Mar 30, 2021, at 5:52 PM, Michael Paquier  wrote:
> 
> On Tue, Mar 30, 2021 at 08:44:26PM -0400, Andrew Dunstan wrote:
>> Yeah, it should be validated. All things considered I think just calling
>> 'pg_config --version' is probably the simplest validation, and likely to
>> be sufficient.
>> 
>> I'll try to come up with something tomorrow.
> 
> There is already TestLib::check_pg_config().  Shouldn't you leverage
> that with PG_VERSION_NUM or equivalent?

Only if you change that function.  It doesn't currently do anything special to 
run the *right* pg_config.

The patch I sent takes the view that once the install_path has been sanity 
checked and the *right* pg_config executed, relying on the environment's path 
variables thereafter is safe.  But that means the initial pg_config execution 
is unique in not being able to rely on the path.  There really isn't enough 
motivation for changing TestLib, I don't think, because subsequent calls to 
pg_config don't need to be paranoid, just the first call.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: multi-install PostgresNode fails with older postgres versions

2021-03-30 Thread Michael Paquier
On Tue, Mar 30, 2021 at 08:44:26PM -0400, Andrew Dunstan wrote:
> Yeah, it should be validated. All things considered I think just calling
> 'pg_config --version' is probably the simplest validation, and likely to
> be sufficient.
> 
> I'll try to come up with something tomorrow.

There is already TestLib::check_pg_config().  Shouldn't you leverage
that with PG_VERSION_NUM or equivalent?
--
Michael


signature.asc
Description: PGP signature


Re: multi-install PostgresNode fails with older postgres versions

2021-03-30 Thread Mark Dilger



> On Mar 30, 2021, at 5:44 PM, Andrew Dunstan  wrote:
> 
> I'll try to come up with something tomorrow.

I hope the patch I sent is useful, at least as a starting point.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







RE: libpq debug log

2021-03-30 Thread tsunakawa.ta...@fujitsu.com
From: 'alvhe...@alvh.no-ip.org' 
> Okay, pushed this patch and the new testing for it based on
> libpq_pipeline.  We'll see how the buildfarm likes it.

Thank you very much!  I appreciate you taking your valuable time while I 
imagine you must be pretty busy with taking care of other (possibly more 
important) patches.

TBH, when Tom-san suggested drastic change, I was afraid we may not be able to 
complete this in PG 14.  But in the end, I'm very happy that the patch has 
become much leaner and cleaner.

And congratulations on your first commit, Iwata-san!  I hope you can have time 
and energy to try adding a connection parameter to enable tracing, which 
eliminates application modification.


> I didn't like the idea of silently skipping the redacted fields, so I
> changed the code to print  or  instead.  I also made the
> redacting occur in the last mile (pqTraceOutputInt32 / String) rather
> that in their callers: it seemed quite odd to advance the cursor in the
> "else" branch.
> 
> I refactored the duplicate code that appeared for Notice and Error.
> In that function, we redact not only the 'L' field (what Iwata-san was
> doing) but also 'F' (file) and 'R' (routine) because those things can
> move around for reasons that are not interesting to testing this code.
> 
> In the libpq_pipeline commit I added 'pipeline_abort' and 'transaction'
> to the cases that generate traces, which adds coverage for
> NoticeResponse and ErrorResponse.

These make sense to me.  Thank you for repeatedly polishing and making the 
patch better much.



Regards
Takayuki Tsunakawa




Re: DROP INDEX docs - explicit lock naming

2021-03-30 Thread Michael Paquier
On Tue, Mar 30, 2021 at 10:33:46AM -0400, Greg Rychlewski wrote:
> While reading the documentation for DROP INDEX[1], I noticed the lock was
> described colloquially as an "exclusive" lock, which made me pause for a
> second because it's the same name as the EXCLUSIVE table lock.
> 
> The attached patch explicitly states that an ACCESS EXCLUSIVE lock is
> acquired.

Indeed, this could be read as ACCESS SHARE being allowed, but that's
never the case for any of the index code paths, except if CONCURRENTLY
is involved.  It is not the only place in the docs where we could do
more clarification.  For instance, reindex.sgml mentions twice an
exclusive lock but that should be an access exclusive lock.  To be
exact, I can spot 27 places under doc/ that could be improved.  Such
changes depend on the surrounding context, of course.
--
Michael


signature.asc
Description: PGP signature


Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)

2021-03-30 Thread Andy Fan
>
> I assume we want to know if a Var is nullable with a function like.
> is_var_notnullable(Var *var,  Relids relids).  If so, we can define the
> data as below:
>
> struct RelOptInfo {
>
> Bitmapset** notnullattrs;
> ..
> };
>
> After this we can implement the function as:
>
> bool
> is_var_notnullable(Var* var, Relids relids)
> {
>   RelOptInfo *rel = find_rel_by_relids(reldis);
>   return bms_is_member(var->varattno, rel->notnullattrs[var->varno]);
> }
>
> Probably we can make some hackers to reduce the notnullattrs's memory usage
> overhead.
>
>
To be more precise,  to make the rel->notnullattrs shorter, we can do the
following methods:
1). Relids only has single element, we can always use a 1-len array rather
than rel->varno
elements. 2).  For multi-elements relids, we use the max(varno) as the
length of rel->notnullattrs.
3). For some cases,  the notnullattrs of a baserel is not changed in later
stages, we can just
reuse the same Bitmapset * in later stages.

-- 
Best Regards
Andy Fan (https://www.aliyun.com/)


Re: multi-install PostgresNode fails with older postgres versions

2021-03-30 Thread Andrew Dunstan


On 3/30/21 6:22 PM, Alvaro Herrera wrote:
> On 2021-Mar-30, Mark Dilger wrote:
>
>> Once you have a node running, you can query the version using
>> safe_psql, but that clearly doesn't work soon enough, since we need
>> the information prior to running initdb.
> I was thinking something like examining some file in the install dir --
> say, include/server/pg_config.h, but that seems messier than just
> running pg_config.
>
>> One of the things I noticed while playing with this new toy (thanks,
>> Andrew!) 


(I'm really happy someone is playing with it so soon.)



>> is that if you pass a completely insane install_path, you
>> don't get any errors.  In fact, you get executables and libraries from
>> whatever PATH="/no/such/postgres:$PATH" gets you, probably the
>> executables and libraries of your latest development branch.  By
>> forcing get_new_node to call the pg_config of the path you pass in,
>> you'd fix that problem.  I didn't do that, mind you, but you could.  I
>> just executed pg_config, which means you'll still get the wrong
>> version owing to the PATH confusion.
> Hmm, I think it should complain if you give it a path that doesn't
> actually contain a valid installation.
>


Yeah, it should be validated. All things considered I think just calling
'pg_config --version' is probably the simplest validation, and likely to
be sufficient.


I'll try to come up with something tomorrow.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [PATCH v3 1/1] Fix detection of preadv/pwritev support for OSX.

2021-03-30 Thread Thomas Munro
On Tue, Mar 30, 2021 at 7:39 PM James Hilliard
 wrote:
> On Mon, Mar 29, 2021 at 11:58 PM Tom Lane  wrote:
> > We haven't claimed in the past to support MACOSX_DEPLOYMENT_TARGET,
> > and I'm not sure we should start now.  How many people actually care
> > about that?
>
> Seems kinda important for anyone who wants to build postgres
> compatible with targets older than the host system.

Personally I'm mostly concerned about making it easy for new
contributors to get a working dev system going on a super common
platform without dealing with hard-to-diagnose errors, than people who
actually want a different target as a deliberate choice.  Do I
understand correctly that there a period of time each year when major
upgrades come out of sync and lots of people finish up running a
toolchain and OS with this problem for a while due to the default
target not matching?   If so I wonder if other projects are running
into this with AC_REPLACE_FUNCS and what they're doing.

I suppose an alternative strategy would be to try to detect the
mismatch and spit out a friendlier warning, if we decide we're not
going to support such builds.




Re: multi-install PostgresNode fails with older postgres versions

2021-03-30 Thread Mark Dilger


> On Mar 30, 2021, at 3:22 PM, Alvaro Herrera  wrote:
> 
>> One of the things I noticed while playing with this new toy (thanks,
>> Andrew!) is that if you pass a completely insane install_path, you
>> don't get any errors.  In fact, you get executables and libraries from
>> whatever PATH="/no/such/postgres:$PATH" gets you, probably the
>> executables and libraries of your latest development branch.  By
>> forcing get_new_node to call the pg_config of the path you pass in,
>> you'd fix that problem.  I didn't do that, mind you, but you could.  I
>> just executed pg_config, which means you'll still get the wrong
>> version owing to the PATH confusion.
> 
> Hmm, I think it should complain if you give it a path that doesn't
> actually contain a valid installation.

I felt the same way, but wondered if Andrew had set path variables without 
sanity checking the install_path argument for some specific reason, and didn't 
want to break something he did intentionally.  If that wasn't intentional, then 
there are two open bugs/infelicities against master:

1) PostgresNode::init() doesn't work for older server versions

2) PostgresNode::get_new_node() doesn't reject invalid paths, resulting in 
confusion about which binaries subsequently get executed

I think this next version of the patch addresses both issues.  The first issue 
was already fixed in the previous patch.  The second issue is also now fixed by 
forcing the usage of the install_path qualified pg_config executable, rather 
than using whatever pg_config happens to be found in the path.

There is an existing issue that if you configure with 
--bindir=$SOMEWHERE_UNEXPECTED, PostgresNode won't work.  It inserts 
${install_path}/bin and ${install_path}/lib into the environment without regard 
for whether "bin" and "lib" are correct.  That's a pre-existing limitation, and 
I'm not complaining, but just commenting that I didn't do anything to fix it.

Keeping the WIP marking on the patch until we hear Andrew's opinion on all this.



v2-0001-Extending-PostgresNode-cross-version-functionalit.patch.WIP
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





cursor already in use, UPDATE RETURNING bug?

2021-03-30 Thread Jaime Casanova
Hi,

Just noted an interesting behaviour when using a cursor in a function
in an UPDATE RETURNING (note that INSERT RETURNING has no problem).

I have seen this problem in all versions I tested (9.4 thru master).
Steps to reproduce:

prepare the test
```
create table t1 as select random() * foo i from generate_series(1, 100) foo;
create table t2 as select random() * foo i from generate_series(1, 100) foo;

CREATE OR REPLACE FUNCTION cursor_bug()
 RETURNS integer
 LANGUAGE plpgsql
AS $function$
declare
  c1 cursor (p1 int) for select count(*) from t1 where i = p1;
  n int4;
begin
  open c1 (77);
  fetch c1 into n;
  return n;
end $function$
;
```

-- this ends fine
insert into t2 values(5) returning cursor_bug() as c1;
 c1

  0
(1 row)

-- this fails
update t2 set i = 5 returning cursor_bug() as c1;
ERROR:  cursor "c1" already in use
CONTEXT:  PL/pgSQL function cursor_bug() line 6 at OPEN

-- 
Jaime Casanova
Director de Servicios Profesionales
SYSTEMGUARDS - Consultores de PostgreSQL




Re: Extensions not dumped when --schema is used

2021-03-30 Thread Michael Paquier
On Tue, Mar 30, 2021 at 12:02:45PM +0900, Michael Paquier wrote:
> Okay.  So I have looked at that stuff in details, and after fixing
> all the issues reported upthread in the code, docs and tests I am
> finishing with the attached.  The tests have been moved out of
> src/bin/pg_dump/ to src/test/modules/test_pg_dump/, and include both
> positive and negative tests (used the trick with plpgsql for the
> latter to avoid the dump of the extension test_pg_dump or any data
> related to it).

I have double-checked this stuff this morning, and did not notice any
issues.  So, applied.
--
Michael


signature.asc
Description: PGP signature


Re: Outdated comment for CreateStmt.inhRelations

2021-03-30 Thread Michael Paquier
On Tue, Mar 30, 2021 at 08:30:15PM +0800, Julien Rouhaud wrote:
> I just noticed that the comment for CreateStmt.inhRelations says that it's a
> List of inhRelation, which hasn't been the case for a very long time.

Thanks, applied.
--
Michael


signature.asc
Description: PGP signature


Re: Issue with point_ops and NaN

2021-03-30 Thread Michael Paquier
On Tue, Mar 30, 2021 at 11:39:40PM +0800, Julien Rouhaud wrote:
> On Tue, Mar 30, 2021 at 11:02:32AM -0400, Tom Lane wrote:
>> Agreed --- one could make an argument for either 'false' or NULL
>> result, but surely not 'true'.
> 
> I would think that it should return NULL since it's not inside nor outside the
> polygon, but I'm fine with false.

Yeah, this is trying to make an undefined point fit into a box that
has a  definition, so "false" does not make sense to me either here as
it implies that the point exists?  NULL seems adapted here.
--
Michael


signature.asc
Description: PGP signature


Re: libpq debug log

2021-03-30 Thread 'alvhe...@alvh.no-ip.org'


So crake failed.  The failure is that it doesn't print the DataRow
messages.  That's quite odd.  We see this in the trace log:

Mar 30 20:05:15 # F 54  Parse"" "SELECT 1.0/g FROM 
generate_series(3, -1, -1) g" 0
Mar 30 20:05:15 # F 12  Bind "" "" 0 0 0
Mar 30 20:05:15 # F 6   Describe P ""
Mar 30 20:05:15 # F 9   Execute  "" 0
Mar 30 20:05:15 # F 4   Sync
Mar 30 20:05:15 # B 4   ParseComplete
Mar 30 20:05:15 # B 4   BindComplete
Mar 30 20:05:15 # B 33  RowDescription   1 "?column?"  0  65535 
-1 0
Mar 30 20:05:15 # B 70  ErrorResponseS "ERROR" V "ERROR" C "22012" 
M "division by zero" F "" L "" R "" \\x00
Mar 30 20:05:15 # B 5   ReadyForQueryI

and the expected is this:

Mar 30 20:05:15 # F 54  Parse"" "SELECT 1.0/g FROM 
generate_series(3, -1, -1) g" 0
Mar 30 20:05:15 # F 12  Bind "" "" 0 0 0
Mar 30 20:05:15 # F 6   Describe P ""
Mar 30 20:05:15 # F 9   Execute  "" 0
Mar 30 20:05:15 # F 4   Sync
Mar 30 20:05:15 # B 4   ParseComplete
Mar 30 20:05:15 # B 4   BindComplete
Mar 30 20:05:15 # B 33  RowDescription   1 "?column?"  0  65535 
-1 0
Mar 30 20:05:15 # B 32  DataRow  1 22 '0.'
Mar 30 20:05:15 # B 32  DataRow  1 22 '0.5000'
Mar 30 20:05:15 # B 32  DataRow  1 22 '1.'
Mar 30 20:05:15 # B 70  ErrorResponseS "ERROR" V "ERROR" C "22012" 
M "division by zero" F "" L "" R "" \\x00
Mar 30 20:05:15 # B 5   ReadyForQueryI


...

I wonder if this is a libpq pipeline problem rather than a PQtrace
problem.  In that test case we're using the libpq singlerow mode, so we
should see three rows before the error is sent, but for some reason
crake is not doing that.

aborted pipeline... NOTICE:  table "pq_pipeline_demo" does not exist, skipping
ok
got expected ERROR:  cannot insert multiple commands into a prepared statement
got expected division-by-zero
ok 5 - libpq_pipeline pipeline_abort
not ok 6 - pipeline_abort trace match


Other hosts seem to get it right:

# Running: libpq_pipeline -t 
/Users/buildfarm/bf-data/HEAD/pgsql.build/src/test/modules/libpq_pipeline/tmp_check/log/pipeline_abort.trace
 pipeline_abort port=49797 
host=/var/folders/md/8mp8j5m5169ccgy11plb4w_8gp/T/iA9YP9_IHa 
dbname='postgres' 1
aborted pipeline... NOTICE:  table "pq_pipeline_demo" does not exist, skipping
ok
got expected ERROR:  cannot insert multiple commands into a prepared statement
got row: 0.
got row: 0.5000
got row: 1.
got expected division-by-zero
ok 5 - libpq_pipeline pipeline_abort
ok 6 - pipeline_abort trace match


-- 
Álvaro Herrera   Valdivia, Chile
"This is what I like so much about PostgreSQL.  Most of the surprises
are of the "oh wow!  That's cool" Not the "oh shit!" kind.  :)"
Scott Marlowe, http://archives.postgresql.org/pgsql-admin/2008-10/msg00152.php




Re: "has_column_privilege()" issue with attnums and non-existent columns

2021-03-30 Thread Joe Conway

On 3/30/21 6:22 PM, Tom Lane wrote:

Joe Conway  writes:

Heh, I missed the forest for the trees it seems.
That version undid the changes fixing what Ian was originally complaining about.


Duh, right.  It would be a good idea for there to be a code comment
explaining this, because it's *far* from obvious.  Say like

* Check for column-level privileges first.  This serves in
* part as a check on whether the column even exists, so we
* need to do it before checking table-level privilege.


Will do.


My gripe about providing API-spec comments for the new aclchk.c
entry points still stands.  Other than that, I think it's good
to go.


Yeah, I was planning to put something akin to this in all four spots:
8<---
/*
 * Exported routine for checking a user's access privileges to a table
 *
 * Does the bulk of the work for pg_class_aclcheck(), and allows other
 * callers to avoid the missing relation ERROR when is_missing is non-NULL.
 */
AclResult
pg_class_aclcheck_ext(Oid table_oid, Oid roleid,
  AclMode mode, bool *is_missing)
...
8<---

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development




Re: unconstrained memory growth in long running procedure stored procedure after upgrading 11-12

2021-03-30 Thread Tom Lane
Justin Pryzby  writes:
> On Tue, Mar 30, 2021 at 04:17:03PM -0500, Merlin Moncure wrote:
>> We just upgraded from postgres 11 to 12.6 and our server is running
>> out of memory and rebooting about 1-2 times a day.

> I haven't tried your test, but this sounds a lot like the issue I reported 
> with
> JIT, which is enabled by default in v12.

FWIW, I just finished failing to reproduce any problem with that
test case ... but I was using a non-JIT-enabled build.

regards, tom lane




Re: libpq debug log

2021-03-30 Thread 'alvhe...@alvh.no-ip.org'
Okay, pushed this patch and the new testing for it based on
libpq_pipeline.  We'll see how the buildfarm likes it.

I made some further changes to the last version; user-visibly, I split
the trace flags in two, keeping the timestamp suppression separate from
the redacting feature for regression testing.

I didn't like the idea of silently skipping the redacted fields, so I
changed the code to print  or  instead.  I also made the
redacting occur in the last mile (pqTraceOutputInt32 / String) rather
that in their callers: it seemed quite odd to advance the cursor in the
"else" branch.

I refactored the duplicate code that appeared for Notice and Error.
In that function, we redact not only the 'L' field (what Iwata-san was
doing) but also 'F' (file) and 'R' (routine) because those things can
move around for reasons that are not interesting to testing this code.

In the libpq_pipeline commit I added 'pipeline_abort' and 'transaction'
to the cases that generate traces, which adds coverage for
NoticeResponse and ErrorResponse.

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: unconstrained memory growth in long running procedure stored procedure after upgrading 11-12

2021-03-30 Thread Justin Pryzby
On Tue, Mar 30, 2021 at 04:17:03PM -0500, Merlin Moncure wrote:
> Hello all,
> 
> We just upgraded from postgres 11 to 12.6 and our server is running
> out of memory and rebooting about 1-2 times a day.Application
> architecture is a single threaded stored procedure, executed with CALL
> that loops and never terminates. With postgres 11 we had no memory
> issues.  Ultimately the crash looks like this:
> 
> terminate called after throwing an instance of 'std::bad_alloc'
>   what():  std::bad_alloc
> 2021-03-29 04:34:31.262 CDT [1413] LOG:  server process (PID 9792) was
> terminated by signal 6: Aborted

I haven't tried your test, but this sounds a lot like the issue I reported with
JIT, which is enabled by default in v12.

https://www.postgresql.org/docs/12/release-12.html
Enable Just-in-Time (JIT) compilation by default, if the server has been built 
with support for it (Andres Freund)
Note that this support is not built by default, but has to be selected 
explicitly while configuring the build.

https://www.postgresql.org/message-id/20201001021609.GC8476%40telsasoft.com
terminate called after throwing an instance of 'std::bad_alloc'

I suggest to try ALTER SYSTEM SET jit_inline_above_cost=-1; SELECT 
pg_reload_conf();

> memory growth immediately.   It's about a gigabyte an hour in my test.
> Sorry for the large-ish attachment.

Your reproducer is probably much better than mine was.

-- 
Justin




Re: What to call an executor node which lazily caches tuples in a hash table?

2021-03-30 Thread Zhihong Yu
Hi,
I was reading this part of the description:

the Result Cache's
hash table is much smaller than the hash join's due to result cache only
caching useful values rather than all tuples from the inner side of the
join.

I think the word 'Result' should be part of the cache name considering the
above.

Cheers

On Tue, Mar 30, 2021 at 4:30 PM David Rowley  wrote:

> Hackers,
>
> Over on [1] I've been working on adding a new type of executor node
> which caches tuples in a hash table belonging to a given cache key.
>
> The current sole use of this node type is to go between a
> parameterized nested loop and the inner node in order to cache
> previously seen sets of parameters so that we can skip scanning the
> inner scan for parameter values that we've already cached.  The node
> could also be used to cache results from correlated subqueries,
> although that's not done yet.
>
> The cache limits itself to not use more than hash_mem by evicting the
> least recently used entries whenever more space is needed for new
> entries.
>
> Currently, in the patch, the node is named "Result Cache".  That name
> was not carefully thought out. I just needed to pick something when
> writing the code.
>
> Here's an EXPLAIN output with the current name:
>
> postgres=# explain (costs off) select relkind,c from pg_class c1,
> lateral (select count(*) c from pg_class c2 where c1.relkind =
> c2.relkind) c2;
>  QUERY PLAN
> 
>  Nested Loop
>->  Seq Scan on pg_class c1
>->  Result Cache
>  Cache Key: c1.relkind
>  ->  Aggregate
>->  Seq Scan on pg_class c2
>  Filter: (c1.relkind = relkind)
> (7 rows)
>
> I just got off a team call with Andres, Thomas and Melanie. During the
> call I mentioned that I didn't like the name "Result Cache". Many name
> suggestions followed:
>
> Here's a list of a few that were mentioned:
>
> Probe Cache
> Tuple Cache
> Keyed Materialize
> Hash Materialize
> Result Cache
> Cache
> Hash Cache
> Lazy Hash
> Reactive Hash
> Parameterized Hash
> Parameterized Cache
> Keyed Inner Cache
> MRU Cache
> MRU Hash
>
> I was hoping to commit the final patch pretty soon, but thought I'd
> have another go at seeing if we can get some consensus on a name
> before doing that. Otherwise, I'd sort of assumed that we'd just reach
> some consensus after everyone complained about the current name after
> the feature is committed.
>
> My personal preference is "Lazy Hash", but I feel it might be better
> to use the word "Reactive" instead of "Lazy".
>
> There was some previous discussion on the name in [2]. I suggested
> some other names in [3]. Andy voted for "Tuple Cache" in [4]
>
> Votes? Other suggestions?
>
> (I've included all the people who have shown some previous interest in
> naming this node.)
>
> David
>
> [1]
> https://www.postgresql.org/message-id/flat/CAApHDvrPcQyQdWERGYWx8J%2B2DLUNgXu%2BfOSbQ1UscxrunyXyrQ%40mail.gmail.com
> [2]
> https://www.postgresql.org/message-id/CA%2BTgmoZMxLeanqrS00_p3xNsU3g1v3EKjNZ4dM02ShRxxLiDBw%40mail.gmail.com
> [3]
> https://www.postgresql.org/message-id/CAApHDvoj_sH1H3JVXgHuwnxf1FQbjRVOqqgxzOgJX13NiA9-cg%40mail.gmail.com
> [4]
> https://www.postgresql.org/message-id/CAKU4AWoshM0JoymwBK6PKOFDMKg-OO6qtSVU_Piqb0dynxeL5w%40mail.gmail.com
>
>
>


What to call an executor node which lazily caches tuples in a hash table?

2021-03-30 Thread David Rowley
Hackers,

Over on [1] I've been working on adding a new type of executor node
which caches tuples in a hash table belonging to a given cache key.

The current sole use of this node type is to go between a
parameterized nested loop and the inner node in order to cache
previously seen sets of parameters so that we can skip scanning the
inner scan for parameter values that we've already cached.  The node
could also be used to cache results from correlated subqueries,
although that's not done yet.

The cache limits itself to not use more than hash_mem by evicting the
least recently used entries whenever more space is needed for new
entries.

Currently, in the patch, the node is named "Result Cache".  That name
was not carefully thought out. I just needed to pick something when
writing the code.

Here's an EXPLAIN output with the current name:

postgres=# explain (costs off) select relkind,c from pg_class c1,
lateral (select count(*) c from pg_class c2 where c1.relkind =
c2.relkind) c2;
 QUERY PLAN

 Nested Loop
   ->  Seq Scan on pg_class c1
   ->  Result Cache
 Cache Key: c1.relkind
 ->  Aggregate
   ->  Seq Scan on pg_class c2
 Filter: (c1.relkind = relkind)
(7 rows)

I just got off a team call with Andres, Thomas and Melanie. During the
call I mentioned that I didn't like the name "Result Cache". Many name
suggestions followed:

Here's a list of a few that were mentioned:

Probe Cache
Tuple Cache
Keyed Materialize
Hash Materialize
Result Cache
Cache
Hash Cache
Lazy Hash
Reactive Hash
Parameterized Hash
Parameterized Cache
Keyed Inner Cache
MRU Cache
MRU Hash

I was hoping to commit the final patch pretty soon, but thought I'd
have another go at seeing if we can get some consensus on a name
before doing that. Otherwise, I'd sort of assumed that we'd just reach
some consensus after everyone complained about the current name after
the feature is committed.

My personal preference is "Lazy Hash", but I feel it might be better
to use the word "Reactive" instead of "Lazy".

There was some previous discussion on the name in [2]. I suggested
some other names in [3]. Andy voted for "Tuple Cache" in [4]

Votes? Other suggestions?

(I've included all the people who have shown some previous interest in
naming this node.)

David

[1] 
https://www.postgresql.org/message-id/flat/CAApHDvrPcQyQdWERGYWx8J%2B2DLUNgXu%2BfOSbQ1UscxrunyXyrQ%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/CA%2BTgmoZMxLeanqrS00_p3xNsU3g1v3EKjNZ4dM02ShRxxLiDBw%40mail.gmail.com
[3] 
https://www.postgresql.org/message-id/CAApHDvoj_sH1H3JVXgHuwnxf1FQbjRVOqqgxzOgJX13NiA9-cg%40mail.gmail.com
[4] 
https://www.postgresql.org/message-id/CAKU4AWoshM0JoymwBK6PKOFDMKg-OO6qtSVU_Piqb0dynxeL5w%40mail.gmail.com




Re: Proposal: Save user's original authenticated identity for logging

2021-03-30 Thread Jacob Champion
On Tue, 2021-03-30 at 17:06 +, Jacob Champion wrote:
> Would it be acceptable to adjust the tests for live rotation using the
> logging collector, rather than a full restart? It would unfortunately
> mean that we have to somehow wait for the rotation to complete, since
> that's asynchronous.

I wasn't able to make live rotation work in a sane way. So, v14 tries
to thread the needle with a riff on your earlier idea:

> If you want to keep this information around
> for debugging, I guess that we could just print the contents of the
> backend logs to regress_log_001_password instead?  This could be done
> with a simple wrapper routine that prints the past contents of the log
> file before truncating them.

Rather than putting Postgres log data into the Perl logs, I rotate the
logs exactly once at the beginning -- so that there's an
old 001_ssltests_primary.log, and a new 001_ssltests_primary_1.log --
and then every time we truncate the logfile, I shuffle the bits from
the new logfile into the old one. So no one has to learn to find the
log entries in a new place, we don't get an explosion of rotated logs,
we don't lose the log data, we don't match incorrect portions of the
logs, and we only pay the restart price once. This is wrapped into a
small Perl module, LogCollector.

WDYT?

--Jacob
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index 9fac0689ee..b5fb15f794 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -10,6 +10,7 @@ use FindBin;
 use lib $FindBin::RealBin;
 
 use SSLServer;
+use LogCollector;
 
 if ($ENV{with_ssl} ne 'openssl')
 {
@@ -454,8 +455,7 @@ test_connect_fails(
 );
 
 
-my $log = $node->rotate_logfile();
-$node->restart;
+my $log = LogCollector->new($node);
 
 # correct client cert using whole DN
 my $dn_connstr = "$common_connstr dbname=certdb_dn";
@@ -466,16 +466,10 @@ test_connect_ok(
"certificate authorization succeeds with DN mapping"
 );
 
-$node->stop('fast');
-my $log_contents = slurp_file($log);
-
-like(
-   $log_contents,
+$log->like(
qr/connection authenticated: 
identity="CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG" method=cert/,
"authenticated DNs are logged");
 
-$node->start;
-
 # same thing but with a regex
 $dn_connstr = "$common_connstr dbname=certdb_dn_re";
 
@@ -485,8 +479,7 @@ test_connect_ok(
"certificate authorization succeeds with DN regex mapping"
 );
 
-$log = $node->rotate_logfile();
-$node->restart;
+$log->rotate;
 
 # same thing but using explicit CN
 $dn_connstr = "$common_connstr dbname=certdb_cn";
@@ -497,17 +490,10 @@ test_connect_ok(
"certificate authorization succeeds with CN mapping"
 );
 
-$node->stop('fast');
-$log_contents = slurp_file($log);
-
-like(
-   $log_contents,
+$log->like(
qr/connection authenticated: 
identity="CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG" method=cert/,
"full authenticated DNs are logged even in CN mapping mode");
 
-$node->start;
-
-
 
 TODO:
 {
@@ -565,8 +551,7 @@ SKIP:
"certificate authorization fails because of file permissions");
 }
 
-$log = $node->rotate_logfile();
-$node->restart;
+$log->rotate;
 
 # client cert belonging to another user
 test_connect_fails(
@@ -576,17 +561,10 @@ test_connect_fails(
"certificate authorization fails with client cert belonging to another 
user"
 );
 
-$node->stop('fast');
-$log_contents = slurp_file($log);
-
-like(
-   $log_contents,
+$log->like(
qr/connection authenticated: identity="CN=ssltestuser" method=cert/,
"cert authentication succeeds even if authorization fails");
 
-$log = $node->rotate_logfile();
-$node->restart;
-
 # revoked client cert
 test_connect_fails(
$common_connstr,
@@ -594,25 +572,16 @@ test_connect_fails(
qr/SSL error/,
"certificate authorization fails with revoked client cert");
 
-$node->stop('fast');
-$log_contents = slurp_file($log);
-
-unlike(
-   $log_contents,
+$log->unlike(
qr/connection authenticated:/,
"revoked certs do not authenticate users");
 
-$node->start;
-
 # Check that connecting with auth-option verify-full in pg_hba:
 # works, iff username matches Common Name
 # fails, iff username doesn't match Common Name.
 $common_connstr =
   "sslrootcert=ssl/root+server_ca.crt sslmode=require dbname=verifydb 
hostaddr=$SERVERHOSTADDR";
 
-$log = $node->rotate_logfile();
-$node->restart;
-
 test_connect_ok(
$common_connstr,
"user=ssltestuser sslcert=ssl/client.crt sslkey=ssl/client_tmp.key",
@@ -634,18 +603,12 @@ test_connect_ok(
"auth_option clientcert=verify-ca succeeds with mismatching username 
and Common Name"
 );
 
-$node->stop('fast');
-$log_contents = slurp_file($log);
-
 # None of the above connections to verifydb should have resulted in
 # authentication.
-unlike(
-   $log_contents,
+$log->unlike(
qr/connection authenticated:/,
"trust auth method does not set 

Re: Idea: Avoid JOINs by using path expressions to follow FKs

2021-03-30 Thread Joel Jacobson
On Tue, Mar 30, 2021, at 22:01, Isaac Morland wrote:
> On Tue, 30 Mar 2021 at 15:33, Joel Jacobson  wrote:
>>> Also, should the join be a left join, which would therefore return a NULL 
>>> when there is no matching record? Or could we have a variation such as ->? 
>>> to give a left join (NULL when no matching record) with -> using an inner 
>>> join (record is not included in result when no matching record).
>> 
>> Interesting idea, but I think we can keep it simple, and still support the 
>> case you mention:
>> 
>> If we only have -> and you want to exclude records where the column is NULL 
>> (i.e. INNER JOIN),
>> I think we should just use the WHERE clause and filter on such condition.
>> 
> 
> Just to be clear, it will always be a left join? Agreed that getting the 
> inner join behaviour can be done in the WHERE clause. I think this is a case 
> where simple is good. As long as the left join case is supported I'm happy.

Hmm, I guess, since technically, if all foreign key column(s) are declared as 
NOT NULL, we would know for sure such values exist, so a LEFT JOIN and INNER 
JOIN would always produce the same result.
I'm not sure if the query planner could produce different plans though, and if 
an INNER JOIN could be more efficient. If it matters, then I think we should 
generate an INNER JOIN for the "all column(s) NOT NULL" case.

>  
>> Thanks for the encouraging words. I have exactly the same experience myself 
>> and share your view.
>> 
>> I look forward to continued discussion on this matter.
> 
> I had another idea: maybe the default name of a foreign key constraint to a 
> primary key should simply be the name of the target table? That is, if I say:
> 
> FOREIGN KEY (...) REFERENCES t
> 
> ... then unless the table name t is already in use as a constraint name, it 
> will be used as the constraint name. It would be nice not to have to keep 
> repeating, like this:
> 
> CONSTRAINT t FOREIGN KEY (...) REFERENCES t
> 

I suggested earlier in the thread to allow making the default name format 
user-definable,
since some users according to the comment in pg_constraint.c might depend on 
apps that rely on the name
being unique within the namespace and not just the table.

Here is the commit that implemented this:

commit 45616f587745e0e82b00e77562d6502aa042
Author: Tom Lane 
Date:   Thu Jun 10 17:56:03 2004 +

Clean up generation of default names for constraints, indexes, and serial
sequences, as per recent discussion.  All these names are now of the
form table_column_type, with digits added if needed to make them unique.
Default constraint names are chosen to be unique across their whole schema,
not just within the parent object, so as to be more SQL-spec-compatible
and make the information schema views more useful.

So if nothing has changed since then, I don't think we should change the 
default name for all users.
But like I said earlier, I think it would be good if users who know what they 
are doing could override the default name format.

/Joel

Re: Trouble with initdb trying to run regression tests

2021-03-30 Thread Tom Lane
Isaac Morland  writes:
> I've built Postgres inside a Ubuntu Vagrant VM. When I try to "make check",
> I get a complaint about the permissions on the data directory:

> vagrant@ubuntu-focal:/vagrant$ tail /vagrant/src/test/regress/log/initdb.log
> creating subdirectories ... ok
> selecting dynamic shared memory implementation ... posix
> selecting default max_connections ... 20
> selecting default shared_buffers ... 400kB
> selecting default time zone ... Etc/UTC
> creating configuration files ... ok
> running bootstrap script ... 2021-03-30 21:38:32.746 UTC [23154] FATAL:
>  data directory "/vagrant/src/test/regress/./tmp_check/data" has invalid
> permissions
> 2021-03-30 21:38:32.746 UTC [23154] DETAIL:  Permissions should be u=rwx
> (0700) or u=rwx,g=rx (0750).
> child process exited with exit code 1

Further up in initdb.log, there was probably some useful information
about whether it found an existing directory there or not.

regards, tom lane




Re: "has_column_privilege()" issue with attnums and non-existent columns

2021-03-30 Thread Tom Lane
Joe Conway  writes:
> Heh, I missed the forest for the trees it seems.
> That version undid the changes fixing what Ian was originally complaining 
> about.

Duh, right.  It would be a good idea for there to be a code comment
explaining this, because it's *far* from obvious.  Say like

* Check for column-level privileges first.  This serves in
* part as a check on whether the column even exists, so we
* need to do it before checking table-level privilege.

My gripe about providing API-spec comments for the new aclchk.c
entry points still stands.  Other than that, I think it's good
to go.

regards, tom lane




Re: multi-install PostgresNode fails with older postgres versions

2021-03-30 Thread Alvaro Herrera
On 2021-Mar-30, Mark Dilger wrote:

> Once you have a node running, you can query the version using
> safe_psql, but that clearly doesn't work soon enough, since we need
> the information prior to running initdb.

I was thinking something like examining some file in the install dir --
say, include/server/pg_config.h, but that seems messier than just
running pg_config.

> One of the things I noticed while playing with this new toy (thanks,
> Andrew!) is that if you pass a completely insane install_path, you
> don't get any errors.  In fact, you get executables and libraries from
> whatever PATH="/no/such/postgres:$PATH" gets you, probably the
> executables and libraries of your latest development branch.  By
> forcing get_new_node to call the pg_config of the path you pass in,
> you'd fix that problem.  I didn't do that, mind you, but you could.  I
> just executed pg_config, which means you'll still get the wrong
> version owing to the PATH confusion.

Hmm, I think it should complain if you give it a path that doesn't
actually contain a valid installation.

-- 
Álvaro Herrera   Valdivia, Chile
"But static content is just dynamic content that isn't moving!"
http://smylers.hates-software.com/2007/08/15/fe244d0c.html




Re: multi-install PostgresNode fails with older postgres versions

2021-03-30 Thread Mark Dilger



> On Mar 30, 2021, at 3:12 PM, Alvaro Herrera  wrote:
> 
> On 2021-Mar-30, Mark Dilger wrote:
> 
>> The problem is clear enough; -N/--nosync was added in 9.3, and
>> PostgresNode::init is passing -N to initdb unconditionally. I wonder
>> if during PostgresNode::new a call should be made to pg_config and the
>> version information grep'd out so that version specific options to
>> various functions (init, backup, etc) could hinge on the version of
>> postgres being used?
> 
> Yeah, I think making it backwards-compatible would be good.  Is running
> pg_config to obtain the version the best way to do it?  I'm not sure --
> what other ways are there?  I can't of anything.  (Asking the user seems
> right out.)

Once you have a node running, you can query the version using safe_psql, but 
that clearly doesn't work soon enough, since we need the information prior to 
running initdb.

One of the things I noticed while playing with this new toy (thanks, Andrew!) 
is that if you pass a completely insane install_path, you don't get any errors. 
 In fact, you get executables and libraries from whatever 
PATH="/no/such/postgres:$PATH" gets you, probably the executables and libraries 
of your latest development branch.  By forcing get_new_node to call the 
pg_config of the path you pass in, you'd fix that problem.  I didn't do that, 
mind you, but you could.  I just executed pg_config, which means you'll still 
get the wrong version owing to the PATH confusion.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Refactor SSL test framework to support multiple TLS libraries

2021-03-30 Thread Alvaro Herrera
On 2021-Mar-30, Daniel Gustafsson wrote:

> +$node->connect_ok($common_connstr . " " . "user=ssltestuser",
> 
> This double concatenation could be a single concat, or just use scalar value
> interpolation in the string to make it even more readable.  As it isn't using
> the same line broken pattern of the others the concat looks a bit weird as a
> result.

+1 for using a single scalar.

-- 
Álvaro Herrera   Valdivia, Chile




Re: making update/delete of inheritance trees scale better

2021-03-30 Thread Tom Lane
I wrote:
> However, I then tried a partitioned equivalent of the 6-column case
> (script also attached), and it looks like
> 6 columns 16551   19097   15637   18201
> which is really noticeably worse, 16% or so.

... and on the third hand, that might just be some weird compiler-
and platform-specific artifact.

Using the exact same compiler (RHEL8's gcc 8.3.1) on a different
x86_64 machine, I measure the same case as about 7% slowdown not
16%.  That's still not great, but it calls the original measurement
into question, for sure.

Using Apple's clang 12.0.0 on an M1 mini, the patch actually clocks
in a couple percent *faster* than HEAD, for both the partitioned and
unpartitioned 6-column test cases.

So I'm not sure what to make of these results, but my level of concern
is less than it was earlier today.  I might've just gotten trapped by
the usual bugaboo of micro-benchmarking, ie putting too much stock in
only one test case.

regards, tom lane




Re: multi-install PostgresNode fails with older postgres versions

2021-03-30 Thread Alvaro Herrera
On 2021-Mar-30, Mark Dilger wrote:

> The problem is clear enough; -N/--nosync was added in 9.3, and
> PostgresNode::init is passing -N to initdb unconditionally. I wonder
> if during PostgresNode::new a call should be made to pg_config and the
> version information grep'd out so that version specific options to
> various functions (init, backup, etc) could hinge on the version of
> postgres being used?

Yeah, I think making it backwards-compatible would be good.  Is running
pg_config to obtain the version the best way to do it?  I'm not sure --
what other ways are there?  I can't of anything.  (Asking the user seems
right out.)

-- 
Álvaro Herrera39°49'30"S 73°17'W
Officer Krupke, what are we to do?
Gee, officer Krupke, Krup you! (West Side Story, "Gee, Officer Krupke")




Re: Refactor SSL test framework to support multiple TLS libraries

2021-03-30 Thread Daniel Gustafsson
> On 30 Mar 2021, at 11:53, Michael Paquier  wrote:
> 
> On Tue, Mar 30, 2021 at 03:50:28PM +0900, Michael Paquier wrote:
>> The test_*() ones are just wrappers for psql able to use a customized
>> connection string.  It seems to me that it would make sense to move
>> those two into PostgresNode::psql itself and extend it to be able to
>> handle custom connection strings?
> 
> Looking at this part, I think that this is a win in terms of future
> changes for SSLServer.pm as it would become a facility only in charge
> of managing the backend's SSL configuration.  This has also the
> advantage to make the error handling with psql more consistent with
> the other tests.
> 
> So, attached is a patch to do this simplification.  The bulk of the
> changes is within the tests themselves to adapt to the merge of
> $common_connstr and $connstr for the new routines of PostgresNode.pm,
> and I have done things this way to ease the patch lookup.  Thoughts?

LGTM with the findings that Alvaro reported.

+$node->connect_ok($common_connstr . " " . "user=ssltestuser",

This double concatenation could be a single concat, or just use scalar value
interpolation in the string to make it even more readable.  As it isn't using
the same line broken pattern of the others the concat looks a bit weird as a
result.

Thanks for picking it up, as I have very limited time for hacking right now.

--
Daniel Gustafsson   https://vmware.com/




Re: SQL/JSON: JSON_TABLE

2021-03-30 Thread Erik Rijkers


> On 2021.03.30. 22:25 Nikita Glukhov  wrote:
> 
>  
> On 30.03.2021 19:56, Erik Rijkers wrote:
> 
> >> On 2021.03.27. 02:12 Nikita Glukhov  wrote:
> >>
> >> Attached 47th version of the patches.
> > Hi,
> >
> > Apply, build all fine.  It also works quite well, and according to 
> > specification, as far as I can tell.
> >
> > But today I ran into:
> >
> > ERROR:  function ExecEvalJson not in llvmjit_types.c
> >
> > I think that it is caused by:
> >
> > set enable_bitmapscan = off;
> >
> > (I installed llvm a few days ago. llvm-3.9-dev on this debian stretch).
> >
> >
> > This is the test sql I concocted, which runs fine with enable_bitmapscan on 
> > (the default):
> >
> > select jt1.*
> > from myjsonfile100k as t(js, id)
> >, json_table(
> >t.js
> > , '$' columns (
> >  "lastname"   textpath  '$. "lastname" '
> >, "firstname"  textpath  '$. "firstname"'
> >, "date"   textpath  '$. "date" '
> >, "city"   textpath  '$. "city" '
> >, "country"textpath  '$. "country"  '
> >, "name 0(1)"  textpath  '$. "array"[0] '
> >, "name 4(5)"  textpath  '$. "array"[4] '
> >, "names"  text[]  path  '$. "array"'
> >, "randfloat"  float   path  '$. "random float" '
> >  )
> > ) as jt1
> > where  js @> ('[ { "city": "Santiago de Cuba" } ]')
> > and js[0]->>'firstname' = 'Gilda'
> > ;
> > ERROR:  function ExecEvalJson not in llvmjit_types.c
> >
> > That statement only errors out if the table is large enough. I have no time 
> > now to make a sample table but if no-one understands the problem off-hand, 
> > I'll try to construct such a table later this week (the one I'm using is 
> > large, 1.5 GB).
> 
> Thank you for testing.
> 
> 
> I think you can try to add 3 missing functions references to the end of
> src/backend/jit/llvm/llvmjit_types.c:
> 
>   void       *referenced_functions[] =
>   {
>   ...
>   ExecEvalXmlExpr,
> +    ExecEvalJsonConstructor,
> +    ExecEvalIsJsonPredicate,
> +    ExecEvalJson,
>   MakeExpandedObjectReadOnlyInternal,
>   ...
>   };
> 
> 
> If this fixes problem, I will add this to the new version of the patches.

It does almost fix it, but in the above is a typo:
+  ExecEvalIsJsonPredicate should to be changed to:
+  ExecEvalJsonIsPredicate.

With that change the problem vanishes.

Thanks!

Erik Rijkers








> 
> -- 
> Nikita Glukhov
> Postgres Professional:http://www.postgrespro.co   
> The Russian Postgres Company




Trouble with initdb trying to run regression tests

2021-03-30 Thread Isaac Morland
I've built Postgres inside a Ubuntu Vagrant VM. When I try to "make check",
I get a complaint about the permissions on the data directory:

[]
pg_regress: initdb failed
Examine /vagrant/src/test/regress/log/initdb.log for the reason.
Command was: "initdb" -D "/vagrant/src/test/regress/./tmp_check/data"
--no-clean --no-sync > "/vagrant/src/test/regress/log/initdb.log" 2>&1
make[1]: *** [GNUmakefile:125: check] Error 2
make[1]: Leaving directory '/vagrant/src/test/regress'
make: *** [GNUmakefile:69: check] Error 2
vagrant@ubuntu-focal:/vagrant$ tail /vagrant/src/test/regress/log/initdb.log
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 20
selecting default shared_buffers ... 400kB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... 2021-03-30 21:38:32.746 UTC [23154] FATAL:
 data directory "/vagrant/src/test/regress/./tmp_check/data" has invalid
permissions
2021-03-30 21:38:32.746 UTC [23154] DETAIL:  Permissions should be u=rwx
(0700) or u=rwx,g=rx (0750).
child process exited with exit code 1
initdb: data directory "/vagrant/src/test/regress/./tmp_check/data" not
removed at user's request
vagrant@ubuntu-focal:/vagrant$

Has anybody had this problem? The directory in question is created by the
make check activities so I would have thought that it would set the
permissions; and if not, then everybody trying to run regression tests
would bump into this.


Re: multi-install PostgresNode fails with older postgres versions

2021-03-30 Thread Mark Dilger


> On Mar 30, 2021, at 10:39 AM, Mark Dilger  
> wrote:
> 
> Andrew,
> 
> While developing some cross version tests, I noticed that PostgresNode::init 
> fails for postgres versions older than 9.3, like so:
> 
> # Checking port 52814
> # Found port 52814
> Name: 9.2.24
> Data directory: 
> /Users/mark.dilger/hydra/postgresnode.review/src/test/modules/test_cross_version/tmp_check/t_001_verify_paths_9.2.24_data/pgdata
> Backup directory: 
> /Users/mark.dilger/hydra/postgresnode.review/src/test/modules/test_cross_version/tmp_check/t_001_verify_paths_9.2.24_data/backup
> Archive directory: 
> /Users/mark.dilger/hydra/postgresnode.review/src/test/modules/test_cross_version/tmp_check/t_001_verify_paths_9.2.24_data/archives
> Connection string: port=52814 
> host=/var/folders/6n/3f4vwbnn7fz5qk0xqhgbjrkwgp/T/L_A2w1x7qb
> Log file: 
> /Users/mark.dilger/hydra/postgresnode.review/src/test/modules/test_cross_version/tmp_check/log/001_verify_paths_9.2.24.log
> # Running: initdb -D 
> /Users/mark.dilger/hydra/postgresnode.review/src/test/modules/test_cross_version/tmp_check/t_001_verify_paths_9.2.24_data/pgdata
>  -A trust -N
> initdb: invalid option -- N
> Try "initdb --help" for more information.
> Bail out!  system initdb failed
> 
> The problem is clear enough; -N/--nosync was added in 9.3, and 
> PostgresNode::init is passing -N to initdb unconditionally. I wonder if 
> during PostgresNode::new a call should be made to pg_config and the version 
> information grep'd out so that version specific options to various functions 
> (init, backup, etc) could hinge on the version of postgres being used?
> 
> You could also just remove the -N option, but that would slow down all tests 
> for everybody, so I'm not keen to do that.  Or you could remove -N in cases 
> where $node->{_install_path} is defined, which would be far more acceptable.  
> I'm leaning towards using the output of pg_config, though, since this problem 
> is likely to come up again with other options/commands.
> 
> Thoughts?

I fixed this largely as outlined above, introducing a few new functions which 
ease test development and using one of them to condition the behavior of init() 
on the postgres version.

In the tests I have been developing (not included), the developer (or some 
buildfarm script) has to list all postgresql installations in a configuration 
file, like so:

/Users/mark.dilger/install/8.4
/Users/mark.dilger/install/9.0.23
/Users/mark.dilger/install/9.1.24
/Users/mark.dilger/install/9.2.24
/Users/mark.dilger/install/9.3.25
/Users/mark.dilger/install/9.4.26
/Users/mark.dilger/install/9.5.25
/Users/mark.dilger/install/9.6
/Users/mark.dilger/install/10
/Users/mark.dilger/install/11
/Users/mark.dilger/install/12
/Users/mark.dilger/install/13

The tests can't be hardcoded to know anything about which specific postgres 
versions will be installed, or what version of postgres exists in any 
particular install directory.  It makes the tests easier to maintain if they 
can do stuff like:

   $node{$_} = PostgresNode->get_new_node(...) for (@installed_versions);
   
   if ($node{a}->newer_than_version($node{b}))
   {
  # check that newer version A's whatever can connect to and work with 
older server B
 
   }

I therefore included functions of that sort in the patch along with the 
$node->at_least_version(version) function that the fix uses.



v1-0001-Extending-PostgresNode-cross-version-functionalit.patch.WIP
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





unconstrained memory growth in long running procedure stored procedure after upgrading 11-12

2021-03-30 Thread Merlin Moncure
Hello all,

We just upgraded from postgres 11 to 12.6 and our server is running
out of memory and rebooting about 1-2 times a day.Application
architecture is a single threaded stored procedure, executed with CALL
that loops and never terminates. With postgres 11 we had no memory
issues.  Ultimately the crash looks like this:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
2021-03-29 04:34:31.262 CDT [1413] LOG:  server process (PID 9792) was
terminated by signal 6: Aborted
2021-03-29 04:34:31.262 CDT [1413] DETAIL:  Failed process was
running: CALL Main()
2021-03-29 04:34:31.262 CDT [1413] LOG:  terminating any other active
server processes
2021-03-29 04:34:31.264 CDT [9741] WARNING:  terminating connection
because of crash of another server process
2021-03-29 04:34:31.264 CDT [9741] DETAIL:  The postmaster has
commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.
2021-03-29 04:34:31.264 CDT [9741] HINT:  In a moment you should be
able to reconnect to the database and repeat your command.
2021-03-29 04:34:31.267 CDT [1413] LOG:  archiver process (PID 9742)
exited with exit code 1
2021-03-29 04:34:31.267 CDT [1413] LOG:  all server processes
terminated; reinitializing

Attached is a self contained test case which reproduces the problem.

Instructions:
1. run the attached script in psql, pgtask_test.sql. It will create a
database, initialize it, and run the main procedure. dblink must be
available
2. in another window, run SELECT CreateTaskChain('test', 'DEV');

In the console that ran main(), you should see output that the
procedure began to do work. Once it does, a 'top' should show resident
memory growth immediately.   It's about a gigabyte an hour in my test.
Sorry for the large-ish attachment.

merlin


pgtask_test.sql
Description: Binary data


pgtask.sql
Description: Binary data


Re: Remove page-read callback from XLogReaderState.

2021-03-30 Thread Thomas Munro
On Thu, Mar 4, 2021 at 3:29 PM Kyotaro Horiguchi
 wrote:
> A recent commot about LSN_FORMAT_ARGS conflicted this.
> Just rebased.

FYI I've been looking at this, and I think it's a very nice
improvement.  I'll post some review comments and a rebase shortly.




Re: SQL/JSON: JSON_TABLE

2021-03-30 Thread Nikita Glukhov

On 30.03.2021 19:56, Erik Rijkers wrote:


On 2021.03.27. 02:12 Nikita Glukhov  wrote:

Attached 47th version of the patches.

Hi,

Apply, build all fine.  It also works quite well, and according to 
specification, as far as I can tell.

But today I ran into:

ERROR:  function ExecEvalJson not in llvmjit_types.c

I think that it is caused by:

set enable_bitmapscan = off;

(I installed llvm a few days ago. llvm-3.9-dev on this debian stretch).


This is the test sql I concocted, which runs fine with enable_bitmapscan on 
(the default):

select jt1.*
from myjsonfile100k as t(js, id)
   , json_table(
   t.js
, '$' columns (
 "lastname"   textpath  '$. "lastname" '
   , "firstname"  textpath  '$. "firstname"'
   , "date"   textpath  '$. "date" '
   , "city"   textpath  '$. "city" '
   , "country"textpath  '$. "country"  '
   , "name 0(1)"  textpath  '$. "array"[0] '
   , "name 4(5)"  textpath  '$. "array"[4] '
   , "names"  text[]  path  '$. "array"'
   , "randfloat"  float   path  '$. "random float" '
 )
) as jt1
where  js @> ('[ { "city": "Santiago de Cuba" } ]')
and js[0]->>'firstname' = 'Gilda'
;
ERROR:  function ExecEvalJson not in llvmjit_types.c

That statement only errors out if the table is large enough. I have no time now 
to make a sample table but if no-one understands the problem off-hand, I'll try 
to construct such a table later this week (the one I'm using is large, 1.5 GB).


Thank you for testing.


I think you can try to add 3 missing functions references to the end of
src/backend/jit/llvm/llvmjit_types.c:

 void       *referenced_functions[] =
 {
 ...
 ExecEvalXmlExpr,
+    ExecEvalJsonConstructor,
+    ExecEvalIsJsonPredicate,
+    ExecEvalJson,
 MakeExpandedObjectReadOnlyInternal,
 ...
 };


If this fixes problem, I will add this to the new version of the patches.


--
Nikita Glukhov
Postgres Professional:http://www.postgrespro.co   
The Russian Postgres Company



Re: "has_column_privilege()" issue with attnums and non-existent columns

2021-03-30 Thread Joe Conway

On 3/30/21 3:37 PM, Joe Conway wrote:

On 3/21/21 12:27 PM, Tom Lane wrote:

I think we may have to adjust the acl.c APIs, or maybe better provide new
entry points, so that we can have variants of pg_xxx_aclcheck that won't
throw a hard error upon not finding the row.  We cheesily tried to avoid
adjusting those APIs to support the semantics we need here, and we now see
that it didn't really work.


Ok, I took a shot at that; see attached.


Heh, I missed the forest for the trees it seems.

That version undid the changes fixing what Ian was originally complaining about.

New version attached that preserves the fixed behavior.


Questions:
1. I confined the changes to just pg_class_aclcheck/mask
 and pg_attribute_aclcheck/mask -- did you intend
 that we do this same change across the board? Or
 perhaps do the rest of them once we open up pg15
 development?

2. This seems more invasive than something we would want
 to back patch -- agreed?


The questions remain...

Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index add3d14..b5a6b3a 100644
*** a/src/backend/catalog/aclchk.c
--- b/src/backend/catalog/aclchk.c
*** AclMode
*** 3706,3711 
--- 3706,3719 
  pg_attribute_aclmask(Oid table_oid, AttrNumber attnum, Oid roleid,
  	 AclMode mask, AclMaskHow how)
  {
+ 	return pg_attribute_aclmask_ext(table_oid, attnum, roleid,
+ 	mask, how, NULL);
+ }
+ 
+ AclMode
+ pg_attribute_aclmask_ext(Oid table_oid, AttrNumber attnum, Oid roleid,
+ 		 AclMode mask, AclMaskHow how, bool *is_missing)
+ {
  	AclMode		result;
  	HeapTuple	classTuple;
  	HeapTuple	attTuple;
*** pg_attribute_aclmask(Oid table_oid, Attr
*** 3723,3740 
  			   ObjectIdGetDatum(table_oid),
  			   Int16GetDatum(attnum));
  	if (!HeapTupleIsValid(attTuple))
! 		ereport(ERROR,
! (errcode(ERRCODE_UNDEFINED_COLUMN),
!  errmsg("attribute %d of relation with OID %u does not exist",
! 		attnum, table_oid)));
  	attributeForm = (Form_pg_attribute) GETSTRUCT(attTuple);
  
! 	/* Throw error on dropped columns, too */
  	if (attributeForm->attisdropped)
! 		ereport(ERROR,
! (errcode(ERRCODE_UNDEFINED_COLUMN),
!  errmsg("attribute %d of relation with OID %u does not exist",
! 		attnum, table_oid)));
  
  	aclDatum = SysCacheGetAttr(ATTNUM, attTuple, Anum_pg_attribute_attacl,
  			   );
--- 3731,3768 
  			   ObjectIdGetDatum(table_oid),
  			   Int16GetDatum(attnum));
  	if (!HeapTupleIsValid(attTuple))
! 	{
! 		if (is_missing != NULL)
! 		{
! 			/* return "no privileges" instead of throwing an error */
! 			*is_missing = true;
! 			return 0;
! 		}
! 		else
! 			ereport(ERROR,
! 	(errcode(ERRCODE_UNDEFINED_COLUMN),
! 	 errmsg("attribute %d of relation with OID %u does not exist",
! 			attnum, table_oid)));
! 	}
! 
  	attributeForm = (Form_pg_attribute) GETSTRUCT(attTuple);
  
! 	/* Check dropped columns, too */
  	if (attributeForm->attisdropped)
! 	{
! 		if (is_missing != NULL)
! 		{
! 			/* return "no privileges" instead of throwing an error */
! 			*is_missing = true;
! 			ReleaseSysCache(attTuple);
! 			return 0;
! 		}
! 		else
! 			ereport(ERROR,
! 	(errcode(ERRCODE_UNDEFINED_COLUMN),
! 	 errmsg("attribute %d of relation with OID %u does not exist",
! 			attnum, table_oid)));
! 	}
  
  	aclDatum = SysCacheGetAttr(ATTNUM, attTuple, Anum_pg_attribute_attacl,
  			   );
*** AclMode
*** 3791,3796 
--- 3819,3831 
  pg_class_aclmask(Oid table_oid, Oid roleid,
   AclMode mask, AclMaskHow how)
  {
+ 	return pg_class_aclmask_ext(table_oid, roleid, mask, how, NULL);
+ }
+ 
+ AclMode
+ pg_class_aclmask_ext(Oid table_oid, Oid roleid, AclMode mask,
+ 	 AclMaskHow how, bool *is_missing)
+ {
  	AclMode		result;
  	HeapTuple	tuple;
  	Form_pg_class classForm;
*** pg_class_aclmask(Oid table_oid, Oid role
*** 3804,3813 
  	 */
  	tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(table_oid));
  	if (!HeapTupleIsValid(tuple))
! 		ereport(ERROR,
! (errcode(ERRCODE_UNDEFINED_TABLE),
!  errmsg("relation with OID %u does not exist",
! 		table_oid)));
  	classForm = (Form_pg_class) GETSTRUCT(tuple);
  
  	/*
--- 3839,3858 
  	 */
  	tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(table_oid));
  	if (!HeapTupleIsValid(tuple))
! 	{
! 		if (is_missing != NULL)
! 		{
! 			/* return "no privileges" instead of throwing an error */
! 			*is_missing = true;
! 			return 0;
! 		}
! 		else
! 			ereport(ERROR,
! 	(errcode(ERRCODE_UNDEFINED_TABLE),
! 	 errmsg("relation with OID %u does not exist",
! 			table_oid)));
! 	}
! 
  	classForm = (Form_pg_class) GETSTRUCT(tuple);
  
  	/*
*** AclResult
*** 4468,4474 
  pg_attribute_aclcheck(Oid table_oid, AttrNumber attnum,
  	  Oid roleid, AclMode mode)
  {
! 	

Re: "has_column_privilege()" issue with attnums and non-existent columns

2021-03-30 Thread Tom Lane
Joe Conway  writes:
> On 3/21/21 12:27 PM, Tom Lane wrote:
>> I think we may have to adjust the acl.c APIs, or maybe better provide new
>> entry points, so that we can have variants of pg_xxx_aclcheck that won't
>> throw a hard error upon not finding the row.  We cheesily tried to avoid
>> adjusting those APIs to support the semantics we need here, and we now see
>> that it didn't really work.

> Ok, I took a shot at that; see attached.

Looks generally reasonable by eyeball.  The lack of any documentation
comment for the new functions makes me itch, although the ones that
are there are hardly better.

> 1. I confined the changes to just pg_class_aclcheck/mask
> and pg_attribute_aclcheck/mask -- did you intend
> that we do this same change across the board? Or
> perhaps do the rest of them once we open up pg15
> development?

In principle, it might be nice to fix all of those functions in acl.c
to be implemented similarly --- you could get rid of the initial
SearchSysCacheExists calls in the ones that are trying not to throw
error for is-missing cases.  In practice, as long as there's no
reachable bug for the other cases, there are probably better things
to spend time on.

> 2. This seems more invasive than something we would want
> to back patch -- agreed?

You could make an argument either way, but given the limited number
of complaints about this, I'd lean to no back-patch.

regards, tom lane




Re: Idea: Avoid JOINs by using path expressions to follow FKs

2021-03-30 Thread Isaac Morland
On Tue, 30 Mar 2021 at 15:33, Joel Jacobson  wrote:

> Also, should the join be a left join, which would therefore return a NULL
> when there is no matching record? Or could we have a variation such as ->?
> to give a left join (NULL when no matching record) with -> using an inner
> join (record is not included in result when no matching record).
>
>
> Interesting idea, but I think we can keep it simple, and still support the
> case you mention:
>
> If we only have -> and you want to exclude records where the column is
> NULL (i.e. INNER JOIN),
> I think we should just use the WHERE clause and filter on such condition.
>

Just to be clear, it will always be a left join? Agreed that getting the
inner join behaviour can be done in the WHERE clause. I think this is a
case where simple is good. As long as the left join case is supported I'm
happy.


> Thanks for the encouraging words. I have exactly the same experience
> myself and share your view.
>
> I look forward to continued discussion on this matter.
>

I had another idea: maybe the default name of a foreign key constraint to a
primary key should simply be the name of the target table? That is, if I
say:

FOREIGN KEY (...) REFERENCES t

... then unless the table name t is already in use as a constraint name, it
will be used as the constraint name. It would be nice not to have to keep
repeating, like this:

CONSTRAINT t FOREIGN KEY (...) REFERENCES t


Re: pg_amcheck contrib application

2021-03-30 Thread Robert Haas
On Mon, Mar 29, 2021 at 7:16 PM Mark Dilger
 wrote:
> Sure, here are four patches which do the same as the single v12 patch did.

Thanks. Here are some comments on 0003 and 0004:

When you posted v11, you said that "Rather than print out all four
toast pointer fields for each toast failure, va_rawsize, va_extsize,
and va_toastrelid are only mentioned in the corruption message if they
are related to the specific corruption.  Otherwise, just the
va_valueid is mentioned in the corruption message." I like that
principal; in fact, as you know, I suggested it. But, with the v13
patches applied, exactly zero of the callers to
report_toast_corruption() appear to be following it, because none of
them include the value ID. I think you need to revise the messages,
e.g. "toasted value for attribute %u missing from toast table" ->
"toast value %u not found in toast table"; "final toast chunk number
%u differs from expected value %u" -> "toast value %u was expected to
end at chunk %u, but ended at chunk %u"; "toast chunk sequence number
is null" -> "toast value %u has toast chunk with null sequence
number". In the first of those example cases, I think you need not
mention the attribute number because it's already there in its own
column.

On a related note, it doesn't look like you are actually checking
va_toastrelid here. Doing so seems like it would be a good idea. It
also seems like it would be good to check that the compressed size is
less than or equal to the uncompressed size.

I do not like the name tuple_is_volatile, because volatile has a
couple of meanings already, and this isn't one of them. A
SQL-callable function is volatile if it might return different outputs
given the same inputs, even within the same SQL statement. A C
variable is volatile if it might be magically modified in ways not
known to the compiler. I had suggested tuple_cannot_die_now, which is
closer to the mark. If you want to be even more precise, you could
talk about whether the tuple is potentially prunable (e.g.
tuple_can_be_pruned, which inverts the sense). That's really what
we're worried about: whether MVCC rules would permit the tuple to be
pruned after we release the buffer lock and before we check TOAST.

I would ideally prefer not to rename report_corruption(). The old name
is clearer, and changing it produces a bunch of churn that I'd rather
avoid. Perhaps the common helper function could be called
report_corruption_internal(), and the callers could be
report_corruption() and report_toast_corruption().

Regarding 0001 and 0002, I think the logic in 0002 looks a lot closer
to correct now, but I want to go through it in more detail. I think,
though, that you've made some of my comments worse. For example, I
wrote: "It should be impossible for xvac to still be running, since
we've removed all that code, but even if it were, it ought to be safe
to read the tuple, since the original inserter must have committed.
But, if the xvac transaction committed, this tuple (and its associated
TOAST tuples) could be pruned at any time." You changed that to read
"We don't bother comparing against safe_xmin because the VACUUM FULL
must have committed prior to an upgrade and can't still be running."
Your comment is shorter, which is a point in its favor, but what I was
trying to emphasize is that the logic would be correct EVEN IF we
again started to use HEAP_MOVED_OFF and HEAP_MOVED_IN again. Your
version makes it sound like the code would need to be revised in that
case. If that's true, then my comment was wrong, but I didn't think it
was true, or I wouldn't have written the comment in that way.

Also, and maybe this is a job for a separate patch, but then again
maybe not, I wonder if it's really a good idea for get_xid_status to
return both a XidBoundsViolation and an XidCommitStatus. It seems to
me (without checking all that carefully) that it might be better to
just flatten all of that into a single enum, because right now it
seems like you often end up with two consecutive switch statements
where, perhaps, just one would suffice.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pgbench - add pseudo-random permutation function

2021-03-30 Thread Dean Rasheed
On Tue, 30 Mar 2021 at 20:31, Dean Rasheed  wrote:
>
> Yeah, that's probably a fair point. However, all the existing pgbench
> random functions are using it, so I think it's fair enough for
> permute() to do the same (and actually 2^48 is pretty huge). Switching
> to a 64-bit PRNG might not be a bad idea, but I think that's something
> we'd want to do across the board, and so I think it should be out of
> scope for this patch.
>

Of course the immediate counter-argument to changing the existing
random functions would be that doing so would break lots of people's
tests, and no one would thank us for that. Still, I think that, since
the existing random functions use a 48-bit PRNG, it's not unreasonable
for permute() to do the same.

Regards,
Dean




Re: "has_column_privilege()" issue with attnums and non-existent columns

2021-03-30 Thread Joe Conway

On 3/21/21 12:27 PM, Tom Lane wrote:

I think we may have to adjust the acl.c APIs, or maybe better provide new
entry points, so that we can have variants of pg_xxx_aclcheck that won't
throw a hard error upon not finding the row.  We cheesily tried to avoid
adjusting those APIs to support the semantics we need here, and we now see
that it didn't really work.


Ok, I took a shot at that; see attached.

Questions:

1. I confined the changes to just pg_class_aclcheck/mask
   and pg_attribute_aclcheck/mask -- did you intend
   that we do this same change across the board? Or
   perhaps do the rest of them once we open up pg15
   development?

2. This seems more invasive than something we would want
   to back patch -- agreed?

Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index add3d14..b5a6b3a 100644
*** a/src/backend/catalog/aclchk.c
--- b/src/backend/catalog/aclchk.c
*** AclMode
*** 3706,3711 
--- 3706,3719 
  pg_attribute_aclmask(Oid table_oid, AttrNumber attnum, Oid roleid,
  	 AclMode mask, AclMaskHow how)
  {
+ 	return pg_attribute_aclmask_ext(table_oid, attnum, roleid,
+ 	mask, how, NULL);
+ }
+ 
+ AclMode
+ pg_attribute_aclmask_ext(Oid table_oid, AttrNumber attnum, Oid roleid,
+ 		 AclMode mask, AclMaskHow how, bool *is_missing)
+ {
  	AclMode		result;
  	HeapTuple	classTuple;
  	HeapTuple	attTuple;
*** pg_attribute_aclmask(Oid table_oid, Attr
*** 3723,3740 
  			   ObjectIdGetDatum(table_oid),
  			   Int16GetDatum(attnum));
  	if (!HeapTupleIsValid(attTuple))
! 		ereport(ERROR,
! (errcode(ERRCODE_UNDEFINED_COLUMN),
!  errmsg("attribute %d of relation with OID %u does not exist",
! 		attnum, table_oid)));
  	attributeForm = (Form_pg_attribute) GETSTRUCT(attTuple);
  
! 	/* Throw error on dropped columns, too */
  	if (attributeForm->attisdropped)
! 		ereport(ERROR,
! (errcode(ERRCODE_UNDEFINED_COLUMN),
!  errmsg("attribute %d of relation with OID %u does not exist",
! 		attnum, table_oid)));
  
  	aclDatum = SysCacheGetAttr(ATTNUM, attTuple, Anum_pg_attribute_attacl,
  			   );
--- 3731,3768 
  			   ObjectIdGetDatum(table_oid),
  			   Int16GetDatum(attnum));
  	if (!HeapTupleIsValid(attTuple))
! 	{
! 		if (is_missing != NULL)
! 		{
! 			/* return "no privileges" instead of throwing an error */
! 			*is_missing = true;
! 			return 0;
! 		}
! 		else
! 			ereport(ERROR,
! 	(errcode(ERRCODE_UNDEFINED_COLUMN),
! 	 errmsg("attribute %d of relation with OID %u does not exist",
! 			attnum, table_oid)));
! 	}
! 
  	attributeForm = (Form_pg_attribute) GETSTRUCT(attTuple);
  
! 	/* Check dropped columns, too */
  	if (attributeForm->attisdropped)
! 	{
! 		if (is_missing != NULL)
! 		{
! 			/* return "no privileges" instead of throwing an error */
! 			*is_missing = true;
! 			ReleaseSysCache(attTuple);
! 			return 0;
! 		}
! 		else
! 			ereport(ERROR,
! 	(errcode(ERRCODE_UNDEFINED_COLUMN),
! 	 errmsg("attribute %d of relation with OID %u does not exist",
! 			attnum, table_oid)));
! 	}
  
  	aclDatum = SysCacheGetAttr(ATTNUM, attTuple, Anum_pg_attribute_attacl,
  			   );
*** AclMode
*** 3791,3796 
--- 3819,3831 
  pg_class_aclmask(Oid table_oid, Oid roleid,
   AclMode mask, AclMaskHow how)
  {
+ 	return pg_class_aclmask_ext(table_oid, roleid, mask, how, NULL);
+ }
+ 
+ AclMode
+ pg_class_aclmask_ext(Oid table_oid, Oid roleid, AclMode mask,
+ 	 AclMaskHow how, bool *is_missing)
+ {
  	AclMode		result;
  	HeapTuple	tuple;
  	Form_pg_class classForm;
*** pg_class_aclmask(Oid table_oid, Oid role
*** 3804,3813 
  	 */
  	tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(table_oid));
  	if (!HeapTupleIsValid(tuple))
! 		ereport(ERROR,
! (errcode(ERRCODE_UNDEFINED_TABLE),
!  errmsg("relation with OID %u does not exist",
! 		table_oid)));
  	classForm = (Form_pg_class) GETSTRUCT(tuple);
  
  	/*
--- 3839,3858 
  	 */
  	tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(table_oid));
  	if (!HeapTupleIsValid(tuple))
! 	{
! 		if (is_missing != NULL)
! 		{
! 			/* return "no privileges" instead of throwing an error */
! 			*is_missing = true;
! 			return 0;
! 		}
! 		else
! 			ereport(ERROR,
! 	(errcode(ERRCODE_UNDEFINED_TABLE),
! 	 errmsg("relation with OID %u does not exist",
! 			table_oid)));
! 	}
! 
  	classForm = (Form_pg_class) GETSTRUCT(tuple);
  
  	/*
*** AclResult
*** 4468,4474 
  pg_attribute_aclcheck(Oid table_oid, AttrNumber attnum,
  	  Oid roleid, AclMode mode)
  {
! 	if (pg_attribute_aclmask(table_oid, attnum, roleid, mode, ACLMASK_ANY) != 0)
  		return ACLCHECK_OK;
  	else
  		return ACLCHECK_NO_PRIV;
--- 4513,4527 
  pg_attribute_aclcheck(Oid table_oid, AttrNumber attnum,
  	  Oid roleid, AclMode mode)
  {
! 	return 

Re: Get memory contexts of an arbitrary backend process

2021-03-30 Thread Fujii Masao



On 2021/03/30 22:06, torikoshia wrote:

Modified the patch according to the suggestions.


Thanks for updating the patch!

I applied the cosmetic changes to the patch and added the example of
the function call into the document. Attached is the updated version
of the patch. Could you check this version?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index fbf6062d0a..aa9080bddc 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24917,6 +24917,23 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_log_backend_memory_contexts
+
+pg_log_backend_memory_contexts ( 
pid integer )
+boolean
+   
+   
+Logs the memory contexts whose backend process has the specified
+process ID.
+Memory contexts will be logged based on the log configuration set.
+See  for more information.
+Only superusers can log the memory contexts.
+   
+  
+
   

 
@@ -24987,6 +25004,35 @@ SELECT collation for ('foo' COLLATE "de_DE");
 pg_stat_activity view.

 
+   
+pg_log_backend_memory_contexts can be used
+to log the memory contexts of the backend process. For example,
+
+postgres=# SELECT pg_log_backend_memory_contexts(pg_backend_pid());
+ pg_log_backend_memory_contexts 
+
+ t
+(1 row)
+
+The memory contexts will be logged in the log file. For example:
+LOG:  logging memory contexts of PID 10377
+STATEMENT:  SELECT pg_log_backend_memory_contexts(pg_backend_pid());
+LOG:  level: 0; TopMemoryContext: 80800 total in 6 blocks; 14432 free (5 
chunks); 66368 used
+LOG:  level: 1; pgstat TabStatusArray lookup hash table: 8192 total in 1 
blocks; 1408 free (0 chunks); 6784 used
+LOG:  level: 1; TopTransactionContext: 8192 total in 1 blocks; 7720 free (1 
chunks); 472 used
+LOG:  level: 1; RowDescriptionContext: 8192 total in 1 blocks; 6880 free (0 
chunks); 1312 used
+LOG:  level: 1; MessageContext: 16384 total in 2 blocks; 5152 free (0 chunks); 
11232 used
+LOG:  level: 1; Operator class cache: 8192 total in 1 blocks; 512 free (0 
chunks); 7680 used
+LOG:  level: 1; smgr relation table: 16384 total in 2 blocks; 4544 free (3 
chunks); 11840 used
+LOG:  level: 1; TransactionAbortContext: 32768 total in 1 blocks; 32504 free 
(0 chunks); 264 used
+...
+LOG:  level: 1; ErrorContext: 8192 total in 1 blocks; 7928 free (3 chunks); 
264 used
+LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 
1029560 used
+
+For more than 100 child contexts under the same parent one,
+100 child contexts and a summary of the remaining ones will be logged.
+   
+
   
 
   
diff --git a/src/backend/storage/ipc/procsignal.c 
b/src/backend/storage/ipc/procsignal.c
index c6a8d4611e..eac6895141 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -30,6 +30,7 @@
 #include "storage/shmem.h"
 #include "storage/sinval.h"
 #include "tcop/tcopprot.h"
+#include "utils/memutils.h"
 
 /*
  * The SIGUSR1 signal is multiplexed to support signaling multiple event
@@ -657,6 +658,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_BARRIER))
HandleProcSignalBarrierInterrupt();
 
+   if (CheckProcSignal(PROCSIG_LOG_MEMORY_CONTEXT))
+   HandleLogMemoryContextInterrupt();
+
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_DATABASE))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_DATABASE);
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 2b1b68109f..afaf3b1cce 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3295,6 +3295,9 @@ ProcessInterrupts(void)
 
if (ParallelMessagePending)
HandleParallelMessages();
+
+   if (LogMemoryContextPending)
+   ProcessLogMemoryContextInterrupt();
 }
 
 
diff --git a/src/backend/utils/adt/mcxtfuncs.c 
b/src/backend/utils/adt/mcxtfuncs.c
index c02fa47550..fe9b7979e2 100644
--- a/src/backend/utils/adt/mcxtfuncs.c
+++ b/src/backend/utils/adt/mcxtfuncs.c
@@ -18,6 +18,8 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "mb/pg_wchar.h"
+#include "storage/proc.h"
+#include "storage/procarray.h"
 #include "utils/builtins.h"
 
 /* --
@@ -61,7 +63,7 @@ PutMemoryContextsStatsTupleStore(Tuplestorestate *tupstore,
 
/* Examine the context itself */
memset(, 0, sizeof(stat));
-   (*context->methods->stats) (context, NULL, (void *) , );
+   (*context->methods->stats) (context, NULL, (void *) , , 
true);
 
memset(values, 0, sizeof(values));
memset(nulls, 0, sizeof(nulls));
@@ -155,3 +157,59 @@ pg_get_backend_memory_contexts(PG_FUNCTION_ARGS)
 
return (Datum) 0;
 }
+
+/*
+ * pg_log_backend_memory_contexts
+ 

Re: Idea: Avoid JOINs by using path expressions to follow FKs

2021-03-30 Thread Joel Jacobson
On Tue, Mar 30, 2021, at 21:02, Isaac Morland wrote:
> On Tue, 30 Mar 2021 at 14:30, Joel Jacobson  wrote:
> 
>> __
>> If the expression ends with a column_name,
>> you get the value for the column.
>> 
>> If the expression ends with a constraint_name,
>> you get the referenced table as a record.
> 
> Can’t you just leave off the “ends with a column_name” part? If you want one 
> of its columns, just put .column_name:
> 
> table -> constraint -> ... -> constraint . column_name
> 
> Then you know that -> expects a constraint_name and only that to its right.

+1

Of course! Much simpler. Thanks.

> 
> Also, should the join be a left join, which would therefore return a NULL 
> when there is no matching record? Or could we have a variation such as ->? to 
> give a left join (NULL when no matching record) with -> using an inner join 
> (record is not included in result when no matching record).

Interesting idea, but I think we can keep it simple, and still support the case 
you mention:

If we only have -> and you want to exclude records where the column is NULL 
(i.e. INNER JOIN),
I think we should just use the WHERE clause and filter on such condition.

> 
> For the record I would find something like this quite useful. I constantly 
> find myself joining in code lookup tables and the like, and while from a 
> mathematical view it’s just another join, explicitly listing the table in the 
> FROM clause of a large query does not assist with readability to say the 
> least.

Thanks for the encouraging words. I have exactly the same experience myself and 
share your view.

I look forward to continued discussion on this matter.

/Joel

Re: pgbench - add pseudo-random permutation function

2021-03-30 Thread Dean Rasheed
On Tue, 30 Mar 2021 at 19:26, Fabien COELHO  wrote:
>
> First, I have a thing against erand48.

Yeah, that's probably a fair point. However, all the existing pgbench
random functions are using it, so I think it's fair enough for
permute() to do the same (and actually 2^48 is pretty huge). Switching
to a 64-bit PRNG might not be a bad idea, but I think that's something
we'd want to do across the board, and so I think it should be out of
scope for this patch.

> Second, I have a significant reservation about the very structure of the
> transformation in this version:
>
>loop 4 times :
>
>  // FIRST HALF STEER
>  m/r = pseudo randoms
>  if v in first "half"
> v = ((v * m) ^ r) & mask;
> rotate1(v)
>
>  // FULL SHIFT 1
>  r = pseudo random
>  v = (v + r) % size
>
>  // SECOND HALF STEER
>  m/r = pseudo randoms
>  if v in second "half"
> same as previous on second half
>
>  // FULL SHIFT 2
>  r = pseudo random
>  v = (v + r) % size
>
> I'm really at odds with FULL SHIFT 1, because it means that up to 1/256 of
> values are kept out of STEERING. Whole chunks of values could be kept
> unshuffled because they would only have SHIFTS apply to them and each time
> fall in the not steered half. It should be an essential part of the design
> that at least one steer is applied on a value at each round, and if two
> are applied then fine, but certainly not zero. So basically I think that
> the design would be significantly improved by removing "FULL SHIFT 1".

Ah, that's a good point. Something else that also concerned me there
was that it might lead to 2 consecutive full shifts with nothing in
between, which would lead to less uniform randomness (like the
Irwin-Hall distribution).

I just did a quick test without the first full shift, and the results
do appear to be better, so removing that looks like a good idea.

> Third, I think that the rotate code can be simplified, in particular the
> ?: should be avoided because it may induce branches quite damaging to
> processor performance.

Yeah, I wondered about that. Perhaps there's a "trick" that can be
used to simplify it. Pre-computing the number of bits in the mask
would probably help. I'll give it some thought.

Regards,
Dean




Bug? pg_identify_object_as_address() et al doesn't work with pg_enum.oid

2021-03-30 Thread Joel Jacobson
Hi,

Some catalog oid values originate from other catalogs,
such as pg_aggregate.aggfnoid -> pg_proc.oid
or pg_attribute.attrelid -> pg_class.oid.

For such oid values, the foreign catalog is the regclass
which should be passed as the first argument to
all the functions taking (classid oid, objid oid, objsubid integer)
as input, i.e. pg_describe_object(), pg_identify_object() and
pg_identify_object_as_address().

All oids values in all catalogs,
can be used with these functions,
as long as the correct regclass is passed as the first argument,
*except* pg_enum.oid.

(This is not a problem for pg_enum.enumtypid,
its regclass is 'pg_type' and works fine.)

I would have expected the regclass to be 'pg_enum'::regclass,
since there is no foreign key on pg_enum.oid.

In a way, pg_enum is similar to pg_attribute,
   pg_enum.enumtypid -> pg_type.oid
reminds me of
   pg_attribute.attrelid -> pg_class.oid

But pg_enum has its own oid column as primary key,
whereas in pg_attribute we only have a multi-column primary key (attrelid, 
attnum).

Is this a bug? I.e. should we add support to deal with pg_enum.oid?

Or is this by design?
If so, wouldn't it be good to mention this corner-case
somewhere in the documentation for pg_identify_object_as_address() et al?
That is, to explain these functions works for almost all oid values, except 
pg_enum.oid.

/Joel




Re: SELECT INTO deprecation

2021-03-30 Thread Jan Wieck

On 12/15/20 5:13 PM, Bruce Momjian wrote:

On Wed, Dec  9, 2020 at 09:48:54PM +0100, Peter Eisentraut wrote:

On 2020-12-03 20:26, Peter Eisentraut wrote:
> On 2020-12-03 16:34, Tom Lane wrote:
> > As I recall, a whole lot of the pain we have with INTO has to do
> > with the semantics we've chosen for INTO in a set-operation nest.
> > We think you can write something like
> > 
> >  SELECT ... INTO foo FROM ... UNION SELECT ... FROM ...
> > 
> > but we insist on the INTO being in the first component SELECT.

> > I'd like to know exactly how much of that messiness is shared
> > by SQL Server.
> 
> On sqlfiddle.com, this works:
> 
> select a into t3 from t1 union select a from t2;
> 
> but this gets an error:
> 
> select a from t1 union select a into t4 from t2;
> 
> SELECT INTO must be the first query in a statement containing a UNION,

> INTERSECT or EXCEPT operator.

So, with that in mind, here is an alternative proposal that points out that
SELECT INTO has some use for compatibility.


Do we really want to carry around confusing syntax for compatibility?  I
doubt we would ever add INTO now, even for compatibility.



If memory serves the INTO syntax is a result from the first incarnation 
of PL/pgSQL being based on the Oracle PL/SQL syntax. I think it has been 
there from the very beginning, which makes it likely that by now a lot 
of migrants are using it in rather old code.


I don't think it should be our business to throw wrenches into their 
existing applications.



Regards, Jan

--
Jan Wieck
Principle Database Engineer
Amazon Web Services




Re: Support tab completion for upper character inputs in psql

2021-03-30 Thread David Zhang

Hi Tang,

Thanks a lot for the patch.

I did a quick test based on the latest patch V3 on latest master branch 
"commit 4753ef37e0eda4ba0af614022d18fcbc5a946cc9".


Case 1: before patch

  1 postgres=# set a
  2 all  allow_system_table_mods 
application_name array_nulls

  3 postgres=# set A
  4
  5 postgres=# create TABLE tbl (data text);
  6 CREATE TABLE
  7 postgres=# update tbl SET DATA =
  8
  9 postgres=# update T
 10
 11 postgres=#

Case 2: after patched

  1 postgres=# set a
  2 all  allow_system_table_mods 
application_name array_nulls

  3 postgres=# set A
  4 ALL  ALLOW_SYSTEM_TABLE_MODS 
APPLICATION_NAME ARRAY_NULLS

  5 postgres=# create TABLE tbl (data text);
  6 CREATE TABLE
  7
  8 postgres=# update tbl SET DATA =
  9
 10 postgres=# update TBL SET
 11
 12 postgres=#

So, as you can see the difference is between line 8 and 10 in case 2. It 
looks like the lowercase can auto complete more than the uppercase; 
secondly, if you can add some test cases, it would be great.


Best regards,
David

On 2021-03-22 5:41 a.m., tanghy.f...@fujitsu.com wrote:

On Tuesday, March 16, 2021 5:20 AM, Peter Eisentraut 
 wrote:


The cases that complete with a query
result are not case insensitive right now.  This affects things like

UPDATE T

as well.  I think your first patch was basically right.  But we need to
understand that this affects all completions with query results, not
just the one you wanted to fix.  So you should analyze all the callers
and explain why the proposed change is appropriate.

Thanks for your review and suggestion. Please find attached patch V3 which was 
based on the first patch[1].
Difference from the first patch is:

Add tab completion support for all query results in psql.
complete_from_query
+complete_from_versioned_query
+complete_from_schema_query
+complete_from_versioned_schema_query

[1] 
https://www.postgresql.org/message-id/a63cbd45e3884cf9b3961c2a6a95dcb7%40G08CNEXMBPEKD05.g08.fujitsu.local

The modification to support case insensitive matching in " _complete_from_query" is based on 
"complete_from_const and "complete_from_list" .
Please let me know if you find anything insufficient.

Regards,
Tang



--
David

Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca




Re: Idea: Avoid JOINs by using path expressions to follow FKs

2021-03-30 Thread Isaac Morland
On Tue, 30 Mar 2021 at 14:30, Joel Jacobson  wrote:

If the expression ends with a column_name,
> you get the value for the column.
>
> If the expression ends with a constraint_name,
> you get the referenced table as a record.
>

Can’t you just leave off the “ends with a column_name” part? If you want
one of its columns, just put .column_name:

table -> constraint -> ... -> constraint . column_name

Then you know that -> expects a constraint_name and only that to its right.

Also, should the join be a left join, which would therefore return a NULL
when there is no matching record? Or could we have a variation such as ->?
to give a left join (NULL when no matching record) with -> using an inner
join (record is not included in result when no matching record).

For the record I would find something like this quite useful. I constantly
find myself joining in code lookup tables and the like, and while from a
mathematical view it’s just another join, explicitly listing the table in
the FROM clause of a large query does not assist with readability to say
the least.


Re: Failed assertion on standby while shutdown

2021-03-30 Thread igor levshin

отбой: Маша сказала: уже оплатили :)


вт 30.03.21 20:44, Maxim Orlov пишет:

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

All the tests passed successfully.

The new status of this patch is: Ready for Committer





Re: making update/delete of inheritance trees scale better

2021-03-30 Thread Tom Lane
I wrote:
> ... I also tried variants
> of that involving updating two columns of a 6-column table and of a
> 10-column table, figuring that those cases might be a bit more
> representative of typical usage (see attached scripts).

Argh, I managed to attach the wrong file for the 10-column test
case.  For the archives' sake, here's the right one.

regards, tom lane

drop table if exists tab;
create unlogged table tab (a int4, b int4, f3 int, f4 int, f5 text,
  f6 float8, f7 int, f8 int, f9 int, f10 int);
\timing on
insert into tab select g, g, g, g, g::text,
  g, g, g, g, g
  from generate_series(1, 1000) g;
vacuum tab;
explain verbose update tab set b = b, f6 = f6;
update tab set b = b, f6 = f6;


Re: making update/delete of inheritance trees scale better

2021-03-30 Thread Tom Lane
I wrote:
> I've not made any attempt to do performance testing on this,
> but I think that's about the only thing standing between us
> and committing this thing.

I think the main gating condition for committing this is "does it
make things worse for simple non-partitioned updates?".  The need
for an extra tuple fetch per row certainly makes it seem like there
could be a slowdown there.  However, in some tests here with current
HEAD (54bb91c30), I concur with Amit's findings that there's barely
any slowdown in that case.  I re-did Heikki's worst-case example [1]
of updating both columns of a 2-column table.  I also tried variants
of that involving updating two columns of a 6-column table and of a
10-column table, figuring that those cases might be a bit more
representative of typical usage (see attached scripts).  What I got
was

Times in ms, for the median of 3 runs:

Table width HEADpatch   HEADpatch
-- jit on ---   -- jit off --

2 columns   12528   13329   12574   12922
6 columns   15861   15862   14775   15249
10 columns  18399   16985   16994   16907

So even with the worst case, it's not too bad, just a few percent
worse, and once you get to a reasonable number of columns the v13
patch is starting to win.

However, I then tried a partitioned equivalent of the 6-column case
(script also attached), and it looks like

6 columns   16551   19097   15637   18201

which is really noticeably worse, 16% or so.  I poked at it with
"perf" to see if there were any clear bottlenecks, and didn't find
a smoking gun.  As best I can tell, the extra overhead must come
from the fact that all the tuples are now passing through an Append
node that's not there in the old-style plan tree.  I find this
surprising, because a (non-parallelized) Append doesn't really *do*
much; it certainly doesn't add any data copying or the like.
Maybe it's not so much the Append as that the rules for what kind of
tuple slot can be used have changed somehow?  Andres would have a
clearer idea about that than I do.

Anyway, I'm satisfied that this patch isn't likely to seriously
hurt non-partitioned cases.  There may be some micro-optimization
that could help simple partitioned cases, though.

This leaves us with a question whether to commit this patch now or
delay it till we have a better grip on why cases like this one are
slower.  I'm inclined to think that since there are a lot of clear
wins for users of partitioning, we shouldn't let the fact that there
are also some losses block committing.  But it's a judgment call.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/2e50d782-36f9-e723-0c4b-d133e63c6127%40iki.fi

drop table if exists tab;
create unlogged table tab (a int4, b int4, f3 int, f4 int, f5 text, f6 float8);
\timing on
insert into tab select g, g, g, g, g::text, g from generate_series(1, 1000) 
g;
vacuum tab;
explain verbose update tab set b = b, f6 = f6;
update tab set b = b, f6 = f6;
drop table tab;
create unlogged table tab (a int4, b int4, f3 int, f4 int, f5 text, f6 float8);
\timing on
insert into tab select g, g, g, g, g::text, g from generate_series(1, 1000) 
g;
vacuum tab;
explain update tab set b = b, a = a;
update tab set b = b, a = a;
drop table if exists tab;
create unlogged table tab (a int4, b int4, f3 int, f4 int, f5 text, f6 float8)
partition by range(a);
do $$
begin
for i in 0..9 loop
  execute 'create unlogged table tab'||i||' partition of tab for values from
 ('||i*100||') to ('||(i+1)*100||');';
end loop;
end$$;

\timing on
insert into tab select g, g, g, g, g::text, g from generate_series(1, 
1000-1) g;
vacuum tab;
explain verbose update tab set b = b, f6 = f6;
update tab set b = b, f6 = f6;


Re: Idea: Avoid JOINs by using path expressions to follow FKs

2021-03-30 Thread Joel Jacobson
On Mon, Mar 29, 2021, at 16:17, Vik Fearing wrote:
> SELECT DISTINCT order_details."order"->customer->company_name
> FROM order_details
> WHERE order_details.product->product_name = 'Chocolade';

I like the idea of using -> instead of . (dot),
since name resolution is already complicated,
so overloading the dot operator feels like a bad idea.

I therefore propose the following syntax:

{ table_name | alias } -> constraint_name [ [ -> constraint_name ... ] -> 
column_name ]

It's necessary to start with the table name or its alias,
since two tables/aliases used in the same query
might have different constraints with the same name.

If the expression ends with a column_name,
you get the value for the column.

If the expression ends with a constraint_name,
you get the referenced table as a record.

I also have a new idea on how we can use
the nullability of the foreign key's column(s),
as a rule to determine if you would get
a LEFT JOIN or an INNER JOIN:

If ALL of the foreign key column(s) are declared as NOT NULL,
then you would get an INNER JOIN.

If ANY of the foreign key column(s) are declared as NULL,
then you would get a LEFT JOIN.

Thoughts?

/Joel

Re: pgbench - add pseudo-random permutation function

2021-03-30 Thread Fabien COELHO



Hello Dean,

Thanks a lot for this work. This version looks much better than the 
previous one you sent, and has significant advantages over the one I sent, 
in particular avoiding the prime number stuff and large modular multiply.

So this looks good!

I'm happy that halves-xoring is back because it was an essential part of 
the design. ISTM that the previous version you sent only had linear/affine 
transforms which was a bad idea.


The linear transform is moved inside halves, why not, and the 
any-odd-number multiplication is prime with power-of-two trick is neat on 
these.


However I still have some reservations:

First, I have a thing against erand48. The erand 48 bits design looks like 
something that made sense when computer architectures where 16/32 bits and 
large integers were 32 bits, so a 16 bits margin looked good enough. Using 
a int16[3] type now is really depressing and probably costly on modern 
architectures. With 64 bits architecture, and given that we are 
manipulating 64 bits integers (well 63 really), ISTM that the minimal 
state size for a PRNG should be at least 64 bits. It there a good reason 
NOT to use a 64 bits state prn generator? I took Knuth's because it is 
cheap and 64 bits. I'd accept anything which is at least 64 bits. I looked 
at xor-shift things, but there was some controversy around these so I kept 
it simple and just shifted small values to get rid of the obvious cycles 
on Knuth's generator.


Second, I have a significant reservation about the very structure of the 
transformation in this version:


  loop 4 times :

// FIRST HALF STEER
m/r = pseudo randoms
if v in first "half"
   v = ((v * m) ^ r) & mask;
   rotate1(v)

// FULL SHIFT 1
r = pseudo random
v = (v + r) % size

// SECOND HALF STEER
m/r = pseudo randoms
if v in second "half"
   same as previous on second half

// FULL SHIFT 2
r = pseudo random
v = (v + r) % size

I'm really at odds with FULL SHIFT 1, because it means that up to 1/256 of 
values are kept out of STEERING. Whole chunks of values could be kept 
unshuffled because they would only have SHIFTS apply to them and each time 
fall in the not steered half. It should be an essential part of the design 
that at least one steer is applied on a value at each round, and if two 
are applied then fine, but certainly not zero. So basically I think that 
the design would be significantly improved by removing "FULL SHIFT 1".


Third, I think that the rotate code can be simplified, in particular the 
?: should be avoided because it may induce branches quite damaging to 
processor performance.


I'll give it some more thoughts.

--
Fabien.




Re: Calendar support in localization

2021-03-30 Thread Daniel Verite
Surafel Temesgen wrote:

> > About intervals, if there were locale-aware functions like
> >  add_interval(timestamptz, interval [, locale]) returns timestamptz
> > or
> >  sub_timestamp(timestamptz, timestamptz [,locale]) returns interval
> > that would use ICU to compute the results according to the locale,
> > wouldn't it be good enough?
> >
> >
> Yes it can be enough for now but there are patches proposed to support the
> system and application time period which are in SQL standard 

To clarify, these function signatures are not meant to oppose
a core vs extension implementation, nor an ICU vs non-ICU
implementation. They're meant to illustrate the case of using
specific functions instead of adding specific data types.
AFAIU, adding data types come from the idea that since
(non-gregorian-date + interval) doesn't have the same result as
(gregorian-date + interval), we could use a different type for
non-gregorian-date and so a different "+" operator, maybe
even a specific interval type.

For the case of temporal tables, I'm not quite familiar with the
feature, but I notice that the patch says:

+When system versioning is specified two columns are added which
+record the start timestamp and end timestamp of each row verson.
+The data type of these columns will be TIMESTAMP WITH TIME ZONE.

The user doesn't get to choose the data type, so if we'd require to
use specific data types for non-gregorian calendars, that would
seemingly complicate things for this feature. This is consistent
with the remark upthread that the SQL standard assumes the
gregorian calendar.


> what it takes to support calendar locally is input/output function
> and a converter from and to julian calendar and that may not be that
> much hard since most of the world calendar is based on julian or
> gregorian calendar[0]

The conversions from julian dates are not necessarily hard, but the
I/O functions means having localized names for all days, months, eras
of all calendars in all supported languages. If you're thinking of
implementing this from scratch (without the ICU dependency), where
would these names come from? OTOH if we're using ICU, then why
bother reinventing the julian-to-calendars conversions that ICU
already does?


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite




Re: Idea: Avoid JOINs by using path expressions to follow FKs

2021-03-30 Thread Joel Jacobson
On Tue, Mar 30, 2021, at 16:25, Rod Taylor wrote:
> On Sat, 27 Mar 2021 at 16:28, Joel Jacobson  wrote:
>> __Imagine if we could simply write the SQL query like this:
>> 
>> SELECT DISTINCT od.order_id.customer_id.company_name
>> FROM order_details AS od
>> WHERE od.product_id.product_name = 'Chocolade';
> 
> I regularly do this type of thing via views. It's a bit confusing as writes 
> go to one set of tables while selects often go through the view with all the 
> details readily available.
> 
> I think I'd want these shortcuts to be well defined and obvious to someone 
> exploring via psql. I can also see uses where a foreign key might not be 
> available (left join rather than join).
> 
> I wonder if GENERATED ... VIRTUAL might be a way of defining this type of 
> added record.
> 
> ALTER TABLE order ADD customer record GENERATED JOIN customer USING 
> (customer_id) VIRTUAL;
> ALTER TABLE order_detail ADD order record GENERATED JOIN order USING 
> (order_id) VIRTUAL;
> 
> SELECT order.customer.company_name FROM order_detail;
> 
> Of course, if they don't reference the GENERATED column then the join isn't 
> added to the query.

Interesting idea, but not sure I like it, since you would need twice as many 
columns,
and you would still need the foreign keys, right?

/Joel

Re: Failed assertion on standby while shutdown

2021-03-30 Thread Maxim Orlov
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

All the tests passed successfully.

The new status of this patch is: Ready for Committer


multi-install PostgresNode fails with older postgres versions

2021-03-30 Thread Mark Dilger
Andrew,

While developing some cross version tests, I noticed that PostgresNode::init 
fails for postgres versions older than 9.3, like so:

# Checking port 52814
# Found port 52814
Name: 9.2.24
Data directory: 
/Users/mark.dilger/hydra/postgresnode.review/src/test/modules/test_cross_version/tmp_check/t_001_verify_paths_9.2.24_data/pgdata
Backup directory: 
/Users/mark.dilger/hydra/postgresnode.review/src/test/modules/test_cross_version/tmp_check/t_001_verify_paths_9.2.24_data/backup
Archive directory: 
/Users/mark.dilger/hydra/postgresnode.review/src/test/modules/test_cross_version/tmp_check/t_001_verify_paths_9.2.24_data/archives
Connection string: port=52814 
host=/var/folders/6n/3f4vwbnn7fz5qk0xqhgbjrkwgp/T/L_A2w1x7qb
Log file: 
/Users/mark.dilger/hydra/postgresnode.review/src/test/modules/test_cross_version/tmp_check/log/001_verify_paths_9.2.24.log
# Running: initdb -D 
/Users/mark.dilger/hydra/postgresnode.review/src/test/modules/test_cross_version/tmp_check/t_001_verify_paths_9.2.24_data/pgdata
 -A trust -N
initdb: invalid option -- N
Try "initdb --help" for more information.
Bail out!  system initdb failed

The problem is clear enough; -N/--nosync was added in 9.3, and 
PostgresNode::init is passing -N to initdb unconditionally. I wonder if during 
PostgresNode::new a call should be made to pg_config and the version 
information grep'd out so that version specific options to various functions 
(init, backup, etc) could hinge on the version of postgres being used?

You could also just remove the -N option, but that would slow down all tests 
for everybody, so I'm not keen to do that.  Or you could remove -N in cases 
where $node->{_install_path} is defined, which would be far more acceptable.  
I'm leaning towards using the output of pg_config, though, since this problem 
is likely to come up again with other options/commands.

Thoughts?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Stronger safeguard for archive recovery not to miss data

2021-03-30 Thread Fujii Masao




On 2021/03/26 22:14, David Steele wrote:

On 3/25/21 9:23 PM, Fujii Masao wrote:


On 2021/03/25 23:21, David Steele wrote:

On 1/25/21 3:55 AM, Laurenz Albe wrote:

On Mon, 2021-01-25 at 08:19 +, osumi.takami...@fujitsu.com wrote:

I think you should pst another patch where the second, now superfluous, error
message is removed.


Updated. This patch showed no failure during regression tests
and has been aligned by pgindent.


Looks good to me.
I'll set it to "ready for committer" again.


Fujii, does the new patch in [1] address your concerns?


No. I'm still not sure if this patch is good idea... I understand
why this safeguard is necessary. OTOH I'm afraid it increases
a bit the risk that users get unstartable database, i.e., lose whole database.
But maybe I'm concerned about rare case and my opinion is minority one.
So I'd like to hear more opinions about this patch.


After reviewing the patch I am +1. I think allowing corruption in recovery by 
default is not a good idea. There is currently a warning but I don't think that 
is nearly strong enough and is easily missed.

Also, "data may be missing" makes this sound like an acceptable situation. What 
is really happening is corruption is being introduced that may make the cluster unusable 
or at the very least lead to errors during normal operation.


Ok, now you, Osumi-san and Laurenz agree to this change
while I'm only the person not to like this. So unless we can hear
any other objections to this change, probably we should commit the patch.


If we want to allow recovery to play past this point I think it would make more 
sense to have a GUC (like ignore_invalid_pages [1]) that allows recovery to 
proceed and emits a warning instead of fatal.

Looking the patch, I see a few things:

1) Typo in the tests:

This protection shold apply to recovery mode

should be:

This protection should apply to recovery mode

2) I don't think it should be the job of this patch to restructure the if 
conditions, even if it is more efficient. It just obscures the purpose of the 
patch.


+1


So, I would revert all the changes in xlog.c except changing the warning to an 
error:

-    ereport(WARNING,
-    (errmsg("WAL was generated with wal_level=minimal, data may be 
missing"),
- errhint("This happens if you temporarily set wal_level=minimal 
without taking a new base backup.")));
+    ereport(FATAL,
+    (errmsg("WAL was generated with wal_level=minimal, cannot 
continue recovering"),
+ errdetail("This happens if you temporarily set 
wal_level=minimal on the server."),
+ errhint("Run recovery again from a new base backup taken after 
setting wal_level higher than minimal")));

I guess that users usually encounter this error because they have not
taken base backups yet after setting wal_level to higher than minimal
and have to use the old base backup for archive recovery. So I'm not sure
how much only this HINT is helpful for them. Isn't it better to append
something like "If there is no such backup, recover to the point in time
before wal_level is set to minimal even though which cause data loss,
to start the server." into HINT?

The following error will never be thrown because of the patch.
So IMO the following code should be removed.

if (ControlFile->wal_level < WAL_LEVEL_REPLICA)
ereport(ERROR,
(errmsg("hot standby is not possible because wal_level 
was not set to \"replica\" or higher on the primary server"),
 errhint("Either set wal_level to 
\"replica\" on the primary, or turn off hot_standby here.")));

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Proposal: Save user's original authenticated identity for logging

2021-03-30 Thread Jacob Champion
On Tue, 2021-03-30 at 09:55 +0900, Michael Paquier wrote:
> On Mon, Mar 29, 2021 at 11:53:03PM +, Jacob Champion wrote:
> > It's not a matter of the tests being stable, but of the tests needing
> > to change and evolve as the implementation changes. A big part of that
> > is visibility into what the tests are doing, so that you can debug
> > them.
> 
> Sure, but I still don't quite see why this applies here?  At the point
> of any test, like() or unlink() print the contents of the comparison
> if there is a failure, so there is no actual loss of data.  That's
> what issues_sql_like() does, for one.

The key there is "if there is a failure" -- false positives need to be
debugged too. Tests I've worked with recently for the NSS work were
succeeding for the wrong reasons. Overly generic logfile matches ("SSL
error"), for example.

> > Keep in mind that the rotate-restart-slurp method comes from an
> > existing test. I assume Andrew chose that method for the same reasons I
> > did -- it works with what we currently have.
> 
> PostgresNode::rotate_logfile got introduced in c098509, and it is just
> used in t/017_shm.pl on HEAD.  There could be more simplifications
> with 019_replslot_limit.pl, I certainly agree with that.

modules/ssl_passphrase_callback/t/001_testfunc.pl is where I pulled
this pattern from.

> > Is the additional effort to create (and maintain) that new system worth
> > two seconds per run? I feel like it's not -- but if you feel strongly
> > then I can definitely look into it.
> 
> I fear that heavily parallelized runs could feel the difference.  Ask
> Andres about that, he has been able to trigger in parallel a failure
> with pg_upgrade wiping out testtablespace while the main regression
> test suite just began :) 

Does unilateral log truncation play any nicer with parallel test runs?
I understand not wanting to make an existing problem worse, but it
doesn't seem like the existing tests were written for general
parallelism.

Would it be acceptable to adjust the tests for live rotation using the
logging collector, rather than a full restart? It would unfortunately
mean that we have to somehow wait for the rotation to complete, since
that's asynchronous.

(Speaking of asynchronous: how does the existing check-and-truncate
code make sure that the log entries it's looking for have been flushed
to disk? Shutting down the server guarantees it.)

> > Which parts would you consider confusing/in need of change? I'm happy
> > to expand where needed. Would an inline sample be more helpful than a
> > textual explanation?
> 
> That's with the use of "authentication and authorization".  How can
> users make the difference between what one or this other is without
> some explanation with the name maps?  It seems that there is no place
> in the existing docs where this difference is explained.  I am
> wondering if it would be better to not change this paragraph, or
> reword it slightly to outline that this may cause more than one log
> entry, say:
> "Causes each attempted connection to the server, and each
> authentication activity to be logged."

I took a stab at this in v13: "Causes each attempted connection to the
server to be logged, as well as successful completion of both client
authentication (if necessary) and authorization." (IMO any further in-
depth explanation of authn/z and user mapping probably belongs in the
auth method documentation, and this patch doesn't change any authn/z
behavior.)

v13 also incorporates the latest SSL cert changes, so it's just a
single patch now. Tests now cover the CN and DN clientname modes. I
have not changed the log capture method yet; I'll take a look at it
next.

--Jacob
From 48cfecfb71ac276bede9abe3e374ebee8f428117 Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Wed, 3 Feb 2021 11:42:05 -0800
Subject: [PATCH v13] Log authenticated identity from all auth backends

The "authenticated identity" is the string used by an auth method to
identify a particular user. In many common cases this is the same as the
Postgres username, but for some third-party auth methods, the identifier
in use may be shortened or otherwise translated (e.g. through pg_ident
mappings) before the server stores it.

To help DBAs see who has actually interacted with the system, store the
original identity when authentication succeeds, and expose it via the
log_connections setting. The log entries look something like this
example (where a local user named "pchampion" is connecting to the
database as the "admin" user):

LOG:  connection received: host=[local]
LOG:  connection authenticated: identity="pchampion" method=peer (/data/pg_hba.conf:88)
LOG:  connection authorized: user=admin database=postgres application_name=psql

port->authn_id is set according to the auth method:

bsd: the Postgres username (which is the local username)
cert: the client's Subject DN
gss: the user principal
ident: the remote username
ldap: the final bind DN
pam: the 

Re: SQL/JSON: JSON_TABLE

2021-03-30 Thread Erik Rijkers
> On 2021.03.27. 02:12 Nikita Glukhov  wrote:
> 
> Attached 47th version of the patches.
> 
[..]
> 
> I have added forgotten files and fixed the first patch.
> 
> [0001-SQL-JSON-functions-v47.patch]
> [0002-JSON_TABLE-v47.patch]
> [0003-JSON_TABLE-PLAN-DEFAULT-clause-v47.patch]
> [0004-JSON_TABLE-PLAN-clause-v47.patch]

Hi,

Apply, build all fine.  It also works quite well, and according to 
specification, as far as I can tell.

But today I ran into:

ERROR:  function ExecEvalJson not in llvmjit_types.c

I think that it is caused by:

set enable_bitmapscan = off;

(I installed llvm a few days ago. llvm-3.9-dev on this debian stretch).


This is the test sql I concocted, which runs fine with enable_bitmapscan on 
(the default):

select jt1.* 
from myjsonfile100k as t(js, id) 
  , json_table(
  t.js
   , '$' columns (
"lastname"   textpath  '$. "lastname" '
  , "firstname"  textpath  '$. "firstname"'
  , "date"   textpath  '$. "date" '
  , "city"   textpath  '$. "city" '
  , "country"textpath  '$. "country"  '
  , "name 0(1)"  textpath  '$. "array"[0] '
  , "name 4(5)"  textpath  '$. "array"[4] '
  , "names"  text[]  path  '$. "array"'
  , "randfloat"  float   path  '$. "random float" '
)
) as jt1
where  js @> ('[ { "city": "Santiago de Cuba" } ]')
   and js[0]->>'firstname' = 'Gilda'
;
ERROR:  function ExecEvalJson not in llvmjit_types.c

That statement only errors out if the table is large enough. I have no time now 
to make a sample table but if no-one understands the problem off-hand, I'll try 
to construct such a table later this week (the one I'm using is large, 1.5 GB).


Erik Rijkers




  1   2   >