Optimize update of tables with generated columns

2019-12-20 Thread Peter Eisentraut
When updating a table row with generated columns, we only need to 
recompute those generated columns whose base columns have changed in 
this update and keep the rest unchanged.  This can result in a 
significant performance benefit (easy to reproduce for example with a 
tsvector column).  The required information was already kept in 
RangeTblEntry.extraUpdatedCols; we just have to make use of it.


A small problem is that right now ExecSimpleRelationUpdate() does not 
populate extraUpdatedCols.  That needs fixing first.  This is also 
related to the issue discussed in "logical replication does not fire 
per-column triggers"[0].  I'll leave my patch here while that issue is 
being resolved.



[0]: 
https://www.postgresql.org/message-id/flat/21673e2d-597c-6afe-637e-e8b10425b240%402ndquadrant.com


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 3b80f3147c0e983a2e8ad41be7df54c6480c0d8f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 20 Dec 2019 22:52:31 +0100
Subject: [PATCH] Optimize update of tables with generated columns

When updating a table row with generated columns, only recompute those
generated columns whose base columns have changed in this update and
keep the rest unchanged.  This can result in a significant performance
benefit.  The required information was already kept in
RangeTblEntry.extraUpdatedCols; we just have to make use of it.

FIXME: ExecSimpleRelationUpdate() does not currently populate
extraUpdatedCols.  That needs fixing first.
---
 src/backend/commands/copy.c|  2 +-
 src/backend/executor/execReplication.c |  4 +--
 src/backend/executor/nodeModifyTable.c | 37 +-
 src/include/executor/nodeModifyTable.h |  2 +-
 src/include/nodes/execnodes.h  |  3 +++
 5 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 42a147b67d..a0758cd39c 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -3221,7 +3221,7 @@ CopyFrom(CopyState cstate)
/* Compute stored generated columns */
if 
(resultRelInfo->ri_RelationDesc->rd_att->constr &&

resultRelInfo->ri_RelationDesc->rd_att->constr->has_generated_stored)
-   ExecComputeStoredGenerated(estate, 
myslot);
+   ExecComputeStoredGenerated(estate, 
myslot, CMD_INSERT);
 
/*
 * If the target is a plain table, check the 
constraints of
diff --git a/src/backend/executor/execReplication.c 
b/src/backend/executor/execReplication.c
index 95e027c970..8cbec36b18 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -416,7 +416,7 @@ ExecSimpleRelationInsert(EState *estate, TupleTableSlot 
*slot)
/* Compute stored generated columns */
if (rel->rd_att->constr &&
rel->rd_att->constr->has_generated_stored)
-   ExecComputeStoredGenerated(estate, slot);
+   ExecComputeStoredGenerated(estate, slot, CMD_INSERT);
 
/* Check the constraints of the tuple */
if (rel->rd_att->constr)
@@ -482,7 +482,7 @@ ExecSimpleRelationUpdate(EState *estate, EPQState *epqstate,
/* Compute stored generated columns */
if (rel->rd_att->constr &&
rel->rd_att->constr->has_generated_stored)
-   ExecComputeStoredGenerated(estate, slot);
+   ExecComputeStoredGenerated(estate, slot, CMD_UPDATE);
 
/* Check the constraints of the tuple */
if (rel->rd_att->constr)
diff --git a/src/backend/executor/nodeModifyTable.c 
b/src/backend/executor/nodeModifyTable.c
index 9ba1d78344..5878041ee6 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -246,7 +246,7 @@ ExecCheckTIDVisible(EState *estate,
  * Compute stored generated columns for a tuple
  */
 void
-ExecComputeStoredGenerated(EState *estate, TupleTableSlot *slot)
+ExecComputeStoredGenerated(EState *estate, TupleTableSlot *slot, CmdType 
cmdtype)
 {
ResultRelInfo *resultRelInfo = estate->es_result_relation_info;
Relationrel = resultRelInfo->ri_RelationDesc;
@@ -269,6 +269,7 @@ ExecComputeStoredGenerated(EState *estate, TupleTableSlot 
*slot)
 
resultRelInfo->ri_GeneratedExprs =
(ExprState **) palloc(natts * sizeof(ExprState *));
+   resultRelInfo->ri_NumGeneratedNeeded = 0;
 
for (int i = 0; i < natts; i++)
{
@@ -276,18 +277,41 @@ ExecComputeStoredGenerated(EState *estate, TupleTableSlot 
*slot)
{
 

Re: Preserve versions of initdb-created collations in pg_upgrade

2019-12-20 Thread Peter Eisentraut

On 2019-10-29 03:33, Thomas Munro wrote:

Seems to work as described with -E UTF-8, but it fails with clusters
using -E SQL_ASCII.  That causes the pg_upgrade check to fail on
machines where that is the default encoding chosen by initdb (where
unpatched master succeeds):

pg_restore: creating COLLATION "pg_catalog.af-NA-x-icu"
pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 1700; 3456 12683 COLLATION af-NA-x-icu tmunro
pg_restore: error: could not execute query: ERROR:  collation
"pg_catalog.af-NA-x-icu" for encoding "SQL_ASCII" does not exist
Command was: ALTER COLLATION pg_catalog."af-NA-x-icu" OWNER TO tmunro;


This could be addressed by using is_encoding_supported_by_icu() in 
pg_dump to filter out collations with unsupported encodings.


However, the more I look at this whole problem, I'm wondering whether it 
wouldn't be preferable to avoid this whole mess by just not creating any 
collations in initdb.  What do you think?


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pgsql: Superuser can permit passwordless connections on postgres_fdw

2019-12-20 Thread Tom Lane
Michael Paquier  writes:
> On Fri, Dec 20, 2019 at 02:42:22PM -0500, Tom Lane wrote:
>> I notice in testing this that the "nosuper" business added by
>> 6136e94dc is broken in more ways than what the buildfarm is
>> complaining about: it leaves the role around at the end of the
>> test.

> Roles left behind at the end of a test are annoying.  Here is an idea:
> make pg_regress check if any roles prefixed by "regress_" are left
> behind at the end of a test.  This will not work until test_pg_dump is
> cleaned up, just a thought.

Yeah, it's sort of annoying that the buildfarm didn't notice this
aspect of things.  I'm not sure I want to spend cycles on checking
it in every test run, though.

Maybe we could have -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS
enable a check for that aspect along with what it does now?

regards, tom lane




Re: automating pg_config.h.win32 maintenance

2019-12-20 Thread Michael Paquier
On Fri, Dec 20, 2019 at 09:17:14AM +0100, Peter Eisentraut wrote:
> committed with that comment removed

Yeah, thanks!
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] Increase the maximum value track_activity_query_size

2019-12-20 Thread Michael Paquier
On Fri, Dec 20, 2019 at 08:57:04AM -0500, Bruce Momjian wrote:
> I can imagine using larger pgstat_track_activity_query_size values for
> data warehouse queries, where they are long and there are only a few of
> them.

Why are those queries that long anyway?  A too long IN clause with an
insane amount of parameters which could be replaced by an ANY clause
with an array?
--
Michael


signature.asc
Description: PGP signature


Re: MarkBufferDirtyHint() and LSN update

2019-12-20 Thread Michael Paquier
On Fri, Dec 20, 2019 at 04:30:38PM +0100, Antonin Houska wrote:
> I wanted to register the patch for the next CF so it's not forgotten, but see
> it's already there. Why have you set the status to "withdrawn"?

Because my patch was incorrect, and I did not make enough bandwidth to
think more on the matter.  I am actually not sure that what you are
proposing is better..  If you wish to get that reviewed, could you add
a new CF entry?
--
Michael


signature.asc
Description: PGP signature


Re: Disallow cancellation of waiting for synchronous replication

2019-12-20 Thread Michael Paquier
On Fri, Dec 20, 2019 at 03:07:26PM +0500, Andrey Borodin wrote:
>> Sending a cancellation is currently the only way to resume after
>> disabling synchronous replication. Some HA solutions (e.g.
>> pg_auto_failover) rely on this behaviour. Would it be worth checking
>> whether synchronous replication is still required?
> 
> I think changing synchronous_standby_names to some available
> standbys will resume all backends waiting for synchronous
> replication.  Do we need to check necessity of synchronous
> replication in any other case? 

Yeah, I am not on board with the concept of this thread.  Depending
on your HA configuration you can also reset synchronous_standby_names
after a certain small-ish threshold has been reached in WAL to get at
the same result by disabling synchronous replication, though your
cluster cannot perform safely a failover so you need to keep track of
that state.  Something which would be useful is to improve some cases
where you still want to use synchronous replication by switching to a
different standby.  I recall that sometimes that can be rather slow..
--
Michael


signature.asc
Description: PGP signature


Re: Hooks for session start and end, take two

2019-12-20 Thread Michael Paquier
On Fri, Dec 20, 2019 at 11:33:02AM -0300, Alvaro Herrera wrote:
> Fair enough.

And done:
https://commitfest.postgresql.org/25/2251/
Sorry for the late reply.
--
Michael


signature.asc
Description: PGP signature


Re: pgsql: Superuser can permit passwordless connections on postgres_fdw

2019-12-20 Thread Michael Paquier
On Fri, Dec 20, 2019 at 02:42:22PM -0500, Tom Lane wrote:
> Concretely, I think we ought to do (and back-patch) the attached.

Thanks for the fix, I have not been able to look at that.

> I notice in testing this that the "nosuper" business added by
> 6136e94dc is broken in more ways than what the buildfarm is
> complaining about: it leaves the role around at the end of the
> test.  That's a HUGE violation of project policy, for security
> reasons as well as the fact that it makes it impossible to run
> "make installcheck" twice without getting different results.

Roles left behind at the end of a test are annoying.  Here is an idea:
make pg_regress check if any roles prefixed by "regress_" are left
behind at the end of a test.  This will not work until test_pg_dump is
cleaned up, just a thought.
--
Michael


signature.asc
Description: PGP signature


Re: archive status ".ready" files may be created too early

2019-12-20 Thread Bossart, Nathan
On 12/18/19, 8:34 AM, "Bossart, Nathan"  wrote:
> On 12/17/19, 2:26 AM, "Kyotaro Horiguchi"  wrote:
>> If so, we could amend also that case by marking the last segment as
>> .ready when XLogWrite writes the first bytes of the next segment. (As
>> the further corner-case, it still doesn't work if a contination record
>> spans over trhee or more segments.. But I don't (or want not to) think
>> we don't need to consider that case..)
>
> I'm working on a new version of the patch that will actually look at
> the WAL page metadata to determine when it is safe to mark a segment
> as ready for archival.  It seems relatively easy to figure out whether
> a page is the last one for the current WAL record.

I stand corrected.  My attempts to add logic to check the WAL records
added quite a bit more complexity than seemed reasonable to maintain
in this code path.  For example, I didn't anticipate things like
XLOG_SWITCH records.

I am still concerned about the corner case you noted, but I have yet
to find a practical way to handle it.  You suggested waiting until
writing the first bytes of the next segment before marking a segment
as ready, but I'm not sure that fixes this problem either, and I
wonder if it could result in waiting arbitrarily long before creating
a ".ready" file in some cases.  Perhaps I am misunderstanding your
suggestion.

Another thing I noticed is that any changes in this area could impact
archive_timeout.  If we reset the archive_timeout timer when we mark
the segments ready, we could force WAL switches more often.  If we do
not move the timer logic, we could be resetting it before the file is
ready for the archiver.  However, these differences might be subtle
enough to be okay.

Nathan



Re: Session WAL activity

2019-12-20 Thread Andres Freund
Hi,

On 2019-12-20 16:38:32 -0500, Bruce Momjian wrote:
> On Thu, Dec 12, 2019 at 09:31:22AM +0800, Craig Ringer wrote:
> > On Fri, 6 Dec 2019 at 09:57, Michael Paquier  wrote:
> > 
> > On Thu, Dec 05, 2019 at 12:23:40PM +0300, Konstantin Knizhnik wrote:
> > > Concerning keeping PGPROC size as small as possible, I agree that it 
> > is
> > > reasonable argument.
> > > But even now it is very large (816 bytes) and adding extra 8 bytes 
> > will
> > > increase it on less than 1%.
> > 
> > It does not mean that we should add all kind of things to PGPROC as
> > that's a structure sensitive enough already.

Well, but we don't keep other stats in PGPROC, even when we have them in
shared memory? It seems like PgBackendStatus or such might be a better
place?


> > Right. It's not as critical as PGXACT, but PGPROC is still significant for
> > scalability and connection count limits.
> > 
> > It's a shame we can't really keep most of it in backend-private memory and 
> > copy
> > it to requestors when they want it, say into a temporary DSA or over
> > a shm_mq.

I don't understand what that would buy? Commonly accessed field are just
going to be in L2 or such, with the cacheline being in
modified/exclusive state.  The problem isn't that fields / cachelines
*can* be accessed by other backends, it's only a problem *if* they're
frequently accessed. And even if accessed by multiple backends, it's
only really a problem if there are multiple fields *and* they're also
modified (otherwise they can just stay in shared stated across
cpus/sockets).

There *is* an argument for grouping fields in PGPROC by their access
patterns. E.g. something like ->procArrayGroup* is a lot more commonly
accessed by different backends than e.g. this proposed field.


> > But our single threaded model means we just cannot rely on backends being
> > responsive in a timely enough manner to supply data on-demand. That doesn't
> > mean we have to push it to PGPROC though: we could be sending the parts that
> > don't have to be super-fresh to the stats collector or a new related process
> > for active session stats and letting it aggregate them.

We should definitely *NOT* do that. Ferrying things through the stats
collector is really expensive, and everyone pays the price for an
increase in size, not just code accessing the field.  In fact, no
reasonable quantity that's known at server start should ever go through
a mechanism as expensive as pgstat - the only reason it exists is that
the number of tables obviously can grow over time.

There's a thread somewhere about a patchset to move all of pgstat into
dynamic shared memory, actually. Because the writes / reads needed by
pgstat are really bad on some systems.


> > That's way beyond the scope of this patch though. So personally I'm OK with 
> > the
> > new PGPROC field. Visibility into Pg's activity is woefully limited and
> > something we need to prioritize more.
> 
> Uh, how much does the new field get us near the CPU cache line max size
> for a single PGPROC entry?

It's like ~13x the common size of a cache line (64bytes).

Greetings,

Andres Freund




Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'

2019-12-20 Thread Mark Lorenz

while preparing the patch for the Commitfest, I found a bug in the
to_char() function that is quite correlated with this issue:



SELECT to_char('1997-02-01'::date, '-WW-D')


returns: 1997-05-7 -> which is ok, I believe. Feb, 1st was on 
Saturday,

so counting from Sundays, it was day 7 of week 5.



SELECT to_char('1997-02-03'::date, '-WW-D')



returns: 1997-05-2 -> This cannot be.


Why not?  These format codes are specified as

D   day of the week, Sunday (1) to Saturday (7)
WW  week number of year (1–53) (the first week starts on the first day
of the year)



Because 1997-05-2 is earlier than 1997-05-7. But 1997-02-03 is later 
than 1997-02-01. From my point of view, this is confusing.



I don't see anything there that says that "D" is correlated with "WW".
We do have a connection between "ID" and "IW", so that ID ought to
specify a day within an IW week, but there's no connection between "D"
and either "W" or "WW" week numbering.  It's a day of the week, as
per the regular calendar.  Trying to define it as something else is
just going to break stuff.

The only way to make "D" as it stands compatible with a week-numbering
system is to ensure that your weeks always start on Sundays, that is,
just as confusing as ISO weeks but slightly differently confusing.

Perhaps it would be worth inventing format codes that do have the
same relationship to "W" and/or "WW" as "ID" does to "IW".  But
repurposing "D" for that is a bad idea.

regards, tom lane


I don't want to create any connection here. The day is calculated 
correctly. But the week number is wrong. 1997-02-03 was in week number 
6, as well as 1997-02-04. But Postgres returns 5. The problem with 
to_char() is, that the week number is considering only the nmber of days 
in the year and divides them by 7. So, there is no diffence whether the 
year starts on Sunday or any other week day. So, an offset is missing, 
which yields in wrong week numbers, as I can see...





Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'

2019-12-20 Thread Mark Lorenz

Hi Tom,

thanks for answering!

I commited two different patches:

---

The first one is for the strange behaviour of to_char(), which could be 
seen as a bug, I believe. As described earlier, to_char() with the 
'WW-D' pattern could return wrong week numbers.


The non-ISO week number is defined for weeks beginning with Sundays and 
ending with Saturdays. The first week of the year is the week with 
January, 1st.


For example:

postgres=# SELECT to_char('1997-01-01'::date, '-WW-D');
  to_char
-
  1997-01-4
(1 row)

1997-01-01 was a Wednesday. So the first week in 1997 was from Jan 1st 
to Jan 4th (Saturday). Week 2 started on Jan 5th. But to_char() gives 
out week number 1 until Tuesday (!), Jan 7th.


postgres=# SELECT to_char('1997-01-07'::date, '-WW-D');
  to_char
-
  1997-01-3
(1 row)

After that, on Jan 8th, the output switches from 01-3 to 02-4, which 
makes no sense in my personal opinion. The week number should be 
consistent from Sun to Sat and should not switch during any day in the 
week. Furthermore, it is not clear why Jan 7th should return an earlier 
week day (3) than Jan 1st (4).


The bug is, that the calculation of the week number only considers the 
number of days of the current year. But it ignores the first week day, 
which defines an offset. This has been fixed in the first patch.


---

Second patch:

As you stated correctly, this is not a bug fix, because the current 
behaviour is documented and it works as the documentation states. I 
tried to describe my confusion in the very first post of this thread:


I was wondering why the D part is not recognized in the non-ISO week 
pattern while the ISO day is working very well. Although this is 
documented, there could be a chance that this simply was not implemented 
right now - so I tried.


The main aspect, I believe, is, that to_date() or to_timestamp() is some 
kind of "back" operation of the to_char() function. So, a new definition 
simply should recognize the week day as the to_char() function does, 
instead of setting the day part fix to any number (please see the 
examples in the very first post for that).


---

Combining both patches, the to_char() fix and the to_date() change, it 
is possible to calculate the non-ISO week pattern in both directions:


SELECT to_date(to_char(anydate, '-WW-D'), '-WW-D')

would result in "anydate". Currently it does not:

postgres=# SELECT to_date(to_char('1997-01-07'::date, '-WW-D'), 
'-WW-D')

  to_char
-
  1997-01-01
(1 row)

postgres=# SELECT to_char(to_date('1997-01-07', '-WW-D'), 
'-WW-D')

  to_char
-
  1997-01-04
(1 row)

On the other hand, the ISO week calculations work as expected, 
especially the there-and-back operation results in the original value:


postgres=# SELECT to_date(to_char('1997-01-07'::date, 'IYYY-IW-ID'), 
'IYYY-IW-ID')

  to_char
-
  1997-01-07
(1 row)

postgres=# SELECT to_char(to_date('1997-01-07', 'IYYY-IW-ID'), 
'IYYY-IW-ID')

  to_char
-
  1997-01-7
(1 row)

The only difference between ISO and non-ISO weeks is the beginning on 
Mondays and the definition of the first week. But this cannot be the 
reason why one operation results in right values (comparing with a 
calendar) and the other one does not.


Does this explanation make it clearer?




Re: Session WAL activity

2019-12-20 Thread Bruce Momjian
On Thu, Dec 12, 2019 at 09:31:22AM +0800, Craig Ringer wrote:
> On Fri, 6 Dec 2019 at 09:57, Michael Paquier  wrote:
> 
> On Thu, Dec 05, 2019 at 12:23:40PM +0300, Konstantin Knizhnik wrote:
> > Concerning keeping PGPROC size as small as possible, I agree that it is
> > reasonable argument.
> > But even now it is very large (816 bytes) and adding extra 8 bytes will
> > increase it on less than 1%.
> 
> It does not mean that we should add all kind of things to PGPROC as
> that's a structure sensitive enough already.
> 
> 
> Right. It's not as critical as PGXACT, but PGPROC is still significant for
> scalability and connection count limits.
> 
> It's a shame we can't really keep most of it in backend-private memory and 
> copy
> it to requestors when they want it, say into a temporary DSA or over a shm_mq.
> But our single threaded model means we just cannot rely on backends being
> responsive in a timely enough manner to supply data on-demand. That doesn't
> mean we have to push it to PGPROC though: we could be sending the parts that
> don't have to be super-fresh to the stats collector or a new related process
> for active session stats and letting it aggregate them.
> 
> That's way beyond the scope of this patch though. So personally I'm OK with 
> the
> new PGPROC field. Visibility into Pg's activity is woefully limited and
> something we need to prioritize more.

Uh, how much does the new field get us near the CPU cache line max size
for a single PGPROC entry?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: More issues with expressions always false (no patch)

2019-12-20 Thread Tom Lane
Andreas Karlsson  writes:
> On 12/20/19 1:54 AM, Andreas Karlsson wrote:
>> On 12/20/19 1:01 AM, Ranier Vilela wrote:> First case:
>>> Third case:
>>> \ src \ backend \ utils \ cache \ relcache.c (line 5190)
>>> if (relation-> rd_pubactions)
>>> 
>>> It will never be executed, because if relation-> rd_pubactions is 
>>> true, the function returns on line 5154.

>> I have not looked into this one in detail, but the free at line 5192 
>> looks like potentially dead code.

> I have looked at it now and it seems like this code has been dead since 
> the function was originally implemented in 665d1fad99e.

I would not put a whole lot of faith in that.  This argument supposes
that nothing else can touch the relcache entry while we are doing
GetRelationPublications and the pg_publication syscache accesses inside
the foreach loop.  Now in practice, yeah, it's somewhat unlikely that
anything down inside there would take an interest in our relation's
publication actions, especially if our relation isn't a system catalog.
But there are closely related situations in other relcache functions
that compute cached values like this where we *do* have to worry about
reentrant/recursive use of the function.  I think the "useless" free
is cheap insurance against a permanent memory leak, as well as more
like the coding in nearby functions like RelationGetIndexAttrBitmap.
I wouldn't change it.

regards, tom lane




Re: Disallow cancellation of waiting for synchronous replication

2019-12-20 Thread Tom Lane
Andrey Borodin  writes:
> I think proper solution here would be to add GUC to disallow cancellation of 
> synchronous replication.

This sounds entirely insane to me.  There is no possibility that you
can prevent a failure from occurring at this step.

> Three is still a problem when backend is not canceled, but terminated [2].

Exactly.  If you don't have a fix that handles that case, you don't have
anything.  In fact, you've arguably made things worse, by increasing the
temptation to terminate or "kill -9" the nonresponsive session.

regards, tom lane




Re: range_agg

2019-12-20 Thread Tom Lane
Alvaro Herrera  writes:
> On 2019-Dec-20, Paul A Jungwirth wrote:
>> Is it permitted to add casts with polymorphic inputs & outputs? Is
>> that something that we would actually want to do? I'd probably need
>> both the polymorphic and concrete casts so that you could still say
>> `int4range(1,2)::int4multirange`.

> I'm embarrased to admit that I don't grok the type system well enough
> (yet) to answer this question.

I would say no; if you want behavior like that you'd have to add code for
it into the coercion machinery, much like the casts around, say, types
record and record[] versus named composites and arrays of same.  Expecting
the generic cast machinery to do the right thing would be foolhardy.

In any case, even if it did do the right thing, you'd still need
some additional polymorphic type to express the behavior you wanted,
no?  So it's not clear there'd be any net savings of effort.

regards, tom lane




Re: pg_publication repetitious code

2019-12-20 Thread Tom Lane
Alvaro Herrera  writes:
> This very small patch removes some duplicated code in pg_publication.

Seems like the extra test on missing_oid is unnecessary:

+   oid = get_publication_oid(pubname, missing_ok);
+   if (!OidIsValid(oid) && missing_ok)
+   return NULL;

As coded, it's get_publication_oid's job to deal with that.

Otherwise +1

regards, tom lane




pg_publication repetitious code

2019-12-20 Thread Alvaro Herrera
This very small patch removes some duplicated code in pg_publication.

-- 
Álvaro Herrerahttp://www.linkedin.com/in/alvherre
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d442c8e0bb..347324f320 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -403,17 +403,9 @@ GetPublicationByName(const char *pubname, bool missing_ok)
 {
 	Oid			oid;
 
-	oid = GetSysCacheOid1(PUBLICATIONNAME, Anum_pg_publication_oid,
-		  CStringGetDatum(pubname));
-	if (!OidIsValid(oid))
-	{
-		if (missing_ok)
-			return NULL;
-
-		ereport(ERROR,
-(errcode(ERRCODE_UNDEFINED_OBJECT),
- errmsg("publication \"%s\" does not exist", pubname)));
-	}
+	oid = get_publication_oid(pubname, missing_ok);
+	if (!OidIsValid(oid) && missing_ok)
+		return NULL;
 
 	return GetPublication(oid);
 }


Re: pgsql: Superuser can permit passwordless connections on postgres_fdw

2019-12-20 Thread Tom Lane
I wrote:
> This is a bit messier.  But I think that the discrepancy is not
> really the fault of this patch: rather, it's a bug in the way the
> GSS support was put into libpq.  I thought we had a policy that
> all builds would recognize all possible parameters and then
> perhaps fail later.  Certainly the SSL parameters are implemented
> that way.  The #if's disabling GSS stuff in PQconninfoOptions[]
> are just broken, according to that policy.

Concretely, I think we ought to do (and back-patch) the attached.

I notice in testing this that the "nosuper" business added by
6136e94dc is broken in more ways than what the buildfarm is
complaining about: it leaves the role around at the end of the
test.  That's a HUGE violation of project policy, for security
reasons as well as the fact that it makes it impossible to run
"make installcheck" twice without getting different results.

regards, tom lane

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 3d6e4ee..ed52ddd 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -132,7 +132,7 @@ CREATE FOREIGN TABLE ft6 (
 -- ===
 -- tests for validator
 -- ===
--- requiressl, krbsrvname and gsslib are omitted because they depend on
+-- requiressl is omitted because valid values depend on
 -- configure options
 ALTER SERVER testserver1 OPTIONS (
 	use_remote_estimate 'false',
@@ -158,10 +158,10 @@ ALTER SERVER testserver1 OPTIONS (
 	sslcert 'value',
 	sslkey 'value',
 	sslrootcert 'value',
-	sslcrl 'value'
+	sslcrl 'value',
 	--requirepeer 'value',
-	-- krbsrvname 'value',
-	-- gsslib 'value',
+	krbsrvname 'value',
+	gsslib 'value'
 	--replication 'value'
 );
 -- Error, invalid list syntax
@@ -8855,7 +8855,7 @@ DO $d$
 END;
 $d$;
 ERROR:  invalid option "password"
-HINT:  Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, requirepeer, gssencmode, krbsrvname, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, fetch_size
+HINT:  Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, requirepeer, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, fetch_size
 CONTEXT:  SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
 PL/pgSQL function inline_code_block line 3 at EXECUTE
 -- If we add a password for our user mapping instead, we should get a different
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 33307d6..e0f4c72 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -145,7 +145,7 @@ CREATE FOREIGN TABLE ft6 (
 -- ===
 -- tests for validator
 -- ===
--- requiressl, krbsrvname and gsslib are omitted because they depend on
+-- requiressl is omitted because valid values depend on
 -- configure options
 ALTER SERVER testserver1 OPTIONS (
 	use_remote_estimate 'false',
@@ -171,10 +171,10 @@ ALTER SERVER testserver1 OPTIONS (
 	sslcert 'value',
 	sslkey 'value',
 	sslrootcert 'value',
-	sslcrl 'value'
+	sslcrl 'value',
 	--requirepeer 'value',
-	-- krbsrvname 'value',
-	-- gsslib 'value',
+	krbsrvname 'value',
+	gsslib 'value'
 	--replication 'value'
 );
 
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 4325946..66b09da 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1747,8 +1747,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
   gsslib
   

-GSS library to use for GSSAPI authentication. Only used on Windows.
-Set to gssapi to force libpq to use the GSSAPI
+GSS library to use for GSSAPI authentication.
+Currently this is disregarded except on Windows builds that include
+both GSSAPI and SSPI support.  In that case, set
+this to gssapi to cause libpq to use the GSSAPI
 library for authentication instead of the default SSPI.

   
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 3ca7e05..cb3c431 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ 

Re: range_agg

2019-12-20 Thread Alvaro Herrera
On 2019-Dec-20, Paul A Jungwirth wrote:

> Is it permitted to add casts with polymorphic inputs & outputs? Is
> that something that we would actually want to do? I'd probably need
> both the polymorphic and concrete casts so that you could still say
> `int4range(1,2)::int4multirange`.

I'm embarrased to admit that I don't grok the type system well enough
(yet) to answer this question.

> Should I change the coerce code to look for casts among concrete types
> when the function has polymorphic types? I'm pretty scared to do
> something like that though, both because of the complexity and lest I
> cause unintended effects.

Yeah, I suggest to stay away from that.  I think this multirange thing
is groundbreaking enough that we don't need to cause additional
potential breakage.

> Should I just give up on implicit casts and require you to specify
> one? That makes it a little more annoying to mix range & multirange
> types, but personally I'm okay with that. This is my preferred
> approach.

+1

> I have some time over the holidays to work on the other changes Alvaro
> has suggested.

I hope not to have made things worse by posting a rebase.  Anyway,
that's the reason I posted my other changes separately.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: range_agg

2019-12-20 Thread Paul A Jungwirth
On Fri, Dec 20, 2019 at 10:19 AM Pavel Stehule  wrote:
> I had a talk with Paul about possible simplification of designed operators. 
> Last message from Paul was - he is working on new version.

Thanks Alvaro & Pavel for helping move this forward. I've added the
casts but they aren't used automatically for things like
`range_union(r, mr)` or `mr + r`, even though they are implicit.
That's because the casts are for concrete types (e.g. int4range ->
int4multirange) but the functions & operators are for polymorphic
types (anymultirange + anymultirange). So I'd like to get some
feedback about the best way to proceed.

Is it permitted to add casts with polymorphic inputs & outputs? Is
that something that we would actually want to do? I'd probably need
both the polymorphic and concrete casts so that you could still say
`int4range(1,2)::int4multirange`.

Should I change the coerce code to look for casts among concrete types
when the function has polymorphic types? I'm pretty scared to do
something like that though, both because of the complexity and lest I
cause unintended effects.

Should I just give up on implicit casts and require you to specify
one? That makes it a little more annoying to mix range & multirange
types, but personally I'm okay with that. This is my preferred
approach.

I have some time over the holidays to work on the other changes Alvaro
has suggested.

Thanks,
Paul




Re: range_agg

2019-12-20 Thread Alvaro Herrera
On 2019-Dec-20, Alvaro Herrera wrote:

> I am not convinced that adding TYPTYPE_MULTIRANGE is really necessary.
> Why can't we just treat those types as TYPTYPE_RANGE and distinguish
> them using TYPCATEGORY_MULTIRANGE?  That's what we do for arrays.  I'll
> try to do that next.

I think this can be simplified if we make the the multirange's
pg_type.typelem carry the base range's OID (the link in the other
direction already appears as pg_range.mltrngtypid, though I'd recommend
renaming that to pg_range.rngmultitypid to maintain the "rng" prefix
convention).  Then we can distinguish a multirange from a plain range
easily, both of which have typtype as TYPTYPE_RANGE, because typelem !=
0 in a multi.  That knowledge can be encapsulated easily in
type_is_multirange and pg_dump's getTypes.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pgsql: Superuser can permit passwordless connections on postgres_fdw

2019-12-20 Thread Tom Lane
[ redirecting to -hackers ]

Michael Paquier  writes:
> On Fri, Dec 20, 2019 at 05:55:10AM +, Andrew Dunstan wrote:
>> Superuser can permit passwordless connections on postgres_fdw

> After this commit a couple of buildfarm animals are unhappy with the
> regression tests of postgres_fdw:

Yeah, the buildfarm is *very* unhappy with this.

>  CREATE ROLE nosuper NOSUPERUSER;
> +WARNING:  roles created by regression test cases should have names
>  starting with "regress_"

That one is just failure to follow the guidelines, and is easily
fixed by adjusting the test case.

> These is also a second type of failure:
> -HINT:  Valid options in this context are: [...] krbsrvname [...]
> +HINT:  Valid options in this context are: [...]
> The diff here is that krbsrvname is not part of the list of valid
> options.  Anyway, as this list is build-dependent, I think that this
> test needs some more design effort.

This is a bit messier.  But I think that the discrepancy is not
really the fault of this patch: rather, it's a bug in the way the
GSS support was put into libpq.  I thought we had a policy that
all builds would recognize all possible parameters and then
perhaps fail later.  Certainly the SSL parameters are implemented
that way.  The #if's disabling GSS stuff in PQconninfoOptions[]
are just broken, according to that policy.

regards, tom lane




Re: range_agg

2019-12-20 Thread Pavel Stehule
pá 20. 12. 2019 v 18:43 odesílatel Alvaro Herrera 
napsal:

> I took the liberty of rebasing this series on top of recent branch
> master.  The first four are mostly Paul's originals, except for conflict
> fixes; the rest are changes I'm proposing as I go along figuring out the
> whole thing.  (I would post just my proposed changes, if it weren't for
> the rebasing; apologies for the messiness.)
>
> I am not convinced that adding TYPTYPE_MULTIRANGE is really necessary.
> Why can't we just treat those types as TYPTYPE_RANGE and distinguish
> them using TYPCATEGORY_MULTIRANGE?  That's what we do for arrays.  I'll
> try to do that next.
>
> I think the algorithm for coming up with the multirange name is
> suboptimal.  It works fine with the name is short enough that we can add
> a few extra letters, but otherwise the result look pretty silly.  I
> think we can still improve on that.  I propose to make
> makeUniqueTypeName accept a suffix, and truncate the letters that appear
> *before* the suffix rather than truncating after it's been appended.
>
> There's a number of ereport() calls that should become elog(); and a
> bunch of others that should probably acquire errcode() and be
> reformatted per our style.
>
>
> Regarding Pavel's documentation markup issue,
>
> > I am not sure how much is correct to use  class="monospaced">
> > in doc. It is used for ranges, and multiranges, but no in other places
>
> I looked at the generated PDF and the table looks pretty bad; the words
> in those entries overlap the words in the cell to their right.  But that
> also happens with entries that do not use !
> See [1] for an example of the existing docs being badly formatted.  The
> docbook documentation [2] seems to suggest that what Paul used is the
> appropriate way to do this.
>
> Maybe a way is to make each entry have more than one row -- so the
> example would appear below the other three fields in its own row, and
> would be able to use the whole width of the table.
>

I had a talk with Paul about possible simplification of designed operators.
Last message from Paul was - he is working on new version.

Regards

Pavel



> [1] https://twitter.com/alvherre/status/1205563468595781633
> [2] https://tdg.docbook.org/tdg/5.1/literallayout.html
>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: Protocol problem with GSSAPI encryption?

2019-12-20 Thread Bruce Momjian
On Fri, Dec 20, 2019 at 06:14:09PM +, Andrew Gierth wrote:
> > "Bruce" == Bruce Momjian  writes:
> 
>  >> This came up recently on IRC, not sure if the report there was
>  >> passed on at all.
>  >> 
>  >> ProcessStartupPacket assumes that there will be only one negotiation
>  >> request for an encrypted connection, but libpq is capable of issuing
>  >> two: it will ask for GSS encryption first, if it looks like it will
>  >> be able to do GSSAPI, and if the server refuses that it will ask (on
>  >> the same connection) for SSL.
> 
>  Bruce> Are you saying that there is an additional round-trip for
>  Bruce> starting all SSL connections because we now support GSSAPI, or
>  Bruce> this only happens if libpq asks for GSSAPI?
> 
> The problem only occurs if libpq thinks it might be able to do GSSAPI,
> but the server does not. Without the patch I proposed or something like
> it, this case fails to connect at all; with it, there will be an extra
> round-trip. Explicitly disabling GSSAPI encryption in the connection
> string or environment avoids the issue.
> 
> The exact condition for libpq seems to be a successful call to
> gss_acquire_cred, but I'm not familiar with GSS in general.

Thanks for the clarification from you and Stephen.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: Clarifying/rationalizing Vars' varno/varattno/varnoold/varoattno

2019-12-20 Thread Tom Lane
Robert Haas  writes:
> On Fri, Dec 20, 2019 at 11:13 AM Tom Lane  wrote:
>> The alternatives that seem plausible at this point are
>> ...
>> (2) Explicitly mark Vars as being nullable by some outer join.  I think
>> we could probably get this down to one additional integer field in
>> struct Var, so it wouldn't be too much of a penalty.

> It might be useful 'Relids' with each Var rather than just 'bool'. In
> other words, based on where the reference to the Var is in the
> original query text, figure out the set of joins where (1) the Var is
> syntactically above the join and (2) on the nullable side, and then
> put the relations on the other sides of those joins into the Relids.
> Then if you later determine that A LEFT JOIN B actually can't make
> anything go to null, you can just ignore the presence of A in this set
> for the rest of planning. I feel like this kind of idea might have
> other applications too, although I admit that it also has a cost.

Yeah, a bitmapset might be a better idea than just recording the topmost
relevant join's relid.  But it's also more expensive, and I think we ought
to keep the representation of Vars as cheap as possible.  (On the third
hand, an empty BMS is cheap, while if the alternative to a non-empty BMS
is to put a separate wrapper node around the Var, that's hardly better.)

The key advantage of a BMS, really, is that it dodges the issue of needing
to re-mark Vars when you re-order two outer joins using the outer join
identities.  You really don't want that to result in having to consider
Vars above the two joins to be different depending on the order you chose
for the OJs, because that'd enormously complicate considering both sorts
of Paths at the same time.  The rough idea I'd had about coping with that
issue with just a single relid is that maybe it doesn't matter --- maybe
we can always mark Vars according to the *syntactically* highest nulling
OJ, regardless of the actual join order.  But I'm not totally sure that
can work.

In any case, what the planner likes to work with is sets of baserel
relids covered by a join, not the relid(s) of the join RTEs themselves.
So maybe there's going to be a conversion step required anyhow.

regards, tom lane




Re: Protocol problem with GSSAPI encryption?

2019-12-20 Thread Andrew Gierth
> "Bruce" == Bruce Momjian  writes:

 >> This came up recently on IRC, not sure if the report there was
 >> passed on at all.
 >> 
 >> ProcessStartupPacket assumes that there will be only one negotiation
 >> request for an encrypted connection, but libpq is capable of issuing
 >> two: it will ask for GSS encryption first, if it looks like it will
 >> be able to do GSSAPI, and if the server refuses that it will ask (on
 >> the same connection) for SSL.

 Bruce> Are you saying that there is an additional round-trip for
 Bruce> starting all SSL connections because we now support GSSAPI, or
 Bruce> this only happens if libpq asks for GSSAPI?

The problem only occurs if libpq thinks it might be able to do GSSAPI,
but the server does not. Without the patch I proposed or something like
it, this case fails to connect at all; with it, there will be an extra
round-trip. Explicitly disabling GSSAPI encryption in the connection
string or environment avoids the issue.

The exact condition for libpq seems to be a successful call to
gss_acquire_cred, but I'm not familiar with GSS in general.

-- 
Andrew (irc:RhodiumToad)




Re: Protocol problem with GSSAPI encryption?

2019-12-20 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Sun, Dec  1, 2019 at 01:13:31AM +, Andrew Gierth wrote:
> > This came up recently on IRC, not sure if the report there was passed on
> > at all.
> > 
> > ProcessStartupPacket assumes that there will be only one negotiation
> > request for an encrypted connection, but libpq is capable of issuing
> > two: it will ask for GSS encryption first, if it looks like it will be
> > able to do GSSAPI, and if the server refuses that it will ask (on the
> > same connection) for SSL.
> 
> Are you saying that there is an additional round-trip for starting all
> SSL connections because we now support GSSAPI, or this only happens if
> libpq asks for GSSAPI?

The way that this is intended to work is if, and only if, there's is a
valid GSS credentical cache (on the client side) will GSSAPI encryption
be attempted and then if that fails because the server doesn't support
GSSAPI encryption of it's not possible to acquire credentials for
whatever reason then we'll fall back to other methods.

I have heard, however, that the Applie GSS libraries are both outright
broken (they lie about a valid credential cache existing- claiming one
does even when that's clearly not the case, based on klist..), and
deprecated (so they aren't likely going to fix them either..).  We're
currently looking to see if there's a way to basically detect the Apple
GSS libraries and refuse to build if we discover that's what we're
building against.  I'm not sure what other choice we really have...

If you gdb psql, without a Kerberos credential cache, on a system that
has a working GSS library, you'll note that pg_GSS_have_cred_cache()
returns false, meaning we skip over the GSS startup code in
PQconnectPoll() (and drop down to trying to do SSL next).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Optimizing TransactionIdIsCurrentTransactionId()

2019-12-20 Thread Simon Riggs
On Fri, 20 Dec 2019 at 17:46, Tom Lane  wrote:

> Simon Riggs  writes:
> > On Fri, 20 Dec 2019 at 13:07, Robert Haas  wrote:
> >> With regard to this point, I second Tomas's comments.
>
> > I also agree with Tomas' comments. I am explaining *why* it will be an
> > improvement, expanding on my earlier notes.
> > This function is called extremely frequently in query processing and is
> > fairly efficient. I'm pointing out cases where making it even quicker
> makes
> > sense.
>
> I think the point is that you haven't demonstrated that this particular
> patch makes it quicker.
>

Not yet, but I was trying to agree what an appropriate test would be before
running it.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Solutions for the Enterprise


Re: Optimizing TransactionIdIsCurrentTransactionId()

2019-12-20 Thread Tom Lane
Simon Riggs  writes:
> On Fri, 20 Dec 2019 at 13:07, Robert Haas  wrote:
>> With regard to this point, I second Tomas's comments.

> I also agree with Tomas' comments. I am explaining *why* it will be an
> improvement, expanding on my earlier notes.
> This function is called extremely frequently in query processing and is
> fairly efficient. I'm pointing out cases where making it even quicker makes
> sense.

I think the point is that you haven't demonstrated that this particular
patch makes it quicker.

regards, tom lane




Re: Protocol problem with GSSAPI encryption?

2019-12-20 Thread Bruce Momjian
On Sun, Dec  1, 2019 at 01:13:31AM +, Andrew Gierth wrote:
> This came up recently on IRC, not sure if the report there was passed on
> at all.
> 
> ProcessStartupPacket assumes that there will be only one negotiation
> request for an encrypted connection, but libpq is capable of issuing
> two: it will ask for GSS encryption first, if it looks like it will be
> able to do GSSAPI, and if the server refuses that it will ask (on the
> same connection) for SSL.

Are you saying that there is an additional round-trip for starting all
SSL connections because we now support GSSAPI, or this only happens if
libpq asks for GSSAPI?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'

2019-12-20 Thread Tom Lane
Mark Lorenz  writes:
> while preparing the patch for the Commitfest, I found a bug in the 
> to_char() function that is quite correlated with this issue:

> SELECT to_char('1997-02-01'::date, '-WW-D')

> returns: 1997-05-7 -> which is ok, I believe. Feb, 1st was on Saturday, 
> so counting from Sundays, it was day 7 of week 5.

> SELECT to_char('1997-02-03'::date, '-WW-D')

> returns: 1997-05-2 -> This cannot be.

Why not?  These format codes are specified as

D   day of the week, Sunday (1) to Saturday (7)
WW  week number of year (1–53) (the first week starts on the first day of 
the year)

I don't see anything there that says that "D" is correlated with "WW".
We do have a connection between "ID" and "IW", so that ID ought to
specify a day within an IW week, but there's no connection between "D"
and either "W" or "WW" week numbering.  It's a day of the week, as
per the regular calendar.  Trying to define it as something else is
just going to break stuff.

The only way to make "D" as it stands compatible with a week-numbering
system is to ensure that your weeks always start on Sundays, that is,
just as confusing as ISO weeks but slightly differently confusing.

Perhaps it would be worth inventing format codes that do have the
same relationship to "W" and/or "WW" as "ID" does to "IW".  But
repurposing "D" for that is a bad idea.

regards, tom lane




Re: Optimizing TransactionIdIsCurrentTransactionId()

2019-12-20 Thread Simon Riggs
On Fri, 20 Dec 2019 at 13:07, Robert Haas  wrote:

>
> > Read only transactions should have a fast path thru this function since
> they frequently read more data than write transactions.
>
> With regard to this point, I second Tomas's comments.
>

I also agree with Tomas' comments. I am explaining *why* it will be an
improvement, expanding on my earlier notes.

This function is called extremely frequently in query processing and is
fairly efficient. I'm pointing out cases where making it even quicker makes
sense.

The TopXid is assigned in very few calls. Write transactions perform
searching before the xid is assigned, so UPDATE and DELETE transactions
will call this with TopXid unassigned in many small transactions, e.g.
simple pgbench. In almost all read-only cases and especially on standby
nodes there will be no TopXid assigned, so I estimate that 90-99% of calls
will be made with TopXid invalid. In this case it makes a great deal of
sense to have a fastpath out of this function, by testing
TransactionIdIsNormal(topxid).

I also now notice that on entry the xid provided is hardly ever
InvalidTransactionId. Once, it might have been called repeatedly with
FrozenTransactionId, but that is no longer the case since we no longer
reset the xid on freezing. So the test for TransactionIdIsNormal(xid)
appears to need rethinking since it is now mostly redundant.

So if adding a test is considered heavy, I would swap the test for
TransactionIdIsNormal(xid) and replace with a test for
TransactionIdIsNormal(topxid).

Such a frequently used function is worth discussing, just as we previously
optimised TransactionIdIsInProgress() and MVCC visibility routines, where
we discussed what the most common routes through the functions were before
deciding how to optimize them.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Solutions for the Enterprise


Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'

2019-12-20 Thread Tom Lane
Mark Lorenz  writes:
> I got the advice to split the patches for:
> - fixing the to_char() function
> - changing the to_date()/to_timestamp() behaviour
> So I appended the split patches.

I'm a bit skeptical of the premise here.  The fine manual says

In to_timestamp and to_date, weekday names or numbers (DAY, D, and
related field types) are accepted but are ignored for purposes of
computing the result. The same is true for quarter (Q) fields.

You appear to be trying to change that, but it's not at all clear
what behavior you're changing it to, or whether the result is going
to be any more sensible than it was before.  In any case, this is
certainly not a "bug fix", because the code is working as documented.
It's a redefinition, and you haven't specified the new definition.

Another point is that these functions are meant to be Oracle-compatible,
so I wonder what Oracle does in not-terribly-well-defined cases like
these.

regards, tom lane




vacuum verbose detail logs are unclear (show debug lines at *start* of each stage?)

2019-12-20 Thread Justin Pryzby
This is a usability complaint.  If one knows enough about vacuum and/or
logging, I'm sure there's no issue.

Right now vacuum shows:

|  1  postgres=# VACUUM t; 
|  2  DEBUG:  vacuuming "public.t"
|  3  DEBUG:  scanned index "t_i_key" to remove 999 row versions
|  4  DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
|  5  DEBUG:  "t": removed 999 row versions in 5 pages
|  6  DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
|  7  DEBUG:  index "t_i_key" now contains 999 row versions in 11 pages
|  8  DETAIL:  999 index row versions were removed.
|  9  0 index pages have been deleted, 0 are currently reusable.
| 10  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
| 11  DEBUG:  "t": found 999 removable, 999 nonremovable row versions in 9 out 
of 9 pages
| 12  DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 130886944
| 13  There were 0 unused item identifiers.
| 14  Skipped 0 pages due to buffer pins, 0 frozen pages.
| 15  0 pages are entirely empty.
| 16  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
| 17  VACUUM

2: line showing action to be performed on table - good
3-4: line showing action which WAS performed on index, but only after it's done
5-6: line showing actions conditionally performed on table, but only after it's 
done
7-10: line showing status of on index, but only after it's done
11-16: line showing status of table; unconditional; good

I'm proposing to output a message before 3, 5, and 7, like in the attached.
The messages are just placeholders; if there's any agreement this is an
improvement, I'll accept suggestions for better content.

This is confusing, at least to me.  For example, the rusage output is shown
numerous times (depending on the number of indices and dead tuples).  I (at
least) tend to think think that a past-tense followed by an "elapsed" time
indicates that the process is neary done, and maybe just waiting on a fsync or
maybe some other synchronization.  If one sees multiple indexes output quickly,
you can infer that the process is looping over them.  When it's done with the
indexes, it starts another phase, but doesn't say that (or what).

#2 "vacuuming" line shows no rusage, and it's not very clear that the rusage
"DETAIL" in line#3 (in this example) applies to line#2 "scanning index", so
it's easy to think that the output is reporting that the whole command took
0.00s elapsed, which is irritating when the command hasn't yet finished.

Another example from CSV logs, to show log times (keep in mind that VACUUM
VERBOSE is less clear than the logfile, which has the adantage of a separate
column for DETAIL).

|  1   2019-12-16 09:59:22.568+10 | vacuuming "public.alarms"   
| 
|  2   2019-12-16 09:59:47.662+10 | scanned index "alarms_active_idx" to remove 
211746 row versions | CPU: user: 0.22 s, 
system: 0.00 s, elapsed: 0.46 s
|  3   2019-12-16 09:59:48.036+10 | scanned index "alarms_displayable_idx" to 
remove 211746 row versions| CPU: user: 0.22 s, 
system: 0.00 s, elapsed: 0.37 s
|  4   2019-12-16 09:59:48.788+10 | scanned index "alarms_raw_current_idx" to 
remove 211746 row versions| CPU: user: 0.28 s, 
system: 0.00 s, elapsed: 0.75 s
|  5   2019-12-16 09:59:51.379+10 | scanned index 
"alarms_alarm_id_linkage_back_idx" to remove 211746 row versions
  | CPU: user: 1.04 s, system: 0.05 s, elapsed: 2.59 s
|  6   2019-12-16 09:59:53.75+10  | scanned index "alarms_alarm_id_linkage_idx" 
to remove 211746 row versions   | CPU: user: 0.99 s, 
system: 0.08 s, elapsed: 2.37 s
|  7   2019-12-16 09:59:56.473+10 | scanned index "alarms_pkey" to remove 
211746 row versions   | CPU: user: 1.11 
s, system: 0.08 s, elapsed: 2.72 s
|  8   2019-12-16 10:00:35.142+10 | scanned index "alarms_alarm_time_idx" to 
remove 211746 row versions | CPU: user: 0.94 s, 
system: 0.08 s, elapsed: 38.66 s
|  9   2019-12-16 10:00:37.002+10 | scanned index "alarms_alarm_clear_time_idx" 
to remove 211746 row versions   | CPU: user: 0.72 s, 
system: 0.08 s, elapsed: 1.85 s
| 10   2019-12-16 10:03:57.42+10  | "alarms": removed 211746 row versions in 
83923 pages| CPU: user: 10.24 
s, system: 2.28 s, elapsed: 200.41 s
| 11   2019-12-16 10:03:57.425+10 | index "alarms_active_idx" now contains 32 
row versions in 1077 pages| 57251 index row 
versions were removed. +
| 13   2019-12-16 10:03:57.426+10 | index "alarms_raw_current_idx" now contains 
1495 row versions in 1753 pages | 96957 index row 
versions were removed. +
| 15   2019-12-16 10:03:57.426+10 | index "alarms_displayable_idx" now contains 
32 row versions in 1129 

Re: Clarifying/rationalizing Vars' varno/varattno/varnoold/varoattno

2019-12-20 Thread Robert Haas
On Fri, Dec 20, 2019 at 11:13 AM Tom Lane  wrote:
> The alternatives that seem plausible at this point are
>
> (1) Create some sort of wrapper node indicating "the contents of this
> expression might be replaced by NULL".  This is basically what the
> planner's PlaceHolderVars do, so maybe we'd just be talking about
> introducing those at some earlier stage.
>
> (2) Explicitly mark Vars as being nullable by some outer join.  I think
> we could probably get this down to one additional integer field in
> struct Var, so it wouldn't be too much of a penalty.
>
> The wrapper approach is more general since you can wrap something
> that's not necessarily a plain Var; but it's also bulkier and so
> probably a bit less efficient.  I'm not sure which idea I like better.

I'm not sure which is better, either, although I would like to note in
passing that the name PlaceHolderVar seems to me to be confusing and
terrible. It took me years to understand it, and I've never been
totally sure that I actually do. Why is it not called
MightBeNullWrapper or something?

If you chose to track it in the Var, maybe you could do better than to
track whether it might have gone to NULL. For example, perhaps you
could track the set of baserels that are syntactically below the Var
location and have the Var on the nullable side of a join, rather than
just have a Boolean that indicates whether there are any. I don't know
whether the additional effort would be worth the cost of maintaining
the information, but it seems like it might be.

> With either approach, we could either make parse analysis inject the
> nullability markings, or wait to do it in the planner.  On a purely
> abstract system structural level, I like the former better: it is
> exactly the province of parse analysis to decide what are the semantics
> of what the user typed, and surely what a Var means is part of that.
> OTOH, if we do it that way, the planner potentially has to rearrange the
> markings after it does join strength reduction; so maybe it's best to
> just wait till after that planning phase to address this at all.
>
> Any thoughts about that?

Generally, I like the idea of driving this off the parse tree, because
it seems to me that, ultimately, whether a Var is *potentially*
nullable or not depends on the query as provided by the user. And, if
we replan the query, these determinations don't change, at least as
long as they are only driven by the query syntax and not, say,
attisnull or opclass details. It would be nice not to redo the work
unnecessarily. However, that seems to require some way of segregating
the information we derive as a preliminary and syntactical judgement
from subsequent inferences made during query planning, because the
latter CAN change during replanning.

It might be useful 'Relids' with each Var rather than just 'bool'. In
other words, based on where the reference to the Var is in the
original query text, figure out the set of joins where (1) the Var is
syntactically above the join and (2) on the nullable side, and then
put the relations on the other sides of those joins into the Relids.
Then if you later determine that A LEFT JOIN B actually can't make
anything go to null, you can just ignore the presence of A in this set
for the rest of planning. I feel like this kind of idea might have
other applications too, although I admit that it also has a cost.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Clarifying/rationalizing Vars' varno/varattno/varnoold/varoattno

2019-12-20 Thread Tom Lane
I wrote:
> I started to think a little harder about the rough ideas I sketched
> yesterday in [1] about making the planner deal with outer joins in
> a less ad-hoc manner.
> [1] https://www.postgresql.org/message-id/7771.1576452845%40sss.pgh.pa.us

After further study, the idea of using join alias vars to track
outer-join semantics basically doesn't work at all.  Join alias vars in
their current form (ie, references to the output columns of a JOIN RTE)
aren't suitable for the purpose of representing possibly-nulled inputs
to that same RTE.  There are two big stumbling blocks:

* In the presence of JOIN USING, we don't necessarily have a JOIN output
column that is equivalent to either input column.  The output is
definitely not equal to the nullable side of an OJ, since it won't go to
NULL; and it might not be equivalent to the non-nullable side either,
because JOIN USING might've coerced it to some common datatype.

* We also don't have any output column that could represent a whole-row
reference to either input table.  I thought about representing that with
a RowExpr of join output Vars, but that fails to preserve the existing
semantics: a whole-row reference to the nullable side goes to NULL, not
to a row of NULLs, when we're null-extending the join.

So that kind of crashed and burned.  We could maybe fake it by inventing
some new conventions about magic attnums of the join RTE that correspond
to the values we want, but that seems really messy and error-prone.

The alternatives that seem plausible at this point are

(1) Create some sort of wrapper node indicating "the contents of this
expression might be replaced by NULL".  This is basically what the
planner's PlaceHolderVars do, so maybe we'd just be talking about
introducing those at some earlier stage.

(2) Explicitly mark Vars as being nullable by some outer join.  I think
we could probably get this down to one additional integer field in
struct Var, so it wouldn't be too much of a penalty.

The wrapper approach is more general since you can wrap something
that's not necessarily a plain Var; but it's also bulkier and so
probably a bit less efficient.  I'm not sure which idea I like better.

With either approach, we could either make parse analysis inject the
nullability markings, or wait to do it in the planner.  On a purely
abstract system structural level, I like the former better: it is
exactly the province of parse analysis to decide what are the semantics
of what the user typed, and surely what a Var means is part of that.
OTOH, if we do it that way, the planner potentially has to rearrange the
markings after it does join strength reduction; so maybe it's best to
just wait till after that planning phase to address this at all.

Any thoughts about that?

Anyway, I had started to work on getting parse analysis to label
outer-join-nullable Vars properly, and soon decided that no matter how
we do it, there's not enough information available at the point where
parse analysis makes a Var.  The referenced RTE is not, in itself,
enough info, and I don't think we want to decorate RTEs with more info
that's only needed during parse analysis.  What would be saner is to add
any extra info to the ParseNamespaceItem structs.  But that requires
some refactoring to allow the ParseNamespaceItems, not just the
referenced RTEs, to be passed down through Var lookup/construction.
So attached is a patch that refactors things that way.  As proof of
concept, I added the rangetable index to ParseNamespaceItem, and used
that to get rid of the RTERangeTablePosn() searches that we formerly had
in a bunch of places.  Now, RTERangeTablePosn() isn't likely to be all
that expensive, but still this should be a little faster and cleaner.
Also, I was able to confine the fuzzy-lookup heuristic stuff to within
parse_relation.c instead of letting it bleed out to the rest of the
parser.

This seems to me to be good cleanup regardless of whether we ever
ask parse analysis to label outer-join-nullable Vars.  So, barring
objection, I'd like to push it soon.

regards, tom lane

diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 85d7a96..0656279 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -1344,7 +1344,7 @@ transformValuesClause(ParseState *pstate, SelectStmt *stmt)
 	int			sublist_length = -1;
 	bool		lateral = false;
 	RangeTblEntry *rte;
-	int			rtindex;
+	ParseNamespaceItem *nsitem;
 	ListCell   *lc;
 	ListCell   *lc2;
 	int			i;
@@ -1516,15 +1516,15 @@ transformValuesClause(ParseState *pstate, SelectStmt *stmt)
 	  NULL, lateral, true);
 	addRTEtoQuery(pstate, rte, true, true, true);
 
-	/* assume new rte is at end */
-	rtindex = list_length(pstate->p_rtable);
-	Assert(rte == rt_fetch(rtindex, pstate->p_rtable));
+	/* grab the namespace item made by addRTEtoQuery */
+	nsitem = (ParseNamespaceItem *) llast(pstate->p_namespace);
+	Assert(rte == nsitem->p_rte);
 
 	/*
 	 * Generate a 

Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'

2019-12-20 Thread Mark Lorenz

Hi,

I got the advice to split the patches for:
- fixing the to_char() function
- changing the to_date()/to_timestamp() behaviour

So I appended the split patches.

Kind regards,
Mark Lorenz
From 4e35bd88bef1916e7d11ad0776b3075e3183f7d0 Mon Sep 17 00:00:00 2001
From: Mark Lorenz 
Date: Fri, 20 Dec 2019 14:30:41 +0100
Subject: [PATCH] change to_date()/to_timestamp() behaviour with '-WW-D'
 pattern

Currently, the D part is ignored at non-ISO week pattern. Now the D
pattern works as well.

Added regression tests
---
 src/backend/utils/adt/formatting.c | 44 +++---
 src/backend/utils/adt/timestamp.c  | 84 ++
 src/include/utils/timestamp.h  |  6 ++
 src/test/regress/expected/horology.out | 26 +++-
 src/test/regress/sql/horology.sql  |  4 ++
 5 files changed, 156 insertions(+), 8 deletions(-)

diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 8fcbc22..603c51c 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -4494,21 +4494,51 @@ do_to_timestamp(text *date_txt, text *fmt, bool std,
 			fmask |= DTK_DATE_M;
 		}
 		else
-			tmfc.ddd = (tmfc.ww - 1) * 7 + 1;
+		{
+			/*
+			 * If tmfc.d is not set, then the date is left at the beginning of
+			 * the week (Sunday).
+			 */
+			if (tmfc.d)
+weekdate2date(tmfc.ww, tmfc.d, >tm_year, >tm_mon, >tm_mday);
+			else
+week2date(tmfc.ww, >tm_year, >tm_mon, >tm_mday);
+			fmask |= DTK_DATE_M;
+		}
+	}
+
+	if (tmfc.mm)
+	{
+		tm->tm_mon = tmfc.mm;
+		fmask |= DTK_M(MONTH);
 	}
 
 	if (tmfc.w)
-		tmfc.dd = (tmfc.w - 1) * 7 + 1;
+	{
+		/* if tmfc.mm is set, the date can be calculated */
+		if (tmfc.mm)
+		{
+			/*
+			 * If tmfc.d is not set, then the date is left at the beginning of
+			 * the week (Sunday).
+			 */
+			if (tmfc.d)
+monthweekdate2date(tmfc.mm, tmfc.w, tmfc.d, >tm_year, >tm_mon, >tm_mday);
+			else
+monthweek2date(tmfc.mm, tmfc.w, >tm_year, >tm_mon, >tm_mday);
+
+			fmask |= DTK_DATE_M;
+			tmfc.dd = tm->tm_mday;
+		}
+		else
+			tmfc.dd = (tmfc.w - 1) * 7 + 1;
+	}
+
 	if (tmfc.dd)
 	{
 		tm->tm_mday = tmfc.dd;
 		fmask |= DTK_M(DAY);
 	}
-	if (tmfc.mm)
-	{
-		tm->tm_mon = tmfc.mm;
-		fmask |= DTK_M(MONTH);
-	}
 
 	if (tmfc.ddd && (tm->tm_mon <= 1 || tm->tm_mday <= 1))
 	{
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 945b8f8..3e2f499 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -4264,6 +4264,90 @@ interval_trunc(PG_FUNCTION_ARGS)
 	PG_RETURN_INTERVAL_P(result);
 }
 
+/* monthweek2j()
+ *
+ *	Return the Julian day which corresponds to the first day (Sunday) of the given month/year and week.
+ *	Julian days are used to convert between ISO week dates and Gregorian dates.
+ */
+int
+monthweek2j(int year, int month, int week)
+{
+	int			day0,
+day1;
+
+	/* first day of given month */
+	day1 = date2j(year, month, 1);
+
+	// day0 == offset to first day of week (Sunday)
+	day0 = j2day(day1);
+
+	return ((week - 1) * 7) + (day1 - day0);
+}
+
+/* monthweek2date()
+ * Convert week of month and year number to date.
+ */
+void
+monthweek2date(int month, int wom, int *year, int *mon, int *mday)
+{
+	j2date(monthweek2j(*year, month, wom), year, mon, mday);
+}
+
+/* monthweek2date()
+ *
+ *	Convert a week of month date (year, month, week of month) into a Gregorian date.
+ *	Gregorian day of week sent so weekday strings can be supplied.
+ *	Populates year, mon, and mday with the correct Gregorian values.
+ */
+void
+monthweekdate2date(int month, int wom, int wday, int *year, int *mon, int *mday)
+{
+	int 		jday;
+
+	jday = monthweek2j(*year, month, wom);
+	jday += wday - 1;
+
+	j2date(jday, year, mon, mday);
+}
+
+/* week2j()
+ *
+ *	Return the Julian day which corresponds to the first day (Sunday) of the given year and week.
+ *	Julian days are used to convert between ISO week dates and Gregorian dates.
+ */
+int
+week2j(int year, int week)
+{
+	/* calculating the Julian Day from first month of current year */
+	return monthweek2j(year, 1, week);
+}
+
+/* week2date()
+ * Convert week of year number to date.
+ */
+void
+week2date(int woy, int *year, int *mon, int *mday)
+{
+	j2date(week2j(*year, woy), year, mon, mday);
+}
+
+/* weekdate2date()
+ *
+ *	Convert a week date (year, week) into a Gregorian date.
+ *	Gregorian day of week sent so weekday strings can be supplied.
+ *	Populates year, mon, and mday with the correct Gregorian values.
+ */
+void
+weekdate2date(int woy, int wday, int *year, int *mon, int *mday)
+{
+	int			jday;
+
+	jday = week2j(*year, woy);
+	jday += wday - 1;
+
+	j2date(jday, year, mon, mday);
+}
+
 /* isoweek2j()
  *
  *	Return the Julian day which corresponds to the first day (Monday) of the given ISO 8601 year and week.
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index 7652b41..4c417fc 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -100,6 

Re: backup manifests

2019-12-20 Thread Robert Haas
On Fri, Dec 20, 2019 at 8:24 AM Suraj Kharage
 wrote:
> Thank you for review comments.

Thanks for the new version.

+  --verify-backup 

Whitespace.

+struct manifesthash_hash *hashtab;

Uh, I had it in mind that you would nuke this line completely, not
just remove "typedef" from it. You shouldn't need a global variable
here.

+ if (buf == NULL)

pg_malloc seems to have an internal check such that it never returns
NULL. I don't see anything like this test in other callers.

The order of operations in create_manifest_hash() seems unusual:

+ fd = open(manifest_path, O_RDONLY, 0);
+ if (fstat(fd, ))
+ buf = pg_malloc(stat.st_size);
+ hashtab = manifesthash_create(1024, NULL);
...
+ entry = manifesthash_insert(hashtab, filename, );
...
+ close(fd);

I would have expected open-fstat-read-close to be consecutive, and the
manifesthash stuff all done afterwards. In fact, it seems like reading
the file could be a separate function.

+ if (strncmp(checksum, "SHA256", 6) == 0)

This isn't really right; it would give a false match if we had a
checksum algorithm with a name like SHA2560 or SHA256C or
SHA256ExceptWayBetter. The right thing to do is find the colon first,
and then probably overwrite it with '\0' so that you have a string
that you can pass to parse_checksum_algorithm().

+ /*
+ * we don't have checksum type in the header, so need to
+ * read through the first file enttry to find the checksum
+ * type for the manifest file and initilize the checksum
+ * for the manifest file itself.
+ */

This seems to be proceeding on the assumption that the checksum type
for the manifest itself will always be the same as the checksum type
for the first file in the manifest. I don't think that's the right
approach. I think the manifest should always have a SHA256 checksum,
regardless of what type of checksum is used for the individual files
within the manifest. Since the volume of data in the manifest is
presumably very small compared to the size of the database cluster
itself, I don't think there should be any performance problem there.

+ filesize = atol(size);

Using strtol() would allow for some error checking.

+ * Increase the checksum by its lable length so that we can
+ checksum = checksum + checksum_lable_length;

Spelling.

+ pg_log_error("invalid record found in \"%s\"", manifest_path);

Error message needs work.

+VerifyBackup(void)
+create_manifest_hash(char *manifest_path)
+nextLine(char *buf)

Your function names should be consistent with the surrounding style,
and with each other, as far as possible. Three different conventions
within the same patch and source file seems over the top.

Also keep in mind that you're not writing code in a vacuum. There's a
whole file of code here, and around that, a whole project.
scan_data_directory() is a good example of a function whose name is
clearly too generic. It's not a general-purpose function for scanning
the data directory; it's specifically a support function for verifying
a backup. Yet, the name gives no hint of this.

+verify_file(struct dirent *de, char fn[MAXPGPATH], struct stat st,
+ char relative_path[MAXPGPATH], manifesthash_hash *hashtab)

I think I commented on the use of char[] parameters in my previous review.

+ /* Skip backup manifest file. */
+ if (strcmp(de->d_name, "backup_manifest") == 0)
+ return;

Still looks like this will be skipped at any level of the directory
hierarchy, not just the top. And why are we skipping backup_manifest
here bug pg_wal in scan_data_directory? That's a rhetorical question,
because I know the answer: verify_file() is only getting called for
files, so you can't use it to skip directories. But that's not a good
excuse for putting closely-related checks in different parts of the
code. It's just going to result in the checks being inconsistent and
each one having its own bugs that have to be fixed separately from the
other one, as here. Please try to reorganize this code so that it can
be done in a consistent way.

I think this is related to the way you're traversing the directory
tree, which somehow looks a bit awkward to me. At the top of
scan_data_directory(), you've got code that uses basedir and
subdirpath to construct path and relative_path. I was initially
surprised to see that this was the job of this function, rather than
the caller, but then I thought: well, as long as it makes life easy
for the caller, it's probably fine. However, I notice that the only
non-trivial caller is the scan_data_directory() itself, and it has to
go and construct newsubdirpath from subdirpath and the directory name.

It seems to me that this would get easier if you defined
scan_data_directory() -- or whatever we end up calling it -- to take
two pathname-related arguments:

- basepath, which would be $PGDATA and would never change as we
recurse down, so same as what you're now calling basedir
- pathsuffix, which would be an empty string at the top level and at
each recursive level we'd add a slash and then de->d_name.

So at the 

Re: More issues with expressions always false (no patch)

2019-12-20 Thread Andreas Karlsson

On 12/20/19 1:54 AM, Andreas Karlsson wrote:

On 12/20/19 1:01 AM, Ranier Vilela wrote:> First case:

Third case:
\ src \ backend \ utils \ cache \ relcache.c (line 5190)
if (relation-> rd_pubactions)

It will never be executed, because if relation-> rd_pubactions is 
true, the function returns on line 5154.


I have not looked into this one in detail, but the free at line 5192 
looks like potentially dead code.


I have looked at it now and it seems like this code has been dead since 
the function was originally implemented in 665d1fad99e.


Peter, what do you think?

Andreas
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 50f8912c13..1ae41f776d 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -5187,12 +5187,6 @@ GetRelationPublicationActions(Relation relation)
 			break;
 	}
 
-	if (relation->rd_pubactions)
-	{
-		pfree(relation->rd_pubactions);
-		relation->rd_pubactions = NULL;
-	}
-
 	/* Now save copy of the actions in the relcache entry. */
 	oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
 	relation->rd_pubactions = palloc(sizeof(PublicationActions));


Re: MarkBufferDirtyHint() and LSN update

2019-12-20 Thread Antonin Houska
Antonin Houska  wrote:

> Michael Paquier  wrote:
> 
> > On Mon, Nov 11, 2019 at 10:03:14AM +0100, Antonin Houska wrote:
> > > This looks good to me.
> > 
> > Actually, no, this is not good.  I have been studying more the patch,
> > and after stressing more this code path with a cluster having
> > checksums enabled and shared_buffers at 1MB, I have been able to make
> > a couple of page's LSNs go backwards with pgbench -s 100.  The cause
> > was simply that the page got flushed with a newer LSN than what was
> > returned by XLogSaveBufferForHint() before taking the buffer header
> > lock, so updating only the LSN for a non-dirty page was simply
> > guarding against that.
> 
> Interesting. Now that I know about the problem, I could have reproduced it
> using gdb: MarkBufferDirtyHint() was called by 2 backends concurrently in such
> a way that the first backend generates the LSN, but before it manages to
> assign it to the page, another backend generates another LSN and sets it.
> 
> Can't we just apply the attached diff on the top of your patch?

I wanted to register the patch for the next CF so it's not forgotten, but see
it's already there. Why have you set the status to "withdrawn"?

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com




Re: backup manifests

2019-12-20 Thread Suraj Kharage
Fixed some typos in attached v5-0002 patch. Please consider this patch for
review.

On Fri, Dec 20, 2019 at 6:54 PM Suraj Kharage <
suraj.khar...@enterprisedb.com> wrote:

> Thank you for review comments.
>
> On Thu, Dec 19, 2019 at 2:54 AM Robert Haas  wrote:
>
>> On Tue, Dec 17, 2019 at 12:54 AM Suraj Kharage
>>  wrote:
>> > I have implemented the simplehash in backup validator patch as Robert
>> suggested. Please find attached 0002 patch for the same.
>> >
>> > kindly review and let me know your thoughts.
>>
>> +#define CHECKSUM_LENGTH 256
>>
>> This seems wrong. Not all checksums are the same length, and none of
>> the ones we're using are 256 bytes in length, and if we've got to have
>> a constant someplace for the maximum checksum length, it should
>> probably be in the new header file, not here. But I don't think we
>> should need this in the first place; see comments below about how to
>> revise the parsing of the manifest file.
>>
>
> I agree. Removed.
>
> +charfiletype[10];
>>
>> A mysterious 10-byte field with no comments explaining what it
>> means... and the same magic number 10 appears in at least one other
>> place in the patch.
>>
>
> with current logic, we don't need this anymore.
> I have removed the filetype from the structure as we are not doing any
> comparison anywhere.
>
>
>>
>> +typedef struct manifesthash_hash *hashtab;
>>
>> This declares a new *type* called hashtab, not a variable called
>> hashtab. The new type is not used anywhere, but later, you have
>> several variables of the same type that have this name. Just remove
>> this: it's wrong and unused.
>>
>>
> corrected.
>
>
>> +static enum ChecksumAlgorithm checksum_type = MC_NONE;
>>
>> Remove "enum". Not needed, because you have a typedef for it in the
>> header, and not per style.
>>
>> corrected.
>
>
>> +static  manifesthash_hash *create_manifest_hash(char
>> manifest_path[MAXPGPATH]);
>>
>> Whitespace is wrong. The whole patch needs a visit from pgindent with
>> a properly-updated typedefs.list.
>>
>> Also, you will struggle to find anywhere else in the code base where
>> pass a character array as a function argument. I don't know why this
>> isn't just char *.
>>
>
> Corrected.
>
>
>>
>> +if(verify_backup)
>>
>> Whitespace wrong here, too.
>>
>>
> Fixed
>
>
>>
>> It's also pretty unhelpful, here and elsewhere, to refer to "the hash
>> table" as if there were only one, and as if the reader were supposed
>> to know something about it when you haven't told them anything about
>> it.
>>
>> +if (!entry->matched)
>> +{
>> +pg_log_info("missing file: %s", entry->filename);
>> +}
>> +
>>
>> The braces here are not project style. We usually omit braces when
>> only a single line of code is present.
>>
>
> fixed
>
>
>>
>> I think some work needs to be done to standardize and improve the
>> messages that get produced here.  You have:
>>
>> 1. missing file: %s
>> 2. duplicate file present: %s
>> 3. size changed for file: %s, original size: %d, current size: %zu
>> 4. checksum difference for file: %s
>> 5. extra file found: %s
>>
>> I suggest:
>>
>> 1. file \"%s\" is present in manifest but missing from the backup
>> 2. file \"%s\" has multiple manifest entries
>> (this one should probably be pg_log_error(), not pg_log_info(), as it
>> represents a corrupt-manifest problem)
>> 3. file \"%s" has size %lu in manifest but size %lu in backup
>> 4. file \"%s" has checksum %s in manifest but checksum %s in backup
>> 5. file \"%s" is present in backup but not in manifest
>>
>
> Corrected.
>
>
>>
>> Your patch actually doesn't compile on my system, because for the
>> third message above, it uses %zu to print the size. But %zu is for
>> size_t, not off_t. I went looking for other places in the code where
>> we print off_t; based on that, I think the right thing to do is to
>> print it using %lu and write (unsigned long) st.st_size.
>>
>
> Corrected.
>
> +charfile_checksum[256];
>> +charheader[1024];
>>
>> More arbitrary constants.
>
>
>
>>
>> +if (!file)
>> +{
>> +pg_log_error("could not open backup_manifest");
>>
>> That's bad error reporting.  See e.g. readfile() in initdb.c.
>>
>
> Corrected.
>
>
>>
>> +if (fscanf(file, "%1023[^\n]\n", header) != 1)
>> +{
>> +pg_log_error("error while reading the header from
>> backup_manifest");
>>
>> That's also bad error reporting. It is only a slight step up from
>> "ERROR: error".
>>
>> And we have another magic number (1023).
>>
>
> With current logic, we don't need this anymore.
>
>
>>
>> +appendPQExpBufferStr(manifest, header);
>> +appendPQExpBufferStr(manifest, "\n");
>> ...
>> +appendPQExpBuffer(manifest, "File\t%s\t%d\t%s\t%s\n", filename,
>> +  filesize, mtime, checksum_with_type);
>>
>> This whole thing seems completely crazy to me. Basically, you're
>> trying to use fscanf() to parse the file. But then, because fscanf()
>> doesn't give you the 

Re: Hooks for session start and end, take two

2019-12-20 Thread Alvaro Herrera
On 2019-Dec-20, Tom Lane wrote:

> Alvaro Herrera  writes:
> > On 2019-Dec-20, Michael Paquier wrote:
> >> The patch has been committed once as of e788bd9, then reverted as of
> >> 9555cc8 because it had a couple of fundamental issues and many people
> >> were not happy with it.
> 
> > Hmm, should we mark the commitfest entry as rejected then?  Having it be
> > marked committed seems pretty confusing.  The next version of the patch
> > would have its own CF entry, I presume.
> 
> RWF seems appropriate.  We haven't rejected the concept altogether,
> AFAICT.

Fair enough.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Hooks for session start and end, take two

2019-12-20 Thread Tom Lane
Alvaro Herrera  writes:
> On 2019-Dec-20, Michael Paquier wrote:
>> The patch has been committed once as of e788bd9, then reverted as of
>> 9555cc8 because it had a couple of fundamental issues and many people
>> were not happy with it.

> Hmm, should we mark the commitfest entry as rejected then?  Having it be
> marked committed seems pretty confusing.  The next version of the patch
> would have its own CF entry, I presume.

RWF seems appropriate.  We haven't rejected the concept altogether,
AFAICT.

regards, tom lane




Re: Hooks for session start and end, take two

2019-12-20 Thread Alvaro Herrera
On 2019-Dec-20, Michael Paquier wrote:

> On Fri, Dec 20, 2019 at 02:45:26AM +, tsunakawa.ta...@fujitsu.com wrote:
> > I've got interested in this.  What's the current status of this
> > patch?  The CF entry shows it was committed.
> >
> > But I understood not, because the relevant code doesn't appear in
> > HEAD, and Git log shows that it was reverted.  Am I correct? 
> 
> The patch has been committed once as of e788bd9, then reverted as of
> 9555cc8 because it had a couple of fundamental issues and many people
> were not happy with it.

Hmm, should we mark the commitfest entry as rejected then?  Having it be
marked committed seems pretty confusing.  The next version of the patch
would have its own CF entry, I presume.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'

2019-12-20 Thread Mark Lorenz

Hi,

I fixed the described issue in the to char() function.

The output of the current version is:

postgres=# SELECT to_char('1997-02-01'::date, '-WW-D');
 to_char
-
 1997-05-7
(1 row)

postgres=# SELECT to_char('1997-02-03'::date, '-WW-D');
 to_char
-
 1997-05-2
(1 row)

postgres=# SELECT to_char('1997-02-10'::date, '-WW-D');
 to_char
-
 1997-06-2
(1 row)

As you can see, the week day of the Feb 3rd - which is two days AFTER 
Feb 1st - yields in a result which is 5 days BEFORE the earlier date, 
which obviously cannot be. Furthermore, using the Gregorian calendar, 
Feb 3rd is in week 6. So, the Feb 10th cannot be in week 6 as well.


The bug was, that the week day of Jan 1st was not considered in the 
calculation of the week number. So, a possible offset has not been set.


New output:

postgres=# SELECT to_char('1997-02-03'::date, '-WW-D');
 to_char
-
 1997-06-2
(1 row)

postgres=# SELECT to_char('1997-02-01'::date, '-WW-D');
 to_char
-
 1997-05-7
(1 row)

postgres=# SELECT to_char('1997-02-10'::date, '-WW-D');
 to_char
-
 1997-07-2
(1 row)

---

Furthermore I adjusted the to_date() functionality for the WW-D pattern 
as well. As said before in the thread, I know, ignoring the D part is 
known and documented, but I think, if the ISO format recognizes the day 
part, the non-ISO format should as well - especially when the "back" 
operation does as well (meaning to_char()):


Output in the current version:

postgres=# SELECT to_date('2019-1-1', '-WW-D');
  to_date

 2019-01-01
(1 row)

postgres=# SELECT to_date('2019-1-2', '-WW-D');
  to_date

 2019-01-01
(1 row)

postgres=# SELECT to_date('2019-1-3', '-WW-D');
  to_date

 2019-01-01
(1 row)

postgres=# SELECT to_date('2019-1-7', '-WW-D');
  to_date

 2019-01-01
(1 row)

postgres=# SELECT to_date('2019-2-1', '-WW-D');
  to_date

 2019-01-08
(1 row)

New output:

postgres=# SELECT to_date('2019-1-1', '-WW-D');
  to_date

 2018-12-30
(1 row)

postgres=# SELECT to_date('2019-1-2', '-WW-D');
  to_date

 2018-12-31
(1 row)

postgres=# SELECT to_date('2019-1-3', '-WW-D');
  to_date

 2019-01-01
(1 row)

postgres=# SELECT to_date('2019-1-7', '-WW-D');
  to_date

 2019-01-05
(1 row)

postgres=# SELECT to_date('2019-2-1', '-WW-D');
  to_date

 2019-01-06
(1 row)

I added the patch as plain text attachment. It contains the code and, of 
course, the regression tests. Some existing tests failed, because they 
worked with the old output. I have changed their expected output.


Hope you'll find it helpful.

Best regards,
Mark LorenzFrom 39b759221a827c7730d940ac14e7a28a7a76 Mon Sep 17 00:00:00 2001
From: Mark Lorenz 
Date: Fri, 20 Dec 2019 14:44:42 +0100
Subject: [PATCH] fix issues with date format -WW-D in to_date() and
 to_char(); adjusted and added regression tests

---
 src/backend/utils/adt/formatting.c|  46 +++-
 src/backend/utils/adt/timestamp.c | 121 +++
 src/include/utils/timestamp.h |   7 +
 src/test/regress/expected/horology.out|  66 +-
 src/test/regress/expected/timestamp.out   | 238 -
 src/test/regress/expected/timestamptz.out | 242 +-
 src/test/regress/sql/horology.sql |   4 +
 src/test/regress/sql/timestamp.sql|   2 +
 src/test/regress/sql/timestamptz.sql  |   2 +
 9 files changed, 514 insertions(+), 214 deletions(-)

diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 8fcbc22..ea6e45d 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -3017,7 +3017,7 @@ DCH_to_char(FormatNode *node, bool is_interval, TmToChar *in, char *out, Oid col
 break;
 			case DCH_WW:
 sprintf(s, "%0*d", S_FM(n->suffix) ? 0 : 2,
-		(tm->tm_yday - 1) / 7 + 1);
+		date2week(tm->tm_year, tm->tm_mon, tm->tm_mday));
 if (S_THth(n->suffix))
 	str_numth(s, s, S_TH_TYPE(n->suffix));
 s += strlen(s);
@@ -4494,21 +4494,51 @@ do_to_timestamp(text *date_txt, text *fmt, bool std,
 			fmask |= DTK_DATE_M;
 		}
 		else
-			tmfc.ddd = (tmfc.ww - 1) * 7 + 1;
+		{
+			/*
+			 * If tmfc.d is not set, then the date is left at the beginning of
+			 * the week (Sunday).
+			 */
+			if (tmfc.d)
+weekdate2date(tmfc.ww, tmfc.d, >tm_year, >tm_mon, >tm_mday);
+			else
+week2date(tmfc.ww, >tm_year, >tm_mon, >tm_mday);
+			fmask |= DTK_DATE_M;
+		}
+	}
+
+	if (tmfc.mm)
+	{
+		tm->tm_mon = tmfc.mm;
+		fmask |= DTK_M(MONTH);
 	}
 
 	if (tmfc.w)
-		tmfc.dd = (tmfc.w - 1) * 7 + 1;
+	{
+		/* if tmfc.mm is set, the date can be calculated */
+		if (tmfc.mm)
+		{
+			/*
+			 * If tmfc.d is not set, then the date is left at the beginning of
+			 * the week (Sunday).
+			 */
+			if (tmfc.d)
+monthweekdate2date(tmfc.mm, tmfc.w, tmfc.d, >tm_year, 

Re: problem with read-only user

2019-12-20 Thread Tom Lane
ROS Didier  writes:
> I created a read-only role as follows:
> psql -p 5434 kidsdpn03
> CREATE ROLE kidsdpn03_ro PASSWORD 'xxx';
> ALTER ROLE kidsdpn03_ro WITH LOGIN;
> GRANT CONNECT ON DATABASE kidsdpn03 TO kidsdpn03_ro;
> GRANT USAGE ON SCHEMA kidsdpn03 TO kidsdpn03_ro;
> GRANT SELECT ON ALL TABLES IN SCHEMA kidsdpn03 TO kidsdpn03_ro;
> GRANT SELECT ON ALL SEQUENCES IN SCHEMA kidsdpn03 TO kidsdpn03_ro;
> ALTER DEFAULT PRIVILEGES IN SCHEMA kidsdpn03 GRANT SELECT ON TABLES TO 
> kidsdpn03_ro;
> ALTER ROLE kidsdpn03_ro SET search_path TO kidsdpn03;

> but when i create new tables, i don't have read access to those new  tables. 

You only showed us part of what you did ... but IIRC, 
ALTER DEFAULT PRIVILEGES only affects privileges for objects
subsequently made by the same user that issued the command.
(Otherwise it'd be a security issue.)  So maybe you didn't
make the tables as the same user?

regards, tom lane




Re: [PATCH] Increase the maximum value track_activity_query_size

2019-12-20 Thread Bruce Momjian
On Fri, Dec 20, 2019 at 02:35:37PM +0300, Alexey Kondratov wrote:
> On 19.12.2019 20:52, Robert Haas wrote:
> > On Thu, Dec 19, 2019 at 10:59 AM Tom Lane  wrote:
> > > Bruce Momjian  writes:
> > > > Good question.  I am in favor of allowing a larger value if no one
> > > > objects.  I don't think adding the min/max is helpful.
> > > 
> > > The original poster.
> 
> 
> And probably anyone else, who debugs stuck queries of yet another crazy ORM.
> Yes, one could use log_min_duration_statement, but having a possibility to
> directly get it from pg_stat_activity without eyeballing the logs is nice.
> Also, IIRC log_min_duration_statement applies only to completed statements.

Yes, you would need log_statement = true.

> > I think there are pretty obvious performance and memory-consumption
> > penalties to very large track_activity_query_size values.  Who exactly
> > are we really helping if we let them set it to huge values?
> > 
> > (wanders away wondering if we have suitable integer-overflow checks
> > in relevant code paths...)
> 
> 
> The value of pgstat_track_activity_query_size is in bytes, so setting it to
> any value below INT_MAX seems to be safe from that perspective. However,
> being multiplied by NumBackendStatSlots its reasonable value should be far
> below INT_MAX (~2 GB).
> 
> Sincerely, It does not look for me like something badly needed, but still.
> We already have hundreds of GUCs and it is easy for a user to build a
> sub-optimal configuration, so does this overprotection make sense?

I can imagine using larger pgstat_track_activity_query_size values for
data warehouse queries, where they are long and there are only a few of
them.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2019-12-20 Thread Amit Kapila
On Fri, Dec 20, 2019 at 2:00 PM Kyotaro Horiguchi
 wrote:
>
> Hello.
>
> At Fri, 13 Dec 2019 14:46:20 +0530, Amit Kapila  
> wrote in
> > On Wed, Dec 11, 2019 at 11:46 PM Robert Haas  wrote:
> > >
> > > On Mon, Dec 2, 2019 at 3:32 AM Dilip Kumar  wrote:
> > > > I have rebased the patch set on the latest head.
> > >
> > > 0001 looks like a clever approach, but are you sure it doesn't hurt
> > > performance when many small XLOG records are being inserted? I think
> > > XLogRecordAssemble() can get pretty hot in some workloads.
> > >
> > > With regard to 0002, logging a separate WAL record for each
> > > invalidation seems painful; I think most operations that generate
> > > invalidations generate a bunch of them all at once. Perhaps you could
> > > just queue up invalidations as they happen, and then force anything
> > > that's been queued up to be emitted into WAL just before you emit any
> > > WAL record that might need to be decoded.
> > >
> >
> > I feel we can log the invalidations of the entire command at one go if
> > we log at CommandEndInvalidationMessages.  We already have all the
> > invalidations of current command in
> > transInvalInfo->CurrentCmdInvalidMsgs.  This can save us the effort of
> > maintaining a new separate list/queue for invalidations and to a good
> > extent, it will ameliorate your concern of logging each invalidation
> > separately.
>
> I have a question on this. Does that mean that the current logical
> decoder (or reorderbuffer)
>

What does currently refer to here?  Is it about HEAD or about the
patch?   Without the patch, we decode only at commit time and by that
time we have all invalidations (logged with commit WAL record), so we
just execute them at each catalog change (see the actions in
REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID).  The patch has to
separately WAL log each invalidation because we can decode the
intermittent changes, so we can't wait till commit.  The above is just
an optimization for the patch.  AFAIK, there is no correctness issue
here, but let me know if you see any.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2019-12-20 Thread Amit Kapila
On Fri, Dec 20, 2019 at 11:47 AM Masahiko Sawada
 wrote:
>
> On Mon, 2 Dec 2019 at 17:32, Dilip Kumar  wrote:
> >
> > On Sun, Dec 1, 2019 at 7:58 AM Michael Paquier  wrote:
> > >
> > > On Fri, Nov 22, 2019 at 01:18:11PM +0530, Dilip Kumar wrote:
> > > > I have rebased the patch on the latest head and also fix the issue of
> > > > "concurrent abort handling of the (sub)transaction." and attached as
> > > > (v1-0013-Extend-handling-of-concurrent-aborts-for-streamin) along with
> > > > the complete patch set.  I have added the version number so that we
> > > > can track the changes.
> > >
> > > The patch has rotten a bit and does not apply anymore.  Could you
> > > please send a rebased version?  I have moved it to next CF, waiting on
> > > author.
> >
> > I have rebased the patch set on the latest head.
>
> Thank you for working on this.
>
> This might have already been discussed but I have a question about the
> changes of logical replication worker. In the current logical
> replication there is a problem that the response time are doubled when
> using synchronous replication because wal senders send changes after
> commit. It's worse especially when a transaction makes a lot of
> changes. So I expected this feature to reduce the response time by
> sending changes even while the transaction is progressing but it
> doesn't seem to be. The logical replication worker writes changes to
> temporary files and applies these changes when the worker received
> commit record (STREAM COMMIT). Since the worker sends the LSN of
> commit record as flush LSN to the publisher after applying all
> changes, the publisher must wait for all changes are applied to the
> subscriber.
>

The main aim of this feature is to reduce apply lag.  Because if we
send all the changes together it can delay there apply because of
network delay, whereas if most of the changes are already sent, then
we will save the effort on sending the entire data at commit time.
This in itself gives us decent benefits.  Sure, we can further improve
it by having separate workers (dedicated to apply the changes) as you
are suggesting and in fact, there is a patch for that as well(see the
performance results and bgworker patch at [1]), but if try to shove in
all the things in one go, then it will be difficult to get this patch
committed (there are already enough things and the patch is quite big
that to get it right takes a lot of energy).  So, the plan is
something like that first we get the basic feature and then try to
improve by having dedicated workers or things like that.  Does this
make sense to you?

[1] - 
https://www.postgresql.org/message-id/8eda5118-2dd0-79a1-4fe9-eec7e334de17%40postgrespro.ru

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: backup manifests

2019-12-20 Thread Suraj Kharage
Thank you for review comments.

On Thu, Dec 19, 2019 at 2:54 AM Robert Haas  wrote:

> On Tue, Dec 17, 2019 at 12:54 AM Suraj Kharage
>  wrote:
> > I have implemented the simplehash in backup validator patch as Robert
> suggested. Please find attached 0002 patch for the same.
> >
> > kindly review and let me know your thoughts.
>
> +#define CHECKSUM_LENGTH 256
>
> This seems wrong. Not all checksums are the same length, and none of
> the ones we're using are 256 bytes in length, and if we've got to have
> a constant someplace for the maximum checksum length, it should
> probably be in the new header file, not here. But I don't think we
> should need this in the first place; see comments below about how to
> revise the parsing of the manifest file.
>

I agree. Removed.

+charfiletype[10];
>
> A mysterious 10-byte field with no comments explaining what it
> means... and the same magic number 10 appears in at least one other
> place in the patch.
>

with current logic, we don't need this anymore.
I have removed the filetype from the structure as we are not doing any
comparison anywhere.


>
> +typedef struct manifesthash_hash *hashtab;
>
> This declares a new *type* called hashtab, not a variable called
> hashtab. The new type is not used anywhere, but later, you have
> several variables of the same type that have this name. Just remove
> this: it's wrong and unused.
>
>
corrected.


> +static enum ChecksumAlgorithm checksum_type = MC_NONE;
>
> Remove "enum". Not needed, because you have a typedef for it in the
> header, and not per style.
>
> corrected.


> +static  manifesthash_hash *create_manifest_hash(char
> manifest_path[MAXPGPATH]);
>
> Whitespace is wrong. The whole patch needs a visit from pgindent with
> a properly-updated typedefs.list.
>
> Also, you will struggle to find anywhere else in the code base where
> pass a character array as a function argument. I don't know why this
> isn't just char *.
>

Corrected.


>
> +if(verify_backup)
>
> Whitespace wrong here, too.
>
>
Fixed


>
> It's also pretty unhelpful, here and elsewhere, to refer to "the hash
> table" as if there were only one, and as if the reader were supposed
> to know something about it when you haven't told them anything about
> it.
>
> +if (!entry->matched)
> +{
> +pg_log_info("missing file: %s", entry->filename);
> +}
> +
>
> The braces here are not project style. We usually omit braces when
> only a single line of code is present.
>

fixed


>
> I think some work needs to be done to standardize and improve the
> messages that get produced here.  You have:
>
> 1. missing file: %s
> 2. duplicate file present: %s
> 3. size changed for file: %s, original size: %d, current size: %zu
> 4. checksum difference for file: %s
> 5. extra file found: %s
>
> I suggest:
>
> 1. file \"%s\" is present in manifest but missing from the backup
> 2. file \"%s\" has multiple manifest entries
> (this one should probably be pg_log_error(), not pg_log_info(), as it
> represents a corrupt-manifest problem)
> 3. file \"%s" has size %lu in manifest but size %lu in backup
> 4. file \"%s" has checksum %s in manifest but checksum %s in backup
> 5. file \"%s" is present in backup but not in manifest
>

Corrected.


>
> Your patch actually doesn't compile on my system, because for the
> third message above, it uses %zu to print the size. But %zu is for
> size_t, not off_t. I went looking for other places in the code where
> we print off_t; based on that, I think the right thing to do is to
> print it using %lu and write (unsigned long) st.st_size.
>

Corrected.

+charfile_checksum[256];
> +charheader[1024];
>
> More arbitrary constants.



>
> +if (!file)
> +{
> +pg_log_error("could not open backup_manifest");
>
> That's bad error reporting.  See e.g. readfile() in initdb.c.
>

Corrected.


>
> +if (fscanf(file, "%1023[^\n]\n", header) != 1)
> +{
> +pg_log_error("error while reading the header from
> backup_manifest");
>
> That's also bad error reporting. It is only a slight step up from
> "ERROR: error".
>
> And we have another magic number (1023).
>

With current logic, we don't need this anymore.


>
> +appendPQExpBufferStr(manifest, header);
> +appendPQExpBufferStr(manifest, "\n");
> ...
> +appendPQExpBuffer(manifest, "File\t%s\t%d\t%s\t%s\n", filename,
> +  filesize, mtime, checksum_with_type);
>
> This whole thing seems completely crazy to me. Basically, you're
> trying to use fscanf() to parse the file. But then, because fscanf()
> doesn't give you the original bytes back, you're trying to reassemble
> the data that you parsed to recover the original line, so that you can
> stuff it in the buffer and eventually checksum it. However, that's
> highly error-prone. You're basically duplicating server code, and thus
> risking getting out of sync in the server code, to work around a
> problem that is entirely 

Re: Optimizing TransactionIdIsCurrentTransactionId()

2019-12-20 Thread Robert Haas
On Fri, Dec 20, 2019 at 12:46 AM Simon Riggs  wrote:
> If the TopTransactionId is not assigned, we can leave the whole function more 
> quickly, not just avoid a test.

Those things are not really any different from each other. You leave
the function when you've done all the necessary tests

> Read only transactions should have a fast path thru this function since they 
> frequently read more data than write transactions.

With regard to this point, I second Tomas's comments.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




problem with read-only user

2019-12-20 Thread ROS Didier
Hi
I created a read-only role as follows:
psql -p 5434 kidsdpn03
CREATE ROLE kidsdpn03_ro PASSWORD 'xxx';
ALTER ROLE kidsdpn03_ro WITH LOGIN;
GRANT CONNECT ON DATABASE kidsdpn03 TO kidsdpn03_ro;
GRANT USAGE ON SCHEMA kidsdpn03 TO kidsdpn03_ro;
GRANT SELECT ON ALL TABLES IN SCHEMA kidsdpn03 TO kidsdpn03_ro;
GRANT SELECT ON ALL SEQUENCES IN SCHEMA kidsdpn03 TO kidsdpn03_ro;
ALTER DEFAULT PRIVILEGES IN SCHEMA kidsdpn03 GRANT SELECT ON TABLES TO 
kidsdpn03_ro;
ALTER ROLE kidsdpn03_ro SET search_path TO kidsdpn03;

but when i create new tables, i don't have read access to those new  tables. 
Anybody can help to solve this problem ?
Thank you in advance

Didier ROS
didier@edf.fr







Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.


This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.


Re: [HACKERS] Block level parallel vacuum

2019-12-20 Thread Prabhat Sahu
Hi,

While testing this feature with parallel vacuum on "TEMPORARY TABLE", I got
a server crash on PG Head+V36_patch.
Changed configuration parameters and Stack trace are as below:

autovacuum = on
max_worker_processes = 4
shared_buffers = 10MB
max_parallel_workers = 8
max_parallel_maintenance_workers = 8
vacuum_cost_limit = 2000
vacuum_cost_delay = 10
min_parallel_table_scan_size = 8MB
min_parallel_index_scan_size = 0

-- Stack trace:
[centos@parallel-vacuum-testing bin]$ gdb -q -c data/core.1399 postgres
Reading symbols from
/home/centos/BLP_Vacuum/postgresql/inst/bin/postgres...done.
[New LWP 1399]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: autovacuum worker   postgres
   '.
Program terminated with signal 6, Aborted.
#0  0x7f4517d80337 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install
glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64
krb5-libs-1.15.1-37.el7_7.2.x86_64 libcom_err-1.42.9-16.el7.x86_64
libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64
openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64
zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x7f4517d80337 in raise () from /lib64/libc.so.6
#1  0x7f4517d81a28 in abort () from /lib64/libc.so.6
#2  0x00a96341 in ExceptionalCondition (conditionName=0xd18efb
"strvalue != NULL", errorType=0xd18eeb "FailedAssertion",
fileName=0xd18ee0 "snprintf.c", lineNumber=442) at assert.c:67
#3  0x00b02522 in dopr (target=0x7ffdb0e38450, format=0xc5fa95
".%s\"", args=0x7ffdb0e38538) at snprintf.c:442
#4  0x00b01ea6 in pg_vsnprintf (str=0x256df50 "autovacuum: dropping
orphan temp table \"postgres.", '\177' ..., count=1024,
fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"",
args=0x7ffdb0e38538) at snprintf.c:195
#5  0x00afbadf in pvsnprintf (buf=0x256df50 "autovacuum: dropping
orphan temp table \"postgres.", '\177' ..., len=1024,
fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"",
args=0x7ffdb0e38538) at psprintf.c:110
#6  0x00afd34b in appendStringInfoVA (str=0x7ffdb0e38550,
fmt=0xc5fa68 "autovacuum: dropping orphan temp table \"%s.%s.%s\"",
args=0x7ffdb0e38538)
at stringinfo.c:149
#7  0x00a970fd in errmsg (fmt=0xc5fa68 "autovacuum: dropping orphan
temp table \"%s.%s.%s\"") at elog.c:832
#8  0x008588d2 in do_autovacuum () at autovacuum.c:2249
#9  0x00857b29 in AutoVacWorkerMain (argc=0, argv=0x0) at
autovacuum.c:1689
#10 0x0085772f in StartAutoVacWorker () at autovacuum.c:1483
#11 0x0086e64f in StartAutovacuumWorker () at postmaster.c:5562
#12 0x0086e106 in sigusr1_handler (postgres_signal_arg=10) at
postmaster.c:5279
#13 
#14 0x7f4517e3f933 in __select_nocancel () from /lib64/libc.so.6
#15 0x00869838 in ServerLoop () at postmaster.c:1691
#16 0x00869212 in PostmasterMain (argc=3, argv=0x256bd70) at
postmaster.c:1400
#17 0x0077855d in main (argc=3, argv=0x256bd70) at main.c:210
(gdb)

I have tried to reproduce the same with all previously executed queries but
now I am not able to reproduce the same.


On Thu, Dec 19, 2019 at 11:26 AM Mahendra Singh  wrote:

> On Wed, 18 Dec 2019 at 12:07, Amit Kapila  wrote:
> >
> > [please trim extra text before responding]
> >
> > On Wed, Dec 18, 2019 at 12:01 PM Mahendra Singh 
> wrote:
> > >
> > > On Tue, 10 Dec 2019 at 00:30, Mahendra Singh 
> wrote:
> > > >
> > > >
> > > > 3.
> > > > After v35 patch, vacuum.sql regression test is taking too much time
> due to large number of inserts so by reducing number of tuples, we can
> reduce that time.
> > > > +INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM
> generate_series(1,10) i;
> > > >
> > > > here, instead of 10, we can make 1000 to reduce time of this
> test case because we only want to test code and functionality.
> > >
> > > As we added check of min_parallel_index_scan_size in v36 patch set to
> > > decide parallel vacuum, 1000 tuples are not enough to do parallel
> > > vacuum. I can see that we are not launching any workers in vacuum.sql
> > > test case and hence, code coverage also decreased. I am not sure that
> > > how to fix this.
> > >
> >
> > Try by setting min_parallel_index_scan_size to 0 in test case.
>
> Thanks Amit for the fix.
>
> Yes, we can add "set min_parallel_index_scan_size = 0;" in vacuum.sql
> test case. I tested by setting min_parallel_index_scan_size=0 and it
> is working fine.
>
> @Masahiko san, please add above line in vacuum.sql test case.
>
> Thanks and Regards
> Mahendra Thalor
> EnterpriseDB: http://www.enterprisedb.com
>
>
>

-- 

With Regards,

Prabhat Kumar Sahu
Skype ID: prabhat.sahu1984
EnterpriseDB Software India Pvt. Ltd.

The Postgres Database Company


Re: [PATCH] Increase the maximum value track_activity_query_size

2019-12-20 Thread Alexey Kondratov

On 19.12.2019 20:52, Robert Haas wrote:

On Thu, Dec 19, 2019 at 10:59 AM Tom Lane  wrote:

Bruce Momjian  writes:

Good question.  I am in favor of allowing a larger value if no one
objects.  I don't think adding the min/max is helpful.


The original poster.



And probably anyone else, who debugs stuck queries of yet another crazy 
ORM. Yes, one could use log_min_duration_statement, but having a 
possibility to directly get it from pg_stat_activity without eyeballing 
the logs is nice. Also, IIRC log_min_duration_statement applies only to 
completed statements.



I think there are pretty obvious performance and memory-consumption
penalties to very large track_activity_query_size values.  Who exactly
are we really helping if we let them set it to huge values?

(wanders away wondering if we have suitable integer-overflow checks
in relevant code paths...)



The value of pgstat_track_activity_query_size is in bytes, so setting it 
to any value below INT_MAX seems to be safe from that perspective. 
However, being multiplied by NumBackendStatSlots its reasonable value 
should be far below INT_MAX (~2 GB).


Sincerely, It does not look for me like something badly needed, but 
still. We already have hundreds of GUCs and it is easy for a user to 
build a sub-optimal configuration, so does this overprotection make sense?



Regards

--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company





Re: Fetching timeline during recovery

2019-12-20 Thread Jehan-Guillaume de Rorthais
On Fri, 20 Dec 2019 13:41:25 +0900 (JST)
Kyotaro Horiguchi  wrote:

> At Fri, 20 Dec 2019 00:35:19 +0100, Jehan-Guillaume de Rorthais
>  wrote in 
> > On Fri, 13 Dec 2019 16:12:55 +0900
> > Michael Paquier  wrote:  
> 
> The first one;
> 
> > > I mentioned a SRF function which takes an input argument, but that
> > > makes no sense.  What I would prefer having is just having one
> > > function, returning one row (LSN, TLI), using in input one argument to
> > > extract the WAL information the caller wants with five possible cases
> > > (write, insert, flush, receive, replay).  
> > 
> > It looks odd when we look at other five existing functions of the same
> > family but without the tli. And this user interaction with admin function
> > is quite different of what we are used to with other admin funcs. But
> > mostly, when I think of such function, I keep thinking this parameter
> > should be a WHERE clause after a SRF function.
> > 
> > -1  
> 
> It is realted to the third one, it may be annoying that the case names
> cannot have an aid of psql-completion..

indeed.

> The second one;
> 
> > > Then, what you are referring to is one function which returns all
> > > (LSN,TLI) for the five cases (write, insert, etc.), so it would return
> > > one row with 10 columns, with NULL mapping to the values which have no
> > > meaning (like replay on a primary).  
> > 
> > This would looks like some other pg_stat_* functions, eg.
> > pg_stat_get_archiver. I'm OK with this. This could even be turned as a
> > catalog view.
> > 
> > However, what's the point of gathering all the values eg from a production
> > cluster? Is it really useful to compare current/insert/flush LSN from wal
> > writer?  
> 
> There is a period where pg_controldata shows the previous TLI after
> promotion. It's useful if we can read the up-to-date TLI from live
> standby. I thought that this project is for that case..

I was not asking about the usefulness of LSN+TLI itself. 
I was wondering about the usecase of gathering all 6 cols current+tli,
insert+tli and flush+tli from a production/primary cluster.

[...]
> > As a fourth possibility, as I badly explained my last implementation
> > details, I still hope we can keep it in the loop here. Just overload
> > existing functions with ones that takes a boolean as parameter and add the
> > TLI as a second field, eg.:
> > 
> > Name   | Result type  | Argument data types
> > ---+--+---
> > pg_current_wal_lsn | pg_lsn   |
> > pg_current_wal_lsn | SETOF record | with_tli bool, OUT lsn pg_lsn, OUT tli
> > int  
> 
> I prefer this one, in the sense of similarity with existing functions.

thanks

> > And the fifth one, implementing brand new functions:
> > 
> >  pg_current_wal_lsn_tli
> >  pg_current_wal_insert_lsn_tli
> >  pg_current_wal_flush_lsn_tli
> >  pg_last_wal_receive_lsn_tli
> >  pg_last_wal_replay_lsn_tli  
> 
> M We should remove exiting ones instead? (Of couse we don't,
> though.)

Yes, that would be great but sadly, it would introduce a regression on various
tools relying on them. At least, the one doing "select *" or most
probably "select func()".

But anyway, adding 5 funcs is not a big deal neither. Too bad they are so close
to existing ones though.

> > > I actually prefer the first one, and you mentioned the second.  But
> > > there could be a point in doing the third one.  An advantage of the
> > > second and third ones is that you may be able to get a consistent view
> > > of all the data, but it means holding locks to look at the values a
> > > bit longer.  Let's see what others think.  
> > 
> > I like the fourth one, but I was not able to return only one field if given
> > parameter is false or NULL. Giving false as argument to these funcs has no
> > meaning compared to the original one without arg. I end up with this
> > solution because I was worried about adding five more funcs really close to
> > some existing one.  
> 
> Right. It is a restriction of polymorphic functions. It is in the same
> relation with pg_stop_backup() and pg_stop_backup(true).

indeed.





Re: Disallow cancellation of waiting for synchronous replication

2019-12-20 Thread Andrey Borodin



> 20 дек. 2019 г., в 12:23, Marco Slot  написал(а):
> 
> On Fri, Dec 20, 2019 at 6:04 AM Andrey Borodin  wrote:
>> I think proper solution here would be to add GUC to disallow cancellation of 
>> synchronous replication. Retry step 3 will wait on locks after hanging 1 and 
>> data will be consistent.
>> Three is still a problem when backend is not canceled, but terminated [2]. 
>> Ideal solution would be to keep locks on changed data. Some well known 
>> databases threat termination of synchronous replication as system failure 
>> and refuse to operate until standbys appear (see Maximum Protection mode). 
>> From my point of view it's enough to PANIC once so that HA tool be informed 
>> that something is going wrong.
> 
> Sending a cancellation is currently the only way to resume after
> disabling synchronous replication. Some HA solutions (e.g.
> pg_auto_failover) rely on this behaviour. Would it be worth checking
> whether synchronous replication is still required?

I think changing synchronous_standby_names to some available standbys will 
resume all backends waiting for synchronous replication.
Do we need to check necessity of synchronous replication in any other case?

Best regards, Andrey Borodin.



Re: How is this possible "publication does not exist"

2019-12-20 Thread Jehan-Guillaume de Rorthais
On Thu, 19 Dec 2019 19:19:56 +0100
Peter Eisentraut  wrote:

> On 2019-12-19 19:15, Dave Cramer wrote:
> > It seems that if you drop the publication on an existing slot it needs 
> > to be recreated. Is this expected behaviour  
> 
> A publication is not associated with a slot.  Only a subscription is 
> associated with a slot.

Couldn't it be the same scenario I reported back in october? See:

  Subject: Logical replication dead but synching
  Date: Thu, 10 Oct 2019 11:57:52 +0200





Re: Memory-Bounded Hash Aggregation

2019-12-20 Thread Adam Lee
On Sat, Dec 14, 2019 at 06:32:25PM +0100, Tomas Vondra wrote:
> I've done a bit more testing on this, after resolving a couple of minor
> conflicts due to recent commits (rebased version attached).
> 
> In particular, I've made a comparison with different dataset sizes,
> group sizes, GUC settings etc. The script and results from two different
> machines are available here:
> 
> The script essentially runs a simple grouping query with different
> number of rows, groups, work_mem and parallelism settings. There's
> nothing particularly magical about it.

Nice!

> I did run it both on master and patched code, allowing us to compare
> results and assess impact of the patch. Overall, the changes are
> expected and either neutral or beneficial, i.e. the timing are the same
> or faster.
> 
> The number of cases that regressed is fairly small, but sometimes the
> regressions are annoyingly large - up to 2x in some cases. Consider for
> example this trivial example with 100M rows:

I suppose this is because the patch has no costing changes yet. I hacked
a little to give hash agg a spilling punish, just some value based on
(groups_in_hashtable * num_of_input_tuples)/num_groups_from_planner, it
would not choose hash aggregate in this case.

However, that punish is wrong, because comparing to the external sort
algorithm, hash aggregate has the respilling, which involves even more
I/O, especially with a very large number of groups but a very small
number of tuples in a single group like the test you did. It would be a
challenge.

BTW, Jeff, Greenplum has a test for hash agg spill, I modified a little
to check how many batches a query uses, it's attached, not sure if it
would help.

-- 
Adam Lee
create schema hashagg_spill;
set search_path to hashagg_spill;

-- start_ignore
CREATE EXTENSION plpythonu;
-- end_ignore

-- Create a function which checks how many batches there are
-- output example: "Memory Usage: 4096kB  Batches: 68  Disk Usage:8625kB"
create or replace function hashagg_spill.num_batches(explain_query text)
returns setof int as
$$
import re
rv = plpy.execute(explain_query)
search_text = 'batches'
result = []
for i in range(len(rv)):
cur_line = rv[i]['QUERY PLAN']
if search_text.lower() in cur_line.lower():
p = re.compile('.*Memory Usage: (\d+).* Batches: (\d+)  Disk.*')
m = p.match(cur_line)
batches = int(m.group(2))
result.append(batches)
return result
$$
language plpythonu;

create table testhagg (i1 int, i2 int, i3 int, i4 int);
insert into testhagg select i,i,i,i from generate_series(1, 3)i;

set work_mem="1800";

select * from (select max(i1) from testhagg group by i2) foo order by 1 limit 
10;
select num_batches > 0 from hashagg_spill.num_batches('explain (analyze, 
verbose) select max(i1) from testhagg group by i2;') num_batches;
select num_batches > 3 from hashagg_spill.num_batches('explain (analyze, 
verbose) select max(i1) from testhagg group by i2 limit 9;') num_batches;

reset all;
set search_path to hashagg_spill;

-- Test agg spilling scenarios
create table aggspill (i int, j int, t text);
insert into aggspill select i, i*2, i::text from generate_series(1, 1) i;
insert into aggspill select i, i*2, i::text from generate_series(1, 10) i;
insert into aggspill select i, i*2, i::text from generate_series(1, 100) i;

-- No spill with large statement memory
set work_mem = '125MB';
select * from hashagg_spill.num_batches('explain (analyze, verbose)
select count(*) from (select i, count(*) from aggspill group by i,j having 
count(*) = 1) g;');

-- Reduce the statement memory to induce spilling
set work_mem = '10MB';
select num_batches > 32 from hashagg_spill.num_batches('explain (analyze, 
verbose)
select count(*) from (select i, count(*) from aggspill group by i,j having 
count(*) = 2) g;') num_batches;

-- Reduce the statement memory, more batches
set work_mem = '5MB';

select num_batches > 64 from hashagg_spill.num_batches('explain (analyze, 
verbose)
select count(*) from (select i, count(*) from aggspill group by i,j having 
count(*) = 3) g;') num_batches;

-- Check spilling to a temp tablespace
SET work_mem='1000kB';

CREATE TABLE hashagg_spill(col1 numeric, col2 int);
INSERT INTO hashagg_spill SELECT id, 1 FROM generate_series(1,2) id;
ANALYZE hashagg_spill;

CREATE TABLE spill_temptblspace (a numeric);
SET temp_tablespaces=pg_default;
INSERT INTO spill_temptblspace SELECT avg(col2) col2 FROM hashagg_spill GROUP 
BY col1 HAVING(sum(col1)) < 0;
RESET temp_tablespaces;
RESET work_mem;

drop schema hashagg_spill cascade;


Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'

2019-12-20 Thread Mark Lorenz

Hi,

while preparing the patch for the Commitfest, I found a bug in the 
to_char() function that is quite correlated with this issue:


SELECT to_char('1997-02-01'::date, '-WW-D')

returns: 1997-05-7 -> which is ok, I believe. Feb, 1st was on Saturday, 
so counting from Sundays, it was day 7 of week 5.


SELECT to_char('1997-02-03'::date, '-WW-D')

returns: 1997-05-2 -> This cannot be. The input date is two days laters, 
but the result is 5 days earlier. I'd expect 1997-06-2 as result, but 
this occurs another week later:


SELECT to_char('1997-02-10'::date, '-WW-D')

This is wrong, because this should be week 7 instead. On the other hand, 
the ISO week formats work very well.


I'll have a look at the code and try to fix it in the patch as well.

Kind regards,
Mark


Am 2019-10-08 17:49, schrieb Mark Lorenz:

Hi,

I apologize for the mistake.

For the mailing list correspondence I created this mail account. But I
forgot to change the sender name. So, the "postgres" name appeared as
sender name in the mailing list. I changed it.

Kind regards,
Mark/S-Man42


Hi,

some days ago I ran into a problem with the to_date() function. I
originally described it on StackExchange:
https://dba.stackexchange.com/questions/250111/unexpected-behaviour-for-to-date-with-week-number-and-week-day

The problem:

If you want to parse a date string with year, week and day of week,
you can do this using the ISO week pattern: 'IYYY-IW-ID'. This works
as expected:

date string |  to_date()
+
'2019-1-1'  |  2018-12-31  -> Monday of the first week of the year
(defined as the week that includes the 4th of January)
'2019-1-2'  |  2019-01-01
'2019-1-3'  |  2019-01-02
'2019-1-4'  |  2019-01-03
'2019-1-5'  |  2019-01-04
'2019-1-6'  |  2019-01-05
'2019-1-7'  |  2019-01-06

'2019-2-1'  |  2019-01-07
'2019-2-2'  |  2019-01-08

But if you are trying this with the non-ISO pattern '-WW-D', the
result was not expected:

date string |  to_date()
-
'2019-1-1'  |  2019-01-01
'2019-1-2'  |  2019-01-01
'2019-1-3'  |  2019-01-01
'2019-1-4'  |  2019-01-01
'2019-1-5'  |  2019-01-01
'2019-1-6'  |  2019-01-01
'2019-1-7'  |  2019-01-01

'2019-2-1'  |  2019-01-08
'2019-2-2'  |  2019-01-08

As you can see, the 'D' part of the pattern doesn't influence the
resulting date.

The answer of Laurenz Albe pointed to a part of the documentation, I
missed so far:

"In to_timestamp and to_date, weekday names or numbers (DAY, D, and
related field types) are accepted but are ignored for purposes of
computing the result. The same is true for quarter (Q) fields."
(https://www.postgresql.org/docs/12/functions-formatting.html)

So, I had a look at the relevant code part. I decided to try a patch
by myself. Now it works as I would expect it:

date string |  to_date()
-
'2019-1-1'  |  2018-12-30 -> Sunday (!) of the first week of the year
(the first week is at the first day of year)
'2019-1-2'  |  2018-12-31
'2019-1-3'  |  2019-01-01
'2019-1-4'  |  2019-01-02
'2019-1-5'  |  2019-01-03
'2019-1-6'  |  2019-01-04
'2019-1-7'  |  2019-01-05

'2019-2-1'  |  2019-01-06
'2019-2-2'  |  2019-01-07

Furthermore, if you left the 'D' part, the date would be always set to
the first day of the corresponding week (in that case it is Sunday, in
contrast to the ISO week, which starts mondays).

To be consistent, I added similar code for the week of month pattern
('W'). So, using the pattern '-MM-W-D' yields in:

date string   |  to_date()
---
'2018-12-5-1' |  2018-12-23
'2018-12-6-1' |  2018-12-30
'2019-1-1-1'  |  2018-12-30 -> First day (Su) of the first week of the
first month of the year
'2019-2-2-1'  |  2019-02-03 -> First day (Su) of the second week of 
February
'2019-10-3-5' |  2019-10-17 -> Fifth day (Th) of the third week of 
October


If you left the 'D', it would be set to 1 as well.

The code can be seen here:
https://github.com/S-Man42/postgres/commit/534e6bd70e23864f385d60ecf187496c7f4387c9

I hope, keeping the code style of the surrounding code (especially the
ISO code) is ok for you.

Now the questions:
1. Although the ignorance of the 'D' pattern is well documented, does
the new behaviour might be interesting for you?
2. Does it work as you'd expect it?
3. Because this could be my very first contribution to the PostgreSQL
code base, I really want you to be as critical as possible. I am not
quite sure if I didn't miss something important.
4. Currently something like '2019-1-8' does not throw an exception but
results in the same as '2019-2-1' (8th is the same as the 1st of the
next week). On the other hand, currently, the ISO week conversion
gives out the result of '2019-1-7' for every 'D' >= 7. I am not sure
if this is better. I think a consistent exception handling should be
discussed separately (date roll over vs. out of range exception vs.
ISO week behaviour)

So far, I am very curious about your opinions!

Kind regards,
Mark/S-Man42





Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2019-12-20 Thread Kyotaro Horiguchi
Hello.

At Fri, 13 Dec 2019 14:46:20 +0530, Amit Kapila  wrote 
in 
> On Wed, Dec 11, 2019 at 11:46 PM Robert Haas  wrote:
> >
> > On Mon, Dec 2, 2019 at 3:32 AM Dilip Kumar  wrote:
> > > I have rebased the patch set on the latest head.
> >
> > 0001 looks like a clever approach, but are you sure it doesn't hurt
> > performance when many small XLOG records are being inserted? I think
> > XLogRecordAssemble() can get pretty hot in some workloads.
> >
> > With regard to 0002, logging a separate WAL record for each
> > invalidation seems painful; I think most operations that generate
> > invalidations generate a bunch of them all at once. Perhaps you could
> > just queue up invalidations as they happen, and then force anything
> > that's been queued up to be emitted into WAL just before you emit any
> > WAL record that might need to be decoded.
> >
> 
> I feel we can log the invalidations of the entire command at one go if
> we log at CommandEndInvalidationMessages.  We already have all the
> invalidations of current command in
> transInvalInfo->CurrentCmdInvalidMsgs.  This can save us the effort of
> maintaining a new separate list/queue for invalidations and to a good
> extent, it will ameliorate your concern of logging each invalidation
> separately.

I have a question on this. Does that mean that the current logical
decoder (or reorderbuffer) may emit incorrect result if it made a
catalog change during the current transaction being decoded? If so,
this is not a feature but a bug fix.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: automating pg_config.h.win32 maintenance

2019-12-20 Thread Peter Eisentraut

On 2019-12-19 08:49, Michael Paquier wrote:

On Thu, Dec 19, 2019 at 08:31:05AM +0100, Peter Eisentraut wrote:

On 2019-12-19 04:59, Michael Paquier wrote:

This part needs a comment.  Like it is the equivalent of what
src/common/'s Makefile does or something like that?


This was meant to be addressed by 
,
but that discussion has not concluded yet.  Perhaps it makes more sense in
this context.


Hmm.  Your patch does not really change the generation of
VAL_CONFIGURE in pg_config.h, so I am not sure that this other thread
is an actual barrier for the improvement discussed here.  I would be
actually fine to just remove the XXX and still use GetFakeConfigure()
with VAL_CONFIGURE for now.  It would be a good thing to get rid of
pg_config.h.win32 definitely.


committed with that comment removed

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Implementing Incremental View Maintenance

2019-12-20 Thread Tatsuo Ishii
> I'm starting to take a closer look at this feature.  I've just finished 
> reading the discussion, excluding other referenced materials.

Thank you!

> The following IVM wiki page returns an error.  Does anybody know what's wrong?
> 
> https://wiki.postgresql.org/wiki/Incremental_View_Maintenance

I don't have any problem with the page. Maybe temporary error?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp