### Re: [HACKERS] Order-preserving function transforms and EquivalenceClass

```Mat Arye  writes:
> We are trying to extend/hook into the planner so that it understands that
> date_trunc('minute', time) has the same ordering as time (or rather that a
> sort ordering on the latter is always a valid sort ordering on the former).
> But this question really applies to any order-preserving transform such as
> (time+1) vs (time).

Hmm ... it seems like the thing that is missing from your statement of the
problem is the notion of one ordering being a refinement of another.  You
can't just say that "date_trunc('minute', time) has the same ordering as
time", or that for a float value x "floor(x) has the same ordering as x".
You can say that a data series ordered by x is also ordered by floor(x),
*but not vice versa*.  So "same" seems like the wrong word.  A related
concept that the planner does understand today is that a data series
ordered by x,y is also ordered by x ... but not vice versa.

The EquivalenceClass mechanism doesn't seem to me to be up to representing
such considerations.  Maybe it could be extended, but I think there's some
pretty fundamental design work needed here.  And I'm not sure it should be
that mechanism in particular: ECs are mostly about making transitivity
deductions, ie a=b and b=c implies a=c.  I'm not very sure what we'd want
an ordering-refinements notion to act like, but I'm pretty sure it's not
like equality.

Thinking about the "ORDER BY x,y" vs "ORDER BY x" precedent, I wonder
whether you could do something in the PathKey mechanism.  But PathKeys
currently depend on per-column EquivalenceClasses, so it's not real
obvious how to proceed there either.

If you've got ideas about what this should look like, I'm all ears.
But it's going to take some pretty basic work, you're not going to
just plug into already-existing mechanism.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] comments in hash_alloc_buckets

```On Tue, Mar 21, 2017 at 10:11 AM, Ashutosh Sharma  wrote:
> Hi,
>
> While working on - [1], I realised that the following comments in
> _hash_alloc_buckets() needs to be corrected.
>
> /*
>  * Initialize the freed overflow page.  Just zeroing the page won't work,
>  * See _hash_freeovflpage for similar usage.
>  */
> _hash_pageinit(page, BLCKSZ);
>
> Attached is the patch that corrects above comment. Thanks.
>

- * Initialize the freed overflow page.  Just zeroing the page won't work,
+ * Initialize the last page in hash index.

I think saying ".. last page in hash index" sounds slightly awkward as
this is the last page for current split point, how about just
"Initialize the page. ..."

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Add pgstathashindex() to get hash index table statistics.

```On Thu, Mar 23, 2017 at 11:24 PM, Ashutosh Sharma  wrote:
> Hi,
>
> On Tue, Feb 7, 2017 at 9:23 AM, Robert Haas  wrote:
>> On Mon, Feb 6, 2017 at 10:40 PM, Amit Kapila  wrote:
Maybe we should call them "unused pages".
>>>
>>> +1.  If we consider some more names for that column then probably one
>>> alternative could be "empty pages".
>>
>> Yeah, but I think "unused" might be better.  Because a page could be
>> in use (as an overflow page or primary bucket page) and still be
>> empty.
>>
>
> Based on the earlier discussions, I have prepared a patch that would
> allow pgstathashindex() to show the number of unused pages in hash
> index. Please find the attached patch for the same. Thanks.
>

else if (opaque->hasho_flag & LH_BITMAP_PAGE)
stats.bitmap_pages++;
+ else if (PageIsEmpty(page))
+ stats.unused_pages++;

I think having this check after PageIsNew() makes more sense then
having at the place where you currently have, other than that
code-wise your patch looks okay, although I haven't tested it.

I think this should also be tracked under PostgreSQL 10 open items,
but as this is not directly a bug, so not sure, what do others think?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Declarative partitioning optimization for large amount of partitions

```On Fri, Mar 24, 2017 at 11:06 PM, Simon Riggs  wrote:
> On 1 March 2017 at 01:36, Amit Langote  wrote:
>
>> I don't know which way you're thinking of fixing this, but a planner patch
>> to implement faster partition-pruning will have taken care of this, I
>> think.  As you may know, even declarative partitioned tables currently
>> depend on constraint exclusion for partition-pruning and planner's current
>> approach of handling inheritance requires to open all the child tables
>> (partitions), whereas the new approach hopefully shouldn't need to do
>> that.  I am not sure if looking for a more localized fix for this would be
>> worthwhile, although I may be wrong.
>
> What "new approach" are we discussing?
>
> Is there a patch or design discussion?

Neither at the moment.  As Aleksander said in his reply I was
referring to Dmitry Ivanov's plan of porting pg_pathman's planner
functionality to core that he mentioned on the declarative
partitioning thread back in December [1].

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/426b2b01-61e0-43aa-bd84-c6fcf516f1c3%40postgrespro.ru

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] scram and \password

```On Fri, Mar 24, 2017 at 10:12 PM, Heikki Linnakangas  wrote:
> On 03/24/2017 03:02 PM, Michael Paquier wrote:
>>
>> In order to close this thread, I propose to reuse the patches I sent
>> here to make scram_build_verifier() available to frontends:
>>
>> https://www.postgresql.org/message-id/CAB7nPqT4yc3u8wspYkWbG088Ndp6asMH3=zb___ck89ctvz...@mail.gmail.com
>>
>> And on top of it modify \password so as it generates a md5 verifier
>> for pre-9.6 servers and a scram one for post-10 servers by looking at
>> the backend version of the current connection. What do you think?
>
> Yep, sounds like a plan.

And attached is a set of rebased patches, with createuser and psql's
\password extended to do that.
--
Michael
From 21e73b1bc1cf1985b7ca0e638b0e752ffd4b89b1 Mon Sep 17 00:00:00 2001
From: Michael Paquier
Date: Sat, 25 Mar 2017 13:31:38 +0900
Subject: [PATCH 1/5] Use base64-based encoding for stored and server keys in
SCRAM verifiers

In order to be able to generate a SCRAM verifier even for frontends, let's
simplify the tools used to generate it and switch all the elements of the
verifiers to be base64-encoded using the routines already in place in
src/common/.
---
doc/src/sgml/catalogs.sgml |  2 +-
src/backend/libpq/auth-scram.c | 30 +-
4 files changed, 26 insertions(+), 22 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index ac39c639ed..30a288ab91 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1371,7 +1371,7 @@
identify the password as a SCRAM-SHA-256 verifier. The second field is a
salt, Base64-encoded, and the third field is the number of iterations used
to generate the password.  The fourth field and fifth field are the stored
-   key and server key, respectively, in hexadecimal format. A password that
+   key and server key, respectively, in Base64 format. A password that
does not follow either of those formats is assumed to be unencrypted.

diff --git a/src/backend/libpq/auth-scram.c b/src/backend/libpq/auth-scram.c
index bcc8d03ef5..e8068d3b7a 100644
--- a/src/backend/libpq/auth-scram.c
+++ b/src/backend/libpq/auth-scram.c
@@ -339,8 +339,8 @@ scram_build_verifier(const char *username, const char *password,
int iterations)
{
uint8		keybuf[SCRAM_KEY_LEN + 1];
-	char		storedkey_hex[SCRAM_KEY_LEN * 2 + 1];
-	char		serverkey_hex[SCRAM_KEY_LEN * 2 + 1];
+	char	   *encoded_storedkey;
+	char	   *encoded_serverkey;
char		salt[SCRAM_SALT_LEN];
char	   *encoded_salt;
int			encoded_len;
@@ -360,20 +360,25 @@ scram_build_verifier(const char *username, const char *password,
encoded_len = pg_b64_encode(salt, SCRAM_SALT_LEN, encoded_salt);
encoded_salt[encoded_len] = '\0';

-	/* Calculate StoredKey, and encode it in hex */
+	/* Calculate StoredKey, and encode it in base64 */
+	encoded_storedkey = palloc(pg_b64_enc_len(SCRAM_KEY_LEN) + 1);
iterations, SCRAM_CLIENT_KEY_NAME, keybuf);
scram_H(keybuf, SCRAM_KEY_LEN, keybuf);		/* StoredKey */
-	(void) hex_encode((const char *) keybuf, SCRAM_KEY_LEN, storedkey_hex);
-	storedkey_hex[SCRAM_KEY_LEN * 2] = '\0';
+	encoded_len = pg_b64_encode((const char *) keybuf, SCRAM_KEY_LEN,
+encoded_storedkey);
+	encoded_storedkey[encoded_len] = '\0';

/* And same for ServerKey */
+	encoded_serverkey = palloc(pg_b64_enc_len(SCRAM_KEY_LEN) + 1);
scram_ClientOrServerKey(password, salt, SCRAM_SALT_LEN, iterations,
SCRAM_SERVER_KEY_NAME, keybuf);
-	(void) hex_encode((const char *) keybuf, SCRAM_KEY_LEN, serverkey_hex);
-	serverkey_hex[SCRAM_KEY_LEN * 2] = '\0';
+	encoded_len = pg_b64_encode((const char *) keybuf, SCRAM_KEY_LEN,
+encoded_serverkey);
+	encoded_serverkey[encoded_len] = '\0';

-	return psprintf("scram-sha-256:%s:%d:%s:%s", encoded_salt, iterations, storedkey_hex, serverkey_hex);
+	return psprintf("scram-sha-256:%s:%d:%s:%s", encoded_salt, iterations,
+	encoded_storedkey, encoded_serverkey);
}

/*
@@ -483,17 +488,16 @@ parse_scram_verifier(const char *verifier, char **salt, int *iterations,
/* storedkey */
if ((p = strtok(NULL, ":")) == NULL)
goto invalid_verifier;
-	if (strlen(p) != SCRAM_KEY_LEN * 2)
+	if (strlen(p) != pg_b64_enc_len(SCRAM_KEY_LEN) - 1)
goto invalid_verifier;
-
-	hex_decode(p, SCRAM_KEY_LEN * 2, (char *) stored_key);
+	pg_b64_decode(p, pg_b64_enc_len(SCRAM_KEY_LEN), (char *) stored_key);

/* serverkey */
if ((p = strtok(NULL, ":")) == NULL)
goto invalid_verifier;
-	if (strlen(p) != SCRAM_KEY_LEN * 2)
+	if (strlen(p) != pg_b64_enc_len(SCRAM_KEY_LEN) - 1)
goto invalid_verifier;
-	hex_decode(p, SCRAM_KEY_LEN * 2, (char *) server_key);
+	pg_b64_decode(p, pg_b64_enc_len(SCRAM_KEY_LEN), (char *) server_key);

pfree(v);
return true;
diff ```

### Re: [HACKERS] Problem in Parallel Bitmap Heap Scan?

```On Tue, Mar 21, 2017 at 5:51 PM, Dilip Kumar  wrote:
> On Tue, Mar 21, 2017 at 4:47 PM, Thomas Munro
>  wrote:
>> I noticed a failure in the inet.sql test while running the regression
>> tests with parallelism cranked up, and can reproduce it interactively
>> as follows.  After an spgist index is created and the plan changes to
>> the one shown below, the query returns no rows.
>
> Thanks for reporting.  Seems like we are getting issues related to
> TBM_ONE_PAGE and TBM_EMPTY.
>
> I think in this area we need more testing, reason these are not tested
> properly because these are not the natural case for parallel bitmap.
> I think in next few days I will test more such cases by forcing the
> parallel bitmap.
>

Okay, is your testing complete?

> Here is the patch to fix the issue in hand.  I have also run the
> regress suit with force_parallel_mode=regress and all the test are
> passing.
>

Thomas, did you get chance to verify Dilip's latest patch?

I have added this issue in PostgreSQL 10 Open Items list so that we
don't loose track of this issue.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] crashes due to setting max_parallel_workers=0

```On Sat, Mar 25, 2017 at 7:40 AM, Tomas Vondra
wrote:
> Hi,
>
> while working on a patch I ran into some crashes that seem to be caused by
> inconsistent handling of max_parallel_workers - queries still seem to be
> planned with parallel plans enabled, but then crash at execution.
>
> The attached script reproduces the issue on a simple query, causing crashed
> in GatherMerge, but I assume the issue is more general.
>

I could see other parallel plans like Gather also picked at
max_parallel_workers=0.  I suspect multiple issues here and this needs
some investigation.  I have added this to open items list.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Logical replication existing data copy

```On 3/25/17 00:45, Mark Kirkwood wrote:
> Minor niggle:
>
> bench=# DROP PUBLICATION pgbench;
> DROP STATISTICS   <===
>
> I'm guessing that notification is wrong.

Fixed.

--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Logical replication existing data copy

```

On 25/03/17 07:52, Peter Eisentraut wrote:

On 3/24/17 10:09, Petr Jelinek wrote:

On 24/03/17 15:05, Peter Eisentraut wrote:

On 3/23/17 19:32, Petr Jelinek wrote:

Yes, I also forgot to check if the table actually exists on subscriber
when fetching them in CREATE SUBSCRIPTION (we have check during
replication but not there).

I think for this we can probably just change the missing_ok argument of
RangeVarGetRelid() to false.

Unless we want the custom error message, in which case we need to change
AlterSubscription_refresh(), because right now it errors out because
missing_ok = false.

You are right, stupid me.

Committed this version.

Minor niggle:

bench=# DROP PUBLICATION pgbench;
DROP STATISTICS   <===

I'm guessing that notification is wrong.

regards

Mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [POC] A better way to expand hash indexes.

```On Tue, Mar 21, 2017 at 8:16 AM, Amit Kapila  wrote:
> Sure, I was telling you based on that.  If you are implicitly treating
> it as 2-dimensional array, it might be easier to compute the array
>offsets.

I think calculating spares offset will not become anyway much simpler
we still need to calculate split group and split phase separately. I
have added a patch to show how a 2-dimensional spares code looks like
and where all we need changes.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

expand_hashbucket_efficiently_06_spares_2dimesion.patch
Description: Binary data

expand_hashbucket_efficiently_06_spares_1dimension.patch
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)

```> I'm wondering if this is a perl version/platform issue around
>
> \$tx->pump until \$stdout =~ /[[:digit:]]+[\r\n]\$/;
>
> where we're not recognising the required output from psql when we get it.
>
> What's in src/test/recovery/tmp_check/log/regress_log_011* ?
>
> I couldn't use PostgresNode::psql or PostgresNode::safe_psql here
> because the session must stay open.

The problem was that psql needs to be called with -X.  Fix committed.

--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] WIP: Faster Expression Processing v4

```btw ... I just got around to looking at a code coverage report for this
patched version, and that reminded me of something I'd already suspected:
EEOP_INNER_SYSVAR and EEOP_OUTER_SYSVAR seem to be dead code.  That's
unsurprising, because we never try to access a tuple's system columns
above the scan level.  If a query asks for system columns, those get
passed up to upper query levels as ordinary user-side columns.

We could keep the execution support for those opcodes, or we could rip it
out and throw an error in execExpr.c if one would need to be generated.
I'm a little inclined to the latter, because it seems like the plan is
to grow more support for every opcode in the future.  We don't need to
be adding support for unreachable opcodes.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] dsm.c API tweak

```On Sat, Mar 25, 2017 at 1:59 PM, Thomas Munro
wrote:
> On Sat, Mar 25, 2017 at 12:35 PM, Alvaro Herrera
>  wrote:
>> Alvaro Herrera wrote:
>>> Per
>>> https://postgr.es/m/CAEepm=11ma_Z1HoPxPcSCANnh5ykHORa=hca1u1v1+5s_jw...@mail.gmail.com
>>> it seems that the dsm.c API is a bit inconvenient right now.  I proposed
>>> in the first patch in that thread to change the API so that a segment is
>>> marked as "pinned" if created with no ResourceOwner set as current;
>>> which is essentially the same as creating a fake one, then marking the
>>> segment as pinned.
>>>
>>> Thomas agrees with me, so I propose this patch as preparatory work for
>>> the autovacuum/BRIN stuff I'm interested in.
>>
>> Here's the proposed patch.
>
> +1

-   seg->resowner = CurrentResourceOwner;
-   ResourceOwnerRememberDSM(CurrentResourceOwner, seg);
+   if (CurrentResourceOwner)
+   {
+   seg->resowner = CurrentResourceOwner;
+   ResourceOwnerRememberDSM(CurrentResourceOwner, seg);
+   }

You need to assign seg->resowner = CurrentResourceOwner
unconditionally here.  Otherwise seg->resowner is uninitialised junk.

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### [HACKERS] crashes due to setting max_parallel_workers=0

```
Hi,

while working on a patch I ran into some crashes that seem to be caused
by inconsistent handling of max_parallel_workers - queries still seem to
be planned with parallel plans enabled, but then crash at execution.

The attached script reproduces the issue on a simple query, causing
crashed in GatherMerge, but I assume the issue is more general.

regards

--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

crash-parallel.sql
Description: application/sql

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### [HACKERS] pg_get_statisticsextdef() is not quite the full shilling

```Hi,

Seems pg_get_statisticsextdef() has a couple of things wrong:

1. HeapTupleIsValid() called on the wrong tuple.
2. Did not schema qualify names.

Both of which were my mistakes.

The attached fixes.

I've also added some tests to give the function a bit of coverage.

I've purposefully left out the WITH syntax. We'll want to add some
logic around that once we have more than one statistic type supported.
I'd suggest not appending WITH if all supported types are present, and
only appending it if a true subset are present. That'll mean pg_dump
from v10 and import into v11 will get all types, if they did in v10,
and the same subset that they did in v10 when only a subset were
originally defined.

Since we support only 1 type now, nothing needs to happen there yet.

--
PostgreSQL Development, 24x7 Support, Training & Services

pg_get_statisticsextdef_fix.patch
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] dsm.c API tweak

```On Sat, Mar 25, 2017 at 12:35 PM, Alvaro Herrera
wrote:
> Alvaro Herrera wrote:
>> Per
>> https://postgr.es/m/CAEepm=11ma_Z1HoPxPcSCANnh5ykHORa=hca1u1v1+5s_jw...@mail.gmail.com
>> it seems that the dsm.c API is a bit inconvenient right now.  I proposed
>> in the first patch in that thread to change the API so that a segment is
>> marked as "pinned" if created with no ResourceOwner set as current;
>> which is essentially the same as creating a fake one, then marking the
>> segment as pinned.
>>
>> Thomas agrees with me, so I propose this patch as preparatory work for
>> the autovacuum/BRIN stuff I'm interested in.
>
> Here's the proposed patch.

+1

I'd word this slightly differently:

+ * If there is a CurrentResourceOwner, the new segment is born unpinned and the
+ * resource owner is in charge of destroying it (and will be blamed if it
+ * doesn't).  If there's no current resource owner, then the segment starts in
+ * the pinned state, so it'll stay alive until explicitely unpinned.

It's confusing that we've overloaded the term 'pin'.  When we 'pin the
mapping', we're disassociating it from the resource owner so that it
will remain attached for longer than the current resource owner scope.
When we 'pin the segment' we are making it survive even if all
backends detach (= an extra secret refcount).  What we're talking
about here is not pinning the *segment*, but pinning the *mapping* in
this backend.

How about: "If there is a non-NULL CurrentResourceOwner, the new
segment is associated with it and will be automatically detached when
the resource owner releases.  Otherwise it will remain attached until
explicitly detached.  Creating or attaching with a NULL
CurrentResourceOwner is equivalent to creating or attaching with a
non-NULL CurrentResourceOwner and then calling dsm_pin_mapping()."
Then dsm_attach() needs to say "See the note above dsm_create() about
the CurrentResourceOwner.", since it's the same deal there.

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] ICU integration

```On Thu, Mar 23, 2017 at 6:31 PM, Craig Ringer  wrote:
> Congratulations on getting this done. It's great work, and it'll make
> a whole class of potential bugs and platform portability warts go away
> if widely adopted.

+1

I would like to see us rigorously define a standard for when Postgres
installations are binary compatible. There should be a simple tool.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Fetch JSONB Value for UNIQUE Constraint

```On Mar 24, 2017, at 5:00 PM, Peter Geoghegan  wrote:

>> So it’s a fine workaround, but maybe there’s something missing from the
>> parsing of the CREATE TABLE statement? This is on 9.6.1.
>
> Unique constraints don't support expressions, or a predicate (partial-ness).

Oh. Okay. I assumed the syntax would be identical to a unique index, since
that’s ultimately what a unique constraint is, IIUC. My mistake.

Thanks Peter!

Best,

David

smime.p7s
Description: S/MIME cryptographic signature

```

### Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)

```On 25 March 2017 at 07:31, Peter Eisentraut
wrote:
> On 3/24/17 02:27, Craig Ringer wrote:
>> On 24 March 2017 at 02:29, Robert Haas  wrote:
>>> On Tue, Mar 21, 2017 at 11:35 PM, Craig Ringer
>>> wrote:
Changes made per discussion.
>>>
>>> Committed 0001.
>>
>> Much appreciated.
>>
>> Here's the 2nd patch rebased on top of master, with the TAP test
>> included this time. Looks ready to go.
>
> I'm experiencing hangs in the new t/011_crash_recovery.pl test.  It
> seems to hang after the first call to SELECT txid_current();.

note "txid is \$xid";

after

+chomp(\$xid);

does it report the xid?

Alternately, can you see a 'psql' process and a backend doing a SELECT
when it stops progressing?

I'm wondering if this is a perl version/platform issue around

\$tx->pump until \$stdout =~ /[[:digit:]]+[\r\n]\$/;

where we're not recognising the required output from psql when we get it.

What's in src/test/recovery/tmp_check/log/regress_log_011* ?

I couldn't use PostgresNode::psql or PostgresNode::safe_psql here
because the session must stay open.

--
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Fetch JSONB Value for UNIQUE Constraint

```On Fri, Mar 24, 2017 at 4:57 PM, David E. Wheeler  wrote:
> So it’s a fine workaround, but maybe there’s something missing from the
> parsing of the CREATE TABLE statement? This is on 9.6.1.

Unique constraints don't support expressions, or a predicate (partial-ness).

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### [HACKERS] Fetch JSONB Value for UNIQUE Constraint

```Dear Hackers,

Should this work?

CREATE TABLE things (
user_id  INTEGER NOT NULL,
document JSONB   NOT NULL,
UNIQUE (user_id, document->>'name')
);
ERROR:  syntax error at or near "->>"
LINE 4: UNIQUE (user_id, document->>’name')

I tried adding parens, but that didn’t work, either:

CREATE TABLE things (
user_id  INTEGER NOT NULL,
document JSONB   NOT NULL,
UNIQUE (user_id, (document->>'name'))
);
ERROR:  syntax error at or near "("
LINE 4: UNIQUE (user_id, (document->>'name'))

It works fine to create a unique index, though:

CREATE TABLE things (
user_id  INTEGER NOT NULL,
document JSONB   NOT NULL
);
CREATE UNIQUE INDEX ON things(user_id, (document->>'name'));

So it’s a fine workaround, but maybe there’s something missing from the parsing
of the CREATE TABLE statement? This is on 9.6.1.

Best,

David

smime.p7s
Description: S/MIME cryptographic signature

```

### Re: [HACKERS] dsm.c API tweak

```Alvaro Herrera wrote:
> Per
> https://postgr.es/m/CAEepm=11ma_Z1HoPxPcSCANnh5ykHORa=hca1u1v1+5s_jw...@mail.gmail.com
> it seems that the dsm.c API is a bit inconvenient right now.  I proposed
> in the first patch in that thread to change the API so that a segment is
> marked as "pinned" if created with no ResourceOwner set as current;
> which is essentially the same as creating a fake one, then marking the
> segment as pinned.
>
> Thomas agrees with me, so I propose this patch as preparatory work for
> the autovacuum/BRIN stuff I'm interested in.

Here's the proposed patch.

--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From ded46066cb634f7714706c8d347af2acf267da17 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera
Date: Fri, 24 Mar 2017 19:27:22 -0300
Subject: [PATCH] Change dsm.c API so that mappings are pinned if created with
no resowner

---
src/backend/storage/ipc/dsm.c | 15 ---
1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 54378bc..061a457 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -453,6 +453,11 @@ dsm_set_control_handle(dsm_handle h)

/*
* Create a new dynamic shared memory segment.
+ *
+ * If there is a CurrentResourceOwner, the new segment is born unpinned and the
+ * resource owner is in charge of destroying it (and will be blamed if it
+ * doesn't).  If there's no current resource owner, then the segment starts in
+ * the pinned state, so it'll stay alive until explicitely unpinned.
*/
dsm_segment *
dsm_create(Size size, int flags)
@@ -1095,7 +1100,8 @@ dsm_create_descriptor(void)
{
dsm_segment *seg;

-   ResourceOwnerEnlargeDSMs(CurrentResourceOwner);
+   if (CurrentResourceOwner)
+   ResourceOwnerEnlargeDSMs(CurrentResourceOwner);

seg = MemoryContextAlloc(TopMemoryContext, sizeof(dsm_segment));
@@ -1106,8 +1112,11 @@ dsm_create_descriptor(void)
seg->mapped_size = 0;

-   seg->resowner = CurrentResourceOwner;
-   ResourceOwnerRememberDSM(CurrentResourceOwner, seg);
+   if (CurrentResourceOwner)
+   {
+   seg->resowner = CurrentResourceOwner;
+   ResourceOwnerRememberDSM(CurrentResourceOwner, seg);
+   }

slist_init(>on_detach);

--
2.1.4

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)

```On 3/24/17 02:27, Craig Ringer wrote:
> On 24 March 2017 at 02:29, Robert Haas  wrote:
>> On Tue, Mar 21, 2017 at 11:35 PM, Craig Ringer  wrote:
>>> Changes made per discussion.
>>
>> Committed 0001.
>
> Much appreciated.
>
> Here's the 2nd patch rebased on top of master, with the TAP test
> included this time. Looks ready to go.

I'm experiencing hangs in the new t/011_crash_recovery.pl test.  It
seems to hang after the first call to SELECT txid_current();.

--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### [HACKERS] dsm.c API tweak

```Per
https://postgr.es/m/CAEepm=11ma_Z1HoPxPcSCANnh5ykHORa=hca1u1v1+5s_jw...@mail.gmail.com
it seems that the dsm.c API is a bit inconvenient right now.  I proposed
in the first patch in that thread to change the API so that a segment is
marked as "pinned" if created with no ResourceOwner set as current;
which is essentially the same as creating a fake one, then marking the
segment as pinned.

Thomas agrees with me, so I propose this patch as preparatory work for
the autovacuum/BRIN stuff I'm interested in.

--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Potential data loss of 2PC files

```On Fri, Mar 24, 2017 at 11:36 PM, Teodor Sigaev  wrote:
>> And the renaming of pg_clog to pg_xact is also my fault. Attached is
>> an updated patch.
>
>
> Thank you. One more question: what about symlinks? If DBA moves, for
> example, pg_xact to another dist and leaves the symlink in data directoty.
> Suppose, fsync on symlink will do nothing actually.

I did not think of this case, but is that really common? There is even
no option to configure that at command level. And surely we cannot
support any fancy scenario that people figure out using PGDATA.
Existing callers of fsync_fname don't bother about this case as well
by the way, take the calls related to pg_logical and pg_repslot.

If something should be done in this area, that would be surely in
fsync_fname directly to centralize all the calls, and I would think of
that as a separate patch, and a separate discussion.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] increasing the default WAL segment size

```
Hi Robert,

On 3/24/17 3:00 PM, Robert Haas wrote:

On Wed, Mar 22, 2017 at 6:05 PM, David Steele  wrote:

Wait, really?  I thought you abandoned this approach because there's
then no principled way to handle WAL segments of less than the default
size.

I did say that, but I thought I had hit on a compromise.

But, as I originally pointed out the hex characters in the filename are not
aligned correctly for > 8 bits (< 16MB segments) and using different
alignments just made it less consistent.

I don't think I understand what the compromise is.  Are you saying we
should have one rule for segments < 16MB and another rule for segments

16MB?  I think using two different rules for file naming depending

on the segment size will be a negative for both tool authors and
ordinary users.

Sorry for the confusion, I meant to say that if we want to keep LSNs in
the filenames and not change alignment for the current default, then we
would need to drop support for segment sizes < 16MB (more or less what I
said below).  Bad editing on my part.

It would be OK if we were willing to drop the 1,2,4,8 segment sizes because
then the alignment would make sense and not change the current 16MB
sequence.

Well, that is true.  But the thing I'm trying to do here is to keep
this patch down to what actually needs to be changed in order to
accomplish the original purchase, not squeeze more and more changes
into it.

Attached is a patch to be applied on top of Beena's v8 patch that
preserves LSNs in the file naming for all segment sizes.  It's not quite
complete because it doesn't modify the lower size limit everywhere, but
I think it's enough so you can see what I'm getting at.  This passes
check-world and I've poked at it in other segment sizes as well manually.

Behavior for the current default of 16MB is unchanged, and all other
sizes go through a logical progression.

1GB:
00010040
00010080
000100C0
00010001

256GB:

00010010
00010020
00010030
...
000100E0
000100F0
00010001

64GB:

000100010004
000100010008
00010001000C
...
0001000100F8
0001000100FC
00010001

I believe that maintaining an easy correspondence between LSN and
filename is important.  The cluster will not always be up to help with
these calculations and tools that do the job may not be present or may
have issues.

I'm happy to merge this with Beena's patch (and tidy my patch up) if
this looks like an improvement to everyone.

--
-David
da...@pgmasters.net
diff --git a/src/include/access/xlog_internal.h
b/src/include/access/xlog_internal.h
index e3f616a..08d6494 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -93,10 +93,12 @@ typedef XLogLongPageHeaderData *XLogLongPageHeader;
extern uint32 XLogSegSize;
#define XLOG_SEG_SIZE XLogSegSize

-/* XLogSegSize can range from 1MB to 1GB */
-#define XLogSegMinSize 1024 * 1024
+/* XLogSegSize can range from 16MB to 1GB */
+#define XLogSegMinSize 1024 * 1024 * 16
#define XLogSegMaxSize 1024 * 1024 * 1024

+#define XLogSegMinSizeBits 24
+
/* Default number of min and max wal segments */
#define DEFAULT_MIN_WAL_SEGS 5
#define DEFAULT_MAX_WAL_SEGS 64
@@ -158,10 +160,14 @@ extern uint32 XLogSegSize;
/* Length of XLog file name */
#define XLOG_FNAME_LEN24

+#define XLogSegNoLower(logSegNo) \
+   (((logSegNo) % XLogSegmentsPerXLogId) * \
+   (XLogSegSize >> XLogSegMinSizeBits))
+
#define XLogFileName(fname, tli, logSegNo) \
snprintf(fname, MAXFNAMELEN, "%08X%08X%08X", tli,   \
(uint32) ((logSegNo) / XLogSegmentsPerXLogId), \
-(uint32) ((logSegNo) % XLogSegmentsPerXLogId))
+(uint32) XLogSegNoLower(logSegNo))

#define XLogFileNameById(fname, tli, log, seg) \
snprintf(fname, MAXFNAMELEN, "%08X%08X%08X", tli, log, seg)
@@ -181,17 +187,18 @@ extern uint32 XLogSegSize;
strcmp((fname) + XLOG_FNAME_LEN, ".partial") == 0)

#define XLogFromFileName(fname, tli, logSegNo) \
-   do {
\
-   uint32 log;
\
-   uint32 seg;
\
-   sscanf(fname, "%08X%08X%08X", tli, , ); \
-   *logSegNo = (uint64) log * XLogSegmentsPerXLogId + seg; \
+   do {
\
+   uint32 log;
\
+   uint32 seg;```

### Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

```v25

- PQExpBuffer on gather_boolean_expression()
- convenience/clarity functions: ignore_slash_option(),
ignore_2_slash_options(), ignore_slash_line()
- remove inaccurate test of variable expansion in a false block
- added kitchen-sink test of parsing slash commands in a false block
- removed spurious file that shouldn't have been in v24
- removed any potential free(NULL) calls *that I introduced*, others remain
from master branch

NOT done:
- grouping all branching commands into one function - can be done in a
later patch for clarity
- combining _ef / _ev or _sf / _sv - can be done in a later patch for
clarity

On Fri, Mar 24, 2017 at 4:33 PM, Corey Huinker
wrote:

> On Fri, Mar 24, 2017 at 4:10 PM, Fabien COELHO
> wrote:
>
>>
>> Hello Corey,
>>
>> I wished for the same thing, happy to use one if it is made known to me.
>>> I pulled that pattern from somewhere else in the code, and given that the
>>> max number of args for a command is around 4, I'm not too worried about
>>> scaling.
>>>
>>
>> If there are expressions one day like pgbench, the number of arguments
>> becomes arbitrary. Have you looked at PQExpBuffer?
>
>
> I will look into it.
>
>
>>
There seems to be pattern repetition for _ev _ef and _sf _sv. Would it
make sense to create a function instead of keeping the initial
copy-paste?

>>>
>>> Yes, and a few things like that, but I wanted this patch to keep as much
>>> code as-is as possible.
>>>
>>
>> If you put the generic function at the same place, one would be more or
>> less kept and the other would be just removed?
>
>
>> "git diff --patience -w" gives a rather convenient output for looking at
>> the patch.
>
>
> Good to know about that option.
>
> As for a function for digested ignored slash options, it seems like I can
> disregard the true/false value of the "semicolon" parameter. Is that
> correct?
>
>
>> I would suggest to put together all if-related backslash command, so that
the stack management is all in one function instead of 4.

>>>
>>> I recognize the urge to group them together, but would there be any
>>> actual
>>> code sharing between them? Wouldn't I be either re-checking the string
>>> "cmd" again, or otherwise setting an enum that I immediately re-check
>>> inside the all_branching_commands() function?
>>>
>>
>> I do not see that as a significant issue, especially compared to the
>> benefit of having the automaton transition management in a single place.
>
>
> I'm still struggling to see how this would add any clarity to the code
> beyond what I can achieve by clustering the exec_command_(if/elif/else/endif)
> near one another.
>
>
>
>

0001.if_endif.v25.patch
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] delta relations in AFTER triggers

```On Fri, Mar 24, 2017 at 5:36 PM, Thomas Munro
wrote:
> On Fri, Mar 24, 2017 at 1:14 PM, Thomas Munro
>  wrote:
>> If that's fixed and the permissions question can be waved away by
>> saying it's the same as the per-row situation, my only other comment
>> would be a bikeshed issue: Enr isn't a great name for a struct.
>
> One more thought: should this be allowed?
>
> postgres=# create table mytab (i int) partition by list (i);
> CREATE TABLE
> postgres=# create table mytab1 partition of mytab for values in (42);
> CREATE TABLE
> postgres=# create function my_trigger_function() returns trigger as \$\$
> begin end; \$\$ language plpgsql;
> CREATE FUNCTION
> postgres=# create trigger my_trigger after update on mytab referencing
> old table as my_old for each statement execute procedure
> my_trigger_function();
> CREATE TRIGGER

On second thoughts, that's actually arguably a bug in committed code.
What do you think about the attached patch?

--
Thomas Munro
http://www.enterprisedb.com

no-transition-tables-for-partitioned-tables.patch
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] comment/security label for publication/subscription

```* Robert Haas (robertmh...@gmail.com) wrote:
> On Fri, Mar 24, 2017 at 12:18 AM, Peter Eisentraut
>  wrote:
> > Here is a patch to add COMMENT support for publications and subscriptions.
> >
> > On a similar issue, do we need SECURITY LABEL support for those?  Does
> > that make sense?
>
> IMHO, it's good to have COMMENT and SECURITY LABEL support for pretty
> much everything.

+1

Thanks!

Stephen

signature.asc
Description: Digital signature

```

### Re: [HACKERS] Guidelines for GSoC student proposals / Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

```On Wed, Mar 22, 2017 at 2:24 AM, Mengxing Liu
wrote:

> I've finished a draft proposal for "Eliminate O(N^2) scaling from
> rw-conflict tracking in serializable transactions".

I've attached some comments to the document; let me know if they
don't show up for you or you need clarification.

Overall, if the comments are addressed, I think it is great.

--
Kevin Grittner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] comment/security label for publication/subscription

```On Fri, Mar 24, 2017 at 12:18 AM, Peter Eisentraut
wrote:
> Here is a patch to add COMMENT support for publications and subscriptions.
>
> On a similar issue, do we need SECURITY LABEL support for those?  Does
> that make sense?

IMHO, it's good to have COMMENT and SECURITY LABEL support for pretty
much everything.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Logical replication existing data copy

```Am Freitag, den 24.03.2017, 14:57 -0400 schrieb Peter Eisentraut:
> On 3/24/17 05:22, Michael Banck wrote:
> > However, replication also seems to not work, I'm using the following
> > script right now:
>
> The problem is that your publication does not include any tables.

Oops, of course. That part must've got lost in a copy-paste or when I
tried to dig further why the CREATE SUBSCRIPTION segfaulted, sorry for
the noise.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

```On Fri, Mar 24, 2017 at 4:10 PM, Fabien COELHO  wrote:

>
> Hello Corey,
>
> I wished for the same thing, happy to use one if it is made known to me.
>> I pulled that pattern from somewhere else in the code, and given that the
>> max number of args for a command is around 4, I'm not too worried about
>> scaling.
>>
>
> If there are expressions one day like pgbench, the number of arguments
> becomes arbitrary. Have you looked at PQExpBuffer?

I will look into it.

>
>>> There seems to be pattern repetition for _ev _ef and _sf _sv. Would it
>>> make sense to create a function instead of keeping the initial
>>> copy-paste?
>>>
>>
>> Yes, and a few things like that, but I wanted this patch to keep as much
>> code as-is as possible.
>>
>
> If you put the generic function at the same place, one would be more or
> less kept and the other would be just removed?

> "git diff --patience -w" gives a rather convenient output for looking at
> the patch.

Good to know about that option.

As for a function for digested ignored slash options, it seems like I can
disregard the true/false value of the "semicolon" parameter. Is that
correct?

> I would suggest to put together all if-related backslash command, so that
>>> the stack management is all in one function instead of 4.
>>>
>>
>> I recognize the urge to group them together, but would there be any actual
>> code sharing between them? Wouldn't I be either re-checking the string
>> "cmd" again, or otherwise setting an enum that I immediately re-check
>> inside the all_branching_commands() function?
>>
>
> I do not see that as a significant issue, especially compared to the
> benefit of having the automaton transition management in a single place.

I'm still struggling to see how this would add any clarity to the code
beyond what I can achieve by clustering the
exec_command_(if/elif/else/endif) near one another.

```

### Re: [HACKERS] Page Scan Mode in Hash Index

```On Thu, Mar 23, 2017 at 2:35 PM, Ashutosh Sharma  wrote:
> I take your suggestion. Please find the attachments.

I think you should consider refactoring this so that it doesn't need
to use goto.  Maybe move the while (offnum <= maxoff) logic into a
helper function and have it return itemIndex.  If itemIndex == 0, you
can call it again.  Notice that the code in the "else" branch of the
if (itemIndex == 0) stuff could actually be removed from the else
block without changing anything, because the "if" block always either
returns or does a goto.  That's worthy of a little more work to try to
make things simple and clear.

+ *  We find the first item(or, if backward scan, the last item) in

Missing space.

In _hash_dropscanbuf(), the handling of hashso_bucket_buf is now
inconsistent with the handling of hashso_split_bucket_buf, which seems
suspicious. Suppose we enter this function with so->hashso_bucket_buf
and so->currPos.buf both being valid buffers, but not the same one.
It looks to me as if we'll release the first pin but not the second
one.  My guess (which could be wrong) is that so->hashso_bucket_buf =
InvalidBuffer should be moved back up higher in the function where it
was before, just after the first if statement, and that the new
condition so->hashso_bucket_buf == so->currPos.buf on the last
_hash_dropbuf() should be removed.  If that's not right, then I think
I need somebody to explain why not.

I am suspicious that _hash_kill_items() is going to have problems if
the overflow page is freed before it reacquires the lock.
_btkillitems() contains safeguards against similar cases.

This is not a full review, but I'm out of time for the moment.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

```

Hello Corey,

I wished for the same thing, happy to use one if it is made known to me.
I pulled that pattern from somewhere else in the code, and given that the
max number of args for a command is around 4, I'm not too worried about
scaling.

If there are expressions one day like pgbench, the number of arguments
becomes arbitrary. Have you looked at PQExpBuffer?

However there is an impact on testing because of all these changes. ISTM
that test cases should reflect this situation and test that \cd, \edit, ...
are indeed ignored properly and taking account there expected args...

I think one grand

\if false
\a
\c some_connect_string
...
\z some_table_name
\endif
should do the trick,

Yes. Maybe some commands could be on the same line as well.

but it wouldn't detect memory leaks.

No miracle...

There seems to be pattern repetition for _ev _ef and _sf _sv. Would it
make sense to create a function instead of keeping the initial copy-paste?

Yes, and a few things like that, but I wanted this patch to keep as much
code as-is as possible.

If you put the generic function at the same place, one would be more or
less kept and the other would be just removed?

"git diff --patience -w" gives a rather convenient output for looking at
the patch.

I would suggest to put together all if-related backslash command, so that
the stack management is all in one function instead of 4.

I recognize the urge to group them together, but would there be any actual
code sharing between them? Wouldn't I be either re-checking the string
"cmd" again, or otherwise setting an enum that I immediately re-check
inside the all_branching_commands() function?

I do not see that as a significant issue, especially compared to the
benefit of having the automaton transition management in a single place.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] UPDATE of partition key

```Thanks Amit for your review comments. I am yet to handle all of your
comments, but meanwhile , attached is an updated patch, that handles
RETURNING.

Earlier it was not working because ExecInsert() did not return any
RETURNING clause. This is because the setup needed to create RETURNIG
projection info for leaf partitions is done in ExecInitModifyTable()
only in case of INSERT. But because it is an UPDATE operation, we have
to do this explicitly as a one-time operation when it is determined
that row-movement is required. This is similar to how we do one-time
setup of mt_partition_dispatch_info. So in the patch, I have moved
this code into a new function ExecInitPartitionReturningProjection(),
and now this is called in ExecInitModifyTable() as well as during row
movement for ExecInsert() processing the returning clause.

Basically we need to do all that is done in ExecInitModifyTable() for
INSERT. There are a couple of other things that I suspect that might
need to be done as part of the missing initialization for Execinsert()
during row-movement :
1. Junk filter handling
2. WITH CHECK OPTION

Yet, ExecDelete() during row-movement is still returning the RETURNING
result redundantly, which I am yet to handle this.

On 23 March 2017 at 07:04, Amit Langote  wrote:
> Hi Amit,
>
> Thanks for the updated patch.
>
> On 2017/03/23 3:09, Amit Khandekar wrote:
>> Attached is v2 patch which implements the above optimization.
>
> Would it be better to have at least some new tests?  Also, there are a few
> places in the documentation mentioning that such updates cause error,
> which will need to be updated.  Perhaps also add some explanatory notes
> about the mechanism (delete+insert), trigger behavior, caveats, etc.
> There were some points discussed upthread that could be mentioned in the
> documentation.

Yeah, agreed. Will do this in the subsequent patch.

>
> @@ -633,6 +634,9 @@ ExecDelete(ItemPointer tupleid,
>  HeapUpdateFailureData hufd;
>  TupleTableSlot *slot = NULL;
>
> +*already_deleted = false;
> +
>
> concurrently_deleted?

Done.

>
> @@ -962,7 +969,7 @@ ExecUpdate(ItemPointer tupleid,
>  }
>  else
>  {
> -LockTupleMode lockmode;
> +LockTupleMode   lockmode;
>
> Useless hunk.
Removed.

I am yet to handle your other comments , still working on them, but
till then , attached is the updated patch.

update-partition-key_v3.patch
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] pageinspect and hash indexes

``` Thanks for reviewing my patch. I have removed the extra white space.
Attached are both the patches.
>>>
>>> Sorry, I have mistakenly attached wrong patch. Here are the correct
>>> set of patches.
>>
>> The latest version of patches looks fine to me.
>
>
> --- a/contrib/pageinspect/hashfuncs.c
> +++ b/contrib/pageinspect/hashfuncs.c
> @@ -80,7 +80,8 @@ verify_hash_page(bytea *raw_page, int flags)
>  /* Check that page type is sane. */
>  pagetype = pageopaque->hasho_flag & LH_PAGE_TYPE;
>  if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
> -pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)
> +pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE &&
> +pagetype != LH_UNUSED_PAGE)
>  ereport(ERROR,
>  (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>   errmsg("invalid hash page type %08x", pagetype)));
>
> The advantage of that is (1) it won't get confused if in the future we
> have an unused page that has some flag bit not in LH_PAGE_TYPE set,
> and (2) if in the future we want to add any other checks to this
> function which should apply to unused pages also, they won't get
> bypassed by an early return statement.

Agreed. Moreover, previous approach might even change the current
behaviour of functions like hash_page_items() and hash_page_stats()
basically when we pass UNUSED pages to these functions. Attached is
the newer version of patch.
From 5c54fa58895cd279ff44424a3eb1feafdaca69d5 Mon Sep 17 00:00:00 2001
From: ashu
Date: Sat, 25 Mar 2017 01:02:35 +0530
Subject: [PATCH] Allow pageinspect to handle UNUSED hash pages v3

---
contrib/pageinspect/hashfuncs.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/pageinspect/hashfuncs.c b/contrib/pageinspect/hashfuncs.c
index 812a03f..2632287 100644
--- a/contrib/pageinspect/hashfuncs.c
+++ b/contrib/pageinspect/hashfuncs.c
@@ -80,7 +80,8 @@ verify_hash_page(bytea *raw_page, int flags)
/* Check that page type is sane. */
pagetype = pageopaque->hasho_flag & LH_PAGE_TYPE;
if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
-		pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)
+		pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE &&
+		pagetype != LH_UNUSED_PAGE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("invalid hash page type %08x", pagetype)));
--
1.8.3.1

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Re: [COMMITTERS] pgsql: Implement multivariate n-distinct coefficients

```Alvaro Herrera  writes:
> Tom Lane wrote:
>> Why not use COSTS OFF?  Or I'll put that even more strongly: all the
>> existing regression tests use COSTS OFF, exactly to avoid this sort of
>> machine-dependent output.  There had better be a really damn good
>> reason not to use it here.

> If we use COSTS OFF, the test is completely pointless, as the plans look
> identical regardless of whether the multivariate stats are being used or
> not.

Well, I think you are going to find that the exact costs are far too
fragile to have in the regression test output.  Just because you wish
you could test them this way doesn't mean you can.

> If we had a ROWS option to ANALYZE that showed estimated number of rows
> but not the cost, that would be an option.

Unlikely to be any better.  All these numbers are subject to lots of
noise, eg due to auto-analyze happening at unexpected times, random
sampling during analyze, etc.  If you try to constrain the test case
enough that none of that happens, I wonder how useful it will really be.

What I would suggest is devising a test case whereby you actually
get a different plan shape now than you did before.  That shouldn't
be too terribly hard, or else what was the point?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] pg_stat_wal_write statistics view

```On Wed, Feb 15, 2017 at 12:53 PM, Haribabu Kommi
wrote:
>
>
> On Wed, Feb 8, 2017 at 9:36 PM, Amit Kapila  wrote:
>>
>> On Tue, Feb 7, 2017 at 11:47 AM, Haribabu Kommi
>>  wrote:
>> > Hi Hackers,
>> >
>> > I just want to discuss adding of a new statistics view that provides
>> > the information of wal writing details as follows
>> >
>>
>> +1.  I think it will be useful to observe WAL activity.
>
>
> Thanks for your opinion.
>
>> > postgres=# \d pg_stat_wal_writer
>> > View "pg_catalog.pg_stat_wal_writer"
>> > Column |   Type   | Collation | Nullable
>> > |
>> > Default
>> >
>> > ---+--+---+--+-
>> >  num_backend_writes   | bigint   |   |
>> > |
>> >  num_total_writes  | bigint   |   |  |
>> >  num_blocks  | bigint   |   |  |
>> >  total_write_time   | bigint|   |  |
>> >  stats_reset   | timestamp with time zone |   |
>> > |
>> >
>> > The columns of the view are
>> > 1. Total number of xlog writes that are called from the backend.
>> > 2. Total number of xlog writes that are called from both backend
>> >  and background workers. (This column can be changed to just
>> >  display on the background writes).
>> > 3. The number of the blocks that are written.
>> > 4. Total write_time of the IO operation it took, this variable data is
>> > filled only when the track_io_timing GUC is enabled.
>>
>> So, here is *write_time* the total time system has spent in WAL
>> writing before the last reset?
>
>
> total write_time spent in WAL writing "after" the last reset in
> milliseconds.
>
>> I think there should be a separate column for write and sync time.
>>
>
>
>>
>> > Or it is possible to integrate the new columns into the existing
>> > pg_stat_bgwriter view also.
>> >
>>
>> I feel separate view is better.
>
>
> Ok.
>
> Following the sample out of the view after regress run.
>
> postgres=# select * from pg_stat_walwrites;
> -[ RECORD 1 ]--+--
> backend_writes | 19092
> writes | 663
> write_blocks   | 56116
> write_time | 0
> sync_time  | 3064
> stats_reset| 2017-02-15 13:37:09.454314+11
>
> Currently, writer, walwriter and checkpointer processes
> are considered as background processes that can do
> the wal write mainly.

I'm not sure if this categorization is good. You told that this view is useful
to tune walwriter parameters at the top of this thread. If so, ISTM that
the information about walwriter's activity should be separated from others.

What about other processes which *can* write WAL, for example walsender
(e.g., BASE_BACKUP can cause WAL record), startup process (e.g., end-of-
recovery checkpoint) and logical replication worker (Not sure if it always
works with synchronous_commit=off, though). There might be other processes
which can write WAL.

Why didn't you separate "write_blocks", "write_time" and "sync_time" per
the process category, like "backend_writes" and "writes"?

This view doesn't seem to take into consideration the WAL writing and flushing
during creating new WAL segment file.

I think that it's better to make this view report also the number of WAL pages
which are written when wal_buffer is full. This information is useful to
tune the size of wal_buffers. This was proposed by Nagayasu before.
https://www.postgresql.org/message-id/4ff824f3.5090...@uptime.jp

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] ANALYZE command progress checker

```On Fri, Mar 24, 2017 at 3:41 AM, vinayak
wrote:
> I have updated the patch.

You can't change the definition of AcquireSampleRowsFunc without
updating the documentation in fdwhandler.sgml, but I think I don't
immediately understand why that's a thing we want to do at all if
neither file_fdw nor postgres_fdw are going to use the additional
argument.  It seems to be entirely pointless because the new value
being passed down is always zero and even if the callee were to update
it, do_analyze_rel() would just ignore the change on return.  Am I
missing something here?  Also, the whitespace-only changes to fdwapi.h
related to AcquireSampleRowsFunc going to get undone by pgindent, so
even if there's some reason to change that you should leave the lines
that don't need changing untouched.

+/* Reset rows processed to zero for the next column */
+pgstat_progress_update_param(PROGRESS_ANALYZE_NUM_ROWS_PROCESSED,
0);

This seems like a bad design.  If you see a value other than zero in
this field, you still won't know how much progress analyze has made,
because you don't know which column you're processing.  I also feel
like you're not paying enough attention to a key point here that I
made in my last review, which is that you need to look at where
ANALYZE actually spends most of the time.  If the process of computing
the per-row statistics consumes significant CPU time, then maybe
there's some point in trying to instrument it, but does it?  Unless
you know the answer to that question, you don't know whether there's
even a point to trying to instrument this.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Re: [COMMITTERS] pgsql: Implement multivariate n-distinct coefficients

```Tom Lane wrote:

> Why not use COSTS OFF?  Or I'll put that even more strongly: all the
> existing regression tests use COSTS OFF, exactly to avoid this sort of
> machine-dependent output.  There had better be a really damn good
> reason not to use it here.

If we use COSTS OFF, the test is completely pointless, as the plans look
identical regardless of whether the multivariate stats are being used or
not.

If we had a ROWS option to ANALYZE that showed estimated number of rows
but not the cost, that would be an option.

--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

```>
>
> A few comments about the patch.
>
> Patch applies. "make check" ok.
>
> As already pointed out, there is one useless file in the patch.
>
> Although currently there is only one expected argument for boolean
> expressions, the n² concatenation algorithm in gather_boolean_expression is
> not very elegant. Is there some string buffer data structure which could be
>

I wished for the same thing, happy to use one if it is made known to me.
I pulled that pattern from somewhere else in the code, and given that the
max number of args for a command is around 4, I'm not too worried about
scaling.

>
> ISTM that ignore_boolean_expression may call free on a NULL pointer if the
> expression is empty?
>

True. The psql code is actually littered with a lot of un-checked free(p)
calls, so I started to wonder if maybe we had a wrapper on free() that
checked for NULL. I'll fix this one just to be consistent.

>
> Generally I find the per-command functions rather an improvement.
>

I did too. I tried to split this patch up into two parts, one that broke
out the functions, and one that added if-then, and found that the first
patch was just as unwieldily without the if-then stuff as with.

>
> However there is an impact on testing because of all these changes. ISTM
> that test cases should reflect this situation and test that \cd, \edit, ...
> are indeed ignored properly and taking account there expected args...
>

I think one grand

\if false
\a
\c some_connect_string
...
\z some_table_name
\endif

should do the trick, but it wouldn't detect memory leaks.

>
> In "exec_command_connect" an argument is changed from "-reuse-previous" to
> "-reuse-previous=", not sure why.
>

It shouldn't have been. Good catch. Most commands were able to be migrated
with simple changes (status => *status, strcmp() if-block becomes
active-if-block, etc), but that one was slightly different.

>
> There seems to be pattern repetition for _ev _ef and _sf _sv. Would it
> make sense to create a function instead of keeping the initial copy-paste?
>

Yes, and a few things like that, but I wanted this patch to keep as much
code as-is as possible.

>
> I think that some functions could be used for some repeated cases such as
> "discard one arg", "discard one or two arg", "discard whole line", for the
> various inactive branches, so as to factor out code.
>

I'd be in favor of that as well

>
> I would suggest to put together all if-related backslash command, so that
> the stack management is all in one function instead of 4.
>

I recognize the urge to group them together, but would there be any actual
code sharing between them? Wouldn't I be either re-checking the string
"cmd" again, or otherwise setting an enum that I immediately re-check
inside the all_branching_commands() function?

>
> For pset the inactive branch does OT_NORMAL instead of OT_NOT_EVAL, not
> sure why.

An oversight. Good catch.

```

### Re: [HACKERS] pageinspect and hash indexes

```On Fri, Mar 24, 2017 at 3:54 AM, Amit Kapila  wrote:
> On Fri, Mar 24, 2017 at 9:50 AM, Ashutosh Sharma
> wrote:
>> On Fri, Mar 24, 2017 at 9:46 AM, Ashutosh Sharma
>> wrote:
>>>
>>> Thanks for reviewing my patch. I have removed the extra white space.
>>> Attached are both the patches.
>>
>> Sorry, I have mistakenly attached wrong patch. Here are the correct
>> set of patches.
>
> The latest version of patches looks fine to me.

--- a/contrib/pageinspect/hashfuncs.c
+++ b/contrib/pageinspect/hashfuncs.c
@@ -80,7 +80,8 @@ verify_hash_page(bytea *raw_page, int flags)
/* Check that page type is sane. */
pagetype = pageopaque->hasho_flag & LH_PAGE_TYPE;
if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
-pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)
+pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE &&
+pagetype != LH_UNUSED_PAGE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("invalid hash page type %08x", pagetype)));

The advantage of that is (1) it won't get confused if in the future we
have an unused page that has some flag bit not in LH_PAGE_TYPE set,
and (2) if in the future we want to add any other checks to this
function which should apply to unused pages also, they won't get
bypassed by an early return statement.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Re: [COMMITTERS] pgsql: Implement multivariate n-distinct coefficients

```Alvaro Herrera  writes:
> Robert Haas wrote:
>> dromedary and arapaima have failures like this, which seems likely
>> related to this commit:
>>
>> EXPLAIN
>> SELECT COUNT(*) FROM ndistinct GROUP BY a, d;
>> QUERY PLAN
>> -
>> !  HashAggregate  (cost=225.00..235.00 rows=1000 width=16)
>> Group Key: a, d
>> !->  Seq Scan on ndistinct  (cost=0.00..150.00 rows=1 width=8)
>> (3 rows)

> Yes.  What seems to be going on here, is that both arapaima and
> dromedary are 32 bit machines; all the 64 bit ones are passing (except
> for prion which showed a real relcache bug, which I already stomped).
> Now, the difference is that the total cost in those machines for seqscan
> is 155 instead of 150.  Tomas suggests that this happens because
> MAXALIGN is different, leading to packing tuples differently: the
> expected cost (on our laptop's 64 bit) is 155, and the cost we get in 32
> bit arch is 150 -- so 5 pages of difference.  We insert 1000 rows on the
> table; 4 bytes per tuple would amount to 40 kB, which is exactly 5
> pages.

> I'll push an alternate expected file for this test, which we think is
> the simplest fix.

Why not use COSTS OFF?  Or I'll put that even more strongly: all the
existing regression tests use COSTS OFF, exactly to avoid this sort of
machine-dependent output.  There had better be a really damn good
reason not to use it here.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] WIP: Faster Expression Processing v4

```Andres Freund  writes:
> On 2017-03-24 11:26:27 -0400, Tom Lane wrote:
>> Another modest proposal:
>>
>> I'm not really sold on the approach of using EEOP_FETCHSOME opcodes to
>> trigger initial tupleslot de-forming.  Certainly we want to have a single
>> slot_getsomeattrs call per source slot, but as-is, we need a separate
>> traversal over the expression tree just to precompute the max attribute
>> number needed.  That can't be good for expression compile speed, and
>> it introduces nontrivial bug risks because the code that does that
>> is completely decoupled from the code that emits the EEOP_VAR opcodes
>> (which are what's really relying on the de-forming to have happened).

> Hm.  We had the separate traversal for projections for a long while, and
> I don't think there've been a a lot of changes to the extraction of the
> last attribute number.

That's easily disproven just by looking at the code:

/*
* Don't examine the arguments or filters of Aggrefs or WindowFuncs,
* because those do not represent expressions to be evaluated within the
* calling expression's econtext.  GroupingFunc arguments are never
* evaluated at all.
*/
if (IsA(node, Aggref))
return false;
if (IsA(node, WindowFunc))
return false;
if (IsA(node, GroupingFunc))
return false;
return expression_tree_walker(node, get_last_attnums_walker,
(void *) info);

The WindowFunc exception hasn't been there so long, and the GroupingFunc
one is very new.  And who's to say whether e.g. the recent XMLTABLE patch
got this right at all?  We could easily be extracting columns we don't
need to.

I'm willing to leave this as-is for the moment, but I really think we
should look into changing it (after the main patch is in).

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### [HACKERS] Re: [COMMITTERS] pgsql: Implement multivariate n-distinct coefficients

```Robert Haas wrote:
> On Fri, Mar 24, 2017 at 1:16 PM, Alvaro Herrera
> wrote:
> > Implement multivariate n-distinct coefficients
>
> dromedary and arapaima have failures like this, which seems likely
> related to this commit:
>
>   EXPLAIN
>SELECT COUNT(*) FROM ndistinct GROUP BY a, d;
>QUERY PLAN
>   -
> !  HashAggregate  (cost=225.00..235.00 rows=1000 width=16)
>  Group Key: a, d
> !->  Seq Scan on ndistinct  (cost=0.00..150.00 rows=1 width=8)
>   (3 rows)

Yes.  What seems to be going on here, is that both arapaima and
dromedary are 32 bit machines; all the 64 bit ones are passing (except
for prion which showed a real relcache bug, which I already stomped).
Now, the difference is that the total cost in those machines for seqscan
is 155 instead of 150.  Tomas suggests that this happens because
MAXALIGN is different, leading to packing tuples differently: the
expected cost (on our laptop's 64 bit) is 155, and the cost we get in 32
bit arch is 150 -- so 5 pages of difference.  We insert 1000 rows on the
table; 4 bytes per tuple would amount to 40 kB, which is exactly 5
pages.

I'll push an alternate expected file for this test, which we think is
the simplest fix.

--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] increasing the default WAL segment size

```On Wed, Mar 22, 2017 at 6:05 PM, David Steele  wrote:
>> Wait, really?  I thought you abandoned this approach because there's
>> then no principled way to handle WAL segments of less than the default
>> size.
>
> I did say that, but I thought I had hit on a compromise.
>
> But, as I originally pointed out the hex characters in the filename are not
> aligned correctly for > 8 bits (< 16MB segments) and using different
> alignments just made it less consistent.

I don't think I understand what the compromise is.  Are you saying we
should have one rule for segments < 16MB and another rule for segments
> 16MB?  I think using two different rules for file naming depending
on the segment size will be a negative for both tool authors and
ordinary users.

> It would be OK if we were willing to drop the 1,2,4,8 segment sizes because
> then the alignment would make sense and not change the current 16MB
> sequence.

Well, that is true.  But the thing I'm trying to do here is to keep
this patch down to what actually needs to be changed in order to
accomplish the original purchase, not squeeze more and more changes
into it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Logical replication existing data copy

```On 3/24/17 05:22, Michael Banck wrote:
> However, replication also seems to not work, I'm using the following
> script right now:

The problem is that your publication does not include any tables.

--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Logical replication existing data copy

```On 3/24/17 10:09, Petr Jelinek wrote:
> On 24/03/17 15:05, Peter Eisentraut wrote:
>> On 3/23/17 19:32, Petr Jelinek wrote:
>>> Yes, I also forgot to check if the table actually exists on subscriber
>>> when fetching them in CREATE SUBSCRIPTION (we have check during
>>> replication but not there).
>>
>> I think for this we can probably just change the missing_ok argument of
>> RangeVarGetRelid() to false.
>>
>> Unless we want the custom error message, in which case we need to change
>> AlterSubscription_refresh(), because right now it errors out because
>> missing_ok = false.
>>
>
> You are right, stupid me.

Committed this version.

--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Enabling parallelism for queries coming from SQL or other PL functions

```On Fri, Mar 24, 2017 at 5:57 AM, Rafia Sabih
wrote:
>> I suspect that code fails to achieve its goals anyway.  At the top of
>> exec_eval_expr(), you call exec_prepare_plan() and unconditionally
>> pass CURSOR_OPT_PARALLEL_OK, so when that function returns, expr->plan
>> might now be a parallel plan.  If we reach the call to
>> exec_run_select() further down in that function, and if we happen to
>> pass false, it's not going to matter, because exec_run_select() is
>> going to find the plan already initialized.
>>
> True, fixed.
> The attached patch is to be applied over [1].

After some scrutiny I didn't find anything particularly wrong with
this, with the exception that exec_eval_expr() was passing false as
the parallelOK argument to exec_run_select(), which is inconsistent
with that function's earlier use of CURSOR_OPT_PARALLEL_OK to plan the
same query.  I fixed that by ripping out the parallelOK argument
altogether and just passing CURSOR_OPT_PARALLEL_OK when portalP ==
NULL.  The only reason I added parallelOK in the first place was
because of that RETURN QUERY stuff which subsequent study has shown to
be misguided.

Committed that way; please let me know if you see any problems.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [COMMITTERS] pgsql: Implement multivariate n-distinct coefficients

```On Fri, Mar 24, 2017 at 1:16 PM, Alvaro Herrera  wrote:
> Implement multivariate n-distinct coefficients

dromedary and arapaima have failures like this, which seems likely
related to this commit:

EXPLAIN
SELECT COUNT(*) FROM ndistinct GROUP BY a, d;
QUERY PLAN
-
!  HashAggregate  (cost=225.00..235.00 rows=1000 width=16)
Group Key: a, d
!->  Seq Scan on ndistinct  (cost=0.00..150.00 rows=1 width=8)
(3 rows)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] multivariate statistics (v25)

```Alvaro Herrera wrote:
> Here's a rebased series on top of today's a3eac988c267.  I call this
> v28.
>
> I put David's pg_dump and COMMENT patches as second in line, just after
> the initial infrastructure patch.  I suppose those three have to be
> committed together, while the others (which add support for additional
> statistic types) can rightly remain as separate commits.

As I said in another thread, I pushed parts 0002,0003,0004.  Tomas said
he would try to rebase patches 0001,0005,0006 on top of what was
committed.  My intention is to give that one a look as soon as it is
available.  So we will have n-distinct and functional dependencies in
PG10.  It sounds unlikely that we will get MCVs and histograms in, since
they're each a lot of code.

I suppose we need 0011 too (psql tab completion), but that can wait.

--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Create replication slot in pg_basebackup if requested and not yet present

```Hi,

On Thu, Mar 23, 2017 at 12:41:54PM +0100, Magnus Hagander wrote:
> On Tue, Mar 21, 2017 at 8:34 PM, Robert Haas  wrote:
> > So I tend to think that there should always be some explicit user
> > action to cause the creation of a slot, like --create-slot-if-needed
> > or --create-slot=name.  That still won't prevent careless use of that
> > option but it's less dangerous than assuming that a user who refers to
> > a nonexistent slot intended to create it when, perhaps, they just
> > typo'd it.
>
> Well, the explicit user action would be "--slot". But sure, I can
> definitely live with a --create-if-not-exists.

Can we just make that option create slot and don't worry if one exists
already? CreateReplicationSlot() can be told to not mind about already
existing slots, and I don't see a huge point in erroring out if the slot
exists already, unless somebody can show that this leads to data loss
somehow.

> The important thing, to me, is that you can run it as a single
> command, which makes it a lot easier to work with. (And not like we
> currently have for pg_receivewal which requires a separate command to
> create the slot)

Oh, that is how it works with pg_receivewal, I have to admit I've never
used it so was a bit confused about this when I read its code.

So in that case I think we don't necessarily need to have the same user
interface at all. I first thought about just adding "-C, --create" (as
in "--create --slot=foo"), but this on second thought this looked a bit
shortsighted - who knows what flashy thing pg_basebackup might create in
5 years... So I settled on --create-slot, which is only slightly more to
type (albeit repetive, "--create-slot --slot=foo"), but adding a short
option "-C" would be fine I thinkg "-C -S foo".

So attached is a patch with adds that option. If people really think it
should be --create-slot-if-not-exists instead I can update the patch, of
course.

I again added a second patch with some further refactoring which makes
it print a message on temporary slot creation in verbose mode.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
>From 4e37e110ac402e67874f729832b330a837284d4b Mon Sep 17 00:00:00 2001
From: Michael Banck
Date: Fri, 24 Mar 2017 18:27:47 +0100
Subject: [PATCH 1/2] Add option to create a replication slot in pg_basebackup
if not yet present.

When reqeusting a particular replication slot, the new option --create tries to
create it before starting to replicate from it in case it does not exist
---
doc/src/sgml/ref/pg_basebackup.sgml  | 12
src/bin/pg_basebackup/pg_basebackup.c| 44 +++-
src/bin/pg_basebackup/t/010_pg_basebackup.pl | 23 +--
3 files changed, 75 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index e1cec9d..789f649 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -269,6 +269,18 @@ PostgreSQL documentation

+  -C
+  --create-slot
+
+
+This option can only be used together with the --slot
+	option.  It causes the specified slot name to be created if it does not
+
+
+
+
+
-T olddir=newdir
--tablespace-mapping=olddir=newdir

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 0a4944d..2af2e22 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -92,6 +92,7 @@ static pg_time_t last_progress_report = 0;
static int32 maxrate = 0;		/* no limit by default */
static char *replication_slot = NULL;
static bool temp_replication_slot = true;
+static bool create_slot = false;

static bool success = false;
static bool made_new_pgdata = false;
@@ -337,6 +338,7 @@ usage(void)
" write recovery.conf for replication\n"));
printf(_("  -S, --slot=SLOTNAMEreplication slot to use\n"));
printf(_("  --no-slot  prevent creation of temporary replication slot\n"));
+	printf(_("  -C, --create-slot  create replication slot if not present already\n"));
printf(_("  -T, --tablespace-mapping=OLDDIR=NEWDIR\n"
" relocate tablespace in OLDDIR to NEWDIR\n"));
printf(_("  -X, --wal-method=none|fetch|stream\n"
@@ -581,6 +583,19 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier)
else
param->temp_slot = temp_replication_slot;

+	/*
+	 * Create permanent physical replication slot if requested.
+	 */
+	if (replication_slot && create_slot)
+	{
+		if (verbose)
+			fprintf(stderr, ```

### Re: [HACKERS] WIP: Faster Expression Processing v4

```Hi,

On 2017-03-24 11:26:27 -0400, Tom Lane wrote:
> Another modest proposal:
>
> I'm not really sold on the approach of using EEOP_FETCHSOME opcodes to
> trigger initial tupleslot de-forming.  Certainly we want to have a single
> slot_getsomeattrs call per source slot, but as-is, we need a separate
> traversal over the expression tree just to precompute the max attribute
> number needed.  That can't be good for expression compile speed, and
> it introduces nontrivial bug risks because the code that does that
> is completely decoupled from the code that emits the EEOP_VAR opcodes
> (which are what's really relying on the de-forming to have happened).

Hm.  We had the separate traversal for projections for a long while, and
I don't think there've been a a lot of changes to the extraction of the
last attribute number.  I'm very doubtful that the cost of traversing
the expression twice is meaningful in comparison to the other costs.

> What do you think about a design like this:
>
> * Drop the FETCHSOME opcodes.
>
> * Add fields to struct ExprState that will hold the maximum inner,
> outer, and scan attribute numbers needed.
>
> * ExecInitExpr initializes those fields to zero, and then during
> ExecInitExprRec, whenever we generate an EEOP_VAR opcode, we do e.g.
>
>   state->last_inner_attno = Max(state->last_inner_attno,
> variable->varattno);
>
> * ExecInitExprSlots, get_last_attnums_walker, etc all go away.
>
> * In the startup segment of ExecInterpExpr, add
>
>   if (state->last_inner_attno > 0)
>   slot_getsomeattrs(innerslot, state->last_inner_attno);
>   if (state->last_outer_attno > 0)
>   slot_getsomeattrs(outerslot, state->last_outer_attno);
>   if (state->last_scan_attno > 0)
>   slot_getsomeattrs(scanslot, state->last_scan_attno);
>
> This would be a little different from the existing code as far as runtime
> branch-prediction behavior goes, but it's not apparent to me that it'd be
> any worse.

I'd be suprised if it weren't.

I'm not super strongly against this setup, but I fail to see the benefit
of whacking this around.  I've benchmarked the previous/current setup
fairly extensively, and I'd rather not redo that.  In contrast to the
other changes you've talked about, this definitely is in the hot-path...

> Also, for JIT purposes it'd still be entirely possible to compile the
> slot_getsomeattrs calls in-line; you'd just be looking to a different
> part of the ExprState struct to find out what to do.

Yea, we could do that.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] pg_serial early wraparound

```On Sat, Mar 25, 2017 at 3:11 AM, Anastasia Lubennikova
wrote:
> Hi, I've tried to review this patch, but it seems that I miss something
> essential.

Hi Anastasia,

Thanks for looking at this.

> You claim that SLRUs now support five digit segment name, while in slru.h
> at current master I see the following:
>
>  * Note: slru.c currently assumes that segment file names will be four hex
>  * digits.  This sets a lower bound on the segment size (64K transactions
>  * for 32-bit TransactionIds).
>  */
> #define SLRU_PAGES_PER_SEGMENT  32
>
> /* Maximum length of an SLRU name */
> #define SLRU_MAX_NAME_LENGTH32

That comment is out of date.  Commit 638cf09e extended SLRUs to
support 5 digit names, to support pg_multixact.  And I see now that
commit 73c986ad more recently created the possibility of 6 chacater
SLRU file names for pg_commit_ts.

> Could you please clarify the idea of the patch? Is it still relevant?

The idea is simply to remove some strange old code including scary
error messages that is no longer needed.  In my study of predicate.c
for other reasons, I noticed this in passing and thought I'd tidy it
up.  Because I have tangled with pg_multixact and seen 5-character
SLRU files with my own eyes, I knew that the restriction that
motivated this code was no longer valid.

> I've also run your test script.
> pg_clog was renamed to pg_xact, so it need to be changed accordingly
> echo "Contents of pg_clog:"
>   ls \$PGDATA/pg_xact/

Right.

> The test shows failed assertion:
>
> == setting next xid to 1073741824 =
> Transaction log reset
> waiting for server to start2017-03-24 17:05:19.897 MSK [1181] LOG:
> listening on IPv4 address "127.0.0.1", port 5432
> 2017-03-24 17:05:19.981 MSK [1181] LOG:  listening on Unix socket
> "/tmp/.s.PGSQL.5432"
> 2017-03-24 17:05:20.081 MSK [1183] LOG:  database system was shut down at
> 2017-03-24 17:05:19 MSK
> 2017-03-24 17:05:20.221 MSK [1181] LOG:  database system is ready to accept
> connections
>  done
> server started
> vacuumdb: vacuuming database "postgres"
> vacuumdb: vacuuming database "template0"
> vacuumdb: vacuuming database "template1"
> TRAP: FailedAssertion("!(TransactionIdPrecedesOrEquals(oldestXact,
> ShmemVariableCache->oldestXid))", File: "clog.c", Line: 669)
> vacuumdb: vacuuming of database "template1" failed: server closed the
> connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> 2017-03-24 17:05:21.541 MSK [1181] LOG:  server process (PID 1202) was
> terminated by signal 6: Aborted
> 2017-03-24 17:05:21.541 MSK [1181] DETAIL:  Failed process was running:
> VACUUM (FREEZE);

My cheap trick for moving the xid around the clock quickly to test
wraparound scenarios no longer works, since this new assertion was
added in ea42cc18.  That was committed just a few hours before you
tested this.  Bad luck for me!

> The new status of this patch is: Waiting on Author

It's not urgent, it's just cleanup work, so I've now moved it to the
next commitfest.  I will try to figure out a new way to demonstrate
that it works correctly without having to ask a reviewing to disable
any assertions.  Thanks again.

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] extended statistics: n-distinct

```Pushed this after some more tweaking.

--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

```On Fri, Mar 24, 2017 at 6:46 PM, Amit Kapila
wrote:

>
>
> I was worried for the case if the index is created non-default
> collation, will the datumIsEqual() suffice the need.  Now again
> thinking about it, I think it will because in the index tuple we are
> storing the value as in heap tuple.  However today it occurred to me
> how will this work for toasted index values (index value >
> TOAST_INDEX_TARGET).  It is mentioned on top of datumIsEqual() that it
> probably won't work for toasted values.  Have you considered that
> point?
>
>
No, I haven't and thanks for bringing that up. And now that I think more
about it and see the code, I think the naive way of just comparing index
attribute value against heap values is probably wrong. The example of
TOAST_INDEX_TARGET is one such case, but I wonder if there are other
varlena attributes that we might store differently in heap and index. Like
index_form_tuple() ->heap_fill_tuple seem to some churning for varlena.
It's not clear to me if index_get_attr will return the values which are
binary comparable to heap values.. I wonder if calling index_form_tuple on
the heap values, fetching attributes via index_get_attr on both index
tuples and then doing a binary compare is a more robust idea. Or may be
that's just duplicating efforts.

While looking at this problem, it occurred to me that the assumptions made
for hash indexes are also wrong :-( Hash index has the same problem as
expression indexes have. A change in heap value may not necessarily cause a
change in the hash key. If we don't detect that, we will end up having two
hash identical hash keys with the same TID pointer. This will cause the
duplicate key scans problem since hashrecheck will return true for both the
hash entries. That's a bummer as far as supporting WARM for hash indexes is
concerned, unless we find a way to avoid duplicate index entries.

Thanks,
Pavan

--
PostgreSQL Development, 24x7 Support, Training & Services

```

### Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

```On 2017-03-24 13:50:54 -0400, Robert Haas wrote:
> On Fri, Mar 24, 2017 at 12:27 PM, Robert Haas  wrote:
> > On Fri, Mar 24, 2017 at 12:14 PM, Robert Haas  wrote:
> >> On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs
> >> wrote:
> >>> Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
> >>>
> >>> For normal commits and aborts we already reset PgXact->xmin
> >>> Avoiding touching highly contented shmem improves concurrent
> >>> performance.
> >>>
> >>> Simon Riggs
> >>
> >> I'm getting occasional crashes with backtraces that look like this:
> >>
> >> #4  0x000107e4be2b in AtEOXact_Snapshot (isCommit= >> temporarily unavailable, due to optimizations>, isPrepare=0 '\0') at
> >> snapmgr.c:1154
> >> #5  0x000107a76c06 in CleanupTransaction () at xact.c:2643
> >>
> >> I suspect that is the fault of this patch.  Please fix or revert.
> >
> > Also, the entire buildfarm is turning red.
> >
> > longfin, spurfowl, and magpie all show this assertion failure in the
> > log.  I haven't checked the others.
> >
> > TRAP: FailedAssertion("!(MyPgXact->xmin == 0)", File: "snapmgr.c", Line:
> > 1154)
>
> Another thing that is interesting is that when I run make -j8
> check-world, the overall tests appear to succeed even though there are
> failures mid-way through:
>
> test tablefunc... FAILED (test process exited with exit code
> 2)
>
> ...but then later we end with:
>
> ok
> All tests successful.
> Files=11, Tests=80, 251 wallclock secs ( 0.07 usr  0.02 sys + 19.77
> cusr 14.45 csys = 34.31 CPU)
> Result: PASS

> real4m27.421s
> user3m50.047s
> sys1m31.937s

> That's unrelated to the current problem of course, but it seems to
> suggest that make's -j option doesn't entirely do what you'd expect
> when used with make check-world.
>

That's likely the output of a different test from the one that failed.
It's a lot easier to see the result if you're doing
&& echo success || echo failure

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

```On 24 March 2017 at 16:14, Robert Haas  wrote:

> I suspect that is the fault of this patch.  Please fix or revert.

Will revert then fix.

--
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

```On Fri, Mar 24, 2017 at 12:27 PM, Robert Haas  wrote:
> On Fri, Mar 24, 2017 at 12:14 PM, Robert Haas  wrote:
>> On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs  wrote:
>>> Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
>>>
>>> For normal commits and aborts we already reset PgXact->xmin
>>> Avoiding touching highly contented shmem improves concurrent
>>> performance.
>>>
>>> Simon Riggs
>>
>> I'm getting occasional crashes with backtraces that look like this:
>>
>> #4  0x000107e4be2b in AtEOXact_Snapshot (isCommit=> temporarily unavailable, due to optimizations>, isPrepare=0 '\0') at
>> snapmgr.c:1154
>> #5  0x000107a76c06 in CleanupTransaction () at xact.c:2643
>>
>> I suspect that is the fault of this patch.  Please fix or revert.
>
> Also, the entire buildfarm is turning red.
>
> longfin, spurfowl, and magpie all show this assertion failure in the
> log.  I haven't checked the others.
>
> TRAP: FailedAssertion("!(MyPgXact->xmin == 0)", File: "snapmgr.c", Line: 1154)

Another thing that is interesting is that when I run make -j8
check-world, the overall tests appear to succeed even though there are
failures mid-way through:

test tablefunc... FAILED (test process exited with exit code 2)

...but then later we end with:

ok
All tests successful.
Files=11, Tests=80, 251 wallclock secs ( 0.07 usr  0.02 sys + 19.77
cusr 14.45 csys = 34.31 CPU)
Result: PASS

real4m27.421s
user3m50.047s
sys1m31.937s

That's unrelated to the current problem of course, but it seems to
suggest that make's -j option doesn't entirely do what you'd expect
when used with make check-world.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Report the number of skipped frozen pages by manual VACUUM

```On Thu, Mar 23, 2017 at 4:28 AM, Fujii Masao  wrote:
> On Wed, Mar 15, 2017 at 7:51 PM, Masahiko Sawada
> wrote:
>> On Wed, Mar 15, 2017 at 1:09 PM, Yugo Nagata  wrote:
>>> On Fri, 10 Mar 2017 20:08:42 +0900
>>> Masahiko Sawada  wrote:
>>>
On Fri, Mar 10, 2017 at 3:58 PM, Jim Nasby  wrote:
> On 3/6/17 8:34 PM, Masahiko Sawada wrote:
>>
>> I don't think it can say "1 frozen pages" because the number of
>> skipped pages according to visibility map is always more than 32
>> (SKIP_PAGES_THRESHOLD).
>
>
> That's just an artifact of how the VM currently works. I'm not a fan of
> cross dependencies like that unless there's a pretty good reason.

Thank you for the comment.
Agreed. Attached fixed version patch.
>>>
>>> It seems that it says "frozen pages" if pinskipped_pages is not zero,
>>> rather than about frozenskipped_pages.
>>>
>>> How about writing as below?
>>>
>>> appendStringInfo(, ngettext("Skipped %u page due to buffer pins, "
>>> "Skipped %u pages due to buffer pins, ",
>>> vacrelstats->pinskipped_pages),
>>>  vacrelstats->pinskipped_pages);
>>> appendStringInfo(, ngettext("%u frozen page.\n", "%u frozen
>>> pages.\n",
>>> vacrelstats->frozenskipped_pages),
>>>  vacrelstats->frozenskipped_pages);
>>>
>>
>> Yeah, you're right. I've attached the updated version patch
>> incorporated your comment and fixing documentation.
>
> The patch looks very simple and good to me.
> Barring objection, I will commit this.

Pushed.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] increasing the default WAL segment size

```Hello,

On Wed, Mar 22, 2017 at 9:41 PM, Kuntal Ghosh
wrote:

> On Wed, Mar 22, 2017 at 3:14 PM, Beena Emerson
> wrote:
> > PFA an updated patch which fixes a minor bug I found. It only increases
> the
> > string size in pretty_wal_size function.
> > The 01-add-XLogSegmentOffset-macro.patch has also been rebased.
> Thanks for the updated versions. Here is a partial review of the patch:
>
> In pg_standby.c and pg_waldump.c,
> + XLogPageHeader hdr = (XLogPageHeader) buf;
> + XLogLongPageHeader NewLongPage = (XLogLongPageHeader) hdr;
> +
> + XLogSegSize = NewLongPage->xlp_seg_size;
> It waits until the file is at least XLOG_BLCKSZ, then gets the
> expected final size from XLogPageHeader. This looks really clean
> compared to the previous approach.
>

thank you for testing. PFA the rebased patch incorporating your comments.

>
> + * Verify that the min and max wal_size meet the minimum requirements.
> Better to write min_wal_size and max_wal_size.
>

Updated wherever applicable.

>
> + errmsg("Insufficient value for \"min_wal_size\"")));
> "min_wal_size %d is too low" may be? Use lower case for error
> messages. Same for max_wal_size.
>

updated.

>
> + /* Set the XLogSegSize */
> + XLogSegSize = ControlFile->xlog_seg_size;
> +
> A call to IsValidXLogSegSize() will be good after this, no?
>

Is it necessary? We already have the CRC check for ControlFiles. Is that
not enough?

>
> + /* Update variables using XLogSegSize */
> + check_wal_size();
> The method name looks somewhat misleading compared to the comment for
> it, doesn't it?
>

The method name is correct since it only checks if the value provided is
sufficient (at least 2 segment size). I have updated the comment to say
Check and update variables dependent on XLogSegSize

> This patch introduces a new guc_unit having values in MB for
> max_wal_size and min_wal_size. I'm not sure about the upper limit
> which is set to INT_MAX for 32-bit systems as well. Is it needed to
> define something like MAX_MEGABYTES similar to MAX_KILOBYTES?
> It is worth mentioning that GUC_UNIT_KB can't be used in this case
> since MAX_KILOBYTES is INT_MAX / 1024(<2GB) on 32-bit systems. That's
> not a sufficient value for min_wal_size/max_wal_size.
>

You are right. I can use the same value as upper limit for GUC_UNIT_MB, we
could probably change the name of MAX_KILOBYTES to something more general
for both GUC_UNIT_MB and GUC_UNIT_KB. The max size in 32-bit systems would
be 2TB.

> While testing with pg_waldump, I got the following error.
> bin/pg_waldump -p master/pg_wal/ -s 0/0100
> Floating point exception (core dumped)
> Stack:
> #0  0x004039d6 in ReadPageInternal ()
> #1  0x00404c84 in XLogFindNextRecord ()
> #2  0x00401e08 in main ()
> I think that the problem is in following code:
> /* parse files as start/end boundaries, extract path if not specified */
> if (optind < argc)
> {
>
> + if (!RetrieveXLogSegSize(full_path))
> ...
> }
> In this case, RetrieveXLogSegSize is conditionally called. So, if the
> condition is false, XLogSegSize will not be initialized.
>
>
pg_waldump code has been updated.

--

Beena Emerson

EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

02-initdb-walsegsize-v8.patch
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Monitoring roles patch

```On Fri, Mar 24, 2017 at 12:46 PM, Dave Page  wrote:
> On Fri, Mar 24, 2017 at 4:24 PM, Robert Haas  wrote:
>>> If we make the users run all the statements individually then they'll
>>> also have to get an updated script for the next version of PG too
>>> because we will have added things that the tools will want access to.
>>
>> That's possible, but it really depends on the tool, not on core
>> PostgreSQL.  The tool should be the one providing the script, because
>> the tool is what knows its own permissions requirements.  Users who
>> care about security won't want to grant every privilege given by a
>> pg_monitor role to a tool that only needs a subset of those
>> privileges.
>
> The upshot of this would be that every time there's a database server
> upgrade that changes the permissions required somehow, the user has to
> login to every server they have and run a script. It is no longer a
> one-time thing, which makes it vastly more painful to deal with

So, I might be all wet here, but I would have expected that changes on
the TOOL side would be vastly more frequent.  I mean, we do not change
what a certain builtin permission does very often.  If we add
pg_read_all_settings, what is the likelihood that the remit of that
role is ever going to change?  I would judge that was is vastly more
likely is that a new version of some tool would start needing that
privilege (or some other) where it didn't before.  If that happens,
and the script is on the tool side, then you just add a new line to
the script.  If the script is built into initdb, then you've got to
wait for the next major release before you can update the definition
of pg_monitor - and maybe argue with other tool authors with different
requirements.

>>> With the approach that Dave and I are advocating, we can avoid all of
>>> that.  Contrib modules can bake-in GRANTs to the appropriate roles,
>>> upgrades can be handled smoothly even when we add new capabilities which
>>> are appropriate, users have a simple and straight-forward way to set up
>>> good monitoring, and tool authors will know what permissions are
>>> available and can finally have a better answer than "well, just make the
>>> monior user superuser if you want to avoid all these complexities."
>>
>> They can have that anyway.  They just have to run a script provided by
>> the tool rather than one provided by the core project as a
>> one-size-fits-all solution for every tool.
>
> Do you object to having individual default roles specifically for
> cases where there are superuser-only checks at the moment that prevent
> GRANT? e.g. pg_show_all_settings for SHOW, pg_show_all_stats for
> pg_tablespace_size and friends, pg_stat_statements for, well,
> pg_stat_statements and so on? It would be an inferior solution in my
> opinion, given that it would demonstrably cause users more work, but
> at least we could do something.

I think pg_show_all_settings is brilliant.  It's narrowly targeted and
in no way redundant with what can anyway be done with GRANT otherwise.
As far as the others, I think that we should just let people GRANT
privileges one by one whenever that's possible, and use built-in roles
where it isn't.  So I'm fine with this:

-if (tblspcOid != MyDatabaseTableSpace)
+if (tblspcOid != MyDatabaseTableSpace &&

But I don't like this:

+GRANT EXECUTE ON FUNCTION pgstattuple(text) TO pg_read_all_stats;

My position is that execute permission on a function is already
grantable, so granting it by default to a built-in role is just
removing flexibility which would otherwise be available to the user.
I do understand that in some cases that may result in a long list of
GRANT commands to make a particular monitoring tool work, but I think
the right solution is to package those commands in an extension or
script distributed with that tool.  When there's a permission that is
reasonably necessary for monitoring tools (or some other reasonable
purpose) and we can't arrange for that permission to be given via a
GRANT EXECUTE ON FUNCTION, then I support adding built-in roles for
those things.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Supporting huge pages on Windows

```Hi,

On Fri, Mar 24, 2017 at 9:00 PM, David Steele  wrote:
> Hi Ashutosh,
>
> On 3/22/17 8:52 AM, Amit Kapila wrote:
>>
>> On Wed, Mar 22, 2017 at 12:07 AM, David Steele
>> wrote:
>>>
>>>
>>> Amit, Magnus, you are signed up as reviewers for this patch.  Do you know
>>> when you'll have a chance to take a look?
>>>
>>
>> I have provided my feedback and I could not test it on my machine.  I
>> think Ashutosh Sharma has tested it.  I can give it another look, but
>> not sure if it helps.
>
>
> I know you are not signed up as a reviewer, but have you take a look at this
> patch?

Well, I had a quick look into the patch just because I wanted the test
it as I am having windows setup. But, I never looked into the patch
from reviewer's perspective. The intention was to reverify the test
results shared by Tsunawaka.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] BUG: pg_dump generates corrupted gzip file in Windows

```On Fri, Mar 24, 2017 at 6:44 AM, Kuntal Ghosh
wrote:
>> ASAICU, if we use binary mode, output is stored bit by bit. In ASCII
>> mode, cmd pokes its nose and does CR / LF conversions on its own. So,
>> whenever we want compression on a plain-text dump file, we can set the
>> stdout mode to O_BINARY. Is it a wrong approach?
> With the help from Ashutosh Sharma, I tested this in Windows
> environment. Sadly, it still doesn't work. :( IMHO, we should document
> the issue somewhere.

Why not?  I mean, if there's code there to force the output into
binary mode, does that not work for the -Fc case?  And if it does work
for the -Fc case, then why doesn't it also work for -Z9?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Monitoring roles patch

```On Fri, Mar 24, 2017 at 4:24 PM, Robert Haas  wrote:
>
>> If we make the users run all the statements individually then they'll
>> also have to get an updated script for the next version of PG too
>> because we will have added things that the tools will want access to.
>
> That's possible, but it really depends on the tool, not on core
> PostgreSQL.  The tool should be the one providing the script, because
> the tool is what knows its own permissions requirements.  Users who
> care about security won't want to grant every privilege given by a
> pg_monitor role to a tool that only needs a subset of those
> privileges.

The upshot of this would be that every time there's a database server
upgrade that changes the permissions required somehow, the user has to
login to every server they have and run a script. It is no longer a
one-time thing, which makes it vastly more painful to deal with

>> With the approach that Dave and I are advocating, we can avoid all of
>> that.  Contrib modules can bake-in GRANTs to the appropriate roles,
>> upgrades can be handled smoothly even when we add new capabilities which
>> are appropriate, users have a simple and straight-forward way to set up
>> good monitoring, and tool authors will know what permissions are
>> available and can finally have a better answer than "well, just make the
>> monior user superuser if you want to avoid all these complexities."
>
> They can have that anyway.  They just have to run a script provided by
> the tool rather than one provided by the core project as a
> one-size-fits-all solution for every tool.

Do you object to having individual default roles specifically for
cases where there are superuser-only checks at the moment that prevent
GRANT? e.g. pg_show_all_settings for SHOW, pg_show_all_stats for
pg_tablespace_size and friends, pg_stat_statements for, well,
pg_stat_statements and so on? It would be an inferior solution in my
opinion, given that it would demonstrably cause users more work, but
at least we could do something.

--
Dave Page
Blog: http://pgsnake.blogspot.com

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)

```

Hello Corey,

v24 highlights:

- finally using git format-patch
- all conditional slash commands broken out into their own functions
(exec_command_\$NAME) , each one tests if it's in an active branch, and if
it's not it consumes the same number of parameters, but discards them.
comments for each slash-command family were left as-is, may need to be
bulked up.
- TAP tests discarded in favor of one ON_EROR_STOP test for invalid \elif
placement
- documentation recommending ON_ERROR_STOP removed, because invalid
expressions no longer throw off if-endif balance
- documentation updated to reflex that contextually-correct-but-invalid
boolean expressions are treated as false
- psql_get_variable has a passthrough void pointer now, but I ended up not
needing it. Instead, all slash commands in false blocks either fetch
OT_NO_EVAL or OT_WHOLE_LINE options. If I'm missing something, let me know.

Patch applies. "make check" ok.

As already pointed out, there is one useless file in the patch.

Although currently there is only one expected argument for boolean
expressions, the n² concatenation algorithm in gather_boolean_expression
is not very elegant. Is there some string buffer data structure which
could be used instead?

ISTM that ignore_boolean_expression may call free on a NULL pointer if the
expression is empty?

Generally I find the per-command functions rather an improvement.

However there is an impact on testing because of all these changes. ISTM
that test cases should reflect this situation and test that \cd, \edit,
... are indeed ignored properly and taking account there expected args...

In "exec_command_connect" an argument is changed from "-reuse-previous" to
"-reuse-previous=", not sure why.

There seems to be pattern repetition for _ev _ef and _sf _sv. Would it
make sense to create a function instead of keeping the initial copy-paste?

I think that some functions could be used for some repeated cases such as
"discard one arg", "discard one or two arg", "discard whole line", for the
various inactive branches, so as to factor out code.

I would suggest to put together all if-related backslash command,
so that the stack management is all in one function instead of 4.

For pset the inactive branch does OT_NORMAL instead of OT_NOT_EVAL, not
sure why.

--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] parallel "return query" is no good

```On Thu, Mar 23, 2017 at 1:53 PM, Alvaro Herrera
wrote:
> Robert Haas wrote:
>> I guess the downside of back-patching this is that it could cause a
>> plan change for somebody which ends up being worse.  On the whole,
>> serial execution of queries intended to be run in parallel isn't
>> likely to work out well, but it's always possible somebody has a cases
>> where it happens to be winning, and this could break it.  So maybe I
>> should do this only in master?  Thoughts?
>
> I think that the chances of someone depending on a parallel plan running
> serially by accident which is better than the non-parallel plan, are
> pretty slim.
>
> +1 for back-patching.

All right, done.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)

```On Fri, Mar 24, 2017 at 2:27 AM, Craig Ringer  wrote:
> On 24 March 2017 at 02:29, Robert Haas  wrote:
>> On Tue, Mar 21, 2017 at 11:35 PM, Craig Ringer  wrote:
>>> Changes made per discussion.
>>
>> Committed 0001.
>
> Much appreciated.
>
> Here's the 2nd patch rebased on top of master, with the TAP test
> included this time. Looks ready to go.

Committed.

> I really appreciate the time you've taken to look at this. Point me at
> anything from your team you want some outside review on.

Thanks, that is a valuable offer which I will be pleased to accept!

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

```On Fri, Mar 24, 2017 at 12:14 PM, Robert Haas  wrote:
> On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs  wrote:
>> Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
>>
>> For normal commits and aborts we already reset PgXact->xmin
>> Avoiding touching highly contented shmem improves concurrent
>> performance.
>>
>> Simon Riggs
>
> I'm getting occasional crashes with backtraces that look like this:
>
> #4  0x000107e4be2b in AtEOXact_Snapshot (isCommit= temporarily unavailable, due to optimizations>, isPrepare=0 '\0') at
> snapmgr.c:1154
> #5  0x000107a76c06 in CleanupTransaction () at xact.c:2643
>
> I suspect that is the fault of this patch.  Please fix or revert.

Also, the entire buildfarm is turning red.

longfin, spurfowl, and magpie all show this assertion failure in the
log.  I haven't checked the others.

TRAP: FailedAssertion("!(MyPgXact->xmin == 0)", File: "snapmgr.c", Line: 1154)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Monitoring roles patch

```On Fri, Mar 24, 2017 at 8:30 AM, Stephen Frost  wrote:
>> But why not do it with GRANTs in the first place then?
>
> This is akin to asking why do we need GRANT ALL and ALTER DEFAULT PRIVs.

Not really.  ALTER DEFAULT PRIVILEGES affects what happens for future
objects, which isn't necessary here at all.  The monitoring tool only
needs to be granted the minimum set of privileges that it currently
needs, not future privileges which it or other monitoring tools may
need in future versions.  GRANT ALL is closer, but GRANT .. ON ALL
TABLES IN SCHEMA really is just a convenience function.  You would
still have the capability to do the exact same thing without that
function; it would just be more work.  And the same is true here.  I
understand why you and Dave want to make this a single command: that's
easier.  But, like GRANT ON ALL TABLES, it's also less flexible.  If
you want to grant on all tables except one, you can't use that magic
syntax any more, and so here.  pg_monitor will either target one
particular monitoring solution (hey, maybe it'll be the one my
employer produces!) or the judgement of one particular person
(whatever Stephen Frost thinks it SHOULD do in a given release,
independent of what tools on the ground do) and those aren't good
definitions.

> If we make the users run all the statements individually then they'll
> also have to get an updated script for the next version of PG too
> because we will have added things that the tools will want access to.

That's possible, but it really depends on the tool, not on core
PostgreSQL.  The tool should be the one providing the script, because
the tool is what knows its own permissions requirements.  Users who
care about security won't want to grant every privilege given by a
pg_monitor role to a tool that only needs a subset of those
privileges.

> Further, they'll have to make sure to install all the contrib modules
> they want to use before running the GRANT script which is provided, or
> deal with GRANT'ing the rights out after installing some extension.

That doesn't sound very hard.

> With the approach that Dave and I are advocating, we can avoid all of
> that.  Contrib modules can bake-in GRANTs to the appropriate roles,
> upgrades can be handled smoothly even when we add new capabilities which
> are appropriate, users have a simple and straight-forward way to set up
> good monitoring, and tool authors will know what permissions are
> available and can finally have a better answer than "well, just make the
> monior user superuser if you want to avoid all these complexities."

They can have that anyway.  They just have to run a script provided by
the tool rather than one provided by the core project as a
one-size-fits-all solution for every tool.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Re: [BUGS] Problem in using pgbench's --connect(-C) and --rate=rate(-R rate) options together.

```
No, it is really needed so that the lag measure is correct.

Thank you, pushed

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Re: Declarative partitioning optimization for large amount of partitions

```Hi Teodor,

Thanks a lot for a review!

> > step1 In pgstat_report_stat() you remove one by one entries from hash and
> > remove them all. Isn't it better to hash_destroy/hash_create or even let
> > hash
> > lives in separate memory context and just resets it?

Agree, fixed.

> > step1 Again, pgstat_report_stat(), all-zero entries aren't deleted from hash
> > although they will be free from point of view of pgStatTabList.

Good point! Fixed.

--
Best regards,
Aleksander Alekseev
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index b704788eb5..4060f241e2 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -161,6 +161,20 @@ typedef struct TabStatusArray
static TabStatusArray *pgStatTabList = NULL;

/*
+ * pgStatTabHash entry
+ */
+typedef struct TabStatHashEntry
+{
+	Oid t_id;
+	PgStat_TableStatus* tsa_entry;
+} TabStatHashEntry;
+
+/*
+ * Hash table for O(1) t_id -> tsa_entry lookup
+ */
+static HTAB *pgStatTabHash = NULL;
+
+/*
* Backends store per-function info that's waiting to be sent to the collector
* in this hash table (indexed by function OID).
*/
@@ -824,6 +838,12 @@ pgstat_report_stat(bool force)
}

/*
+	 * pgStatTabHash is outdated on this point so we have to clean it.
+	 */
+	hash_destroy(pgStatTabHash);
+	pgStatTabHash = NULL;
+
+	/*
* Send partial messages.  Make sure that any pending xact commit/abort
* gets counted, even if there are no table stats to send.
*/
@@ -1668,59 +1688,87 @@ pgstat_initstats(Relation rel)
}

/*
+ * Make sure pgStatTabList and pgStatTabHash are initialized.
+ */
+static void
+make_sure_stat_tab_initialized()
+{
+	HASHCTL			ctl;
+	MemoryContext	new_ctx;
+
+	if(!pgStatTabList)
+	{
+		/* This is first time procedure is called */
+		pgStatTabList = (TabStatusArray *) MemoryContextAllocZero(TopMemoryContext,
+sizeof(TabStatusArray));
+	}
+
+	if(pgStatTabHash)
+		return;
+
+	/* Hash table was freed or never existed.  */
+
+	new_ctx = AllocSetContextCreate(
+		TopMemoryContext,
+		"PGStatLookupHashTableContext",
+		ALLOCSET_DEFAULT_SIZES);
+
+	memset(, 0, sizeof(ctl));
+	ctl.keysize = sizeof(Oid);
+	ctl.entrysize = sizeof(TabStatHashEntry);
+	ctl.hcxt = new_ctx;
+
+	pgStatTabHash = hash_create("pgstat t_id to tsa_entry lookup hash table",
+		TABSTAT_QUANTUM, , HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+}
+
+/*
* get_tabstat_entry - find or create a PgStat_TableStatus entry for rel
*/
static PgStat_TableStatus *
get_tabstat_entry(Oid rel_id, bool isshared)
{
+	TabStatHashEntry* hash_entry;
PgStat_TableStatus *entry;
TabStatusArray *tsa;
-	TabStatusArray *prev_tsa;
-	int			i;
+	bool found;
+
+	make_sure_stat_tab_initialized();

/*
-	 * Search the already-used tabstat slots for this relation.
+	 * Find an entry or create a new one.
*/
-	prev_tsa = NULL;
-	for (tsa = pgStatTabList; tsa != NULL; prev_tsa = tsa, tsa = tsa->tsa_next)
+	hash_entry = hash_search(pgStatTabHash, _id, HASH_ENTER, );
+	if(found)
+		return hash_entry->tsa_entry;
+
+	/*
+	 * `hash_entry` was just created and now we have to fill it.
+	 * First make sure there is a free space in a last element of pgStatTabList.
+	 */
+	tsa = pgStatTabList;
+	while(tsa->tsa_used == TABSTAT_QUANTUM)
{
-		for (i = 0; i < tsa->tsa_used; i++)
+		if(tsa->tsa_next == NULL)
{
-			entry = >tsa_entries[i];
-			if (entry->t_id == rel_id)
-return entry;
+			tsa->tsa_next = (TabStatusArray *) MemoryContextAllocZero(TopMemoryContext,
+		sizeof(TabStatusArray));
}

-		if (tsa->tsa_used < TABSTAT_QUANTUM)
-		{
-			/*
-			 * It must not be present, but we found a free slot instead. Fine,
-			 * let's use this one.  We assume the entry was already zeroed,
-			 * either at creation or after last use.
-			 */
-			entry = >tsa_entries[tsa->tsa_used++];
-			entry->t_id = rel_id;
-			entry->t_shared = isshared;
-			return entry;
-		}
+		tsa = tsa->tsa_next;
}

/*
-	 * We ran out of tabstat slots, so allocate more.  Be sure they're zeroed.
-	 */
-	tsa = (TabStatusArray *) MemoryContextAllocZero(TopMemoryContext,
-	sizeof(TabStatusArray));
-	if (prev_tsa)
-		prev_tsa->tsa_next = tsa;
-	else
-		pgStatTabList = tsa;
-
-	/*
-	 * Use the first entry of the new TabStatusArray.
+	 * Add an entry.
*/
entry = >tsa_entries[tsa->tsa_used++];
entry->t_id = rel_id;
entry->t_shared = isshared;
+
+	/*
+	 * Add a corresponding entry to pgStatTabHash.
+	 */
+	hash_entry->tsa_entry = entry;
return entry;
}

@@ -1732,22 +1780,19 @@ get_tabstat_entry(Oid rel_id, bool isshared)
PgStat_TableStatus *
find_tabstat_entry(Oid rel_id)
{
-	PgStat_TableStatus *entry;
-	TabStatusArray *tsa;
-	int			i;
+	TabStatHashEntry* hash_entry;

-	for (tsa = pgStatTabList; tsa != NULL; tsa = tsa->tsa_next)
-	{
-		for (i = 0; i < tsa->tsa_used; i++)
-		{
-			entry = >tsa_entries[i];
-			if (entry->t_id == rel_id)
-return entry;
-		}
-	}
+	/*
+	 * There are no entries at all.
+	 */
+	if(!pgStatTabHash)
+		```

### Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

```On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs  wrote:
> Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
>
> For normal commits and aborts we already reset PgXact->xmin
> Avoiding touching highly contented shmem improves concurrent
> performance.
>
> Simon Riggs

I'm getting occasional crashes with backtraces that look like this:

#0  0x7fff9679c286 in __pthread_kill ()
#1  0x7fff94e1a9f9 in pthread_kill ()
#2  0x7fff9253a9a3 in abort ()
#3  0x000107e0659e in ExceptionalCondition (conditionName=, errorType=0x6 , fileName=, lineNumber=) at assert.c:54
#4  0x000107e4be2b in AtEOXact_Snapshot (isCommit=, isPrepare=0 '\0') at
snapmgr.c:1154
#5  0x000107a76c06 in CleanupTransaction () at xact.c:2643
#6  0x000107a76267 in CommitTransactionCommand () at xact.c:2818
#7  0x000107cecfc2 in exec_simple_query
(query_string=0x7f975481e640 "ABORT TRANSACTION") at postgres.c:2461
#8  0x000107ceabb7 in PostgresMain (argc=, argv=, dbname=, username=) at postgres.c:4071
#9  0x000107c6bb58 in PostmasterMain (argc=, argv=) at postmaster.c:4317
#10 0x000107be5cdd in main (argc=, argv=) at main.c:228

I suspect that is the fault of this patch.  Please fix or revert.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Parallel Append implementation

```On 24 March 2017 at 00:38, Amit Khandekar  wrote:
> On 23 March 2017 at 16:26, Amit Khandekar  wrote:
>> On 23 March 2017 at 05:55, Robert Haas  wrote:
>>>
>>> So in your example we do this:
>>>
>>> C[0] += 20;
>>> C[1] += 16;
>>> C[2] += 10;
>>> /* C[2] is smaller than C[0] or C[1] at this point, so we add the next
>>> path to C[2] */
>>> C[2] += 8;
>>> /* after the previous line, C[1] is now the smallest, so add to that
>>> entry next */
>>> C[1] += 3;
>>> /* now we've got C[0] = 20, C[1] = 19, C[2] = 18, so add to C[2] */
>>> C[2] += 1;
>>> /* final result: C[0] = 20, C[1] = 19, C[2] = 19 */
>>>
>>> Now we just take the highest entry that appears in any array, which in
>>> this case is C[0], as the total cost.
>>
>> Wow. The way your final result exactly tallies with my algorithm
>> result is very interesting. This looks like some maths or computer
>> science theory that I am not aware.
>>
>> I am currently coding the algorithm using your method.
>

> While I was coding this, I was considering if Path->rows also should
> be calculated similar to total cost for non-partial subpath and total
> cost for partial subpaths. I think for rows, we can just take
> total_rows divided by workers for non-partial paths, and this
> approximation should suffice. It looks odd that it be treated with the
> same algorithm we chose for total cost for non-partial paths.

Attached is the patch v12, where Path->rows calculation of non-partial
paths is kept separate from the way total cost is done for non-partial
costs. rows for non-partial paths is calculated as total_rows divided
by workers as approximation. And then rows for partial paths are just
added one by one.

>
> Meanwhile, attached is a WIP patch v10. The only change in this patch
> w.r.t. the last patch (v9) is that this one has a new function defined
> append_nonpartial_cost(). Just sending this to show how the algorithm
> looks like; haven't yet called it.

Now append_nonpartial_cost() is used, and it is tested.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

ParallelAppend_v12.patch
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] exposing wait events for non-backends (was: Tracking wait event for latches)

```On Thu, Mar 23, 2017 at 7:29 AM, Michael Paquier
wrote:
> On Thu, Mar 23, 2017 at 8:19 PM, Kuntal Ghosh
>  wrote:
>> Hence, to be consistent with others, bgworker processes can be
>> initialized from BackgroundWorkerInitializeConnectionBy[Oid].
>
> Yeah, I am fine with that. Thanks for the updated versions.

I think this is still not good.  The places where pgstat_bestart() has
been added are not even correct.  For example, the call added to
BackgroundWriterMain() occurs after the section that does
error-recovery, so it would get repeated every time the background
writer recovers from an error.  There are similar problems elsewhere.
Furthermore, although in theory there's an idea here that we're making
it no longer the responsibility of InitPostgres() to call
pgstat_bestart(), the patch as proposed only removes one of the two
calls, so we really don't even have a consistent practice.  I think
it's better to go with the idea of having InitPostgres() be
responsible for calling this for regular backends, and
AuxiliaryProcessMain() for auxiliary backends.  That involves
substantially fewer calls to pgstat_bestart() and they are spread
across only two functions, which IMHO makes fewer bugs of omission a
lot less likely.

Updated patch with that change attached.  In addition to that
modification, I made some other alterations:

- I changed the elog() for the can't-happen case in pgstat_bestart()
from PANIC to FATAL.  The contents of shared memory aren't corrupted,
so I think PANIC is overkill.

- I tweaked the comment in WalSndLoop() just before the
pgstat_report_activity() call to accurately reflect what the effect of
that call now is.

- I changed the column ordering in pg_stat_get_activity() to put
backend_type with the other columns that appear in pg_stat_activity,
rather than (as the patch did) grouping it with the ones that appear
in pg_stat_ssl.

- I modified the code to tolerate a NULL return from
AuxiliaryPidGetProc().  I am pretty sure that without that there's a
race condition that could lead to a crash if somebody tried to call
this function just as an auxiliary process was terminating.

- I updated the documentation slightly.

- I rebased over some conflicting commits.

If there aren't objections, I plan to commit this version.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

more-pg-stat-activity.patch
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Re: Declarative partitioning optimization for large amount of partitions

```
Sorry, 1) and 4) is my fault, comment in hsearch.h:
* ... The hash key
* is expected to be at the start of the caller's hash entry data structure.

Ops, forgot that.

Teodor Sigaev wrote:

things in order I'm attaching the previous patch as well.

Patches look good, but I have some notices:

1 step1 Why do you need TabStatHashEntry at all? TabStatHashEntry.t_id is never
used for read, so entry for hash could be just a pointer to PgStat_TableStatus.

2 step1 In pgstat_report_stat() you remove one by one entries from hash and
remove them all. Isn't it better to hash_destroy/hash_create or even let hash
lives in separate memory context and just resets it?

3 step1 Again, pgstat_report_stat(), all-zero entries aren't deleted from hash
although they will be free from point of view of pgStatTabList.

4 step 2. The same as 1) about SeenRelsEntry->rel_id but it even isn't
initialized anywhere.

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [PATCH] Generic type subscripting

```
On 24.03.2017 18:29, Tom Lane wrote:

David Steele  writes:

Do you have an idea when you will have a patch ready?  We are now into
the last week of the commitfest.  I see one question for Tom, but it's
not clear that this would prevent you from producing a new patch.

FWIW, I'm up to my eyeballs in Andres' faster-expressions patch, and
won't have time to think about this one for at least a couple more
days.

regards, tom lane

I can try to review a new version of the patch, when Dmitry will send
it. If no one objects. Besides, I've seen the previous versions of the
patch from previous commitfest.

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Should we cacheline align PGXACT?

```On Fri, Mar 24, 2017 at 10:12 AM, Simon Riggs  wrote:
> On 10 March 2017 at 13:08, Alexander Korotkov
> wrote:
>> Results look good for me.  Idea of committing both of patches looks
>> attractive.
>
> I'll commit mine since I understand what it does. I'll look at the other one
> also, but won't commit yet.

Might've been nice to include reviewers, testers, and a link to the
mailing list discussion in the commit message.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Supporting huge pages on Windows

```
Hi Ashutosh,

On 3/22/17 8:52 AM, Amit Kapila wrote:

On Wed, Mar 22, 2017 at 12:07 AM, David Steele  wrote:

Amit, Magnus, you are signed up as reviewers for this patch.  Do you know
when you'll have a chance to take a look?

I have provided my feedback and I could not test it on my machine.  I
think Ashutosh Sharma has tested it.  I can give it another look, but
not sure if it helps.

I know you are not signed up as a reviewer, but have you take a look at
this patch?

Thanks,
--
-David
da...@pgmasters.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] pgbench - allow to store select results into variables

```

Hello Rafia,

if (my_command->argc > 2)
+ syntax_error(source, lineno, my_command->line, my_command->argv[0],
+ "at most on argument expected", NULL, -1);

I suppose you mean 'one' argument here.

Indeed.

Apart from that indentation is not correct as per pgindent, please check.

I guess that you are refering to switch/case indentation which my emacs
does not do as expected.

Please find attached a v8 which hopefully fixes these two issues.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 1eee8dc..1108fcb 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -816,6 +816,51 @@ pgbench  options  dbname

+
+
+ \gcset [prefix] or
+ \gset [prefix]
+
+
+
+
+  These commands may be used to end SQL queries, replacing a semicolon.
+  \gcset replaces an embedded semicolon (\;) within
+  a compound SQL command, and \gset replaces a final
+  (;) semicolon which ends the SQL command.
+
+
+
+  When these commands are used, the preceding SQL query is expected to
+  return one row, the columns of which are stored into variables named after
+  column names, and prefixed with prefix if provided.
+
+
+
+  The following example puts the final account balance from the first query
+  into variable abalance, and fills variables
+  one, two and p_three with
+  integers from a compound query.
+
+UPDATE pgbench_accounts
+  SET abalance = abalance + :delta
+  WHERE aid = :aid
+  RETURNING abalance \gset
+-- compound of two queries
+SELECT 1 AS one, 2 AS two \gcset
+SELECT 3 AS three \gset p_
+
+
+
+
+
+\gcset and \gset commands do not work when
+empty SQL queries appear within a compound SQL command.
+
+
+
+
+

\set varname expression
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 78f1e6b..ac9289b 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -375,11 +375,14 @@ static const char *QUERYMODE[] = {"simple", "extended", "prepared"};

typedef struct
{
-	char	   *line;			/* text of command line */
+	char	   *line;			/* first line for short display */
+	char	   *lines;			/* full multi-line text of command */
int			command_num;	/* unique index of this Command struct */
int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
int			argc;			/* number of command words */
char	   *argv[MAX_ARGS]; /* command word list */
+	int			compound;   /* last compound command (number of \;) */
+	char	  **gset;   /* per-compound command prefix */
PgBenchExpr *expr;			/* parsed expression, if needed */
SimpleStats stats;			/* time spent in this command */
} Command;
@@ -1225,6 +1228,105 @@ getQueryParams(CState *st, const Command *command, const char **params)
params[i] = getVariable(st, command->argv[i + 1]);
}

+/* read all responses from backend */
+static bool
+read_response(CState *st, char **gset)
+{
+	PGresult   *res;
+	int			compound = -1;
+
+	while ((res = PQgetResult(st->con)) != NULL)
+	{
+		compound += 1;
+
+		switch (PQresultStatus(res))
+		{
+			case PGRES_COMMAND_OK: /* non-SELECT commands */
+			case PGRES_EMPTY_QUERY: /* may be used for testing no-op overhead */
+if (gset[compound] != NULL)
+{
+	fprintf(stderr,
+			"client %d file %d command %d compound %d: "
+			"\\gset expects a row\n",
+			st->id, st->use_file, st->command, compound);
+	st->ecnt++;
+	return false;
+}
+break; /* OK */
+
+			case PGRES_TUPLES_OK:
+if (gset[compound] != NULL)
+{
+	/* store result into variables */
+	int ntuples = PQntuples(res),
+		nfields = PQnfields(res),
+		f;
+
+	if (ntuples != 1)
+	{
+		fprintf(stderr,
+"client %d file %d command %d compound %d: "
+"expecting one row, got %d\n",
+st->id, st->use_file, st->command, compound, ntuples);
+		st->ecnt++;
+		PQclear(res);
+		return false;
+	}
+
+	for (f = 0; f < nfields ; f++)
+	{
+		char *varname = PQfname(res, f);
+		if (*gset[compound] != '\0')
+			varname = psprintf("%s%s", gset[compound], varname);
+
+		/* store result as a string */
+		if (!putVariable(st, "gset", varname,
+		 PQgetvalue(res, 0, f)))
+		{
+			/* internal error, should it rather abort? */
+			fprintf(stderr,
+	"client %d file %d command %d compound %d: "
+	"error storing into var %s\n",
+	st->id, st->use_file, st->command, compound,
+	varname);
+			st->ecnt++;
+			PQclear(res);
+			return false;
+		}
+
+		if (*gset[compound] != '\0')
+			free(varname);
+	}
+}
+break;	/* OK */
+
+			default:
+/* everything else is unexpected, so probably an error */
+fprintf(stderr,
+		"client %d file %d aborted in command ```

### Re: [HACKERS] Re: Declarative partitioning optimization for large amount of partitions

```
things in order I'm attaching the previous patch as well.

Patches look good, but I have some notices:

1 step1 Why do you need TabStatHashEntry at all? TabStatHashEntry.t_id is never
used for read, so entry for hash could be just a pointer to PgStat_TableStatus.

2 step1 In pgstat_report_stat() you remove one by one entries from hash and
remove them all. Isn't it better to hash_destroy/hash_create or even let hash
lives in separate memory context and just resets it?

3 step1 Again, pgstat_report_stat(), all-zero entries aren't deleted from hash
although they will be free from point of view of pgStatTabList.

4 step 2. The same as 1) about SeenRelsEntry->rel_id but it even isn't
initialized anywhere.

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [PATCH] Generic type subscripting

```David Steele  writes:
> Do you have an idea when you will have a patch ready?  We are now into
> the last week of the commitfest.  I see one question for Tom, but it's
> not clear that this would prevent you from producing a new patch.

FWIW, I'm up to my eyeballs in Andres' faster-expressions patch, and
won't have time to think about this one for at least a couple more
days.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] WIP: Faster Expression Processing v4

```Another modest proposal:

I'm not really sold on the approach of using EEOP_FETCHSOME opcodes to
trigger initial tupleslot de-forming.  Certainly we want to have a single
slot_getsomeattrs call per source slot, but as-is, we need a separate
traversal over the expression tree just to precompute the max attribute
number needed.  That can't be good for expression compile speed, and
it introduces nontrivial bug risks because the code that does that
is completely decoupled from the code that emits the EEOP_VAR opcodes
(which are what's really relying on the de-forming to have happened).

What do you think about a design like this:

* Drop the FETCHSOME opcodes.

* Add fields to struct ExprState that will hold the maximum inner,
outer, and scan attribute numbers needed.

* ExecInitExpr initializes those fields to zero, and then during
ExecInitExprRec, whenever we generate an EEOP_VAR opcode, we do e.g.

state->last_inner_attno = Max(state->last_inner_attno,
variable->varattno);

* ExecInitExprSlots, get_last_attnums_walker, etc all go away.

* In the startup segment of ExecInterpExpr, add

if (state->last_inner_attno > 0)
slot_getsomeattrs(innerslot, state->last_inner_attno);
if (state->last_outer_attno > 0)
slot_getsomeattrs(outerslot, state->last_outer_attno);
if (state->last_scan_attno > 0)
slot_getsomeattrs(scanslot, state->last_scan_attno);

This would be a little different from the existing code as far as runtime
branch-prediction behavior goes, but it's not apparent to me that it'd be
any worse.  Also, for JIT purposes it'd still be entirely possible to
compile the slot_getsomeattrs calls in-line; you'd just be looking to a
different part of the ExprState struct to find out what to do.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] cast result of copyNode()

```
On 3/21/17 6:52 PM, Mark Dilger wrote:

On Mar 21, 2017, at 2:13 PM, David Steele  wrote:

Hi Mark,

On 3/9/17 3:34 PM, Peter Eisentraut wrote:

On 3/7/17 18:27, Mark Dilger wrote:

You appear to be using a #define macro to wrap a function of the same name
with the code:

#define copyObject(obj) ((typeof(obj)) copyObject(obj))

Yeah, that's a bit silly.  Here is an updated version that changes that.

Do you know when you'll have a chance to take a look at the updated patch?

The patch applies cleanly, compiles, and passes all the regression tests
for me on my laptop.  Peter appears to have renamed the function copyObject
as copyObjectImpl, which struct me as odd when I first saw it, but I don't have
a better name in mind, so that seems ok.

If the purpose of this patch is to avoid casting so many things down to (Node
*),
perhaps some additional work along the lines of the patch I'm attaching are
appropriate.  (This patch applies on top Peter's v2 patch).  The idea being to
keep objects as (Expr *) where appropriate, rather than casting through (Node *)
quite so much.

I'm not certain that this is (a) merely a bad idea, (b) a different idea than
what
Peter is proposing, and as such should be submitted independently, or
(c) something that aught to be included in Peter's patch prior to commit.
I only applied this idea to one file, and maybe not completely in that file,
because
I'd like feedback before going any further along these lines.

I have marked this "Waiting on Author" pending Peter's input.

--
-David
da...@pgmasters.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [PATCH] Generic type subscripting

```On 24 March 2017 at 15:39, David Steele  wrote:
>
> Do you have an idea when you will have a patch ready?

Yes, I'll prepare a new version with most important changes in two days.

```

### Re: [HACKERS] Declarative partitioning optimization for large amount of partitions

```Hi Simon,

> > I don't know which way you're thinking of fixing this, but a planner patch
> > to implement faster partition-pruning will have taken care of this, I
> > think.  As you may know, even declarative partitioned tables currently
> > depend on constraint exclusion for partition-pruning and planner's current
> > approach of handling inheritance requires to open all the child tables
> > (partitions), whereas the new approach hopefully shouldn't need to do
> > that.  I am not sure if looking for a more localized fix for this would be
> > worthwhile, although I may be wrong.
>
> What "new approach" are we discussing?
> Is there a patch or design discussion?

I think what was meant was plans of my colleague Dmitry Ivanov to
matter. Unless there is anyone else who is already working on this
optimization we would like to work on it together. Unfortunately there
is no patch or design discussion of partition-pruning on this
commitfest.

--
Best regards,
Aleksander Alekseev

signature.asc
Description: PGP signature

```

### Re: [HACKERS] PATCH: Batch/pipelining support for libpq

```
Hi Vaishnavi,

On 3/19/17 9:32 PM, Vaishnavi Prabakaran wrote:

On Fri, Mar 17, 2017 at 12:37 AM, Daniel Verite  I would also like to hear Craig's opinion on it before applying this fix
> to the original patch, just to make sure am not missing anything here.

Craig?

Am going to include the test which you shared in the test patch. Please
let me know if you want to cover anymore specific cases to gain
confidence.

I have marked this submission "Waiting for Author" since it appears a
new patch is required based on input and adding a new test.

--
-David
da...@pgmasters.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] LWLock optimization for multicore Power machines

```
Hi Alexander,

On 3/16/17 1:35 PM, David Steele wrote:

On 2/21/17 9:54 AM, Bernd Helmle wrote:

Am Dienstag, den 14.02.2017, 15:53 +0300 schrieb Alexander Korotkov:

+1
And you could try to use pg_wait_sampling
to sampling of wait
events.

I've tried this with your example from your blog post[1] and got this:

(pgbench scale 1000)

pgbench -Mprepared -S -n -c 100 -j 100 -T 300 -P2 pgbench2

SELECT-only:

SELECT * FROM profile_log ;
ts |  event_type   | event | count
+---+---+---
2017-02-21 15:21:52.45719  | LWLockNamed   | ProcArrayLock | 8
2017-02-21 15:22:11.19594  | LWLockTranche | lock_manager  | 1
2017-02-21 15:22:11.19594  | LWLockNamed   | ProcArrayLock |24
2017-02-21 15:22:31.220803 | LWLockNamed   | ProcArrayLock | 1
2017-02-21 15:23:01.255969 | LWLockNamed   | ProcArrayLock | 1
2017-02-21 15:23:11.272254 | LWLockNamed   | ProcArrayLock | 2
2017-02-21 15:23:41.313069 | LWLockNamed   | ProcArrayLock | 1
2017-02-21 15:24:31.37512  | LWLockNamed   | ProcArrayLock |19
2017-02-21 15:24:41.386974 | LWLockNamed   | ProcArrayLock | 1
2017-02-21 15:26:41.530399 | LWLockNamed   | ProcArrayLock | 1
(10 rows)

writes pgbench runs have far more events logged, see the attached text
file. Maybe this is of interest...

[1] http://akorotkov.github.io/blog/2016/03/25/wait_monitoring_9_6/

This patch applies cleanly at cccbdde.  It doesn't break compilation on
amd64 but I can't test on a Power-based machine

Alexander, have you had a chance to look at Bernd's results?

I'm marking this submission "Waiting for Author" as your input seems to
be required.

This thread has been idle for a week.  Please respond and/or post a new
patch by 2017-03-28 00:00 AoE (UTC-12) or this submission will be marked
"Returned with Feedback".

--
-David
da...@pgmasters.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Re: [BUGS] Problem in using pgbench's --connect(-C) and --rate=rate(-R rate) options together.

```

Hello Teodor,

Hi, the patch looks good except why do you remove initialization of
is_throttled? Suppose, just a typo?

No, it is really needed so that the lag measure is correct.

Without the is_throttled change:

sh> ./pgbench -T 3 -R 10 -C -S -P 1
starting vacuum...end.
progress: 1.0 s, 9.0 tps, lat 8.278 ms stddev 9.134, lag  2049638230412146.250
ms
progress: 2.0 s, 12.0 tps, lat 4.897 ms stddev 3.961, lag 2.219 ms
transaction type:
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 3 s
number of transactions actually processed: 33
latency average = 5.602 ms
latency stddev = 5.820 ms
rate limit schedule lag: avg 558992244657859.500 (max 18446744073709264.000) ms
tps = 11.024559 (including connections establishing)
tps = 12.183456 (excluding connections establishing)

With the is_throttled change:

./pgbench -T 3 -R 10 -C -S -P 1
starting vacuum...end.
progress: 1.0 s, 11.0 tps, lat 3.742 ms stddev 2.161, lag 1.658 ms
progress: 2.0 s, 7.0 tps, lat 2.985 ms stddev 0.496, lag 0.276 ms
transaction type:
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 3 s
number of transactions actually processed: 25
latency average = 3.312 ms
latency stddev = 1.518 ms
rate limit schedule lag: avg 0.894 (max 7.031) ms
tps = 8.398353 (including connections establishing)
tps = 9.069456 (excluding connections establishing)

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] pg_stat_wal_write statistics view

```
On 3/16/17 1:54 AM, vinayak wrote:

On 2017/03/16 14:46, Haribabu Kommi wrote:

As the view name already contains WAL, I am not sure
whether is it required to include WAL in every column?
I am fine to change if others have the same opinion of
adding WAL to column names.

Ok.

So what is the status of this patch now?  Should it be marked "Ready for
Committer"?

--
-David
da...@pgmasters.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### [HACKERS] Logical replication SnapBuildInitalSnapshot spelling

```Hi,

Commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920 seems to have introduced an
alternative spelling of "initial".  Fixed in the attached.

.m

logical_inital.patch
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Index usage for elem-contained-by-const-range clauses

```
Hi Pritam,

On 3/17/17 5:41 PM, Pritam Baral wrote:

So sorry. I'm attaching the correct version of the original with this,
in case you want to test the limited implementation, because I still
have to go through Tom's list of suggestions.

BTW, the patch is for applying on top of REL9_6_2, and while I
suspect it may work on master too, I haven't tested it since the
original submission (Feb 23).

Also, I noticed that patch haven't regression tests.  Some mention of
this optimization in docs is also nice to have.  > > -- >
Alexander Korotkov > Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company

This thread has been idle for a week.  Please respond and/or post a new
patch by 2017-03-28 00:00 AoE (UTC-12) or this submission will be marked
"Returned with Feedback".

Thanks,
--
-David
da...@pgmasters.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] logical replication apply to run with sync commit off by default

```On 21/03/17 22:37, Petr Jelinek wrote:
> On 21/03/17 18:54, Robert Haas wrote:
>> On Mon, Mar 20, 2017 at 7:56 PM, Petr Jelinek
>>  wrote:
>>> On 18/03/17 13:31, Petr Jelinek wrote:
On 07/03/17 06:23, Petr Jelinek wrote:
> there has been discussion at the logical replication initial copy thread
> [1] about making apply work with sync commit off by default for
> performance reasons and adding option to change that per subscription.
>
> Here I propose patch to implement this - it adds boolean column
> subssynccommit to pg_subscription catalog which decides if
> synchronous_commit should be off or local for apply. And it adds
> SYNCHRONOUS_COMMIT = boolean to the list of WITH options for CREATE and
> ALTER SUBSCRIPTION. When nothing is specified it will set it to false.
>
> The patch is built on top of copy patch currently as there are conflicts
> between the two and this helps a bit with testing of copy patch.
>
> [1]
>

I rebased this patch against recent changes and the latest version of
copy patch.
>>>

Rebase after table copy patch got committed.

--
PostgreSQL Development, 24x7 Support, Training & Services
>From 4e56f693a0b82d116930a03d63c7d034ef004f25 Mon Sep 17 00:00:00 2001
From: Petr Jelinek
Date: Mon, 6 Mar 2017 13:07:45 +0100
Subject: [PATCH] Add option to modify sync commit per subscription

This also changes default behaviour of subscription workers to
synchronous_commit = off
---
doc/src/sgml/catalogs.sgml | 11 +++
doc/src/sgml/ref/alter_subscription.sgml   |  2 ++
doc/src/sgml/ref/create_subscription.sgml  | 12
src/backend/catalog/pg_subscription.c  |  1 +
src/backend/commands/subscriptioncmds.c| 49 --
src/backend/replication/logical/launcher.c |  2 +-
src/backend/replication/logical/worker.c   | 10 ++
src/bin/pg_dump/pg_dump.c  | 12 +++-
src/bin/pg_dump/pg_dump.h  |  1 +
src/bin/pg_dump/t/002_pg_dump.pl   |  2 +-
src/bin/psql/describe.c|  5 ++-
src/include/catalog/pg_subscription.h  | 11 ---
src/test/regress/expected/subscription.out | 27
src/test/regress/sql/subscription.sql  |  3 +-
14 files changed, 116 insertions(+), 32 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index c531c73..9e072ea 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6414,6 +6414,17 @@

+  subsynccommit
+  bool
+
+
+   If true, the apply for the subscription will run with
+   synchronous_commit set to local.
+   Otherwise it will have it set to false.
+
+
+
+
subconninfo
text

diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 6f94247..572e370 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -26,6 +26,8 @@ ALTER SUBSCRIPTION name WITH ( where suboption can be:

SLOT NAME = slot_name
+SLOT NAME = slot_name
+| SYNCHRONOUS_COMMIT = boolean

ALTER SUBSCRIPTION name SET PUBLICATION publication_name [, ...] { REFRESH WITH ( puboption [, ... ] ) | NOREFRESH }
ALTER SUBSCRIPTION name REFRESH PUBLICATION WITH ( puboption [, ... ] )
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 8f3c30b..479f4e4 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -32,6 +32,7 @@ CREATE SUBSCRIPTION subscription_nameslot_name
| COPY DATA | NOCOPY DATA
+| SYNCHRONOUS_COMMIT = boolean
| NOCONNECT

@@ -148,6 +149,17 @@ CREATE SUBSCRIPTION subscription_name

+SYNCHRONOUS_COMMIT = boolean
+
+
+  The value of this parameter overrides the
+  synchronous_commit setting.
+  The default value is false.
+
+
+
+
+
NOCONNECT

diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index e420ec1..e7a1634 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -68,6 +68,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->name = pstrdup(NameStr(subform->subname));
sub->owner = subform->subowner;
sub->enabled = subform->subenabled;
+	sub->synccommit = subform->subsynccommit;

/* Get conninfo */
datum = SysCacheGetAttr(SUBSCRIPTIONOID,
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 0784ca7..8f201b2 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ ```

### Re: [HACKERS] Logical replication origin tracking fix

```On 10/03/17 05:59, Petr Jelinek wrote:
> Hi,
>
> while discussing with Craig issues around restarting logical replication
> stream related to the patch he posted [1], I realized that we track
> wrong origin LSN in the logical replication apply.
>
> We currently track commit_lsn which is *start* of commit record, what we
> need to track is end_lsn which is *end* of commit record otherwise we
> might request transaction that was already replayed if the subscription
> instance has crashed right after commit.
>
> Attached patch fixes that.
>

Rebase after table copy patch got committed.

--
PostgreSQL Development, 24x7 Support, Training & Services
>From 2a312a4ecce09aab6c72f5f039c49b2300b3f3a4 Mon Sep 17 00:00:00 2001
From: Petr Jelinek
Date: Thu, 9 Mar 2017 12:54:43 +0100
Subject: [PATCH] Fix remote position tracking in logical replication

We need to set the origin remote position to end_lsn and not commit_lsn
as commit_lsn is start of commit record and we use the origin remote
position as start position when restarting replication stream. If we'd
use commit_lsn we could request data that we already received from
remote server after crash of downstream server.
---
src/backend/replication/logical/worker.c | 13 +++--
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index bbf3506..c2ccab7 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -421,9 +421,6 @@ apply_handle_begin(StringInfo s)

-	replorigin_session_origin_timestamp = begin_data.committime;
-	replorigin_session_origin_lsn = begin_data.final_lsn;
-
remote_final_lsn = begin_data.final_lsn;

in_remote_transaction = true;
@@ -443,14 +440,18 @@ apply_handle_commit(StringInfo s)

-	Assert(commit_data.commit_lsn == replorigin_session_origin_lsn);
-	Assert(commit_data.committime == replorigin_session_origin_timestamp);
-
Assert(commit_data.commit_lsn == remote_final_lsn);

/* The synchronization worker runs in single transaction. */
if (IsTransactionState() && !am_tablesync_worker())
{
+		/*
+		 * Update origin state so we can restart streaming from correct
+		 * position in case of crash.
+		 */
+		replorigin_session_origin_lsn = commit_data.end_lsn;
+		replorigin_session_origin_timestamp = commit_data.committime;
+
CommitTransactionCommand();

store_flush_position(commit_data.end_lsn);
--
2.7.4

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### [HACKERS] Fix slot name change handling for subscriptions

```Hi,

ALTER SUBSCRIPTION ... WITH (SLOT NAME = foo) will make the worker dies
on error about unexpected subscription changed. It's my oversight in the
original logical replication patch-set. Attached patch fixes it to
behave same way as other changes to subscription options.

--
PostgreSQL Development, 24x7 Support, Training & Services
>From aaa54e5a608987449542da65fa029afa73d4373b Mon Sep 17 00:00:00 2001
From: Petr Jelinek
Date: Tue, 21 Mar 2017 22:04:57 +0100
Subject: [PATCH] Handle change of slotname in logical replication apply

---
src/backend/replication/logical/worker.c | 18 --
1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index c2ccab7..fc01cd3 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -1352,6 +1352,21 @@ reread_subscription(void)
}

/*
+	 * We need to make new connection to new slot if slot name has changed
+	 * so exit here as well if that's the case.
+	 */
+	if (strcmp(newsub->slotname, MySubscription->slotname) != 0)
+	{
+		ereport(LOG,
+(errmsg("logical replication worker for subscription \"%s\" will "
+		"restart because the replication slot name was changed",
+		MySubscription->name)));
+
+		walrcv_disconnect(wrconn);
+		proc_exit(0);
+	}
+
+	/*
* Exit if publication list was changed. The launcher will start
* new worker.
*/
@@ -1383,8 +1398,7 @@ reread_subscription(void)
}

/* Check for other changes that should never happen too. */
-	if (newsub->dbid != MySubscription->dbid ||
-		strcmp(newsub->slotname, MySubscription->slotname) != 0)
+	if (newsub->dbid != MySubscription->dbid)
{
elog(ERROR, "subscription %u changed unexpectedly",
MyLogicalRepWorker->subid);
--
2.7.4

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] [PATCH] Generic type subscripting

```
Hi Dmitry,

On 3/21/17 4:42 PM, Dmitry Dolgov wrote:

On 21 March 2017 at 18:16, David Steele > wrote:

This thread has been idle for over a week.

Yes, sorry for the late reply. I'm still trying to find a better
solution for
some of the problems, that arose in this patch.

On 15 March 2017 at 00:10, Tom Lane > wrote:

Dmitry Dolgov <9erthali...@gmail.com >

writes:

I looked through this a bit.

Do you have an idea when you will have a patch ready?  We are now into
the last week of the commitfest.  I see one question for Tom, but it's
not clear that this would prevent you from producing a new patch.

Please post a new patch by 2017-03-28 00:00 AoE (UTC-12) or this
submission will be marked "Returned with Feedback".

Thanks,
--
-David
da...@pgmasters.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] Potential data loss of 2PC files

```
And the renaming of pg_clog to pg_xact is also my fault. Attached is
an updated patch.

Thank you. One more question: what about symlinks? If DBA moves, for example,
pg_xact to another dist and leaves the symlink in data directoty. Suppose, fsync
on symlink will do nothing actually.

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] PassDownLimitBound for ForeignScan/CustomScan [take-2]

```
Hi Kaigai,

On 3/21/17 1:11 PM, David Steele wrote:

On 3/13/17 3:25 AM, Jeevan Chalke wrote:

I have reviewed this patch further and here are my comments:

This thread has been idle for over a week.  Please respond and/or post a
new patch by 2017-03-24 00:00 AoE (UTC-12) or this submission will be
marked "Returned with Feedback".

This submission has been marked "Returned with Feedback".  Please feel
free to resubmit to a future commitfest.

Regards,
--
-David
da...@pgmasters.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```

### Re: [HACKERS] make async slave to wait for lsn to be replayed

```
Hi Ivan,

On 3/21/17 1:06 PM, David Steele wrote:

Hi Ivan,

On 3/12/17 10:20 PM, Thomas Munro wrote:

On Fri, Mar 10, 2017 at 1:49 AM, Ivan Kartyshov
wrote:

Here I attached rebased patch waitlsn_10dev_v3 (core feature)
I will leave the choice of implementation (core/contrib) to the
discretion
of the community.

Will be glad to hear your suggestion about syntax, code and patch.

Hi Ivan,

Here is some feedback based on a first read-through of the v4 patch.
I haven't tested it yet.

This thread has been idle for over a week.  Please respond and/or post a
new patch by 2017-03-24 00:00 AoE (UTC-12) or this submission will be
marked "Returned with Feedback".

This submission has been marked "Returned with Feedback".  Please feel
free to resubmit to a future commitfest.

Regards,
--
-David
da...@pgmasters.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

```