Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

2017-08-03 Thread Masahiko Sawada
On Thu, Aug 3, 2017 at 2:00 AM, Robert Haas  wrote:
> On Wed, Aug 2, 2017 at 12:34 PM, Tom Lane  wrote:
>> Of course.  It's also a heck of a lot more flexible.  Adding on another
>> ad-hoc option that does the minimum possible amount of work needed to
>> address one specific problem is always going to be less work; but after
>> we repeat that process five or ten times, we're going to have a mess.
>
> Well, I still like Masahiko-san's proposal, but I'm not prepared to
> keep arguing about it right now.  Maybe some other people will weigh
> in with an opinion.
>

My motivation of this proposal is same as what Robert has. I
understand that ad-hoc option can solve only the part of big problem
and it could be cause of mess. However It seems me that the script
especially for table initialization will not be flexible than we
expected. I mean, even if we provide some meta commands for table
initialization or data loading, these meta commands work for only
pgbench tables (i.g., pgbench_accounts, pgbench_branches and so on).
If we want to create other tables and load data to them as we want we
can do that using psql -f. So an alternative ways is having a flexible
style option for example --custom-initialize = { [load, create_pkey,
create_fkey, vacuum], ... }. That would solve this in a better way.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-03 Thread Michael Paquier
On Thu, Aug 3, 2017 at 4:29 AM, Stephen Frost  wrote:
> I'll provide another update tomorrow.  Hopefully Michael is able to produce
> a 9.6 patch, otherwise I'll do it.

I have sent an updated version of the patch, with something that can
be used for 9.6 as well. It would be nice to get something into the
next set of minor releases.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusing error message in pgbench

2017-08-03 Thread Fabien COELHO



Indeed.  It doesn't look that hard: AFAICS the problem is just that
process_sql_command() is making premature decisions about whether to
extract parameters from the SQL commands.  Proposed patch attached.


Great. Patch looks good to me.


Too me as well: code looks ok, patch applies, compiles, make check 
ok, manual tests with pgbench ok.


That is one more patch about pgbench in the queue.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] map_partition_varattnos() and whole-row vars

2017-08-03 Thread Amit Langote
Fujita-san,

Thanks for the review.

On 2017/08/03 16:01, Etsuro Fujita wrote:
> On 2017/08/02 15:21, Amit Langote wrote:
>> On 2017/08/02 1:33, Amit Khandekar wrote:
>>> ---
>>>
>>> Few more comments :
>>>
>>> @@ -1240,7 +1247,7 @@ map_variable_attnos_mutator(Node *node,
>>>   var->varlevelsup == context->sublevels_up)
>>>   {
>>>   /* Found a matching variable, make the substitution */
>>>
>>> - Var*newvar = (Var *) palloc(sizeof(Var));
>>> + Var*newvar = copyObject(var);
>>>int attno = var->varattno;
>>>
>>>   *newvar = *var;
>>>
>>> Here, "*newvar = *var" should be removed.
>>
>> Done.
> 
> I'm not sure this change is a good idea, because the copy by "*newvar =
> *var" would be more efficient than the copyObject().  (We have this
> optimization in other places as well.  See eg, copyVar() in setrefs.c.)

OK, done.

> Here are some other comments:
> 
> +/* If the callers expects us to convert the same, do so. */
> +if (OidIsValid(context->to_rowtype))
> +{
> +ConvertRowtypeExpr *r;
> +
> +/* Var itself is converted to the requested rowtype. */
> +newvar->vartype = context->to_rowtype;
> +
> +/*
> + * And a conversion step on top to convert back to the
> + * original type.
> + */
> +r = makeNode(ConvertRowtypeExpr);
> +r->arg = (Expr *) newvar;
> +r->resulttype = var->vartype;
> +r->convertformat = COERCE_IMPLICIT_CAST;
> +r->location = -1;
> +
> +return (Node *) r;
> +}
> 
> Why not do this conversion if to_rowtype is valid and it's different from
> the rowtype of the original whole-row Var like the previous patch? Also, I
> think it's better to add an assertion that the rowtype of the original
> whole-row Var is a named one.  So, what I have in mind is:
> 
>   if (OidIsValid(context->to_rowtype))
>   {
> Assert(var->vartype != RECORDOID)
> if (var->vartype != context->to_rowtype)
> {
>   ConvertRowtypeExpr *r;
> 
>   /* Var itself is converted to the requested rowtype. */
>   ...
>   /* And a conversion step on top to convert back to the ... */
>   ...
>   return (Node *) r;
> }
>   }

Sounds good, so done.

> Here is the modification to the map_variable_attnos()'s API:
> 
>  map_variable_attnos(Node *node,
> int target_varno, int sublevels_up,
> const AttrNumber *attno_map, int
> map_length,
> -   bool *found_whole_row)
> +   bool *found_whole_row, Oid
> to_rowtype)
> 
> This is nitpicking, but I think it would be better to put the new argument
> to_rowtype right before the output argument found_whole_row.

I consider this a good suggestion.  I guess we tend to list all the input
arguments before any output arguments.  So done as you suggest.

> + * RelationGetRelType
> + *Returns the rel's pg_type OID.
> + */
> +#define RelationGetRelType(relation) \
> +((relation)->rd_rel->reltype)
> 
> This macro is used in only one place.  Do we really need that?  (This
> macro would be useful for other places such as expand_inherited_rtentry,
> but I think it's better to introduce this in a separate patch, maybe for
> PG11.)

Alright, dropped RelationGetRelType from the patch.

> +-- check that wholerow vars in the RETUNING list work with partitioned
> tables
> 
> Typo: s/RETUNING/RETURNING/

Fixed.

Attached updated patches.

Thanks,
Amit
From 9b2d16ec4c8eadd7261849d5aa0f34ee2577b405 Mon Sep 17 00:00:00 2001
From: amit 
Date: Wed, 26 Jul 2017 16:45:46 +0900
Subject: [PATCH 1/2] Fix map_partition_varattnos to not error on
 found_whole_row

It was designed assuming that the expressions passed to it can never
contain whole-row vars, but it's wrong since it's called from places
that pass it WCO constraint expressions and RETURNING target list
expressions, which can very well contain whole-row vars.

Move the responsibility of error'ing out to the callers, because they
are in position to know that finding whole-row vars is an error
condition.

Adds test in insert.sql and updatable_views.sql.

Reported by: Rajkumar Raghuwanshi
Report: 
https://postgr.es/m/CAKcux6%3Dz38gH4K6YAFi%2BYvo5tHTwBL4tam4VM33CAPZ5dDMk1Q%40mail.gmail.com
---
 src/backend/catalog/partition.c   | 20 ++--
 src/backend/commands/tablecmds.c  |  8 +++-
 src/backend/executor/nodeModifyTable.c| 18 ++
 src/include/catalog/partition.h   |  3 ++-
 src/test/regress/expected/insert.out  | 10 ++
 src/test/regress/expected/updatable_views.out | 10 ++
 

Re: [HACKERS] map_partition_varattnos() and whole-row vars

2017-08-03 Thread Etsuro Fujita

On 2017/08/02 15:21, Amit Langote wrote:

Thanks Fuita-san and Amit for reviewing.

On 2017/08/02 1:33, Amit Khandekar wrote:

On 1 August 2017 at 15:11, Etsuro Fujita  wrote:

On 2017/07/31 18:56, Amit Langote wrote:

Yes, that's what's needed here.  So we need to teach
map_variable_attnos_mutator() to convert whole-row vars just like it's
done in adjust_appendrel_attrs_mutator().



Seems reasonable.  (Though I think it might be better to do this kind of
conversion in the planner, not the executor, because that would increase the
efficiency of cached plans.)


That's a good point, although it sounds like a bigger project that, IMHO,
should be undertaken separately, because that would involve designing for
considerations of expanding inheritance even in the INSERT case.


Agreed.  I think that would be definitely a material for PG11.


I think the work of shifting to planner should be taken as a different
task when we shift the preparation of DispatchInfo to the planner.


Yeah, I think it'd be a good idea to do those projects together.  That is,
doing what Fujita-san suggested and expanding partitioned tables in
partition bound order in the planner.


OK


---

Few more comments :

@@ -1240,7 +1247,7 @@ map_variable_attnos_mutator(Node *node,
  var->varlevelsup == context->sublevels_up)
  {
  /* Found a matching variable, make the substitution */

- Var*newvar = (Var *) palloc(sizeof(Var));
+ Var*newvar = copyObject(var);
   int attno = var->varattno;

  *newvar = *var;

Here, "*newvar = *var" should be removed.


Done.


I'm not sure this change is a good idea, because the copy by "*newvar = 
*var" would be more efficient than the copyObject().  (We have this 
optimization in other places as well.  See eg, copyVar() in setrefs.c.)


Here are some other comments:

+   /* If the callers expects us to convert the 
same, do so. */
+   if (OidIsValid(context->to_rowtype))
+   {
+   ConvertRowtypeExpr *r;
+
+   /* Var itself is converted to the 
requested rowtype. */
+   newvar->vartype = context->to_rowtype;
+
+   /*
+* And a conversion step on top to 
convert back to the
+* original type.
+*/
+   r = makeNode(ConvertRowtypeExpr);
+   r->arg = (Expr *) newvar;
+   r->resulttype = var->vartype;
+   r->convertformat = COERCE_IMPLICIT_CAST;
+   r->location = -1;
+
+   return (Node *) r;
+   }

Why not do this conversion if to_rowtype is valid and it's different 
from the rowtype of the original whole-row Var like the previous patch? 
Also, I think it's better to add an assertion that the rowtype of the 
original whole-row Var is a named one.  So, what I have in mind is:


  if (OidIsValid(context->to_rowtype))
  {
Assert(var->vartype != RECORDOID)
if (var->vartype != context->to_rowtype)
{
  ConvertRowtypeExpr *r;

  /* Var itself is converted to the requested rowtype. */
  ...
  /* And a conversion step on top to convert back to the ... */
  ...
  return (Node *) r;
}
  }

Here is the modification to the map_variable_attnos()'s API:

 map_variable_attnos(Node *node,
int target_varno, int sublevels_up,
const AttrNumber *attno_map, 
int map_length,

-   bool *found_whole_row)
+   bool *found_whole_row, Oid 
to_rowtype)


This is nitpicking, but I think it would be better to put the new 
argument to_rowtype right before the output argument found_whole_row.


+ * RelationGetRelType
+ * Returns the rel's pg_type OID.
+ */
+#define RelationGetRelType(relation) \
+   ((relation)->rd_rel->reltype)

This macro is used in only one place.  Do we really need that?  (This 
macro would be useful for other places such as expand_inherited_rtentry, 
but I think it's better to introduce this in a separate patch, maybe for 
PG11.)


+-- check that wholerow vars in the RETUNING list work with partitioned 
tables


Typo: s/RETUNING/RETURNING/

Sorry for the delay.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-03 Thread Michael Paquier
On Wed, Aug 2, 2017 at 7:58 PM, Stephen Frost  wrote:
> * Michael Paquier (michael.paqu...@gmail.com) wrote:
>> Do you need a back-patchable version for 9.6? I could get one out of
>> my pocket if necessary.
>
> I was just trying to find a bit of time to generate exactly that- if
> you have a couple spare cycles, it would certainly help.

OK, here you go. Even if archive_mode = always has been introduced in
9.5, but non-exclusive mode is a 9.6 feature, so here are patches down
to this version. I am pretty satisfied by this, and I included all the
comments and error message corrections reviewed up to now. I noticed
some incorrect comments, doc typos and an incorrect indentation as
well for the WARNING reported to the user when waiting for the
archiving.
-- 
Michael


pg_stop_backup_on_standby_v6_96.patch
Description: Binary data


pg_stop_backup_on_standby_v6_master.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Red-Black tree traversal tests

2017-08-03 Thread Aleksander Alekseev
Hi Victor,

> I forgot to attach the patch. Sorry.
> Here it is.

I would say that this patch is in a pretty good shape now. And I see a
99% code coverage of rbtree.c. Let's see what committers think.

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


Re: [HACKERS] map_partition_varattnos() and whole-row vars

2017-08-03 Thread Etsuro Fujita

On 2017/08/03 17:12, Amit Langote wrote:

Attached updated patches.


Thanks for the patch!  That looks good to me.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ON CONFLICT and partitioned tables

2017-08-03 Thread Jeevan Ladhe
Thanks Amit for addressing the comment.

The patch looks good to me. I have no more comments.
Verified that v2 patch applies cleanly and make check passes.

Thanks,
Jeevan Ladhe


Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

2017-08-03 Thread Fabien COELHO


Hello,


My motivation of this proposal is same as what Robert has. I
understand that ad-hoc option can solve only the part of big problem
and it could be cause of mess. However It seems me that the script
especially for table initialization will not be flexible than we
expected. I mean, even if we provide some meta commands for table
initialization or data loading, these meta commands work for only
pgbench tables (i.g., pgbench_accounts, pgbench_branches and so on).
If we want to create other tables and load data to them as we want we
can do that using psql -f. So an alternative ways is having a flexible
style option for example --custom-initialize = { [load, create_pkey,
create_fkey, vacuum], ... }. That would solve this in a better way.


Personnaly, I could be fine with a limited number of long options to 
adjust pgbench initialization to various needs, eg --use-hash-index, 
--skip-whetever-index, etc.


The flexible --custom-init idea outlined above looks nice as well.


As for a more generic solution, the easy part are the "CREATE" stuff and 
the transaction script stuff (existing pgbench scripts).


For the CREATE stuff, the script language is SQL, the command to use it is 
"psql"...


The real and hard part is to fill tables with meaningful pseudo-random 
test data which do not violate constraints for any non trivial schema 
involving foreign keys and various unique constraints.


The solution for this is SQL for trivial cases, think of:

  "INSERT INTO Foo() SELECT ... FROM generate_series(...);"

For instance the pgbench initialization is really close to:

 psql -Dscale=10 

Re: [HACKERS] Change in "policy" on dump ordering?

2017-08-03 Thread Michael Banck
Am Donnerstag, den 27.07.2017, 15:52 -0400 schrieb Tom Lane:
> So I'm thinking we should consider the multi-pass patch I posted as
> a short-term, backpatchable workaround, which we could hope to
> replace with dependency logic later.

+1, it would be really nice if this could be fixed in the back-branches 
before v11.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for Secure Transport SSL library on macOS as OpenSSL alternative

2017-08-03 Thread Heikki Linnakangas

On 08/03/2017 01:02 PM, Daniel Gustafsson wrote:

In https://postgr.es/m/69db7657-3f9d-4d30-8a4b-e06034251...@yesql.se I
presented a WIP patch for adding support for the Apple Secure Transport SSL
library on macOS as, an alternative to OpenSSL.  That patch got put on the
backburner for a bit, but I’ve now found the time to make enough progress to
warrant a new submission for discussions on this (and hopefully help hacking).


Hooray!


Keychains
=
The frontend has support for using PEM files for certificates and keys.  It can
also look up the key for the certificate in a Keychain, or both certificate and
key in a Keychain.  The root certificate is still just read from a PEM file.


Why can't you have the root certificate in the keychain, too? Just not 
implemented yet, or is there some fundamental problem with it?



The existence of an sslcert config trumps a keychain, but when a keychain is
used I’m currently using the sslcert parameter (awaiting a discussion on how to
properly do this) for the certificate label required to search the keychain.

There is a new frontend parameter called “keychain” with which a path to a
custom Keychain file can be passed.  If set, this Keychain will be searched as
well as the default.  If not, only the default user Keychain is used.  There is
nothing that modifies the Keychain in this patch, it can read identities
(certificates and its key) but not alter them in any way.


OpenSSL also has a mechanism somewhat similar to the keychain, called 
"engines". You can e.g. keep the private key corresponding a certificate 
on a smart card, and speak to it with an OpenSSL "smart card reader" 
engine. If you do that, the 'sslkey' syntax is ":name>". Perhaps we should adopt that syntax here as well? For example, 
to read the client certificate from the key chain, you would use libpq 
options like "keychain=/home/heikki/my_keychain 
sslcert=keychain:my_client_cert".



“keychain” is obviously a very Secure Transport specific name, and I personally
think we should avoid that.  Any new configuration added here should take
future potential implementation into consideration such that avoid the need for
lots of backend specific knobs.  “sslcertstore” comes to mind as an
alternative, but we’d also need parameters to point into the certstore for
finding what we need.  Another option would be to do a URL based scheme
perhaps.


I wouldn't actually mind using implementation-specific terms like 
"keychain" here. It makes it clear that it's implementation-specific. I 
think it would be confusing, to use the same generic option name, like 
"sslcertstore", for both a macOS keychain and e.g. the private key store 
in Windows. Or GNU Keyring. In the worst case, you might even have 
multiple such "key stores" on the same system, so you'd anyways need a 
way to specify which one you mean.


Actually, perhaps it should be made even more explicit, and call it 
"secure_transport_keychain". That's quite long, though.


Wrt. keychains, is there a system-global or per-user keychain in macOS? 
And does this patch use them? If you load a CRL into a global keychain, 
does it take effect?



Testing
===
In order to test this we need to provide an alternative to the openssl calls in
src/test/ssl/Makefile for Secure Transport.


Those openssl commands are only needed to re-generate the test 
certificates. The certificates are included in the git repository, so 
you only need to re-generate them if you want to modify them or add new 
ones. I think it's OK to require the openssl tool for that, it won't be 
needed just to run the test suite.



Documentation
=
I have started fiddling with this a little, but to avoid spending time on the
wrong thing I have done very little awaiting the outcome of discussions here.
I have tried to add lots of comments to the code however, to explain the quirks
of Secure Transport.


I think this patch in general is in very good shape, and the next step 
is to write the documentation. In particular, I'd like to see 
documentation on how the keychain stuff should work. It'll be easier to 
discuss the behavior and the interactions, once it's documented.


In fact, better to write the documentation for that now, and not bother 
to change the code, until after we've discussed and are happy with the 
documented behavior.



I went into this thinking I would write a README for how to implement a new SSL
library.  But in the end, turns out the refactoring that went into our SSL code
earlier made that part almost too easy to warrant that.  It’s really quite
straightforward.


That's nice to hear!

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clarification in pg10's pgupgrade.html step 10 (upgrading standby servers)

2017-08-03 Thread Michael Paquier
On Mon, Jul 31, 2017 at 6:13 PM, Robert Haas  wrote:
> On Fri, Jul 28, 2017 at 10:35 AM, Andreas Joseph Krogh
>  wrote:
>> I'm reading https://www.postgresql.org/docs/10/static/pgupgrade.html to try
>> to understand how to upgrade standby-servers using pg_upgrade with pg10.
>>
>> The text in step 10 sais:
>> "You will not be running pg_upgrade on the standby servers, but rather
>> rsync", which to me sounds like rsync, in step 10-f, should be issued on the
>> standy servers. Is this the case? If so I don't understand how the standby's
>> data is upgraded and what "remote_dir" is. If rsync is supposed to be issued
>> on the primary then I think it should be explicitly mentioned, and step 10-f
>> should provide a clarer example with more detailed values for the
>> directory-structures involved.
>>
>> I really think section 10 needs improvement as I'm certainly not comfortable
>> upgrading standbys following the existing procedure.
>
> Yeah, I don't understand it either, and I have never been convinced
> that there's any safe way to do it other than recloning the standbys
> from the upgraded master.

Here are my 2c on the matter. 10-f means that the upgraded node may
have generated WAL with wal_level = minimal, which, at least it seems
to me, that we have a risk of having inconsistent data pages if only a
rsync is used on the old standbys. Like Robert, the flow we used in
the products I work on is to re-create standbys from scratch after the
upgrade using a fresh backup, with a VM cloning. An upgrade here is an
in-place process not only linked to Postgres, so standby VMs are made
of many services, some are being linked to Postgres. So this choice is
mainly decided by those dependencies, still it feels safer anyway.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update comments in nodeModifyTable.c

2017-08-03 Thread Etsuro Fujita

On 2017/08/02 4:07, Robert Haas wrote:

On Tue, Aug 1, 2017 at 12:31 AM, Etsuro Fujita
 wrote:

Maybe I'm missing something, but I'm not sure that's a good idea because the
change says like we might have 'wholerow' only for the FDW case, but that
isn't correct because we would have 'wholerow' for a view as well. ISTM that
views should be one of the typical cases, so I'd like to propose to modify
the sentence starting with 'Typically' to something like this: "Typically,
this will be a 'ctid' or 'wholerow' attribute, but in the case of a foreign
data wrapper it might be a set of junk attributes sufficient to identify the
remote row."  What do you think about that?


Works for me.


I updated the patch that way.  Attached is a new version of the patch.

Best regards,
Etsuro Fujita
diff --git a/src/backend/executor/nodeModifyTable.c 
b/src/backend/executor/nodeModifyTable.c
index 0dde0ed..0199c9d 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -1696,7 +1696,7 @@ ExecModifyTable(PlanState *pstate)
 * the old relation tuple.
 *
 * Foreign table updates have a wholerow 
attribute when the
-* relation has an AFTER ROW trigger.  Note 
that the wholerow
+* relation has a row-level trigger.  Note that 
the wholerow
 * attribute does not carry system columns.  
Foreign table
 * triggers miss seeing those, except that we 
know enough here
 * to set t_tableOid.  Quite separately from 
this, the FDW may
@@ -2182,8 +2182,11 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, 
int eflags)
/*
 * Initialize the junk filter(s) if needed.  INSERT queries need a 
filter
 * if there are any junk attrs in the tlist.  UPDATE and DELETE always
-* need a filter, since there's always a junk 'ctid' or 'wholerow'
-* attribute present --- no need to look first.
+* need a filter, since there's always at least one junk attribute 
present
+* --- no need to look first.  Typically, this will be a 'ctid' or
+* 'wholerow' attribute, but in the case of a foreign data wrapper it
+* might be a set of junk attributes sufficient to identify the remote
+* row.
 *
 * If there are multiple result relations, each one needs its own junk
 * filter.  Note multiple rels are only possible for UPDATE/DELETE, so 
we
@@ -2251,7 +2254,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, 
int eflags)
else if (relkind == 
RELKIND_FOREIGN_TABLE)
{
/*
-* When there is an AFTER 
trigger, there should be a
+* When there is a row-level 
trigger, there should be a
 * wholerow attribute.
 */
j->jf_junkAttNo = 
ExecFindJunkAttribute(j, "wholerow");

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-08-03 Thread Michael Paquier
On Mon, Jul 31, 2017 at 7:27 PM, Alvaro Herrera
 wrote:
> Robert Haas wrote:
>
>> An alternative approach is to have some kind of other identifier,
>> let's call it a distributed transaction ID (DXID) which is mapped by
>> each node onto a local XID.
>
> Postgres-XL seems to manage this problem by using a transaction manager
> node, which is in charge of assigning snapshots.  I don't know how that
> works, but perhaps adding that concept here could be useful too.  One
> critical point to that design is that the app connects not directly to
> the underlying Postgres server but instead to some other node which is
> or connects to the node that manages the snapshots.
>
> Maybe Michael can explain in better detail how it works, and/or how (and
> if) it could be applied here.

XL (and XC) use a transaction ID that plugs in directly with the
internal XID assigned by Postgres, actually bypassing what Postgres
assigns to each backend if a transaction needs one. So if transactions
are not heavenly shared among multiple nodes, performance gets
impacted. Now when we worked on this project we noticed that we gained
in performance by reducing the number of requests and grouping them
together, so a proxy layer has been added between the global
transaction manager and Postgres to group those requests. This does
not change the fact that read-committed transactions still need
snapshots for each query, which is consuming. So this approach hurts
less with analytic queries, and more with OLTP.

2PC transaction status was tracked as well in the GTM. This allows
fancy things like being able to prepare a transaction on node 1, and
commit it on node 2 for example. I am not honestly sure that you need
to add anything at clog level for example, but I think that having at
the FDW level the meta data of a transaction stored as a rather
correct approach on the matter. That's what greenplum actually does if
I recall correctly (Heikki save me!): it has one coordinator with such
metadata handling, and bunch of underlying nodes that store the data.
Citus does also that if I recall correctly. So instead of
decentralizing this information, this gets stored in a Postgres
coordinator instance.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reload-through-the-top-parent switch the partition table

2017-08-03 Thread Ashutosh Bapat
On Wed, Aug 2, 2017 at 11:47 PM, David G. Johnston
 wrote:
> On Wed, Aug 2, 2017 at 10:58 AM, Tom Lane  wrote:
>>
>> Robert Haas  writes:
>> > On Wed, Aug 2, 2017 at 1:08 PM, Tom Lane  wrote:
>> >> --restore-via-partition-root ?
>>
>> > I worry someone will think that pg_dump is now restoring stuff, but it
>> > isn't.
>>
>> Well, the point is that the commands it emits will cause the eventual
>> restore to go through the root.  Anyway, I think trying to avoid using
>> a verb altogether is going to result in a very stilted option name.
>>
>> I notice that the option list already includes some references to
>> "insert", so maybe "--insert-via-partition-root"?  Although you could
>> argue that that's confusing when we're using COPY.
>
>
> --use-partitioned-table [partition-name, ...]  # if names are omitted it
> defaults to all partitioned tables

I like this idea since it allows using this feature for selected
tables e.g. hash tables. Otherwise, users will be forced to use this
option even when there is only  a single hash partitioned table and
many other list/range partitioned tables.

What we are trying to do here is dump the data in a partitioned table
as if it's not partitioned. Combine that with --data-only dumps, and
one could use it to load partitioned data into unpartitioned or
differently partitioned table. So, how about naming the option as

--unpartition-partitioned-table [partitioned-table-name, ] # if
names are omitted it defaults to all the partitioned tables.

That really says what dump is really doing without focusing on how the
data will be used like restoring/inserting/copying etc.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Support for Secure Transport SSL library on macOS as OpenSSL alternative

2017-08-03 Thread Daniel Gustafsson
In https://postgr.es/m/69db7657-3f9d-4d30-8a4b-e06034251...@yesql.se I
presented a WIP patch for adding support for the Apple Secure Transport SSL
library on macOS as, an alternative to OpenSSL.  That patch got put on the
backburner for a bit, but I’ve now found the time to make enough progress to
warrant a new submission for discussions on this (and hopefully help hacking).

It is a drop-in replacement for the OpenSSL code, and supports all the same
features and options, except for two things: compression is not supported and
the CRL cannot be loaded from a plain PEM file.  A Keychain must be used for
that instead.


Current state
=
The frontend and backend are implemented more or less fully, the two main
things missing being the CRL support (further details below) and loading DH
files (to support the GUC added in c0a15e07cd).  All non-CRL tests but one are
passing.  The failing test is in the frontend and it also fails when running
against an OpenSSL backend, but I can’t find where the logic is flawed and
could do with some help there.


Threads
===
Just like the CFLocaleCopyCurrent() call referenced in postmaster.c, the Secure
Transport APIs makes the process multithreaded.  To keep this out of the
postmaster, and contained in the backend, I’ve moved all functionality to
open_server and left init empty.  I could definitely need some clues on how to
properly handle this, or if it’s a complete showstopper.


Keychains
=
The frontend has support for using PEM files for certificates and keys.  It can
also look up the key for the certificate in a Keychain, or both certificate and
key in a Keychain.  The root certificate is still just read from a PEM file.
The existence of an sslcert config trumps a keychain, but when a keychain is
used I’m currently using the sslcert parameter (awaiting a discussion on how to
properly do this) for the certificate label required to search the keychain.

There is a new frontend parameter called “keychain” with which a path to a
custom Keychain file can be passed.  If set, this Keychain will be searched as
well as the default.  If not, only the default user Keychain is used.  There is
nothing that modifies the Keychain in this patch, it can read identities
(certificates and its key) but not alter them in any way.

The backend is only supporting PEM files at this point.

Once we have support for Keychains, we can however use them for additionally
supporting other Secure Transport features like OCSP etc.

“keychain” is obviously a very Secure Transport specific name, and I personally
think we should avoid that.  Any new configuration added here should take
future potential implementation into consideration such that avoid the need for
lots of backend specific knobs.  “sslcertstore” comes to mind as an
alternative, but we’d also need parameters to point into the certstore for
finding what we need.  Another option would be to do a URL based scheme
perhaps.


Certificate Revocation
==
Secure Transport supports loading CRL lists into Keychain files, the command
line certtool can for example do that.  When doing the trust evaluation on the
connection, a policy can be added which enables revocation checking via CRL.  I
have however been unable to figure out how to load a CRL list programmatically,
and as far as I can tell there is no API for this.  The certtool utility is
using the low-level CSSM APIs for this it seems, but adding that level of
complexity doesn’t seem worth it for us to maintain (I did a test and it turned
big, ugly and messy).

Unless someone knows how to implement this, we’d be limited to requiring the
use of a Keychain file for CRL support.  This would break drop-in replacement
support, but supporting certificate revocation seems more important.


Platform Support

I’ve tested this on 10.11 El Capitan and 10.12 Sierra, which are the systems I
have access to.  Supporting prairiedog and dromedary in the buildfarm will
probably be too hard (if at all possible), but down to 10.9 Mavericks should be
quite possible (if someone has the required systems to test on).  It adds a
dependency on Core Foundation on top of Secure Transport, no other macOS APIs
are used.


Testing
===
In order to test this we need to provide an alternative to the openssl calls in
src/test/ssl/Makefile for Secure Transport.  On top of that, code to generate
Keychains is required.  The certtool application can do all the Keychain
generations (I think) but this is still left open.  The main thing to figure
out here is how to avoid having to type in the Keychain password in a modal GUI
that Secure Transport pops up.  Since a Keychain can be passwordless it should
be doable, but the right certtool incantations for that is still evading me.
I’ve included a show-and-tell patch for this which I’ve used locally for
testing during hacking.


Documentation
=
I have started fiddling with this a little, but to avoid spending time on 

Re: [HACKERS] Macros bundling RELKIND_* conditions

2017-08-03 Thread Joe Conway
On 08/02/2017 10:52 PM, Ashutosh Bapat wrote:
> On Wed, Aug 2, 2017 at 11:15 PM, Alvaro Herrera
>  wrote:
>> Alvaro Herrera wrote:
>>> I think pg_class is a reasonable place to put more generic relkind lists
>>> alongside a matching error message for each, rather than specialized
>>> "does this relkind have storage" macros.  What about something like a
>>> struct list in pg_class.h,
>>
>> I just noticed that this doesn't help at all with the initial problem
>> statement, which is that some of the relkind checks failed to notice
>> that partitioned tables needed to be added to the set.  Maybe it still
>> helps because you have something to grep for, as Tom proposed elsewhere.
> 
> Having something like relkind_i_t_T, though correct, doesn't make the
> test readable. That's another improvement because of using such
> macros. The macro name tells the purpose of the test, which is what a
> reader would be interested in, rather than the actual values used.

+1

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] reload-through-the-top-parent switch the partition table

2017-08-03 Thread Robert Haas
On Wed, Aug 2, 2017 at 4:08 PM, Peter Eisentraut
 wrote:
> On 8/2/17 13:58, Tom Lane wrote:
>> I notice that the option list already includes some references to
>> "insert", so maybe "--insert-via-partition-root"?  Although you could
>> argue that that's confusing when we're using COPY.
>
> "load" could be more general.  But I'm also OK with "restore".

"load" seems better than "restore" to me, both because it's shorter
and because it sounds less like pg_dump will be doing the job of
pg_restore.

So maybe --load-via-partition-root if nobody likes my previous
suggestion of --partition-data-via-root ?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reload-through-the-top-parent switch the partition table

2017-08-03 Thread David G. Johnston
On Thursday, August 3, 2017, Rushabh Lathia 
wrote:

>
>
> --use-partitioned-table [partitioned_name, ...] # if
> names are omitted it defaults to all the partitioned tables.
>
> Here user need to specify the root relation name in the option - and any
> partition table have that as a ROOT, will load the data through
> top-parent-relation.
>
>
This is indeed what I intended to convey.  Only specify "root" names.

David J.


Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests

2017-08-03 Thread Robert Haas
On Wed, Aug 2, 2017 at 11:47 PM, Noah Misch  wrote:
> postmaster algorithms rely on the PG_SETMASK() calls preventing that.  Without
> such protection, duplicate bgworkers are an understandable result.  I caught
> several other assertions; the PMChildFlags failure is another case of
> duplicate postmaster children:
>
>   6 TRAP: FailedAssertion("!(entry->trans == ((void *)0))", File: 
> "pgstat.c", Line: 871)
>   3 TRAP: FailedAssertion("!(PMSignalState->PMChildFlags[slot] == 1)", 
> File: "pmsignal.c", Line: 229)
>  20 TRAP: FailedAssertion("!(RefCountErrors == 0)", File: "bufmgr.c", 
> Line: 2523)
>  21 TRAP: FailedAssertion("!(vmq->mq_sender == ((void *)0))", File: 
> "shm_mq.c", Line: 221)
>  Also, got a few "select() failed in postmaster: Bad address"
>
> I suspect a Cygwin signals bug.  I'll try to distill a self-contained test
> case for the Cygwin hackers.  The lack of failures on buildfarm member brolga
> argues that older Cygwin is not affected.

Nice detective work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Macros bundling RELKIND_* conditions

2017-08-03 Thread Alvaro Herrera
Joe Conway wrote:
> On 08/02/2017 10:52 PM, Ashutosh Bapat wrote:
> > On Wed, Aug 2, 2017 at 11:15 PM, Alvaro Herrera
> >  wrote:
> >> Alvaro Herrera wrote:
> >>> I think pg_class is a reasonable place to put more generic relkind lists
> >>> alongside a matching error message for each, rather than specialized
> >>> "does this relkind have storage" macros.  What about something like a
> >>> struct list in pg_class.h,
> >>
> >> I just noticed that this doesn't help at all with the initial problem
> >> statement, which is that some of the relkind checks failed to notice
> >> that partitioned tables needed to be added to the set.  Maybe it still
> >> helps because you have something to grep for, as Tom proposed elsewhere.
> > 
> > Having something like relkind_i_t_T, though correct, doesn't make the
> > test readable. That's another improvement because of using such
> > macros. The macro name tells the purpose of the test, which is what a
> > reader would be interested in, rather than the actual values used.
> 
> +1

So add another layer:

#define RELKIND_HAS_STORAGE relkind_i_t_T

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reload-through-the-top-parent switch the partition table

2017-08-03 Thread Tom Lane
Robert Haas  writes:
> So maybe --load-via-partition-root if nobody likes my previous
> suggestion of --partition-data-via-root ?

WFM.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reload-through-the-top-parent switch the partition table

2017-08-03 Thread David G. Johnston
On Thu, Aug 3, 2017 at 8:53 AM, Tom Lane  wrote:

> Robert Haas  writes:
> > So maybe --load-via-partition-root if nobody likes my previous
> > suggestion of --partition-data-via-root ?
>
> WFM.
>

​+1

David J.​


Re: [HACKERS] Row Level Security Documentation

2017-08-03 Thread Rod Taylor
On Thu, Jul 13, 2017 at 5:49 AM, Fabien COELHO  wrote:

>
> Hello Rod,
>
> This version of the table attempts to stipulate which section of the
>> process the rule applies to.
>>
>
> The table should be referenced from the description, something like "Table
> xxx summarizes the ..."
>

Added the below which seemed consistent with other "see something else"
messages.

A summary of the application of policies to a command is found in .



> ISTM that it would be clearer to split the Policy column into "FOR xxx
> ..." and "USING" or "WITH CHECK", and to merge the rows which have the same
> "FOR xxx ..." contents, something like:
>
>POLICY |
>   ---++-
>  | USING  | ...
>   FOR ALL ...++-
>  | WITH CHECK | ...
>   ---++-
>   FOR SELECT ... | USING  | ...
>
> So that it is clear that only ALL & UPDATE can get both USING & WITH
> CHECK. This can be done with "morerows=1" on an entry so that it spans more
> rows.
>

Done. I couldn't figure out a morecols=1 equivalent to keep everything
under the Policy heading without a full colspec.



> For empty cells, maybe a dash would be clearer. Not sure.


Looked cluttered to me. Tried N/A first which was even worse.

-- 
Rod Taylor
diff --git a/doc/src/sgml/ref/create_policy.sgml b/doc/src/sgml/ref/create_policy.sgml
index c0dfe1ea4b..52a868e65d 100644
--- a/doc/src/sgml/ref/create_policy.sgml
+++ b/doc/src/sgml/ref/create_policy.sgml
@@ -94,6 +94,11 @@ CREATE POLICY name ON default deny policy is assumed, so that no rows will
be visible or updatable.
   
+
+  
+   A summary of the application of policies to a command is found
+   in .
+  
  
 
  
@@ -389,6 +394,80 @@ CREATE POLICY name ON 
 

+
+   Policies Applied During Statement
+
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+  
+   Policy
+   SELECT
+   INSERT
+   UPDATE
+   DELETE
+  
+ 
+ 
+  
+   FOR ALL ...
+   USING
+   WHERE clause
+   RETURNING clause
+   WHERE and RETURNING clause
+   WHERE and RETURNING clause
+  
+  
+   WITH CHECK
+   
+   new tuple
+   new tuple
+   new tuple
+  
+  
+   FOR SELECT ... USING
+   WHERE clause
+   RETURNING clause
+   WHERE and RETURNING clause
+   WHERE and RETURNING clause
+  
+  
+   FOR INSERT ... WITH CHECK
+   
+   new tuple
+   
+   
+  
+  
+   FOR UPDATE ...
+   USING
+   
+   
+   WHERE clause
+   
+  
+  
+   WITH CHECK
+   
+   
+   new tuple
+   
+  
+  
+   FOR DELETE ... USING
+   
+   
+   
+   WHERE clause
+  
+ 
+
+   
+
   
  
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Macros bundling RELKIND_* conditions

2017-08-03 Thread Joe Conway
On 08/02/2017 10:30 PM, Ashutosh Bapat wrote:
> On Wed, Aug 2, 2017 at 8:47 PM, Robert Haas  wrote:
>> 0001-RELKIND_HAS_VISIBILITY_MAP.patch - one place
>> 0002-RELKIND_HAS_STORAGE.patch - one place
>> 0003-RELKIND_HAS_XIDS-macro.patch - one place
>> 0004-RELKIND_HAS_COMPOSITE_TYPE-macro.patch - one place
>> 0005-RELKIND_CAN_HAVE_TOAST_TABLE-macro.patch - one place
>> 0006-RELKIND_CAN_HAVE_COLUMN_COMMENT-macro.patch - one place
>> 0007-RELKIND_CAN_HAVE_INDEX-macro.patch - two places
>> 0008-RELKIND_CAN_HAVE_COLUMN_SECLABEL-macro.patch - one place
>> 0009-RELKIND_CAN_HAVE_STATS-macro.patch - two places
>>
>> I'm totally cool with doing this where we can use the macro in more
>> than one place, but otherwise I don't think it helps.

I disagree.

> Can we say that any relation that has visibility map will also have
> xids? If yes, what's that common property?

Perhaps there are better ways to achieve the same goal (e.g. nearby
comments), but I think this is the salient point. The macro names allow
whoever is looking at the code to focus on the relevant properties of
the relkind without having to arrive at the conclusion themselves that
relkind "A" has property "B". Makes the code easier to read and
understand IMHO.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] On Complex Source Code Reading Strategy

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 1:55 AM, Zeray Kalayu  wrote:
> Therefore, I feel and think that I am a bit in a hurry to be DB
> hacker. But given time, indefatigability and PG community support, I
> believe that I will manage to achieve my dream.

Indefatigability is key.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

2017-08-03 Thread Tom Lane
Masahiko Sawada  writes:
> If we want to create other tables and load data to them as we want we
> can do that using psql -f. So an alternative ways is having a flexible
> style option for example --custom-initialize = { [load, create_pkey,
> create_fkey, vacuum], ... }. That would solve this in a better way.

FWIW, I like that significantly better than your original proposal.
It'd allow people to execute parts of pgbench's standard initialization
sequence and then do other things in between (in psql).  Realistically,
that's probably about as much win as we need here --- if you're veering
far enough away from the standard scenario that that doesn't do it for
you, you might as well just write an all-custom setup script in psql.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reload-through-the-top-parent switch the partition table

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 9:01 AM, Rushabh Lathia  wrote:
> --unpartition-partitioned-table is very confusing.

+1.

> I liked the previous option which is --use-partitioned-table
> [partition-name, ...].
> The Only problem with --use-partitioned-table is, a user needs to specify
> the
> partition-name in the options list. Imagine if someone having 100's of
> partitions then specifying those name is pg_dump option is a pain.

Yeah, that won't work.

> Rather than that:
>
> --use-partitioned-table [partitioned_name, ...] # if
> names are omitted it defaults to all the partitioned tables.
>
> Here user need to specify the root relation name in the option - and any
> partition table have that as a ROOT, will load the data through
> top-parent-relation.

We could do that, but I'm not sure it's a good idea to use
getopt_long() with optional options.  Sometimes that creates confusion
-- is pg_dump --use-partitioned-table salad an attempt to dump the
salad database with the --use-partitioned-table option, or an attempt
to apply --use-partitioned-table only to partitions whose parent is
the salad table?  getopt_long() has an answer, but some people may
guess incorrectly about what it is.

I would be more inclined to make this a global option than something
that modifies the behavior for certain tables; the only per-table
flags we have right now are just to include/exclude individual tables.
You could make --inserts or --no-unlogged-table-data apply to some but
not all tables, but we didn't; why start here?

I don't like the specific name --use-partitioned-table much either.
Use it for what?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-08-03 Thread Ashutosh Bapat
On Thu, Aug 3, 2017 at 2:10 AM, Robert Haas  wrote:
> On Mon, Jul 31, 2017 at 7:59 AM, Ashutosh Bapat
>  wrote:
>> Adding AppendRelInfos to root->append_rel_list as and when they are
>> created would keep parent AppendRelInfos before those of children. But
>> that function throws away the AppendRelInfo it created when their are
>> no real children i.e. in partitioned table's case when has no leaf
>> partitions. So, we can't do that. Hence, I chose to change the API to
>> return the list of AppendRelInfos when the given RTE has real
>> children.
>
> So, IIUC, the case you're concerned about is when you have a hierarchy
> of only partitioned tables, with no plain tables.  For example, if B
> is a partitioned table and a partition of A, and that's all there is,
> A will recurse to B and B will return NIL.
>
> Is it necessary to get rid of the extra AppendRelInfos, or are they
> harmless like the duplicate RTE and PlanRowMark nodes?
>

Actually there are two sides to this:

If there are no leaf partitions, without the patch two things happen
1. rte->inh is cleared and 2 no appinfo is added to the
root->append_rel_list, even though harmless RTE and PlanRowMark nodes
are created. The first avoids treating the relation as the inheritance
parent and thus avoids creating any child relations and paths, saving
a lot of work. Ultimately set_rel_size() marks such a relation as
dummy
 352 else if (rte->relkind == RELKIND_PARTITIONED_TABLE)
 353 {
 354 /*
 355  * A partitioned table without leaf
partitions is marked
 356  * as a dummy rel.
 357  */
 358 set_dummy_rel_pathlist(rel);
 359 }

Since root->append_rel_list is traversed for every inheritance parent,
not adding needless AppendRelInfos improves performance and saves
memory, (FWIW or consider a case where there are thousands of
partitioned partitions without any leaf partition.).

My initial thought was to keep both these properties intact. But then
removing such AppendRelInfos would have a problem when such a table is
on the inner side of the join as described in [1]. So I wrote the
patch not to do either of those things when there are partitioned
partitions without leaf partitions. So, it looks like you are correct,
we could just go ahead and add those AppendRelInfos directly to
root->append_rel_list.

> /*
>  * If all the children were temp tables or a partitioned parent did not
>  * have any leaf partitions, pretend it's a non-inheritance situation; we
>  * don't need Append node in that case.  The duplicate RTE we added for
>  * the parent table is harmless, so we don't bother to get rid of it;
>  * ditto for the useless PlanRowMark node.
>  */
> if (!need_append)
> {
> /* Clear flag before returning */
> rte->inh = false;
> return;
> }
>
> If we do need to get rid of the extra AppendRelInfos, maybe a less
> invasive solution would be to change the if (!need_append) case to do
> root->append_rel_list = list_truncate(root->append_rel_list,
> original_append_rel_length).

We might require this for non-partitioned tables. I will try to
implement it this way in the next set of patches.

>
>> The code avoids creating AppendRelInfos for a child which represents
>> the parent in its role as a simple member of inheritance set.
>
> OK, I suggest we rewrite the whole comment like this: "We need an
> AppendRelInfo if paths will be built for the child RTE.  If
> childrte->inh is true, then we'll always need to generate append paths
> for it.  If childrte->inh is false, we must scan it if it's not a
> partitioned table; but if it is a partitioned table, then it never has
> any data of its own and need not be scanned.  It does, however, need
> to be locked, so note the OID for inclusion in the
> PartitionedChildRelInfo we're going to build."

Done.

>
> It looks like you also need to update the header comment for
> AppendRelInfo itself, in nodes/relation.h.

Done. Thanks for pointing it out.

>
> + * PlannerInfo for every child is obtained by translating relevant 
> members
>
> Insert "The" at the start of the sentence.

Done.

>
> -subroot->parse = (Query *)
> -adjust_appendrel_attrs(root,
> -   (Node *) parse,
> -   appinfo);
> +subroot->parse = (Query *) adjust_appendrel_attrs(parent_root,
> +  (Node *)
> parent_parse,
> +  1, );
>
> I suggest that you don't remove the line break after the cast.

This is part of 0001 patch, fixed there.

>
> + * If the child is further partitioned, remember it as a parent. 
> Since
> + * partitioned tables do not have any data, we don't need to create
> 

Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

2017-08-03 Thread Tom Lane
Fabien COELHO  writes:
> As for a more generic solution, the easy part are the "CREATE" stuff and 
> the transaction script stuff (existing pgbench scripts).

> For the CREATE stuff, the script language is SQL, the command to use it is 
> "psql"...

> The real and hard part is to fill tables with meaningful pseudo-random 
> test data which do not violate constraints for any non trivial schema 
> involving foreign keys and various unique constraints.

> The solution for this is SQL for trivial cases, think of:
>"INSERT INTO Foo() SELECT ... FROM generate_series(...);"

Yeah.  I was also thinking that complicated data-generation requirements
could be handled with plpgsql DO blocks, avoiding the need for hard-wired
code inside pgbench.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unused field of ProjectionPath

2017-08-03 Thread Tom Lane
Antonin Houska  writes:
> I've noticed that the dummypp field of ProjectionPath is set but never read.

I do not think removing that field is a good idea, even if it's not used
in the current form of the logic.  It's useful for debugging purposes at
least.  See original commit comment at

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8b9d323cb9810109e3e5aab1ead427cbbb7aa77e

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Reminder: back-branch releases next week

2017-08-03 Thread Tom Lane
[ Shoulda got this out sooner, but better late than never ]

We'll be doing routine quarterly releases of supported Postgres
back branches next week (tarballs wrapped Monday 7th, public
announcement Thursday 10th).  We'll release v10 beta3 at the
same time.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] map_partition_varattnos() and whole-row vars

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 4:40 AM, Etsuro Fujita
 wrote:
> On 2017/08/03 17:12, Amit Langote wrote:
>> Attached updated patches.
>
> Thanks for the patch!  That looks good to me.

Committed with some comment changes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update comments in nodeModifyTable.c

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 5:55 AM, Etsuro Fujita
 wrote:
> I updated the patch that way.  Attached is a new version of the patch.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tightly packing expressions

2017-08-03 Thread Andres Freund
Hi Doug,

On 2017-08-03 18:01:06 +, Douglas Doole wrote:
> Back when you were getting ready to commit your faster expressions, I
> volunteered to look at switching from an array of fixed sized steps to
> tightly packed structures. (
> https://www.postgresql.org/message-id/20170314221648.jrcgh5n7ld4ej...@alap3.anarazel.de).
> I've finally found time to make it happen.

Thanks for working on it!


> Using tightly packed structures makes the expressions a lot smaller. I
> instrumented some before and after builds and ran "make check" just to see
> how much memory expressions were using. What I found was:
> 
> There were ~104K expressions.
> 
> Memory - bytes needed to hold the steps of the expressions
> Allocated Memory - memory is allocated in blocks, this is the bytes
> allocated to hold the expressions
> Wasted Memory - the difference between the allocated and used memory
> 
> Original code:
> Memory: min=64 max=9984 total=28232512 average=265
> Allocated Memory: min=1024 max=16384 total=71584 average=1045
> Wasted Memory: min=0 max=6400 total=82939072 average=780
> 
> New code:
> Memory: min=32 (50%) max=8128 (82%) total=18266712 (65%) average=175 (66%)
> Allocated Memory: min=192 (19%) max=9408 (57%) total=29386176 (26%)
> average=282 (27%)
> Wasted Memory: min=0 (0%) max=1280 (20%) total=9464 (13%) average=106
> (14%)

That's actually not *that* big of a saving...


> Another benefit of the way I did this is that the expression memory is
> never reallocated. This means that it is safe for one step to point
> directly to a field in another step without needing to separately palloc
> storage. That should allow us to simplify the code and reduce pointer
> traversals. (I haven't exploited this in the patch. I figured it would be a
> task for round 2 assuming you like what I've done here.)

Yes, that's a neat benefit. Although I think it'd even better if we
could work to the point where the mutable and the unchanging data is
separately allocated, so we can at some point avoid redundant expression
"compilations".


> The work was mostly mechanical. The only tricky bit was dealing with the
> places where you jump to step n+1 while building step n. Since we can't
> tell the address of step n+1 until it is actually built, I had to defer
> resolution of the jumps. So all the interesting bits of this patch are in
> ExprEvalPushStep().

I was wondering about a more general "symbol resolution" stage
anyway. Then we'd allocate individual steps during ExecInitExprRec, and
allocate the linear array after we know the exact size.


I think it'd be important to see some performance measurements,
especially for larger queries. It'd not be too surprising if the
different method of dereffing the next expression actually has a
negative performance effect, but I'm absolutely not sure of that.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign table creation and NOT VALID check constraints

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 12:35 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Aug 2, 2017 at 9:41 PM, Amit Langote
>>  wrote:
>>> Attached is a patch.  I think this could be considered a bug-fix,
>>> backpatchable to 9.6 which introduced this behavior change [1].
>
>> I could go either way on that.  It's not inconceivable somebody could
>> be unhappy about seeing this behavior change in a minor release.
>
> FWIW, I vote with the camp that this is a clear bug and needs to be
> fixed.  9.6 broke a behavior that could be relied on before that.
> We do not normally hesitate to fix regressions in minor releases.
>
> (That's not a vote for the patch as submitted; I haven't reviewed it.
> But we need to fix this.)

OK.  I'm going to commit and back-patch the substantive fix with a
comment change, but I'm not going to include Amit's documentation
changes for now because I'm not sure they are going to be sufficiently
clear.  There's not a lot of context for them where he put them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

2017-08-03 Thread Fabien COELHO



For the CREATE stuff, the script language is SQL, the command to use it is
"psql"...



The real and hard part is to fill tables with meaningful pseudo-random
test data which do not violate constraints for any non trivial schema
involving foreign keys and various unique constraints.



The solution for this is SQL for trivial cases, think of:
   "INSERT INTO Foo() SELECT ... FROM generate_series(...);"


Yeah.  I was also thinking that complicated data-generation requirements
could be handled with plpgsql DO blocks, avoiding the need for hard-wired
code inside pgbench.


I do not thing that it is really be needed for what pgbench does, though. 
See attached attempt, including a no_foreign_keys option.


The only tricky thing is to have the elapsed/remaining advancement report 
on stdout, maybe with some PL/pgSQL.


Timings are very similar compared to "pgbench -i".

--
Fabien.

pgbench_init.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for Secure Transport SSL library on macOS as OpenSSL alternative

2017-08-03 Thread Michael Paquier
On Thu, Aug 3, 2017 at 12:02 PM, Daniel Gustafsson  wrote:
> In https://postgr.es/m/69db7657-3f9d-4d30-8a4b-e06034251...@yesql.se I
> presented a WIP patch for adding support for the Apple Secure Transport SSL
> library on macOS as, an alternative to OpenSSL.  That patch got put on the
> backburner for a bit, but I’ve now found the time to make enough progress to
> warrant a new submission for discussions on this (and hopefully help hacking).
>
> It is a drop-in replacement for the OpenSSL code, and supports all the same
> features and options, except for two things: compression is not supported and
> the CRL cannot be loaded from a plain PEM file.  A Keychain must be used for
> that instead.

Is there a set of APIs to be able to get server certificate for the
frontend and the backend, and generate a hash of it? That matters for
channel binding support of SCRAM for tls-server-end-point. There were
no APIs to get the TLS finish message last time I looked at OSX stuff,
which mattered for tls-unique. It would be nice if we could get one.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 9:25 AM, Oliver Ford  wrote:
> Adds to the to_number() function the ability to convert Roman numerals
> to a number. This feature is on the formatting.c TODO list. It is not
> currently implemented in either Oracle, MSSQL or MySQL so gives
> PostgreSQL an edge :-)

I kind of put my head in my hands when I saw this.  I'm not really
sure it's worth complicating the code for something that has so little
practical utility, but maybe other people will feel differently.  I
can't deny the profound advantages associated with having a leg up on
Oracle.

The error reporting is a little wonky, although maybe no wonkier than
anything else about these conversion routines.

rhaas=# select to_number('q', 'rn');
ERROR:  invalid character "q"

(hmm, no position)

rhaas=# select to_number('dd', 'rn');
ERROR:  invalid character "D" at position 1

(now i get a position, but it's not really the right position; and the
problem isn't really that the character is invalid but that you don't
like me including it twice, and I said 'd' not 'D')

rhaas=# select to_number('à', 'rn');
ERROR:  invalid character "?"

(eh?)

How much call is there for a format that can only represent values up to 3999?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unused variable scanned_tuples in LVRelStats

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 1:10 AM, Masahiko Sawada  wrote:
> So we can remove scanned_tuples from LVRelStats struct and change the
> variable name num_tuples to scanned_tuples. Attached updated patch.

On second thought, I think we should just leave this alone.
scanned_tuples is closely tied to tupcount_pages, so it's a little
confusing to pull one out and not the other.  And we can't pull
tupcount_pages out of LVRelStats because lazy_cleanup_index needs it.
The current situation isn't doing any harm, so I'm not seeing much
point in changing it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A bug in mapping attributes in ATExecAttachPartition()

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 1:04 AM, Amit Langote
 wrote:
> Alright, attached updated 0001 does that.

Committed 0001 and 0002.  0003 needs a rebase.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-08-03 Thread David Fetter
On Thu, Aug 03, 2017 at 01:45:02PM -0400, Robert Haas wrote:
> On Thu, Aug 3, 2017 at 9:25 AM, Oliver Ford  wrote:
> > Adds to the to_number() function the ability to convert Roman numerals
> > to a number. This feature is on the formatting.c TODO list. It is not
> > currently implemented in either Oracle, MSSQL or MySQL so gives
> > PostgreSQL an edge :-)
> 
> I kind of put my head in my hands when I saw this.  I'm not really
> sure it's worth complicating the code for something that has so little
> practical utility, but maybe other people will feel differently.  I
> can't deny the profound advantages associated with having a leg up on
> Oracle.
> 
> The error reporting is a little wonky, although maybe no wonkier than
> anything else about these conversion routines.
> 
> rhaas=# select to_number('q', 'rn');
> ERROR:  invalid character "q"
> 
> (hmm, no position)
> 
> rhaas=# select to_number('dd', 'rn');
> ERROR:  invalid character "D" at position 1
> 
> (now i get a position, but it's not really the right position; and the
> problem isn't really that the character is invalid but that you don't
> like me including it twice, and I said 'd' not 'D')
> 
> rhaas=# select to_number('à', 'rn');
> ERROR:  invalid character "?"
> 
> (eh?)
> 
> How much call is there for a format that can only represent values up to 3999?

There are ways to represent much larger numbers, possibly bigger than
INT_MAX.  https://en.wikipedia.org/wiki/Roman_numerals#Large_numbers
https://en.wikipedia.org/wiki/Numerals_in_Unicode#Roman_numerals

As nearly as I can tell, this patch is late by 124 days.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cache lookup errors with functions manipulation object addresses

2017-08-03 Thread Michael Paquier
On Fri, Jul 21, 2017 at 8:53 AM, Michael Paquier
 wrote:
> I can see your point. The v1 proposed above adds quite a lot of error
> code churn to deal with the lack of missing_ok flag in
> getObjectDescription, getObjectIdentity and getObjectIdentityParts.
> And the cache lookup error messages cannot be object-specific either,
> so I fell back to using %u/%u with the class as first identifier.
> Let's go with what you suggest here then.

Thinking more on that, I'll go with the flag Alvaro suggested.

> Before producing any v2, I would still need some sort of consensus
> about a couple of points though to grab object details. Here is what I
> think should be done:
> 1) For functions, let's remove format_procedure_qualified, and replace
> it with format_procedure_extended which replaces
> format_procedure_internal.

format_procedure_qualified is only used for objaddr.c, so I am going
here for not defining a compatibility set of macros.

> 2) For relation attributes, we have now get_relid_attribute_name()
> which cannot tolerate failures, and get_attname which returns NULL on
> errors. My suggestion here is to remove get_relid_attribute_name, and
> add a missing_ok flag to get_attname. Let's do this as a refactoring
> patch. One thing that may matter is modules calling the existing APIs.
> We could provide a compatibility macro.

But here get_relid_attribute_name is old enough to have a
compatibility macro. So I'll add one in one of the refactoring
patches.

> 3) For types, the existing interface is more a mess. HEAD has
> allow_invalid which is used by the SQL function format_type. My
> suggestion here would be to remove allow_invalid and replace it with
> missing_ok, to return NULL if the type has gone missing, or issue an
> error depending on what caller wants. oidvectortypes() needs to be
> modified as well. I would suggest to change this interface as a
> refactoring patch.

With compatibility macros.

> 4) GetForeignServer and GetForeignDataWrapper need to have a
> missing_ok. I suggest to refactor them as well with a separate patch.
> 5) For operators, there is format_operator_internal which has its own
> idea of invalid objects. I think that the existing API should be
> reworked.

No convinced much here, format_operator_qualified is not widely used
as far as I see, so I would just replace it with the _extended
version.

> So, while the work of this thread is largely possible and can be done
> incrementally. The devil is in the details of how to handle the
> existing APIs. Reaching an agreement about all the points above is key
> to get a clean implementation we are happy with.

Extra opinions always welcome.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Subscription code improvements

2017-08-03 Thread Peter Eisentraut
On 7/13/17 23:53, Masahiko Sawada wrote:
> To summary, I think we now have only one issue; ALTER SUBSCRIPTION is
> not transactional, 0004 patch is addressing this issue .

We have two competing patches for this issue.  This patch moves the
killing to the end of the DDL transaction.  Your earlier patch makes the
tablesync work itself responsible for exiting.  Do you wish to comment
which direction to pursue?  (Doing both might also be an option?)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xmltable SQL conformance

2017-08-03 Thread Pavel Stehule
2017-08-03 22:14 GMT+02:00 Peter Eisentraut <
peter.eisentr...@2ndquadrant.com>:

> I'm looking to update the SQL conformance list for the release.  I'm
> wondering whether the new xmltable feature fully completes the following
> items:
>
> X300XMLTable
> X301XMLTable: derived column list option
> X302XMLTable: ordinality column option
> X303XMLTable: column default option
> X304XMLTable: passing a context item
>
> It looks so to me, but maybe you know more.
>

I am not sure what X304 means? Other are not supported 100% .. are related
to unsupported XQuery lang.

Regards

Pavel




>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] Support for Secure Transport SSL library on macOS as OpenSSL alternative

2017-08-03 Thread Daniel Gustafsson
> On 03 Aug 2017, at 19:27, Michael Paquier  wrote:
> 
> On Thu, Aug 3, 2017 at 12:02 PM, Daniel Gustafsson  wrote:
>> In https://postgr.es/m/69db7657-3f9d-4d30-8a4b-e06034251...@yesql.se I
>> presented a WIP patch for adding support for the Apple Secure Transport SSL
>> library on macOS as, an alternative to OpenSSL.  That patch got put on the
>> backburner for a bit, but I’ve now found the time to make enough progress to
>> warrant a new submission for discussions on this (and hopefully help 
>> hacking).
>> 
>> It is a drop-in replacement for the OpenSSL code, and supports all the same
>> features and options, except for two things: compression is not supported and
>> the CRL cannot be loaded from a plain PEM file.  A Keychain must be used for
>> that instead.
> 
> Is there a set of APIs to be able to get server certificate for the
> frontend and the backend, and generate a hash of it? That matters for
> channel binding support of SCRAM for tls-server-end-point.

I believe we can use SSLCopyPeerTrust() for that.  Admittedly I haven’t looked
at that yet so need to get my head around channel binding, but it seems to fit
the bill.

> There were no APIs to get the TLS finish message last time I looked at OSX
> stuff, which mattered for tls-unique.  It would be nice if we could get one.


Yeah, AFAICT there is no API for that.

cheers ./daniel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] elog vs errmsg_internal

2017-08-03 Thread Alvaro Herrera
Peter Eisentraut wrote:
> Is there a preferred method to select between using elog() and
> errmsg_internal()?
> 
> Attached is a patch that converts some DEBUG messages to one of those
> two to remove them from translation, but I'm not sure which one to pick
> other than by random aesthetics.

I think the contrib changes are probably not useful, since contrib is
not a target for nls anyway.

I'm not sure that all DEBUG messages should stay untranslated.  If main
log messages (i.e. NOTICE/LOG and above) are translated, why not debug?
For example the ones in index.c and indexcmds.c.  I agree that many of
these debugs are useless when translated, but not all.  Clearly, it's a
judgement call which ones to mark for translation.

Some other random thoughts in the same area:

* I think translated messages in the server log are mostly an
operational failure.  I think we should default to C, perhaps offer
translation as an option that's not enabled by default.  Of course,
messages sent to client should be translated just like currently.

* Much worse is the fact that we send messages to the log in the
database encoding.  This means that having a server log mixing lines
from databases with different encodings is a failure; it's just not
possible to read the log at all.  A simple fix would be to transliterate
all messages to some common encoding (UTF8) so that at the log
file can at least be opened.  Another fix would be to have multiple log
files, one per encoding; or maybe go for one per database.  Neither of
these proposals seem particularly great.  Simply keeping the server log
in C locale would fix this problem too.




-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Macros bundling RELKIND_* conditions

2017-08-03 Thread Alvaro Herrera
Tom Lane wrote:

> I think Peter's got the error and the detail backwards.  It should be
> more like
> 
> ERROR: "someview" cannot have constraints
> DETAIL: "someview" is a view.
> 
> If we do it like that, we need one ERROR message per error reason,
> and one DETAIL per relkind, which should be manageable.

Hmm ... this works for me.  Hopefully we'd have the "foo is a view"
messages all centrally in pg_class.h (or maybe objectaddress, or some
other central place).  Then the "cannot have constraints" part would
appear directly in whatever .c file is doing the check; no need to
centralize in that case.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-08-03 Thread Oliver Ford
On Thursday, 3 August 2017, Robert Haas  wrote:

> On Thu, Aug 3, 2017 at 9:25 AM, Oliver Ford  > wrote:
> > Adds to the to_number() function the ability to convert Roman numerals
> > to a number. This feature is on the formatting.c TODO list. It is not
> > currently implemented in either Oracle, MSSQL or MySQL so gives
> > PostgreSQL an edge :-)
>
> I kind of put my head in my hands when I saw this.  I'm not really
> sure it's worth complicating the code for something that has so little
> practical utility, but maybe other people will feel differently.  I
> can't deny the profound advantages associated with having a leg up on
> Oracle.


The formatting.c file specifies it as a TODO, so I thought implementing it
would be worthwhile. As there is a to_roman conversion having it the other
way is good for completeness.


>
> The error reporting is a little wonky, although maybe no wonkier than
> anything else about these conversion routines.
>
> rhaas=# select to_number('q', 'rn');
> ERROR:  invalid character "q"
>
> (hmm, no position)
>
> rhaas=# select to_number('dd', 'rn');
> ERROR:  invalid character "D" at position 1
>
> (now i get a position, but it's not really the right position; and the
> problem isn't really that the character is invalid but that you don't
> like me including it twice, and I said 'd' not 'D')
>
> rhaas=# select to_number('à', 'rn');
> ERROR:  invalid character "?"
>
> (eh?)
>
> How much call is there for a format that can only represent values up to
> 3999?
>
>
The existing int_to_roman code goes up to 3999 so this patch is consistent
with that. I could extend both to handle Unicode values for large numbers?


> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] elog vs errmsg_internal

2017-08-03 Thread Tom Lane
Peter Eisentraut  writes:
> Is there a preferred method to select between using elog() and
> errmsg_internal()?

ereport(... errmsg_internal() ...) can be a win for debug messages that
are in hot code paths, because the test for whether the message will
get printed is able to short-circuit more work.  In particular,
if you have moderately expensive functions like syscache lookups in
the argument list of elog(), I believe those functions get evaluated
even if we end up not printing anything.  ereport() skips the
arg-list evaluation in such cases.

But if that doesn't seem very relevant, I'd tend to go for elog()
just because it's less typing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] elog vs errmsg_internal

2017-08-03 Thread Peter Eisentraut
Is there a preferred method to select between using elog() and
errmsg_internal()?

Attached is a patch that converts some DEBUG messages to one of those
two to remove them from translation, but I'm not sure which one to pick
other than by random aesthetics.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 52f9f20d4141f8537da1a012a7ab99207a92e016 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 2 May 2017 21:49:48 -0400
Subject: [PATCH] Remove DEBUG messages from translation

---
 contrib/amcheck/verify_nbtree.c| 12 ++--
 src/backend/access/transam/multixact.c | 10 +-
 src/backend/access/transam/slru.c  |  6 ++
 src/backend/access/transam/varsup.c|  2 +-
 src/backend/access/transam/xlog.c  | 28 ++--
 src/backend/catalog/dependency.c   |  2 +-
 src/backend/catalog/index.c|  5 ++---
 src/backend/commands/indexcmds.c   |  8 
 src/backend/commands/tablecmds.c   |  6 +++---
 src/backend/libpq/be-secure-openssl.c  |  2 +-
 src/backend/parser/parse_utilcmd.c |  2 +-
 src/backend/postmaster/autovacuum.c|  9 +++--
 src/backend/postmaster/bgworker.c  | 13 +
 src/backend/postmaster/checkpointer.c  |  5 ++---
 src/backend/postmaster/postmaster.c|  2 +-
 src/backend/postmaster/syslogger.c |  3 +--
 src/backend/replication/logical/launcher.c |  3 +--
 src/backend/replication/logical/worker.c   |  8 
 src/backend/replication/syncrep.c  |  4 ++--
 src/backend/replication/walsender.c|  4 ++--
 src/backend/storage/lmgr/predicate.c   |  2 +-
 src/backend/storage/lmgr/proc.c|  4 ++--
 src/backend/storage/smgr/md.c  |  6 +++---
 src/backend/tcop/postgres.c|  6 +++---
 src/backend/utils/init/miscinit.c  |  3 +--
 25 files changed, 71 insertions(+), 84 deletions(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 9ae83dc839..ef5fc947ce 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -315,7 +315,7 @@ bt_check_every_level(Relation rel, bool readonly)
if (metad->btm_fastroot != metad->btm_root)
ereport(DEBUG1,
(errcode(ERRCODE_NO_DATA),
-errmsg("harmless fast root mismatch in index 
%s",
+errmsg_internal("harmless fast root mismatch 
in index %s",
RelationGetRelationName(rel)),
 errdetail_internal("Fast root block %u (level 
%u) differs from true root block %u (level %u).",

metad->btm_fastroot, metad->btm_fastlevel,
@@ -415,8 +415,8 @@ bt_check_level_from_leftmost(BtreeCheckState *state, 
BtreeLevel level)
else
ereport(DEBUG1,
(errcode(ERRCODE_NO_DATA),
-errmsg("block %u of index 
\"%s\" ignored",
-   current, 
RelationGetRelationName(state->rel;
+errmsg_internal("block %u of 
index \"%s\" ignored",
+   
 current, RelationGetRelationName(state->rel;
goto nextpage;
}
else if (nextleveldown.leftmost == InvalidBlockNumber)
@@ -823,8 +823,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
targetnext = opaque->btpo_next;
ereport(DEBUG1,
(errcode(ERRCODE_NO_DATA),
-errmsg("level %u leftmost page of index \"%s\" 
was found deleted or half dead",
-   opaque->btpo.level, 
RelationGetRelationName(state->rel)),
+errmsg_internal("level %u leftmost page of 
index \"%s\" was found deleted or half dead",
+
opaque->btpo.level, RelationGetRelationName(state->rel)),
 errdetail_internal("Deleted page found when 
building scankey from right sibling.")));
 
/* Be slightly more pro-active in freeing this memory, just in 
case */
@@ -950,7 +950,7 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 */
ereport(DEBUG1,
(errcode(ERRCODE_NO_DATA),
-errmsg("%s block %u of index \"%s\" has no 
first data item",
+errmsg_internal("%s 

Re: [HACKERS] xmltable SQL conformance

2017-08-03 Thread Pavel Stehule
2017-08-03 22:54 GMT+02:00 Pavel Stehule :

>
>
> 2017-08-03 22:14 GMT+02:00 Peter Eisentraut  com>:
>
>> I'm looking to update the SQL conformance list for the release.  I'm
>> wondering whether the new xmltable feature fully completes the following
>> items:
>>
>> X300XMLTable
>> X301XMLTable: derived column list option
>> X302XMLTable: ordinality column option
>> X303XMLTable: column default option
>> X304XMLTable: passing a context item
>>
>> It looks so to me, but maybe you know more.
>>
>
> I am not sure what X304 means? Other are not supported 100% .. are related
> to unsupported XQuery lang.
>

Others than x300, 301, 302, 303 are unsupported. I don't understand to X304
well - related syntax is ignored in Postgres if I understand.

Pavel


>
> Regards
>
> Pavel
>
>
>
>
>>
>> --
>> Peter Eisentraut  http://www.2ndQuadrant.com/
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>
>
>


[HACKERS] xmltable SQL conformance

2017-08-03 Thread Peter Eisentraut
I'm looking to update the SQL conformance list for the release.  I'm
wondering whether the new xmltable feature fully completes the following
items:

X300XMLTable
X301XMLTable: derived column list option
X302XMLTable: ordinality column option
X303XMLTable: column default option
X304XMLTable: passing a context item

It looks so to me, but maybe you know more.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL_stashcache, or, what's our minimum Perl version?

2017-08-03 Thread Andrew Dunstan


On 07/31/2017 06:54 PM, Tom Lane wrote:
> but could we do something like
> my $pflags = "PROVE_FLAGS='" . ($ENV{PROVE_FLAGS} || "--timer") . "'";
>
> to allow overriding this choice from the buildfarm config?
>
>


I have committed this in a slightly different form.

cheers

andrew


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-03 Thread Robert Haas
On Thu, Jun 1, 2017 at 2:25 PM, Andres Freund  wrote:
> Just to clarify: I don't think it's a problem to do so for integers and
> most other simple scalar types. There's plenty hash algorithms that are
> endianess independent, and the rest is just a bit of care.

Do you have any feeling for which of those endianness-independent hash
functions might be a reasonable choice for us?

https://github.com/markokr/pghashlib implements various hash functions
for PostgreSQL, and claims that, of those implemented, crc32, Jenkins,
lookup3be and lookup3le, md5, and siphash24 are endian-independent.

An interesting point here is that Jeff Davis asserted in the original
post on this thread that our existing hash_any() wasn't portable, but
our current hash_any seems to be the Jenkins algorithm -- so I'm
confused.  Part of the problem seems to be that, according to
https://en.wikipedia.org/wiki/Jenkins_hash_function there are 4 of
those.  I don't know whether the one in pghashlib is the same one
we've implemented.

Kennel Marshall suggested xxhash as an endian-independent algorithm
upthread.  Code for that is available under a 2-clause BSD license.

PostgreSQL page checksums use an algorithm based on, but not exactly,
FNV-1a.  See storage/checksum_impl.h.  The comments there say this
algorithm was chosen with speed in mind.  Our version is not
endian-independent because it folds in 4-byte integers rather than
1-byte integers, but plain old FNV-1a *is* endian-independent and
could be used.

We also have an implementation of CRC32C in core - see port/pg_crc32.h
and port/pg_crc32c_sb8.c.  It's not clear to me whether this is
Endian-independent or not, although there is stuff that depends on
WORDS_BIGENDIAN, so, uh, maybe?

Some other possibly-interesting links:
https://research.neustar.biz/2011/12/29/choosing-a-good-hash-function-part-2/
http://greenrobot.org/essentials/features/performant-hash-functions-for-java/comparison-of-hash-functions/
https://www.strchr.com/hash_functions

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump does not handle indirectly-granted permissions properly

2017-08-03 Thread Stephen Frost
Tom, all,

* Stephen Frost (sfr...@snowman.net) wrote:
> This needs more cleanup, testing, and comments explaining why we're
> doing this (and then perhaps comments, somewhere.. in the backend ACL
> code that explains that the ordering needs to be preserved), but the
> basic idea seems sound to me and the case you presented does work with
> this patch (for me, at least) whereas what's in master didn't.

Alright, here's an updated patch which cleans things up a bit and adds
comments to explain what's going on.  I also updated the comments in
acl.h to explain that ordering actually does matter.

I've tried a bit to break the ordering in the backend a bit but there
could probably be more effort put into that, if I'm being honest.
Still, this definitely fixes the case which was being complained about
and therefore is a step in the right direction.

It's a bit late here, so I'll push this in the morning and watch the
buildfarm.

Thanks!

Stephen
From cbca78a387ecfed9570bd079541ec7906c990f0f Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Thu, 3 Aug 2017 21:23:37 -0400
Subject: [PATCH] Fix ordering in pg_dump of GRANTs

The order in which GRANTs are output is important as GRANTs which have
been GRANT'd by individuals via WITH GRANT OPTION GRANTs have to come
after the GRANT which included the WITH GRANT OPTION.  This happens
naturally in the backend during normal operation as we only change
existing ACLs in-place, only add new ACLs to the end, and when removing
an ACL we remove any which depend on it also.

Also, adjust the comments in acl.h to make this clear.

Unfortunately, the updates to pg_dump to handle initial privileges
involved pulling apart ACLs and then combining them back together and
could end up putting them back together in an invalid order, leading to
dumps which wouldn't restore.

Fix this by adjusting the queries used by pg_dump to ensure that the
ACLs are rebuilt in the same order in which they were originally.

Back-patch to 9.6 where the changes for initial privileges were done.
---
 src/bin/pg_dump/dumputils.c | 51 -
 src/include/utils/acl.h | 14 ++---
 2 files changed, 47 insertions(+), 18 deletions(-)

diff --git a/src/bin/pg_dump/dumputils.c b/src/bin/pg_dump/dumputils.c
index dfc6118..67049a6 100644
--- a/src/bin/pg_dump/dumputils.c
+++ b/src/bin/pg_dump/dumputils.c
@@ -722,21 +722,36 @@ buildACLQueries(PQExpBuffer acl_subquery, PQExpBuffer racl_subquery,
 	 * We always perform this delta on all ACLs and expect that by the time
 	 * these are run the initial privileges will be in place, even in a binary
 	 * upgrade situation (see below).
+	 *
+	 * Finally, the order in which privileges are in the ACL string (the order
+	 * they been GRANT'd in, which the backend maintains) must be preserved to
+	 * ensure that GRANTs WITH GRANT OPTION and subsequent GRANTs based on
+	 * those are dumped in the correct order.
 	 */
-	printfPQExpBuffer(acl_subquery, "(SELECT pg_catalog.array_agg(acl) FROM "
-	  "(SELECT pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) AS acl "
-	  "EXCEPT "
-	  "SELECT pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s as foo)",
+	printfPQExpBuffer(acl_subquery,
+	  "(SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM "
+	  "(SELECT acl, row_n FROM "
+			  "pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) "
+	  "WITH ORDINALITY AS perm(acl,row_n) "
+	  "WHERE NOT EXISTS ( "
+	  "SELECT 1 FROM "
+	  "pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s))) "
+	  "AS init(init_acl) WHERE acl = init_acl)) as foo)",
 	  acl_column,
 	  obj_kind,
 	  acl_owner,
 	  obj_kind,
 	  acl_owner);
 
-	printfPQExpBuffer(racl_subquery, "(SELECT pg_catalog.array_agg(acl) FROM "
-	  "(SELECT pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s))) AS acl "
-	  "EXCEPT "
-	  "SELECT pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s as foo)",
+	printfPQExpBuffer(racl_subquery,
+	  "(SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM "
+	  "(SELECT acl, row_n FROM "
+	"pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s))) "
+	  "WITH ORDINALITY AS initp(acl,row_n) "
+	  "WHERE NOT EXISTS ( "
+	  "SELECT 1 FROM "
+			"pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) "
+	  "AS permp(orig_acl) WHERE acl = orig_acl)) as foo)",
 	  obj_kind,
 	  acl_owner,
 	  acl_column,
@@ -761,19 +776,25 @@ buildACLQueries(PQExpBuffer acl_subquery, PQExpBuffer racl_subquery,
 	{
 		printfPQExpBuffer(init_acl_subquery,
 		  "CASE WHEN privtype = 'e' THEN "
-		  "(SELECT pg_catalog.array_agg(acl) FROM "
-		  "(SELECT pg_catalog.unnest(pip.initprivs) AS acl "
-		  "EXCEPT "
-		  "SELECT pg_catalog.unnest(pg_catalog.acldefault(%s,%s))) as foo) END",
+	  "(SELECT pg_catalog.array_agg(acl 

Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?

2017-08-03 Thread Shay Rojansky
I tested the patch.

Doing SSL_CTX_set_session_cache_mode(context, SSL_SESS_CACHE_OFF) doesn't
have any effect whatsoever - I still have the same issue (session id
context uninitialized). I suspect session caching is an entirely different
feature from session tickets/RFC5077 (although it might still be a good
idea to disable).

Doing SSL_CTX_set_options(context, SSL_OP_NO_TICKET) indeed resolves the
issue, as expected. As I wrote above, I'd remove the #ifdef and execute it
always.

I'm still not convinced of the risk/problem of simply setting the session
id context as I explained above (rather than disabling the optimization),
but of course either solution resolves my problem.


Re: [HACKERS] Hash Functions

2017-08-03 Thread Andres Freund
Hi,

On 2017-08-03 17:43:44 -0400, Robert Haas wrote:
> For me, the basic point here is that we need a set of hash functions
> for hash partitioning that are different than what we use for hash
> indexes and hash joins -- otherwise when we hash partition a table and
> create hash indexes on each partition, those indexes will have nasty
> clustering.  Partitionwise hash joins will have similar problems.  So,
> a new set of hash functions specifically for hash partitioning is
> quite desirable.

Couldn't that just as well solved by being a bit smarter with an IV? I
doubt we want to end up with different hashfunctions for sharding,
partitioning, hashjoins (which seems to form a hierarchy). Having a
working hash-combine function, or even better a hash API that can
continue to use the hash's internal state, seems a more scalable
solution.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add Roman numeral conversion to to_number

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 2:54 PM, Oliver Ford  wrote:
> The formatting.c file specifies it as a TODO, so I thought implementing it
> would be worthwhile. As there is a to_roman conversion having it the other
> way is good for completeness.

Huh, didn't know about that.  Well, I'm not direly opposed to this or
anything, just not sure that it's worth spending a lot of time on.

> The existing int_to_roman code goes up to 3999 so this patch is consistent
> with that. I could extend both to handle Unicode values for large numbers?

Well, following what the existing code does is usually a good place to
start -- whether you want to try to extend it is up to you.  I'm not
very interested in Roman numeral handling personally, so you might
want to wait for some opinions from others.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 5:32 PM, Andres Freund  wrote:
>> Do you have any feeling for which of those endianness-independent hash
>> functions might be a reasonable choice for us?
>
> Not a strong / very informed one, TBH.
>
> I'm not convinced it's worth trying to achieve this in the first place,
> now that we "nearly" have the insert-via-parent feature. Isn't this a
> lot of work, for very little practical gain? Having to select that when
> switching architectures still seems unproblematic. People will just
> about never switch endianess, so even a tiny performance & effort
> overhead doesn't seem worth it to me.

I kind of agree with you.  There are some advantages of being
endian-independent, like maybe your hash partitioning is really across
multiple shards, and not all the shards are the same machine
architecture, but it's not going to come up for most people.

For me, the basic point here is that we need a set of hash functions
for hash partitioning that are different than what we use for hash
indexes and hash joins -- otherwise when we hash partition a table and
create hash indexes on each partition, those indexes will have nasty
clustering.  Partitionwise hash joins will have similar problems.  So,
a new set of hash functions specifically for hash partitioning is
quite desirable.

Given that, if it's not a big problem to pick ones that have the
portability properties at least some people want, I'd be inclined to
do it.  If it results in a noticeable slowdown on macrobenchmarks,
then not so much, but otherwise, I'd rather do what people are asking
for than spend time arguing about it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 6:08 PM, Andres Freund  wrote:
>> That's another way to go, but it requires inventing a way to thread
>> the IV through the hash opclass interface.
>
> Only if we really want to do it really well :P. Using a hash_combine()
> like
>
> /*
>  * Combine two hash values, resulting in another hash value, with decent bit
>  * mixing.
>  *
>  * Similar to boost's hash_combine().
>  */
> static inline uint32
> hash_combine(uint32 a, uint32 b)
> {
> a ^= b + 0x9e3779b9 + (a << 6) + (a >> 2);
> return a;
> }
>
> between hash(IV) and the hashfunction should do the trick (the IV needs
> to hashed once, otherwise the bit mix is bad).

That seems pretty lame, although it's sufficient to solve the
immediate problem, and I have to admit to a certain predilection for
things that solve the immediate problem without creating lots of
additional work.

>> We could:
>>
>> - Invent a new hash_partition AM that doesn't really make indexes but
>> supplies hash functions for hash partitioning.
>> - Add a new, optional support function 2 to the hash AM that takes a
>> value of the type *and* an IV as an argument.
>> - Something else.
>
> Not arguing for it, but one option could also have pg_type.hash*
> function(s).

True.  That is a bit less configurable because you can't then have
multiple functions for the same type.  Going through the opclass
interface means you can have hash_portable_ops and hash_fast_ops and
let people choose.  But this would be easy to implement and enough for
most people in practice.

> One thing that I think might be advisable to think about is that we're
> atm stuck with a relatively bad hash function for hash indexes (and hash
> joins/aggs), and we should probably evolve it at some point. At the same
> time there's currently people out there relying on the current hash
> functions remaining stable.

That to me is a bit of a separate problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] analyzeCTE is too strict about typmods?

2017-08-03 Thread Tom Lane
I wrote:
> In short, therefore, it's looking to me like analyzeCTE() is wrong here.
> It should allow the case where the recursive result has typmod -1 while
> the non-recursive output column has some more-specific typmod, so long
> as they match on type OID.  That would correspond to what we do in
> regular non-recursive UNION situations.

Oh, scratch that.  I was thinking that we could simply relax the error
check, but that really doesn't work at all.  The problem is that by now,
we have probably already generated Vars referencing the outputs of the
recursive CTE, and those will have the more-specific typmod, which is
wrong in this scenario.  Usually that wouldn't matter too much, but
there are cases where it would matter.

We could imagine dealing with this by re-parse-analyzing the recursive
term using typmod -1 for the CTE output column, but I don't much want
to go there.  It wouldn't be cheap, and I'm not quite sure it's guaranteed
to converge anyway.

What's seeming like an attractive compromise is to change the HINT
to recommend manually coercing the recursive term, instead of the
non-recursive one.  Adjusting the error cursor to point to that side
might be a bit painful, but it's probably doable.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign table creation and NOT VALID check constraints

2017-08-03 Thread Amit Langote
On 2017/08/04 2:13, Robert Haas wrote:
> On Thu, Aug 3, 2017 at 12:35 AM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> On Wed, Aug 2, 2017 at 9:41 PM, Amit Langote
>>>  wrote:
 Attached is a patch.  I think this could be considered a bug-fix,
 backpatchable to 9.6 which introduced this behavior change [1].
>>
>>> I could go either way on that.  It's not inconceivable somebody could
>>> be unhappy about seeing this behavior change in a minor release.
>>
>> FWIW, I vote with the camp that this is a clear bug and needs to be
>> fixed.  9.6 broke a behavior that could be relied on before that.
>> We do not normally hesitate to fix regressions in minor releases.
>>
>> (That's not a vote for the patch as submitted; I haven't reviewed it.
>> But we need to fix this.)
> 
> OK.  I'm going to commit and back-patch the substantive fix with a
> comment change, but I'm not going to include Amit's documentation
> changes for now because I'm not sure they are going to be sufficiently
> clear.  There's not a lot of context for them where he put them.

Thanks for committing the code changes.

About the documentation changes, it seems that the only places where any
description of NOT VALID appears is ALTER TABLE, ALTER FOREIGN TABLE, and
ALTER DOMAIN references pages.  Even if the CREATE (FOREIGN) TABLE syntax
allows it, NOT VALID does not appear in the syntax synopsis, so it seems
kind of a hidden feature.  Maybe, we should fix that first (if at all).

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-03 Thread Stephen Frost
Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Wed, Aug 2, 2017 at 7:58 PM, Stephen Frost  wrote:
> > * Michael Paquier (michael.paqu...@gmail.com) wrote:
> >> Do you need a back-patchable version for 9.6? I could get one out of
> >> my pocket if necessary.
> >
> > I was just trying to find a bit of time to generate exactly that- if
> > you have a couple spare cycles, it would certainly help.
> 
> OK, here you go. Even if archive_mode = always has been introduced in
> 9.5, but non-exclusive mode is a 9.6 feature, so here are patches down
> to this version. I am pretty satisfied by this, and I included all the
> comments and error message corrections reviewed up to now. I noticed
> some incorrect comments, doc typos and an incorrect indentation as
> well for the WARNING reported to the user when waiting for the
> archiving.

Thanks much for this.  I've looked over it some and it looks pretty good
to me.  I'm going to play with it a bit more and, barring issues, will
push them tomorrow morning.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-03 Thread Masahiko Sawada
On Fri, Aug 4, 2017 at 10:48 AM, Stephen Frost  wrote:
> Michael,
>
> * Michael Paquier (michael.paqu...@gmail.com) wrote:
>> On Wed, Aug 2, 2017 at 7:58 PM, Stephen Frost  wrote:
>> > * Michael Paquier (michael.paqu...@gmail.com) wrote:
>> >> Do you need a back-patchable version for 9.6? I could get one out of
>> >> my pocket if necessary.
>> >
>> > I was just trying to find a bit of time to generate exactly that- if
>> > you have a couple spare cycles, it would certainly help.
>>
>> OK, here you go. Even if archive_mode = always has been introduced in
>> 9.5, but non-exclusive mode is a 9.6 feature, so here are patches down
>> to this version. I am pretty satisfied by this, and I included all the
>> comments and error message corrections reviewed up to now. I noticed
>> some incorrect comments, doc typos and an incorrect indentation as
>> well for the WARNING reported to the user when waiting for the
>> archiving.
>
> Thanks much for this.  I've looked over it some and it looks pretty good
> to me.  I'm going to play with it a bit more and, barring issues, will
> push them tomorrow morning.
>

Thank you for the patches, Michael-san! It looks good to me too.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reload-through-the-top-parent switch the partition table

2017-08-03 Thread Amit Langote
On 2017/08/04 1:08, David G. Johnston wrote:
> On Thu, Aug 3, 2017 at 8:53 AM, Tom Lane  wrote:
> 
>> Robert Haas  writes:
>>> So maybe --load-via-partition-root if nobody likes my previous
>>> suggestion of --partition-data-via-root ?
>>
>> WFM.
>>
> 
> ​+1

+1.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] POC: Cache data in GetSnapshotData()

2017-08-03 Thread Mithun Cy
On 3 Aug 2017 2:16 am, "Andres Freund"  wrote:

Hi Andres thanks for detailed review. I agree with all of the comments. I
am going for a vacation. Once I come back I will fix those comments and
will submit a new patch.


Re: [HACKERS] AlterUserStmt anmd RoleSpec rules in grammar.y

2017-08-03 Thread Peter Eisentraut
On 7/31/17 20:42, Peter Eisentraut wrote:
> That looks like a bug to me.  ALTER USER also does not support the IN
> DATABASE clause, so the code deviation might have started there already.
> 
> I propose the attached patch to clean this up.
> 
> For backpatching, I could develop some less invasive versions.

done and done

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-03 Thread Amit Kapila
On Wed, Aug 2, 2017 at 9:04 PM, Robert Haas  wrote:
> On Wed, Jul 12, 2017 at 1:10 AM, Amit Kapila  wrote:
 Yes, I also think the same idea can be used, in fact, I have mentioned
 it [1] as soon as you have committed that patch.  Do we want to do
 anything at this stage for PG-10?  I don't think we should attempt
 something this late unless people feel this is a show-stopper issue
 for usage of hash indexes.  If required, I think a separate function
 can be provided to allow users to perform squeeze operation.
>>>
>>> Sorry, I have no idea how critical this squeeze thing is for the
>>> newfangled hash indexes, so I cannot comment on that.  Does this make
>>> the indexes unusable in some way under some circumstances?
>>
>> It seems so.  Basically, in the case of a large number of duplicates,
>> we hit the maximum number of overflow pages.  There is a theoretical
>> possibility of hitting it but it could also happen that we are not
>> free the existing unused overflow pages due to which it keeps on
>> growing and hit the limit.  I have requested up thread to verify if
>> that is happening in this case and I am still waiting for same.  The
>> squeeze operation does free such unused overflow pages after cleaning
>> them.  As this is a costly operation and needs a cleanup lock, so we
>> currently perform it only during Vacuum and next split from the bucket
>> which can have redundant overflow pages.
>
> Oops.  It was rather short-sighted of us not to increase
> HASH_MAX_BITMAPS when we bumped HASH_VERSION.  Actually removing that
> limit is hard, but we could have easily bumped it for 128 to say 1024
> without (I think) causing any problem, which would have given us quite
> a bit of headroom here.

Yes, that sounds sensible, but I think it will just delay the problem
to happen.  I think here the actual problem is that we are not able to
perform squeeze operation often enough that it frees the overflow
pages.  Currently, we try to perform the squeeze only at the start of
next split of the bucket or during vacuum.  The reason for doing it
that way was that squeeze operation needs cleanup lock and we already
have that during the start of split and vacuum. Now, to solve it I
have already speculated few ways above [1] and among those, it is
feasible to either do this at end of split which can have performance
implications in some work loads, but will work fine for the case
reported in this thread and another is to some way (like we do for
Brin index in commit 7526e10224f0792201e99631567bbe44492bbde4) trigger
vacuum.

I think we can fix it in one of above ways and increase the value of
HASH_MAX_BITMAPS as well.

What do you say?

>  I suppose we could still try to jam that
> change in before beta3 (bumping HASH_VERSION again) but that might be
> asking for trouble.
>

I am not sure if we have any other option if we decide to increase the
value of HASH_MAX_BITMAPS. BTW, do you think some users will rely on
hash index built on some of the beta version?

Note - AP has off list shared the data dump and we (Ashutosh Sharma
and me) are able to reproduce the problem and we could see that if we
force vacuum via the debugger, then it is able to free overflow pages.
The exact numbers are not available at this stage as the test is not
complete.

[1] - 
https://www.postgresql.org/message-id/CAA4eK1KKq80BYOc%2BmcmHcQzV0Mcs3AHGjEEf--TnLaJbkeTgmg%40mail.gmail.com
-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update comments in nodeModifyTable.c

2017-08-03 Thread Etsuro Fujita

On 2017/08/04 1:52, Robert Haas wrote:

On Thu, Aug 3, 2017 at 5:55 AM, Etsuro Fujita
 wrote:

I updated the patch that way.  Attached is a new version of the patch.


Committed.


Thanks again.

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] map_partition_varattnos() and whole-row vars

2017-08-03 Thread Etsuro Fujita

On 2017/08/04 10:00, Amit Langote wrote:

On 2017/08/04 1:06, Robert Haas wrote:

On Thu, Aug 3, 2017 at 4:40 AM, Etsuro Fujita
 wrote:

On 2017/08/03 17:12, Amit Langote wrote:

Attached updated patches.


Thanks for the patch!  That looks good to me.


Committed with some comment changes.


Thanks a lot.


Thanks!

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?

2017-08-03 Thread Tom Lane
Shay Rojansky  writes:
> I tested the patch.

Thanks!

> Doing SSL_CTX_set_session_cache_mode(context, SSL_SESS_CACHE_OFF) doesn't
> have any effect whatsoever - I still have the same issue (session id
> context uninitialized). I suspect session caching is an entirely different
> feature from session tickets/RFC5077 (although it might still be a good
> idea to disable).

Right, we expected that that would have no visible effect, because there
is no way to cache sessions in Postgres anyway.  The main point, if I
understand Heikki's concern correctly, is that this might save some
amount of low-level overhead from clients trying to cache connections.

> Doing SSL_CTX_set_options(context, SSL_OP_NO_TICKET) indeed resolves the
> issue, as expected.

Excellent.  I'll push this patch tomorrow sometime (too late/tired
right now).

> As I wrote above, I'd remove the #ifdef and execute it always.

The reason I put the #ifdef in is that according to my research the
SSL_OP_NO_TICKET symbol was introduced in openssl 0.9.8f, while we
claim to support back to 0.9.8.  I'd be the first to say that you're
nuts if you're running openssl versions that old; but this patch is not
something to move the compatibility goalposts for when it only takes
an #ifdef to avoid breaking older versions.

(I need to check how far back SSL_SESS_CACHE_OFF goes ... we might
need an #ifdef for that too.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-03 Thread Stephen Frost
* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Thu, Aug 3, 2017 at 4:29 AM, Stephen Frost  wrote:
> > I'll provide another update tomorrow.  Hopefully Michael is able to produce
> > a 9.6 patch, otherwise I'll do it.
> 
> I have sent an updated version of the patch, with something that can
> be used for 9.6 as well. It would be nice to get something into the
> next set of minor releases.

Thanks for the patches.  I'm planning to push them tomorrow morning
after a bit more review and testing.  I'll provide an update tomorrow.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?

2017-08-03 Thread Andres Freund
On 2017-08-04 07:22:42 +0300, Shay Rojansky wrote:
> I'm still not convinced of the risk/problem of simply setting the session
> id context as I explained above (rather than disabling the optimization),
> but of course either solution resolves my problem.

How would that do anything? Each backend has it's own local
memory. I.e. any cache state that openssl would maintain wouldn't be
useful. If you want to take advantage of features around this you really
need to cache tickets in shared memory...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change in "policy" on dump ordering?

2017-08-03 Thread Tom Lane
Michael Banck  writes:
> Am Donnerstag, den 27.07.2017, 15:52 -0400 schrieb Tom Lane:
>> So I'm thinking we should consider the multi-pass patch I posted as
>> a short-term, backpatchable workaround, which we could hope to
>> replace with dependency logic later.

> +1, it would be really nice if this could be fixed in the back-branches 
> before v11.

Done.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] map_partition_varattnos() and whole-row vars

2017-08-03 Thread Amit Langote
On 2017/08/04 1:06, Robert Haas wrote:
> On Thu, Aug 3, 2017 at 4:40 AM, Etsuro Fujita
>  wrote:
>> On 2017/08/03 17:12, Amit Langote wrote:
>>> Attached updated patches.
>>
>> Thanks for the patch!  That looks good to me.
> 
> Committed with some comment changes.

Thanks a lot.

Regards,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-08-03 Thread Mark Rofail
To better understand a limitation I ask 5 questions

What is the limitation?
Why is there a limitation?
Why is it a limitation?
What can we do?
Is it feasible?

Through some reading:

*What is the limitation?*
presupposes that count(distinct y) has exactly the same notion of equality
that the PK unique index has. In reality, count(distinct) will fall back to
the default btree opclass for the array element type.

the planner may choose an optimization of this sort when the index's
opclass matches the one
DISTINCT will use, ie the default for the data type.

*Why is there a limitation?*
necessary because ri_triggers.c relies on COUNT(DISTINCT x) on the element
type, as well as on array_eq() on the array type, and we need those
operations to have the same notion of equality that we're using otherwise.

*Why is it a limitation?*
That's wrong: DISTINCT should use the equality operator that corresponds
to the index' operator class instead, not the default one.

*What can we do ?*
I'm sure that we can replace array_eq() with a newer polymorphic version
but I don't know how we could get around COUNT(DISTINCT x)

*Is it feasible? *
I don't think I have the experience to answer that

Best Regards,
Mark Rofail


[HACKERS] analyzeCTE is too strict about typmods?

2017-08-03 Thread Tom Lane
I'm not sure why bug #7926 didn't get any love when filed,
but the issue came up again today:
https://www.postgresql.org/message-id/264036359.6712710.1501784552...@mail.yahoo.com
and it does seem like this is pretty curious behavior.
A minimal reproducer is

regression=# create table base (f1 numeric(7,3));
CREATE TABLE
regression=# with recursive foo as (
select f1 from base
zunion all
select f1+1 from foo
) select * from foo;
ERROR:  recursive query "foo" column 1 has type numeric(7,3) in non-recursive 
term but type numeric overall
LINE 2: select f1 from base
   ^
HINT:  Cast the output of the non-recursive term to the correct type.

Now the thing about that is that the HINT's advice doesn't work:

regression=# with recursive foo as (
select f1::numeric from base
union all
select f1+1 from foo
) select * from foo;
ERROR:  recursive query "foo" column 1 has type numeric(7,3) in non-recursive 
term but type numeric overall
LINE 2: select f1::numeric from base
   ^
HINT:  Cast the output of the non-recursive term to the correct type.

The reason for this is that parse_coerce.c treats casting a value that's
already of the required type to typmod -1 as a complete no-op (see first
check in coerce_type_typmod).  So the result is still just a Var for "f1".

We could imagine fixing this by insisting that a RelabelType with typmod
-1 should be plastered atop the expression in such cases.  But I'm worried
about the potential side-effects of that, and anyway I'm not convinced
that parse_coerce.c is wrong to be doing it this way: typmod -1 generally
means "unspecified typmod", so the bare Var seems like it ought to be
considered to satisfy the typmod spec.  Besdies, if you just do this:

select f1 from base
union all
select f1+1 from base;

it works, producing a UNION result deemed to have typmod -1, and there's
no extra decoration added to the Var in the first leaf SELECT.

In short, therefore, it's looking to me like analyzeCTE() is wrong here.
It should allow the case where the recursive result has typmod -1 while
the non-recursive output column has some more-specific typmod, so long
as they match on type OID.  That would correspond to what we do in
regular non-recursive UNION situations.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-03 Thread Andres Freund
Hi,

On 2017-08-03 17:09:41 -0400, Robert Haas wrote:
> On Thu, Jun 1, 2017 at 2:25 PM, Andres Freund  wrote:
> > Just to clarify: I don't think it's a problem to do so for integers and
> > most other simple scalar types. There's plenty hash algorithms that are
> > endianess independent, and the rest is just a bit of care.
> 
> Do you have any feeling for which of those endianness-independent hash
> functions might be a reasonable choice for us?

Not a strong / very informed one, TBH.

I'm not convinced it's worth trying to achieve this in the first place,
now that we "nearly" have the insert-via-parent feature. Isn't this a
lot of work, for very little practical gain? Having to select that when
switching architectures still seems unproblematic. People will just
about never switch endianess, so even a tiny performance & effort
overhead doesn't seem worth it to me.


Leaving that aside:


> https://github.com/markokr/pghashlib implements various hash functions
> for PostgreSQL, and claims that, of those implemented, crc32, Jenkins,
> lookup3be and lookup3le, md5, and siphash24 are endian-independent.

> An interesting point here is that Jeff Davis asserted in the original
> post on this thread that our existing hash_any() wasn't portable, but
> our current hash_any seems to be the Jenkins algorithm -- so I'm
> confused.  Part of the problem seems to be that, according to
> https://en.wikipedia.org/wiki/Jenkins_hash_function there are 4 of
> those.  I don't know whether the one in pghashlib is the same one
> we've implemented.

IIUC lookup3be/le from Marko's hashlib just has a endianess conversion
added.  I'd personally not go for jenkins3, it's not particularly fast,
nor does it balance that out w/ being cryptographicaly secure.


> Kennel Marshall suggested xxhash as an endian-independent algorithm
> upthread.  Code for that is available under a 2-clause BSD license.

Yea, that'd have been one of my suggestions too. Especially as I still
want to implement better compression using lz4, and that'll depend on
xxhash in turn.


> PostgreSQL page checksums use an algorithm based on, but not exactly,
> FNV-1a.  See storage/checksum_impl.h.  The comments there say this
> algorithm was chosen with speed in mind.  Our version is not
> endian-independent because it folds in 4-byte integers rather than
> 1-byte integers, but plain old FNV-1a *is* endian-independent and
> could be used.

Non-SIMDed (which we hope to achieve with our implementation, which is
why we have separate compiler flags for that file) implementations of
FNV are, to my knowledge, not particularly fast. And the SIMD tricks
are, to my knowledge, not really applicable to the case at hand here. So
I'm not a fan of choosing FNV.


> We also have an implementation of CRC32C in core - see port/pg_crc32.h
> and port/pg_crc32c_sb8.c.  It's not clear to me whether this is
> Endian-independent or not, although there is stuff that depends on
> WORDS_BIGENDIAN, so, uh, maybe?

The results should be endian independent. It depends on WORDS_BIGENDIAN
because we need different pre-computed tables depending on endianess.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 5:50 PM, Andres Freund  wrote:
> On 2017-08-03 17:43:44 -0400, Robert Haas wrote:
>> For me, the basic point here is that we need a set of hash functions
>> for hash partitioning that are different than what we use for hash
>> indexes and hash joins -- otherwise when we hash partition a table and
>> create hash indexes on each partition, those indexes will have nasty
>> clustering.  Partitionwise hash joins will have similar problems.  So,
>> a new set of hash functions specifically for hash partitioning is
>> quite desirable.
>
> Couldn't that just as well solved by being a bit smarter with an IV? I
> doubt we want to end up with different hashfunctions for sharding,
> partitioning, hashjoins (which seems to form a hierarchy). Having a
> working hash-combine function, or even better a hash API that can
> continue to use the hash's internal state, seems a more scalable
> solution.

That's another way to go, but it requires inventing a way to thread
the IV through the hash opclass interface.  That's actually sort of a
problem anyway.  Maybe I ought to have started with the question of
how we're going to make that end of things work.  We could:

- Invent a new hash_partition AM that doesn't really make indexes but
supplies hash functions for hash partitioning.
- Add a new, optional support function 2 to the hash AM that takes a
value of the type *and* an IV as an argument.
- Something else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-03 Thread Andres Freund
On 2017-08-03 17:57:37 -0400, Robert Haas wrote:
> On Thu, Aug 3, 2017 at 5:50 PM, Andres Freund  wrote:
> > On 2017-08-03 17:43:44 -0400, Robert Haas wrote:
> >> For me, the basic point here is that we need a set of hash functions
> >> for hash partitioning that are different than what we use for hash
> >> indexes and hash joins -- otherwise when we hash partition a table and
> >> create hash indexes on each partition, those indexes will have nasty
> >> clustering.  Partitionwise hash joins will have similar problems.  So,
> >> a new set of hash functions specifically for hash partitioning is
> >> quite desirable.
> >
> > Couldn't that just as well solved by being a bit smarter with an IV? I
> > doubt we want to end up with different hashfunctions for sharding,
> > partitioning, hashjoins (which seems to form a hierarchy). Having a
> > working hash-combine function, or even better a hash API that can
> > continue to use the hash's internal state, seems a more scalable
> > solution.
> 
> That's another way to go, but it requires inventing a way to thread
> the IV through the hash opclass interface.

Only if we really want to do it really well :P. Using a hash_combine()
like

/*
 * Combine two hash values, resulting in another hash value, with decent bit
 * mixing.
 *
 * Similar to boost's hash_combine().
 */
static inline uint32
hash_combine(uint32 a, uint32 b)
{
a ^= b + 0x9e3779b9 + (a << 6) + (a >> 2);
return a;
}

between hash(IV) and the hashfunction should do the trick (the IV needs
to hashed once, otherwise the bit mix is bad).


> That's actually sort of a
> problem anyway.  Maybe I ought to have started with the question of
> how we're going to make that end of things work.

+1 one for that plan.


> We could:
> 
> - Invent a new hash_partition AM that doesn't really make indexes but
> supplies hash functions for hash partitioning.
> - Add a new, optional support function 2 to the hash AM that takes a
> value of the type *and* an IV as an argument.
> - Something else.

Not arguing for it, but one option could also have pg_type.hash*
function(s).

One thing that I think might be advisable to think about is that we're
atm stuck with a relatively bad hash function for hash indexes (and hash
joins/aggs), and we should probably evolve it at some point. At the same
time there's currently people out there relying on the current hash
functions remaining stable.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL_stashcache, or, what's our minimum Perl version?

2017-08-03 Thread Tom Lane
Andrew Dunstan  writes:
> On 07/31/2017 06:54 PM, Tom Lane wrote:
>> but could we do something like
>> my $pflags = "PROVE_FLAGS='" . ($ENV{PROVE_FLAGS} || "--timer") . "'";
>> to allow overriding this choice from the buildfarm config?

> I have committed this in a slightly different form.

Thanks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A bug in mapping attributes in ATExecAttachPartition()

2017-08-03 Thread Amit Langote
On 2017/08/04 3:29, Robert Haas wrote:
> On Thu, Aug 3, 2017 at 1:04 AM, Amit Langote
>  wrote:
>> Alright, attached updated 0001 does that.
> 
> Committed 0001 and 0002.

Thanks.

> 0003 needs a rebase.

Rebased patch attached.

Thanks,
Amit
From f069845c027acc36aab4790d6d6afbf50bba803e Mon Sep 17 00:00:00 2001
From: amit 
Date: Tue, 1 Aug 2017 10:58:38 +0900
Subject: [PATCH 1/2] Cope with differing attnos in ATExecAttachPartition code

If the table being attached has attnos different from the parent for
the partitioning columns which are present in the partition constraint
expressions, then predicate_implied_by() will prematurely return false
due to structural inequality of the corresponding Var expressions in the
the partition constraint and those in the table's check constraint
expressions.  Fix this by changing the partition constraint's expressions
to bear the partition's attnos.

Further, if the validation scan needs to be performed after all and
the table being attached is a partitioned table, we will need to map
the constraint expression again to change the attnos to the individual
leaf partition's attnos from those of the table being attached.

Reported by: Ashutosh Bapat
Report: 
https://postgr.es/m/CAFjFpReT_kq_uwU_B8aWDxR7jNGE%3DP0iELycdq5oupi%3DxSQTOw%40mail.gmail.com
---
 src/backend/commands/tablecmds.c  | 40 ++-
 src/test/regress/expected/alter_table.out | 45 +++
 src/test/regress/sql/alter_table.sql  | 38 ++
 3 files changed, 111 insertions(+), 12 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7859ef13ac..1b8d4b3d17 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -13433,6 +13433,7 @@ ATExecAttachPartition(List **wqueue, Relation rel, 
PartitionCmd *cmd)
boolskip_validate = false;
ObjectAddress address;
const char *trigger_name;
+   boolfound_whole_row;
 
attachrel = heap_openrv(cmd->name, AccessExclusiveLock);
 
@@ -13614,6 +13615,16 @@ ATExecAttachPartition(List **wqueue, Relation rel, 
PartitionCmd *cmd)
partConstraint = list_make1(make_ands_explicit(partConstraint));
 
/*
+* Adjust the generated constraint to match this partition's attribute
+* numbers.
+*/
+   partConstraint = map_partition_varattnos(partConstraint, 1, attachrel,
+   
 rel, _whole_row);
+   /* There can never be a whole-row reference here */
+   if (found_whole_row)
+   elog(ERROR, "unexpected whole-row reference found in partition 
key");
+
+   /*
 * Check if we can do away with having to scan the table being attached 
to
 * validate the partition constraint, by *proving* that the existing
 * constraints of the table *imply* the partition predicate.  We include
@@ -13712,8 +13723,7 @@ ATExecAttachPartition(List **wqueue, Relation rel, 
PartitionCmd *cmd)
AlteredTableInfo *tab;
Oid part_relid = lfirst_oid(lc);
Relationpart_rel;
-   Expr   *constr;
-   boolfound_whole_row;
+   List   *my_partconstr = partConstraint;
 
/* Lock already taken */
if (part_relid != RelationGetRelid(attachrel))
@@ -13732,18 +13742,24 @@ ATExecAttachPartition(List **wqueue, Relation rel, 
PartitionCmd *cmd)
continue;
}
 
+   if (part_rel != attachrel)
+   {
+   /*
+* Adjust the constraint that we constructed 
above for
+* attachRel so that it matches this 
partition's attribute
+* numbers.
+*/
+   my_partconstr = 
map_partition_varattnos(my_partconstr, 1,
+   
part_rel, attachrel,
+   
_whole_row);
+   /* There can never be a whole-row reference 
here */
+   if (found_whole_row)
+   elog(ERROR, "unexpected whole-row 
reference found in partition key");
+   }
+
/* Grab a work queue entry. */
tab = ATGetQueueEntry(wqueue, part_rel);
-
-   /* Adjust constraint to match this partition */
-  

Re: [HACKERS] Unused variable scanned_tuples in LVRelStats

2017-08-03 Thread Masahiko Sawada
On Fri, Aug 4, 2017 at 3:01 AM, Robert Haas  wrote:
> On Thu, Aug 3, 2017 at 1:10 AM, Masahiko Sawada  wrote:
>> So we can remove scanned_tuples from LVRelStats struct and change the
>> variable name num_tuples to scanned_tuples. Attached updated patch.
>
> On second thought, I think we should just leave this alone.
> scanned_tuples is closely tied to tupcount_pages, so it's a little
> confusing to pull one out and not the other.  And we can't pull
> tupcount_pages out of LVRelStats because lazy_cleanup_index needs it.
> The current situation isn't doing any harm, so I'm not seeing much
> point in changing it.
>

Hmm, since I agree the current situation isn't doing any harm actually
I don't push hard for this but I'd like to still propose at least the
original patch that fixes an inconsistent in the code.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reload-through-the-top-parent switch the partition table

2017-08-03 Thread Rushabh Lathia
On Thu, Aug 3, 2017 at 3:39 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> On Wed, Aug 2, 2017 at 11:47 PM, David G. Johnston
>  wrote:
> > On Wed, Aug 2, 2017 at 10:58 AM, Tom Lane  wrote:
> >>
> >> Robert Haas  writes:
> >> > On Wed, Aug 2, 2017 at 1:08 PM, Tom Lane  wrote:
> >> >> --restore-via-partition-root ?
> >>
> >> > I worry someone will think that pg_dump is now restoring stuff, but it
> >> > isn't.
> >>
> >> Well, the point is that the commands it emits will cause the eventual
> >> restore to go through the root.  Anyway, I think trying to avoid using
> >> a verb altogether is going to result in a very stilted option name.
> >>
> >> I notice that the option list already includes some references to
> >> "insert", so maybe "--insert-via-partition-root"?  Although you could
> >> argue that that's confusing when we're using COPY.
> >
> >
> > --use-partitioned-table [partition-name, ...]  # if names are omitted it
> > defaults to all partitioned tables
>
> I like this idea since it allows using this feature for selected
> tables e.g. hash tables. Otherwise, users will be forced to use this
> option even when there is only  a single hash partitioned table and
> many other list/range partitioned tables.
>
>
+1.


> What we are trying to do here is dump the data in a partitioned table
> as if it's not partitioned. Combine that with --data-only dumps, and
> one could use it to load partitioned data into unpartitioned or
> differently partitioned table. So, how about naming the option as
>
> --unpartition-partitioned-table [partitioned-table-name, ] # if
> names are omitted it defaults to all the partitioned tables.
>
>
--unpartition-partitioned-table is very confusing.

I liked the previous option which is --use-partitioned-table
[partition-name, ...].
The Only problem with --use-partitioned-table is, a user needs to specify
the
partition-name in the options list. Imagine if someone having 100's of
partitions then specifying those name is pg_dump option is a pain.

Rather than that:

--use-partitioned-table [partitioned_name, ...] # if
names are omitted it defaults to all the partitioned tables.

Here user need to specify the root relation name in the option - and any
partition table have that as a ROOT, will load the data through
top-parent-relation.


That really says what dump is really doing without focusing on how the
> data will be used like restoring/inserting/copying etc.
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>



-- 
Rushabh Lathia


Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-03 Thread AP
On Wed, Aug 02, 2017 at 11:34:13AM -0400, Robert Haas wrote:
> On Wed, Jul 12, 2017 at 1:10 AM, Amit Kapila  wrote:
> > It seems so.  Basically, in the case of a large number of duplicates,
> > we hit the maximum number of overflow pages.  There is a theoretical
> > possibility of hitting it but it could also happen that we are not
> > free the existing unused overflow pages due to which it keeps on
> > growing and hit the limit.  I have requested up thread to verify if
> > that is happening in this case and I am still waiting for same.  The
> > squeeze operation does free such unused overflow pages after cleaning
> > them.  As this is a costly operation and needs a cleanup lock, so we
> > currently perform it only during Vacuum and next split from the bucket
> > which can have redundant overflow pages.
> 
> Oops.  It was rather short-sighted of us not to increase
> HASH_MAX_BITMAPS when we bumped HASH_VERSION.  Actually removing that
> limit is hard, but we could have easily bumped it for 128 to say 1024
> without (I think) causing any problem, which would have given us quite
> a bit of headroom here.  I suppose we could still try to jam that
> change in before beta3 (bumping HASH_VERSION again) but that might be
> asking for trouble.

I, for one, would be grateful for such a bump (or better). Currently
having more fun than I wish trying to figure out where my inserts will
begin to fail so that I don't make a mess of things. Right now I can't
match data going in with sane partitioning points in-storage. If I can
go "3 months worth of data then partition" or the like and have enough
room for differences in the data then I can be pretty happy. ATM I'm
not getting anywhere near that and am tempted to chuck it all in, eat
the 3-4x disk space cost and go back to btree which'd cost me terrabytes.

AP


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve the performance of the standby server when dropping tables on the primary server

2017-08-03 Thread Tokuda, Takashi
Hi,

> * The previous coding allowed for a fast path to access the last
> unowned relation, which this patch removes. It seems a good idea to
> cache the last unowned relation, or if not explain why the comment
> that says why it worked that way is no longer true.
> 
> * We should only create the hash table when needed, i.e. on or after
> when we add an unowned relation, since that is not a typical case.
> 
> * The hash table is sized at 400 elements and will grow from there.
> The comments in dynahash say "An overly large nelem will penalize
> hash_seq_search speed without buying much." so this makes your patch
> suitable for the bulk case but likely to perform worse for fewer
> elements. So I'm guessing that you picked 400 because that's what the
> parameter is set to for the smgr relation table rather than because
> this has had good consideration.

I thought abount improving the above problems.
But I have no good ideas to improve every case.
Do you have any good ideas?

I suggest to apply this modification only for the startup process.
This is because the startup process has many unowned SMgrRelation objects.
In XLOG replay, statup process create fake relcaches.
Fake relcaches create unowned SMgrRelation objects.
So startup process has more unowned SMgrRelation objects than any other process.

Of cource, it is necessary to think about the problems such as hash size.
Do you think about it.

Regards, Takashi Tokuda


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for Secure Transport SSL library on macOS as OpenSSL alternative

2017-08-03 Thread Daniel Gustafsson
> On 03 Aug 2017, at 13:06, Heikki Linnakangas  wrote:
> 
> On 08/03/2017 01:02 PM, Daniel Gustafsson wrote:
>> 
>> The frontend has support for using PEM files for certificates and keys.  It 
>> can
>> also look up the key for the certificate in a Keychain, or both certificate 
>> and
>> key in a Keychain.  The root certificate is still just read from a PEM file.
> 
> Why can't you have the root certificate in the keychain, too? Just not 
> implemented yet, or is there some fundamental problem with it?

Just not implemented yet, awaiting Keychain discussions.

>> The existence of an sslcert config trumps a keychain, but when a keychain is
>> used I’m currently using the sslcert parameter (awaiting a discussion on how 
>> to
>> properly do this) for the certificate label required to search the keychain.
>> There is a new frontend parameter called “keychain” with which a path to a
>> custom Keychain file can be passed.  If set, this Keychain will be searched 
>> as
>> well as the default.  If not, only the default user Keychain is used.  There 
>> is
>> nothing that modifies the Keychain in this patch, it can read identities
>> (certificates and its key) but not alter them in any way.
> 
> OpenSSL also has a mechanism somewhat similar to the keychain, called 
> "engines". You can e.g. keep the private key corresponding a certificate on a 
> smart card, and speak to it with an OpenSSL "smart card reader" engine. If 
> you do that, the 'sslkey' syntax is ":". Perhaps we 
> should adopt that syntax here as well? For example, to read the client 
> certificate from the key chain, you would use libpq options like 
> "keychain=/home/heikki/my_keychain sslcert=keychain:my_client_cert”.

Thats definitely an option, although it carries a bit redudancy in this case
which can confuse users.  With “keychain=/foo/bar.keychain sslcert=my_cert”,
did the user mean a file called my_cert or is it a reference to a cert in the
keychain?  Nothing that strict parsing rules, good errormessages and
documentation can’t solve but needs careful thinking.

>> “keychain” is obviously a very Secure Transport specific name, and I 
>> personally
>> think we should avoid that.  Any new configuration added here should take
>> future potential implementation into consideration such that avoid the need 
>> for
>> lots of backend specific knobs.  “sslcertstore” comes to mind as an
>> alternative, but we’d also need parameters to point into the certstore for
>> finding what we need.  Another option would be to do a URL based scheme
>> perhaps.
> 
> I wouldn't actually mind using implementation-specific terms like "keychain" 
> here. It makes it clear that it's implementation-specific. I think it would 
> be confusing, to use the same generic option name, like "sslcertstore", for 
> both a macOS keychain and e.g. the private key store in Windows. Or GNU 
> Keyring. In the worst case, you might even have multiple such "key stores" on 
> the same system, so you'd anyways need a way to specify which one you mean.

Thats a good point.

> Actually, perhaps it should be made even more explicit, and call it 
> "secure_transport_keychain". That's quite long, though.

Yeah, secure_transport_ is a pretty long prefix.

> Wrt. keychains, is there a system-global or per-user keychain in macOS? And 
> does this patch use them? If you load a CRL into a global keychain, does it 
> take effect?

Each user has a default Keychain , and on top of that you can create as many
Keychains as you want (a Keychain is really just a database file containing
secret things).  The frontend use the default one for lookups unless the
keychain parameter is set in which case it uses both.

When evaluating trust, Secure Transport will use the default Keychain even if
not explicitly opened (as in the backend code).  It does however only search
for intermediate certificates and not root certs/CRLs.  Adding a policy object
for using CRLs should force it to afaik, but we’d need to support additional
keychains (if only to be able to test without polluting the default).

>> Testing
>> ===
>> In order to test this we need to provide an alternative to the openssl calls 
>> in
>> src/test/ssl/Makefile for Secure Transport.
> 
> Those openssl commands are only needed to re-generate the test certificates. 
> The certificates are included in the git repository, so you only need to 
> re-generate them if you want to modify them or add new ones. I think it's OK 
> to require the openssl tool for that, it won't be needed just to run the test 
> suite.

Ah, of course.  We still need support for re-generating Keychains (or at all
generate them in case we don’t want binaries in the repository).

>> Documentation
>> =
>> I have started fiddling with this a little, but to avoid spending time on the
>> wrong thing I have done very little awaiting the outcome of discussions here.
>> I have tried to add lots of comments to the code however, to explain the 
>> quirks
>> of 

[HACKERS] Unused field of ProjectionPath

2017-08-03 Thread Antonin Houska
I've noticed that the dummypp field of ProjectionPath is set but never read.

If the only possible change between the path and plan creation time is that
the projection path and the subpath targetlists become different, then dummypp
could be used this way:

diff --git a/src/backend/optimizer/plan/createplan.c 
b/src/backend/optimizer/plan/createplan.c
new file mode 100644
index 5c934f2..1910d3f
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
*** create_projection_plan(PlannerInfo *root
*** 1572,1578 
 * not using.)
 */
if (is_projection_capable_path(best_path->subpath) ||
!   tlist_same_exprs(tlist, subplan->targetlist))
{
/* Don't need a separate Result, just assign tlist to subplan */
plan = subplan;
--- 1572,1578 
 * not using.)
 */
if (is_projection_capable_path(best_path->subpath) ||
!   (best_path->dummypp && tlist_same_exprs(tlist, 
subplan->targetlist)))
{
/* Don't need a separate Result, just assign tlist to subplan */
plan = subplan;


On the other hand, if the targetlists can also be different at path creation
time and equal at plan creation time, the lists do always need comparison at
plan creation time and the dummypp field should probably be removed.


-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers