Re: [HACKERS] 64-bit queryId?

2017-10-03 Thread Andres Freund
On 2017-10-03 03:07:09 +0300, Alexander Korotkov wrote:
> On Tue, Oct 3, 2017 at 12:32 AM, Tom Lane  wrote:
> 
> > Peter Geoghegan  writes:
> > > You need to change the SQL interface as well, although I'm not sure
> > > exactly how. The problem is that you are now passing a uint64 queryId
> > > to Int64GetDatumFast() within pg_stat_statements_internal(). That
> > > worked when queryId was a uint32, because you can easily represent
> > > values <= UINT_MAX as an int64/int8. However, you cannot represent the
> > > second half of the range of uint64 within a int64/int8. I think that
> > > this will behave different depending on USE_FLOAT8_BYVAL, if nothing
> > > else.
> >
> > Maybe intentionally drop the high-order bit, so that it's a 63-bit ID?
> >
> 
> +1,
> I see 3 options there:
> 1) Drop high-order bit, as you proposed.
> 2) Allow negative queryIds.
> 3) Implement unsigned 64-type.

4) use numeric, efficiency when querying is not a significant concern here
5) use a custom type that doesn't support arithmetic, similar to pg_lsn.

FWIW, I think we should consider going for something like 5) for
pg_class.relpages.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cache lookup errors for missing replication origins

2017-10-03 Thread Masahiko Sawada
On Wed, Sep 6, 2017 at 3:51 PM, Michael Paquier
 wrote:
> On Tue, Sep 5, 2017 at 12:59 PM, Michael Paquier
>  wrote:
>> ERROR:  42704: replication slot "%s" does not exist
>
> s/slot/origin/
>
>> As far as I can see, replorigin_by_oid makes no use of its missing_ok
>> = false in the backend code, so letting it untouched would have no
>> impact. replorigin_by_name with missing_ok = false is only used with
>> SQL-callable functions, so it could be changed without any impact
>> elsewhere (without considering externally-maintained replication
>> modules).
>
> As long as I don't forget, attached is a patch added as well to the
> next CF. replorigin_by_oid should have the same switch from elog to
> ereport in my opinion. Additional regression tests are included.

The patch passes the regression test and I found no problems in this
patch. I've marked it as Ready for Committer.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

2017-10-03 Thread Kyotaro HORIGUCHI
At Mon, 2 Oct 2017 12:43:19 +0200, Pavel Stehule  
wrote in 

[HACKERS] Add TOAST to system tables with ACL?

2017-10-03 Thread Alexander Korotkov
Hi!

This topic was already discussed (at least one time) in 2011.  See [1] for
details.  I'd like to raise that again.

Currently, we have table in system catalog with ACL, but without TOAST.
Thus, it might happen that we can't fit new ACL item, because row becomes
too large for in-line storage.

You can easily reproduce this situation in psql.

create table t (col int);
\copy (select 'create user u' || i || ';' from generate_series(1,1) i)
to 'script.sql'
\i script.sql
\copy (select 'grant all on t to u' || i || ';' from
generate_series(1,1) i) to 'script.sql'
\i script.sql

Eventually GRANT statements start to raise error.
psql:script.sql:2447: ERROR:  row is too big: size 8168, maximum size 8160

I understand that we shouldn't endorse users to behave like this.  We
should rather advise them to evade adding too many ACL items to single
object by using roles.  And that would be way easier to manage too.

However, current PostgreSQL behavior is rather unexpected and
undocumented.  Thus, I would say it's a bug.  This bug would be nice to fix
even if long ACL lists would work slower than normal.

In the discussion to the post [1] Tom comments that he isn't excited about
out-of-line ACL entry unless it's proven that performance doesn't
completely tank in this case.

I've done some basic experiments in this field on my laptop.  Attached
draft patch adds TOAST to all system catalog tables with ACL.  I've run
pgbench with custom script "SELECT * FROM t;" where t is empty table with
long ACL entry.  I've compared results with 1000 ACL items (in-line
storage) and 1 ACL items (out-of-line storage).

Also, I've notice performance degradation of GRANT statements themselves.
 1000 GRANT statements are executed in 1.5 seconds while 1 GRANT
statements are executed in 42 seconds.  In average single GRANT statements
becomes 2.8 times slower.  That's significant degradation, but it doesn't
seem to be fatal degradation for me.

Results of pgbench are presented in following table.

  Number of ACL items   | -M simple | -M prepared
+---+-
 1000 (in-line storage) |  6623 |7242
1 (out-of-line storage) | 14498 |   17827

So, it's 2.1-2.4 times degradation in this case.  That's very significant
degradation, but I wouldn't day that "performance completely tank".

Any thoughts?  Should we consider TOASTing ACL entries or should we give up
with this?

Links:

 1. https://www.commandprompt.com/blog/grant_schema_usage_
to_2500_users_no_can_do/

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


acl-toast-1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add TOAST to system tables with ACL?

2017-10-03 Thread Alexander Korotkov
On Tue, Oct 3, 2017 at 9:19 PM, Tom Lane  wrote:

> Alexander Korotkov  writes:
> > This topic was already discussed (at least one time) in 2011.  See [1]
> for
> > details.  I'd like to raise that again.
>
> I'm a bit worried about adding a toast table to pg_class, and more so
> about pg_database, because both of those have to be accessed in situations
> where it's not clear that we could successfully fetch from a toast table,
> because too little of the catalog access infrastructure is alive.
>
> pg_class is probably all right as long as only the ACL field could ever
> get toasted, since it's unlikely that any low-level accesses would be
> paying attention to that field anyway.
>
> For pg_database, you'd have to make sure that the startup-time check of
> database CONNECT privilege still works if the ACL's been pushed out of
> line.
>

Thank you for pointing.  I'll check this.

> Also, I've notice performance degradation of GRANT statements themselves.
> >  1000 GRANT statements are executed in 1.5 seconds while 1 GRANT
> > statements are executed in 42 seconds.  In average single GRANT
> statements
> > becomes 2.8 times slower.  That's significant degradation, but it doesn't
> > seem to be fatal degradation for me.
>
> Seems all right, since we could just say "we don't really recommend that
> usage pattern".
>

Yes, sure.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] UPDATE of partition key

2017-10-03 Thread Robert Haas
On Tue, Oct 3, 2017 at 8:16 AM, Amit Khandekar  wrote:
> While writing this down, I observed that after multi-level partition
> tree expansion was introduced, the child table expressions are not
> converted directly from the root. Instead, they are converted from
> their immediate parent. So there is a chain of conversions : to leaf
> from its parent, to that parent from its parent, and so on from the
> root. Effectively, during the first conversion, there are that many
> ConvertRowtypeExpr nodes one above the other already present in the
> UPDATE result rel expressions. But my patch handles the optimization
> only for the leaf partition conversions.
>
> If already has CRE : CRE(rr) -> wr(r)
> Parent-to-child conversion ::: CRE(p) -> wr(r)  ===>   CRE(rr) ->
> CRE(r) -> wr(c1)
> W patch : CRE(rr) -> CRE(r) -> wr(c1) ===> CRE(rr) -> CRE(r) -> wr(c2)

Maybe adjust_appendrel_attrs() should have a similar provision for
avoiding extra ConvertRowTypeExpr nodes?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] WIP Add ALWAYS DEFERRED option for constraints

2017-10-03 Thread Nico Williams
Attached are patches to add an ALWAYS DEFERRED option to CONSTRAINTs and
CONSTRAINT TRIGGERs, meaning: SET CONSTRAINTS .. IMMEDIATE will not make
immediate any constraint/trigger that is declared as ALWAYS DEFERRED.

I.e., the opposite of NOT DEFERRED.  Perhaps I should make this NOT
IMMEDIATE?  Making it NOT IMMEDIATE has the benefit of not having to
change the precedence of ALWAYS to avoid a shift/reduce conflict...  It
may also be more in keeping with NOT DEFERRED.

Motivation:

 - I have trigger procedures that must run at the end of the transaction
   (after the last statement prior to COMMIT sent by the client/user),
   which I make DEFERRABLE, INITIALLY DEFERRED CONSTRAINT TRIGGERs out
   of, but SET CONSTRAINTS can be used to foil my triggers.  I have
   written SQL code to detect that constraint triggers have fired too
   soon, but I'd rather not need it.

 - Symmetry.  If we can have NOT DEFERRABLE constraints, why not also
   NOT IMMEDIABLE?  :)  Naturally "immediable" is not a word, but you
   get the point.

 - To learn my way around PostgreSQL source code in preparation for
   other contributions.

Anyways, this patch is NOT passing tests at the moment, and I'm not sure
why.  I'm sure I can figure it out, but first I need to understand the
failures.  E.g., I see this sort of difference:

   \d testschema.test_index1
   Index "testschema.test_index1"
Column |  Type  | Definition
   ++
id | bigint | id
  -btree, for table "testschema.test_default_tab"
  +f, for table "testschema.btree", predicate (test_default_tab)

which means, I think, that I've screwed up in src/bin/psql/describe.c,
don't it's not obvious to me yet how.

Some questions for experienced PostgreSQL developers:

Q0: Is this sort of patch welcomed?

Q1: Should new columns for pg_catalog.pg_constraint go at the end, or may
they be added in the middle?

Q2: Can I add new columns to information_schema tables, or are there
standards-compliance issues with that?

Q3: Is the C-style for PG documented somewhere?  (sorry if I missed this)

Q4: Any ideas what I'm doing wrong in this patch series?

Nico
-- 
>From 02f2765bde7e7d4fd357853c33dac55e4bdc2732 Mon Sep 17 00:00:00 2001
From: Nicolas Williams 
Date: Tue, 3 Oct 2017 00:33:09 -0500
Subject: [PATCH 1/4] WIP: Add ALWAYS DEFERRED option for CONSTRAINTs

and CONSTRAINT TRIGGERs.

This is important so that one can have triggers and constraints that
must run after all of the user/client's statements in a transaction
(i.e., at COMMIT time), so that the user/client may make no further
changes (triggers, of course, still can).
---
 WIP| 25 +
 doc/src/sgml/catalogs.sgml |  7 +++
 doc/src/sgml/ref/alter_table.sgml  |  4 +-
 doc/src/sgml/ref/create_table.sgml | 10 ++--
 doc/src/sgml/ref/create_trigger.sgml   |  2 +-
 src/backend/catalog/heap.c |  1 +
 src/backend/catalog/index.c|  8 +++
 src/backend/catalog/information_schema.sql |  8 +++
 src/backend/catalog/pg_constraint.c|  2 +
 src/backend/catalog/toasting.c |  2 +-
 src/backend/commands/indexcmds.c   |  2 +-
 src/backend/commands/tablecmds.c   | 20 +++-
 src/backend/commands/trigger.c | 25 +++--
 src/backend/commands/typecmds.c|  3 ++
 src/backend/parser/gram.y  | 81 --
 src/backend/parser/parse_utilcmd.c |  1 +
 src/backend/utils/adt/ruleutils.c  |  2 +
 src/bin/pg_dump/pg_dump.c  | 23 +++--
 src/bin/pg_dump/pg_dump.h  |  2 +
 src/bin/psql/describe.c| 15 --
 src/include/catalog/index.h|  2 +
 src/include/catalog/pg_constraint.h|  4 +-
 src/include/catalog/pg_constraint_fn.h |  1 +
 src/include/catalog/pg_trigger.h   |  4 +-
 src/include/commands/trigger.h |  1 +
 src/include/nodes/parsenodes.h |  6 ++-
 src/include/utils/reltrigger.h |  1 +
 27 files changed, 221 insertions(+), 41 deletions(-)
 create mode 100644 WIP

diff --git a/WIP b/WIP
new file mode 100644
index 000..806df83
--- /dev/null
+++ b/WIP
@@ -0,0 +1,25 @@
+WIP notes for adding ALWAYS DEFERRED option for CONSTRAINTs and CONSTRAINT TRIGGERs
+===
+
+ - add ALWAYS DEFERRED syntax in src/backend/parser/gram.y (DONE)
+
+   (the existing NOT DEFERRABLE == ALWAYS IMMEDIATE, so we don't add that)
+
+ - add alwaysdeferred field to several structs -- wherever initdeferred
+   is defined, basically (DONE)
+
+ - add conalwaysdeferred column to pg_constraints
+
+ - in src/backend/commands/trigger.c modify places where all_isdeferred
+   and all_isset are used (DONE)
+
+ - add AFTER_TRIGGER_ALWAYSDEFERRED for AfterTriggerSharedData's
+   ats_event (struct) and fill it 

Re: [HACKERS] [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM

2017-10-03 Thread Thomas Munro
On Wed, Oct 4, 2017 at 7:32 AM, Petr Jelinek
 wrote:
> On 03/10/17 19:44, Tom Lane wrote:
>> I wrote:
 So that's trouble waiting to happen, for sure.  At the very least we
 need to do a single fetch of WalRcv->latch, not two.  I wonder whether
 even that is sufficient, though: this coding requires an atomic fetch of
 a pointer, which is something we generally don't assume to be safe.
>>
>> BTW, I had supposed that this bug was of long standing, but actually it's
>> new in v10, dating to 597a87ccc9a6fa8af7f3cf280b1e24e41807d555.  Before
>> that walreceiver start/stop just changed the owner of a long-lived shared
>> latch, and there was no question of stale pointers.
>>
>> I considered reverting that decision, but the reason for it seems to have
>> been to let libpqwalreceiver.c manipulate MyProc->procLatch rather than
>> having to know about a custom latch.  That's probably a sufficient reason
>> to justify some squishiness in the wakeup logic.  Still, we might want to
>> revisit it if we find any other problems here.
> That's correct, and because the other users of that library don't have
> special latch it seemed feasible to standardize on the procLatch. If we
> indeed wanted to change the decision about this I think we can simply
> give latch as a parameter to libpqrcv_connect and store it in the
> WalReceiverConn struct. The connection does not need to live past the
> latch (although it currently does, but that's just a matter of
> reordering the code in WalRcvDie() a little AFAICS).

I wonder if the new ConditionVariable mechanism would be worth
considering in future cases like this, where the signalling process
doesn't know the identity of the process to be woken up (or even how
many waiting processes there are), but instead any interested waiters
block on a particular ConditionVariable that models a specific
interesting condition.  In the end it's just latches anyway, but it
may be a better abstraction.  On the other hand I'm not sure how waits
on a ConditionVariable would be multiplexed with IO (a distinct wait
event, or leaky abstraction where the caller relies on the fact that
it's built on MyProc->latch, something else?).

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] WIP Add ALWAYS DEFERRED option for constraints

2017-10-03 Thread Andreas Joseph Krogh
På tirsdag 03. oktober 2017 kl. 21:51:30, skrev Nico Williams <
n...@cryptonector.com >:
Attached are patches to add an ALWAYS DEFERRED option to CONSTRAINTs and
 CONSTRAINT TRIGGERs, meaning: SET CONSTRAINTS .. IMMEDIATE will not make
 immediate any constraint/trigger that is declared as ALWAYS DEFERRED.

 I.e., the opposite of NOT DEFERRED.  Perhaps I should make this NOT
 IMMEDIATE?  Making it NOT IMMEDIATE has the benefit of not having to
 change the precedence of ALWAYS to avoid a shift/reduce conflict...  It
 may also be more in keeping with NOT DEFERRED.

 Motivation:

  - I have trigger procedures that must run at the end of the transaction
    (after the last statement prior to COMMIT sent by the client/user),
    which I make DEFERRABLE, INITIALLY DEFERRED CONSTRAINT TRIGGERs out
    of, but SET CONSTRAINTS can be used to foil my triggers.  I have
    written SQL code to detect that constraint triggers have fired too
    soon, but I'd rather not need it.

  - Symmetry.  If we can have NOT DEFERRABLE constraints, why not also
    NOT IMMEDIABLE?  :)  Naturally "immediable" is not a word, but you
    get the point.

  - To learn my way around PostgreSQL source code in preparation for
    other contributions.

 Anyways, this patch is NOT passing tests at the moment, and I'm not sure
 why.  I'm sure I can figure it out, but first I need to understand the
 failures.  E.g., I see this sort of difference:

    \d testschema.test_index1
    Index "testschema.test_index1"
     Column |  Type  | Definition
    ++
     id     | bigint | id
   -btree, for table "testschema.test_default_tab"
   +f, for table "testschema.btree", predicate (test_default_tab)

 which means, I think, that I've screwed up in src/bin/psql/describe.c,
 don't it's not obvious to me yet how.

 Some questions for experienced PostgreSQL developers:

 Q0: Is this sort of patch welcomed?

 Q1: Should new columns for pg_catalog.pg_constraint go at the end, or may
     they be added in the middle?

 Q2: Can I add new columns to information_schema tables, or are there
     standards-compliance issues with that?

 Q3: Is the C-style for PG documented somewhere?  (sorry if I missed this)

 Q4: Any ideas what I'm doing wrong in this patch series?

 Nico
 
 
+1.
 
While we're in deferrable constraints land...;  I even more often need 
deferrable conditional unique-indexes.
In PG you now may have:
ALTER TABLE email_folder ADD CONSTRAINT some_uk UNIQUE (owner_id, folder_type, 
name) DEFERRABLE INITIALLY DEFERRED; 
 
But this isn't supported:
CREATE UNIQUE INDEX some_uk ON email_folder(owner_id, folder_type, name) WHERE 
parent_idIS NULL DEFERRABLE INITIALLY DEFERRED;  

Are there any plans to support this?
 
Thanks.



 
--
 Andreas Joseph Krogh
 




Re: [HACKERS] [PATCH] Pageinspect - add functions on GIN and GiST indexes from gevel

2017-10-03 Thread Alexander Korotkov
On Wed, Sep 13, 2017 at 10:57 AM, Ashutosh Sharma 
wrote:

> On Wed, Sep 13, 2017 at 1:15 PM, Alexey Chernyshov
>  wrote:
> > On Sat, 9 Sep 2017 13:53:35 +0530
> > Ashutosh Sharma  wrote:
> >
> >>
> >> Finally, i got some time to look into this patch and surprisingly I
> >> didn't find any function returning information at page level instead
> >> all the SQL functions are returning information at index level.
> >> Therefore, i too feel that it should be either integrated with
> >> pgstattuple which could a better match as Tomas also mentioned or may
> >> be add a new contrib module itself. I think, adding a new contrib
> >> module looks like a better option. The reason being, it doesn't simply
> >> include the function for showing index level statistics (for e.g..
> >> index size, no of rows, values..e.t.c) like pgstattuple does but,
> >> also, displays information contained in a page for e.g. the object
> >> stored in the page,  it's tid e.t.c. So, more or less, I would say
> >> that, it's the mixture of pageinspect and pgstattuple module and
> >> therefore, i feel, adding it as a new extension would be a better
> >> choice. Thought?
> >
> > Thank you for your interest, I will add a new contrib module named,
> > say, indexstat. I think we can add statistics on other indexes in the
> > future.
> >
>
> I think we should wait for experts opinion and then take a call. I am
> not expert. I just gave my opinion as i have worked in this area
> earlier when working for hash index. Thanks.


I'm not sure that we should port these functions from gevel directly.  We
could try to re-implement similar functionality which fits pageinspect
approach better.  In particular, we can implement some low-level functions
whose show detailed information at page level.  And on top of them we can
implement analogues of gevel functions in SQL or PL/pgSQL.  Is it feasible?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] SendRowDescriptionMessage() is slow for queries with a lot of columns

2017-10-03 Thread Robert Haas
On Tue, Oct 3, 2017 at 12:23 PM, Andres Freund  wrote:
> Makes sense?

Yes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM

2017-10-03 Thread Petr Jelinek
On 03/10/17 19:44, Tom Lane wrote:
> I wrote:
>>> So that's trouble waiting to happen, for sure.  At the very least we
>>> need to do a single fetch of WalRcv->latch, not two.  I wonder whether
>>> even that is sufficient, though: this coding requires an atomic fetch of
>>> a pointer, which is something we generally don't assume to be safe.
> 
> BTW, I had supposed that this bug was of long standing, but actually it's
> new in v10, dating to 597a87ccc9a6fa8af7f3cf280b1e24e41807d555.  Before
> that walreceiver start/stop just changed the owner of a long-lived shared
> latch, and there was no question of stale pointers.
> 
> I considered reverting that decision, but the reason for it seems to have
> been to let libpqwalreceiver.c manipulate MyProc->procLatch rather than
> having to know about a custom latch.  That's probably a sufficient reason
> to justify some squishiness in the wakeup logic.  Still, we might want to
> revisit it if we find any other problems here.
That's correct, and because the other users of that library don't have
special latch it seemed feasible to standardize on the procLatch. If we
indeed wanted to change the decision about this I think we can simply
give latch as a parameter to libpqrcv_connect and store it in the
WalReceiverConn struct. The connection does not need to live past the
latch (although it currently does, but that's just a matter of
reordering the code in WalRcvDie() a little AFAICS).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

2017-10-03 Thread Robert Haas
On Tue, Oct 3, 2017 at 8:57 AM, Ashutosh Bapat
 wrote:
>> Regarding nomenclature and my previous griping about wisdom, I was
>> wondering about just calling this a "partition join" like you have in
>> the regression test.  So the GUC would be enable_partition_join, you'd
>> have generate_partition_join_paths(), etc.  Basically just delete
>> "wise" throughout.
>
> Partition-wise join is standard term used in literature and in
> documentation of other popular DBMSes, so partition_wise makes more
> sense. But I am fine with partition_join as well. Do you want it
> partition_join or partitionjoin like enable_mergejoin/enable_hashjoin
> etc.?

Well, you're making me have second thoughts.  It's really just that
partition_wise looks a little awkward to me, and maybe that's not
enough reason to change anything.  I suppose if I commit it this way
and somebody really hates it, it can always be changed later.  We're
not getting a lot of input from anyone else at the moment.

> Attached the updated patch-set.

I decided to skip over 0001 for today and spend some time looking at
0002-0006.  Comments below.

0002:

Looks fine.

0003:

The commit message mentions estimate_num_groups but the patch doesn't touch it.

I am concerned that this patch might introduce some problem fixed by
commit dd4134ea56cb8855aad3988febc45eca28851cd8.  The comment in that
patch say, at one place, that "This protects against possible
incorrect matches to child expressions that contain no Vars."
However, if a child expression has no Vars, then I think em->em_relids
will be empty, so the bms_is_equal() test that is there now will fail
but your proposed bms_is_subset() test will pass.

0004:

I suggest renaming get_wholerow_ref_from_convert_row_type to
is_converted_whole_row_reference and making it return a bool.

The coding of that function is a little strange; why not move Var to
an inner scope?  Like this: if (IsA(convexpr->arg, var)) { Var *var =
castNode(Var, convexpr->arg; if (var->varattno == 0) return var; }

Will the statement that "In case of multi-level partitioning, we will
have as many nested ConvertRowtypeExpr as there are levels in
partition hierarchy" be falsified by Amit Khandekar's pending patch to
avoid sticking a ConvertRowTypeExpr on top of another
ConvertRowTypeExpr?  Even if the answer is "no", I think it might be
better to drop this part of the comment; it would be easy for it to
become false in the future, because we might want to optimize that
case in the future and we'll probably forget to update this comment
when we do.

In fix_upper_expr_mutator(), you have an if statement whose entire
contents are another if statement.  I think you should use && instead,
and maybe reverse the order of the tests, since
context->subplan_itlist->has_conv_whole_rows is probably cheaper to
test than a function call.  It's also a little strange that this code
isn't adjacent too, or merged with, the existing has_non_vars case.
Maybe:

converted_whole_row = is_converted_whole_row_reference(node);
if (context->outer_itlist && (context->outer_itlist->has_non_vars ||
(context->outer_itlist->has_conv_whole_rows && converted_whole_row))
...
if (context->inner_itlist && (context->inner_itlist->has_non_vars ||
(context->inner_itlist->has_conv_whole_rows && converted_whole_row))
...

0005:

The comment explaining why the ParamPathInfo is allocated in the same
context as the RelOptInfo is a modified copy of an existing comment
that still reads like the original, a manner of commenting I find a
bit undesirable as it leads to filling up the source base with
duplicate comments.

I don't think I believe that comment, either.  In the case from which
that comment was copied (mark_dummy_rel), it was talking about a
RelOptInfo, and geqo_eval() takes care to remove any leftover pointers
to joinrels creating during a GEQO cycle.  But there's no similar
logic for ppilist, so I think what will happen here is that you'll end
up with a freed node in the middle of the list.

I think reparameterize_path_by_chid() could use a helper function
reparameterize_pathlist_by_child() that iterates over a list of paths
and returns a list of paths.  That would remove some of the loops.

I think the comments for reparameterize_path_by_child() need to be
expanded.  They don't explain how you decided which nodes need to be
handled here or which fields within those nodes need some kind of
handling other than a flat-copy.  I think these kinds of explanations
will be important for future maintenance of this code.  You know why
you did it this way, I can mostly guess what you did it this way, but
what about the next person who comes along who hasn't made a detailed
study of partition-wise join?

I don't see much point in the T_SubqueryScanPath and T_ResultPath
cases in reparameterize_path_by_child().  It's just falling through to
the default case.

I wonder if reparameterize_path_by_child() ought to default to
returning NULL rather than throwing an 

Re: [HACKERS] datetime.h defines like PM conflict with external libraries

2017-10-03 Thread Andrew Dunstan




On 10/03/2017 03:00 PM, Andres Freund wrote:
> Hi,
>
> In my llvm jit work I'd to
>
> #undef PM
> /* include some llvm headers */
> #define PM 1
>
> because llvm has a number of functions which have an argument named PM.
> Now that works, but it's fairly ugly. Perhaps it would be a good idea to
> name these defines in a manner that's slightly less likely to conflict?
>
> Alternatively we could use #pragma push_macro() around the includes, but
> that'd be a new dependency.
>
> Better ideas?
>


AFAICT at a quick glance these are only used in a couple of files. Maybe
the defs need to be floated off to a different header with more limited
inclusion?

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] WIP Add ALWAYS DEFERRED option for constraints

2017-10-03 Thread Nico Williams
On Tue, Oct 03, 2017 at 02:51:30PM -0500, Nico Williams wrote:
> Anyways, this patch is NOT passing tests at the moment, and I'm not sure
> why.  I'm sure I can figure it out, but first I need to understand the
> failures.  E.g., I see this sort of difference:
> 
>\d testschema.test_index1
>Index "testschema.test_index1"
> Column |  Type  | Definition
>++
> id | bigint | id
>   -btree, for table "testschema.test_default_tab"
>   +f, for table "testschema.btree", predicate (test_default_tab)
> 
> which means, I think, that I've screwed up in src/bin/psql/describe.c,
> don't it's not obvious to me yet how.

Ah, I needed to adjust references to results columns.  I suspect that
something similar relates to other remaining failures.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM

2017-10-03 Thread Petr Jelinek
On 03/10/17 22:01, Thomas Munro wrote:
> On Wed, Oct 4, 2017 at 7:32 AM, Petr Jelinek
>  wrote:
>> On 03/10/17 19:44, Tom Lane wrote:
>>> I wrote:
> So that's trouble waiting to happen, for sure.  At the very least we
> need to do a single fetch of WalRcv->latch, not two.  I wonder whether
> even that is sufficient, though: this coding requires an atomic fetch of
> a pointer, which is something we generally don't assume to be safe.
>>>
>>> BTW, I had supposed that this bug was of long standing, but actually it's
>>> new in v10, dating to 597a87ccc9a6fa8af7f3cf280b1e24e41807d555.  Before
>>> that walreceiver start/stop just changed the owner of a long-lived shared
>>> latch, and there was no question of stale pointers.
>>>
>>> I considered reverting that decision, but the reason for it seems to have
>>> been to let libpqwalreceiver.c manipulate MyProc->procLatch rather than
>>> having to know about a custom latch.  That's probably a sufficient reason
>>> to justify some squishiness in the wakeup logic.  Still, we might want to
>>> revisit it if we find any other problems here.
>> That's correct, and because the other users of that library don't have
>> special latch it seemed feasible to standardize on the procLatch. If we
>> indeed wanted to change the decision about this I think we can simply
>> give latch as a parameter to libpqrcv_connect and store it in the
>> WalReceiverConn struct. The connection does not need to live past the
>> latch (although it currently does, but that's just a matter of
>> reordering the code in WalRcvDie() a little AFAICS).
> 
> I wonder if the new ConditionVariable mechanism would be worth
> considering in future cases like this, where the signalling process
> doesn't know the identity of the process to be woken up (or even how
> many waiting processes there are), but instead any interested waiters
> block on a particular ConditionVariable that models a specific
> interesting condition.  In the end it's just latches anyway, but it
> may be a better abstraction.  On the other hand I'm not sure how waits
> on a ConditionVariable would be multiplexed with IO (a distinct wait
> event, or leaky abstraction where the caller relies on the fact that
> it's built on MyProc->latch, something else?).
> 

In this specific case the problem is that what we are waiting for is
indeed the Latch because we want the libpqwalreceiver calls to be
interruptible when waiting for I/O.

Otherwise, yes ConditionVariable is useful for generic signaling, we use
it in slot and origin for "lock" waits (they don't have normal catalog
so normal locking does not work).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] datetime.h defines like PM conflict with external libraries

2017-10-03 Thread Andres Freund
Hi,

On 2017-10-03 16:34:38 -0400, Andrew Dunstan wrote:
> On 10/03/2017 03:00 PM, Andres Freund wrote:
> > Hi,
> >
> > In my llvm jit work I'd to
> >
> > #undef PM
> > /* include some llvm headers */
> > #define PM 1
> >
> > because llvm has a number of functions which have an argument named PM.
> > Now that works, but it's fairly ugly. Perhaps it would be a good idea to
> > name these defines in a manner that's slightly less likely to conflict?
> >
> > Alternatively we could use #pragma push_macro() around the includes, but
> > that'd be a new dependency.
> >
> > Better ideas?

> AFAICT at a quick glance these are only used in a couple of files. Maybe
> the defs need to be floated off to a different header with more limited
> inclusion?

Why not just rename them to PG_PM etc? If we force potential external
users to do some changes, we can use more unique names just as well -
the effort to adapt won't be meaningfully higher... IMNSHO there's not
much excuse for defining macros like PM globally.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] datetime.h defines like PM conflict with external libraries

2017-10-03 Thread Tom Lane
Andres Freund  writes:
> On 2017-10-03 16:34:38 -0400, Andrew Dunstan wrote:
>> AFAICT at a quick glance these are only used in a couple of files. Maybe
>> the defs need to be floated off to a different header with more limited
>> inclusion?

> Why not just rename them to PG_PM etc? If we force potential external
> users to do some changes, we can use more unique names just as well -
> the effort to adapt won't be meaningfully higher... IMNSHO there's not
> much excuse for defining macros like PM globally.

I like the new-header-file idea because it will result in minimal code
churn and thus minimal back-patching hazards.

I do *not* like "PG_PM".  For our own purposes that adds no uniqueness
at all.  If we're to touch these symbols then I'd go for names like
"DATETIME_PM".  Or maybe "DT_PM" ... there's a little bit of precedent
for the DT_ prefix already.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] WIP Add ALWAYS DEFERRED option for constraints

2017-10-03 Thread Nico Williams
On Tue, Oct 03, 2017 at 10:10:59PM +0200, Andreas Joseph Krogh wrote:
> +1.
> 
> While we're in deferrable constraints land...;  I even more often need 
> deferrable conditional unique-indexes.
> In PG you now may have:
> ALTER TABLE email_folder ADD CONSTRAINT some_uk UNIQUE (owner_id, 
> folder_type, 
> name) DEFERRABLE INITIALLY DEFERRED; 
> 
> But this isn't supported:
> CREATE UNIQUE INDEX some_uk ON email_folder(owner_id, folder_type, name) 
> WHERE 
> parent_id IS NULL DEFERRABLE INITIALLY DEFERRED;
> 
> Are there any plans to support this?

Not by me, but I can take a look and, if it is trivial, do it.  At a
quick glance it does look like it should be easy enough to do it, at
least to get started on a patch.  

If I can get some help with my current patch, I'll take a look :)

But yes, I'd like to have full consistency between CREATE and ALTER.
Everything that one can do with CREATE should be possible to do with
ALTER, including IF NOT EXISTS.

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS][BUG] Cache invalidation for queries that contains const of temporary composite type

2017-10-03 Thread Maksim Milyutin

Hi HORIGUCHI-san!


Thanks for the valuable comments.


03.10.17 4:30, Kyotaro HORIGUCHI wrote:


The first thought that patch gave me is that the problem is not
limited to constants. Actually the following sequence also
reproduces similar failure even with this patch.

create table t2 (x int , y int);
create type pair as (x int, y int);
prepare test as select row(x, y)::pair from t2;
drop type pair;
execute test;
| ERROR:  cache lookup failed for type 16410

In this case the causal expression is in the following form.

   TargetEntry (
 expr = (
   RowExpr:
 typeid = 16410,
 row_format = COERCE_EXPLICIT_CAST,
 args = List (Var(t2.x), Var(t2.y))
 )
   )


Yeah, RowExpr has no dependency from type relation with oid=16410. I 
think the routine 'fix_expr_common' needs to be reworked to take into 
account any possible dependencies of expression from composite type.


On November commitfest I'll lay out patch that covers your case.

--
Regards,
Maksim Milyutin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add TOAST to system tables with ACL?

2017-10-03 Thread Tom Lane
Alexander Korotkov  writes:
> This topic was already discussed (at least one time) in 2011.  See [1] for
> details.  I'd like to raise that again.

I'm a bit worried about adding a toast table to pg_class, and more so
about pg_database, because both of those have to be accessed in situations
where it's not clear that we could successfully fetch from a toast table,
because too little of the catalog access infrastructure is alive.

pg_class is probably all right as long as only the ACL field could ever
get toasted, since it's unlikely that any low-level accesses would be
paying attention to that field anyway.

For pg_database, you'd have to make sure that the startup-time check of
database CONNECT privilege still works if the ACL's been pushed out of
line.

> Also, I've notice performance degradation of GRANT statements themselves.
>  1000 GRANT statements are executed in 1.5 seconds while 1 GRANT
> statements are executed in 42 seconds.  In average single GRANT statements
> becomes 2.8 times slower.  That's significant degradation, but it doesn't
> seem to be fatal degradation for me.

Seems all right, since we could just say "we don't really recommend that
usage pattern".

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add TOAST to system tables with ACL?

2017-10-03 Thread Andres Freund
On 2017-10-03 14:19:09 -0400, Tom Lane wrote:
> Alexander Korotkov  writes:
> > This topic was already discussed (at least one time) in 2011.  See [1] for
> > details.  I'd like to raise that again.
> 
> I'm a bit worried about adding a toast table to pg_class, and more so
> about pg_database, because both of those have to be accessed in situations
> where it's not clear that we could successfully fetch from a toast table,
> because too little of the catalog access infrastructure is alive.
> 
> pg_class is probably all right as long as only the ACL field could ever
> get toasted, since it's unlikely that any low-level accesses would be
> paying attention to that field anyway.

I think relpartbound, reloptions are pretty likely to be toasted too if
the pg_class tuple is wide enough due to acls. But that seems ok.

It'd be a lot easier to test if there were an easier way to force
columns to be toasted. Sometimes I wonder about making
TOAST_TUPLE_THRESHOLD configurable...


> For pg_database, you'd have to make sure that the startup-time check of
> database CONNECT privilege still works if the ACL's been pushed out of
> line.

Yep.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] datetime.h defines like PM conflict with external libraries

2017-10-03 Thread Andres Freund
Hi,

In my llvm jit work I'd to

#undef PM
/* include some llvm headers */
#define PM 1

because llvm has a number of functions which have an argument named PM.
Now that works, but it's fairly ugly. Perhaps it would be a good idea to
name these defines in a manner that's slightly less likely to conflict?

Alternatively we could use #pragma push_macro() around the includes, but
that'd be a new dependency.

Better ideas?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-03 Thread Peter Geoghegan
On Tue, Oct 3, 2017 at 6:09 PM, Wood, Dan  wrote:
> I’ve just started looking at this again after a few weeks break.

> if (TransactionIdIsValid(priorXmax) &&
> !TransactionIdEquals(priorXmax, 
> HeapTupleHeaderGetXmin(htup)))
> break;

> We need to understand why these TXID equal checks exist.  Can we 
> differentiate the cases they are protecting against with the two exceptions 
> I’ve found?

I haven't read your remarks here in full, since I'm about to stop
working for the day, but I will point out that
src/backend/access/heap/README.HOT says a fair amount about this,
under "Abort Cases".


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-03 Thread Wood, Dan
One minor side note…   Is it weird for xmin/xmax to go backwards in a hot row 
chain?

lp | t_ctid | lp_off | t_infomask | t_infomask2 | t_xmin | t_xmax 
++++-++
  1 | (0,1)  |   8152 |   2818 |   3 |  36957 |  0
  2 ||  5 || ||   
  3 ||  0 || ||   
  4 ||  0 || ||   
  5 | (0,6)  |   8112 |   9986 |   49155 |  36962 |  36963
  6 | (0,7)  |   8072 |   9986 |   49155 |  36963 |  36961
  7 | (0,7)  |   8032 |  11010 |   32771 |  36961 |  0
(7 rows)


On 10/3/17, 6:20 PM, "Peter Geoghegan"  wrote:

On Tue, Oct 3, 2017 at 6:09 PM, Wood, Dan  wrote:
> I’ve just started looking at this again after a few weeks break.

> if (TransactionIdIsValid(priorXmax) &&
> !TransactionIdEquals(priorXmax, 
HeapTupleHeaderGetXmin(htup)))
> break;

> We need to understand why these TXID equal checks exist.  Can we 
differentiate the cases they are protecting against with the two exceptions 
I’ve found?

I haven't read your remarks here in full, since I'm about to stop
working for the day, but I will point out that
src/backend/access/heap/README.HOT says a fair amount about this,
under "Abort Cases".


-- 
Peter Geoghegan




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

2017-10-03 Thread Tom Lane
Andrew Dunstan  writes:
> Do you have any suggestion as to how we should transmit the blacklist to
> parallel workers?

Perhaps serialize the contents into an array in DSM, then rebuild a hash
table from that in the worker.  Robert might have a better idea though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Warnings in objectaddress.c

2017-10-03 Thread Robert Haas
On Sun, Oct 1, 2017 at 10:33 AM, Дмитрий Воронин
 wrote:
> I'm building PostgreSQL 10 rc1 sources on Debian wheezy (gcc 4.7). I have 
> those warnings:
>
> objectaddress.c: In function ‘get_object_address’:
> objectaddress.c:1646:10: warning: ‘typeoids[1]’ may be used uninitialized in 
> this function [-Wmaybe-uninitialized]
> objectaddress.c:1578:8: note: ‘typeoids[1]’ was declared here
> objectaddress.c:1623:7: warning: ‘typenames[1]’ may be used uninitialized in 
> this function [-Wmaybe-uninitialized]
> objectaddress.c:1577:14: note: ‘typenames[1]’ was declared here
> objectaddress.c:1646:10: warning: ‘typeoids[0]’ may be used uninitialized in 
> this function [-Wmaybe-uninitialized]
> objectaddress.c:1578:8: note: ‘typeoids[0]’ was declared here
> objectaddress.c:1623:7: warning: ‘typenames[0]’ may be used uninitialized in 
> this function [-Wmaybe-uninitialized]
> objectaddress.c:1577:14: note: ‘typenames[0]’ was declared here
>
> Those variables typeoids and typenames are arrays and are not initialized 
> during definition.
>
> Hope this helps. Thank you!

In theory I think this isn't a problem because List *object should
always be a two-element list whose second element is itself a list of
at least 2 elements, but in practice this doesn't seem like very good
code, because if by chance that list isn't of the proper format, we'll
either crash (if the toplevel list is too short) or do syscache
lookups using undefined values (if the sub-list is too short).  The
latter possibility is, I believe, what prompts these warnings. Perhaps
we should apply some glorified version of this:

diff --git a/src/backend/catalog/objectaddress.c
b/src/backend/catalog/objectaddress.c
index c2ad7c675e..9c27ab9434 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -1579,6 +1579,9 @@ get_object_address_opf_member(ObjectType objtype,
 int membernum;
 int i;

+if (list_length(object) < 2)
+elog(ERROR, "fail");
+
 /*
  * The last element of the object list contains the strategy or procedure
  * number.  We need to strip that out before getting the opclass/family
@@ -1603,6 +1606,9 @@ get_object_address_opf_member(ObjectType objtype,
 break;
 }

+if (i < 2)
+elog(ERROR, "somebody screwed up");
+
 switch (objtype)
 {
 case OBJECT_AMOP:

However, I'm not 100% sure that would be sufficient to suppress these
warnings, because the compiler has got to be smart enough to know that
elog() doesn't return and that i >= 2 is sufficient to guarantee that
everything is initialized.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit queryId?

2017-10-03 Thread Michael Paquier
On Tue, Oct 3, 2017 at 11:55 PM, Robert Haas  wrote:
> It seems silly to me to throw away a perfectly good bit from the hash
> value just because of some risk of minor user confusion.  I do like
> Michael's suggestion of outputing hexa-like text, but changing the
> types of the user-visible output columns seems like a job for another
> patch.  Similarly, if we were to adopt Andres's suggestions of a new
> type or using numeric or Alexander's suggestion of implementing a
> 64-bit unsigned type, I think it should be a separate patch from this
> one.

Yeah, any of them would require a version bump of pg_stat_statements.
My suggestion would be actually to just document the use of to_hex in
pg_stat_statements if there is any issue with query ID signed-ness.
Still, I have yet to see any user stories about complains on the
matter, so even just added a note in the docs may be overdoing it.

Thinking about that freshly today (new day, new thoughts of course), I
think that I would go with the 64-bit version. The patch gets more
simple, and there are ways to easily get around negative numbers by
applying anything like to_hex() on top of pg_stat_statements.

+/* Write an unsigned integer field (anything written with UINT64_FORMAT) */
+#define WRITE_UINT64_FIELD(fldname) \
+   appendStringInfo(str, " :" CppAsString(fldname) " " UINT64_FORMAT, \
+node->fldname)
Correct call here.

 {
-   return hash_any((const unsigned char *) str, len);
+   return hash_any_extended((const unsigned char *) str, len, 0);
 }
[...]
-   uint32  start_hash = hash_any(jumble, JUMBLE_SIZE);
+   uint64  start_hash = hash_any_extended(jumble, JUMBLE_SIZE, 0);
Missing two DatumGetUInt64() perhaps? HEAD looks wrong to me as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: protocol version negotiation (Re: Libpq PGRES_COPY_BOTH - version compatibility)

2017-10-03 Thread Badrul Chowdhury
Hello,

Kindly review this patch that implements the proposal for pgwire protocol 
negotiation described in this thread.

A big thanks to @Satyanarayana 
Narlapuram for his help and 
guidance in implementing the patch.

Please note:
1. Pgwire protocol v3.0 with negotiation is called v3.1.
2. There are 2 patches for the change: a BE-specific patch that will be 
backported and a FE-specific patch that is only for pg10 and above.

/*/
/* Feature In A Nutshell */
/*/
This feature enables a newer FE to connect to an older BE using a protocol 
negotiation phase in the connection setup. The FE sends its protocol version 
and a list of startup parameters to the BE to start a connection. If the FE 
pgwire version is newer than the BE's or if the BE could not recognize some of 
the startup parameters,
the BE simply ignores the parameters that it did not recognize and sends a 
ServerProtocolVersion message containing the minimum and maximum versions it 
supports as well as a list of the parameters that it did honour. The FE then 
decides whether it wants to continue connecting- in the patch, the FE continues 
only if the parameters that the BE did not honour were optional, where optional 
parameters are denoted by a proper prefix of "_pq_" in their names.

//
/* BE Patch */
//

Files modified:

  src/
  └── backend/
   └── postmaster/
   └── postmaster.c
   └── utils/misc/
   └── guc.c

  src/
  └── include/postmaster/
   └── postmaster.h

Design decisions:

  1.   The BE sends a message type of ServerProtocolVersion 
if the FE is newer or the FE is not using v3.0
   1.a Message structure: [ 'Y' | msgLength | 
min_version | max_version | param_list_len | list of param names ]
   1.b 'Y' was selected to denote ServerProtocolVersion 
messages as it is the first available character from the bottom of the list of 
available characters that is not being used to denote any message type at 
present.

  2.   Added support for BE to accept optional parameters
   2.a An optional parameter is a startup parameter 
with "_pq_" as a proper prefix in its name. The check to add a placeholder in 
/src/backend/utils/misc/guc.c was modified to allow optional parameters.
   2.b The string comparison in is_optional() is 
encoding-aware as the BE's encoding might be different from the FE's.

//
/* FE Patch */
//

Files modified:

  src/
  └── Makefile

  src/
  └── Makefile.global.in

  src/
  └──   bin/
   └── pg_dump/
   └── pg_dump.c
   └── scripts/
   ├── clusterdb.c
   ├── createuser.c
   ├── reindexdb.c
   └── vacuumdb.c

  src/
  └── common/
  └── Makefile

  src/
  └── fe_utils/
  └── Makefile
   └── simple_list.c

  src/
  └── include/
   └── fe_utils/
   └── simple_list.h
   └── libpq/
   └── pqcomm.h

  src/
  └── interfaces/libpq/
   ├── fe-connect.c
   ├── fe-protocol3.c
   ├── libpq-fe.h
   └── libpq-int.h

  src/
  └── tools/msvc/
   └── Mkvcbuild.pm

Design decisions:

  1.   Added mechanism to send startup parameters to BE:
   SimpleStringList startup_parameters in 
/src/interfaces/libpq/fe-connect.c enables sending parameters to the BE. The 
startup_parameters list is parsed in /src/interfaces/libpq/fe-protocol3.c and 
the arguments are sent in the startup packet to the BE.

   This is mostly used for testing at this point- one 
would send additional startup parameters like so in PQconnectStartParams():

 
simple_string_list_append(_parameters, "_pq_A");
 
simple_string_list_append(_parameters, "1");
 startup_parameters.immutable = true; 
// PQconnectStartParams() is called twice for the same connection; the 
immutable flag ensures that the same parameter is not added twice

  2.   Added a CONNECTION_NEGOTIATING state 

Re: [HACKERS] 64-bit queryId?

2017-10-03 Thread Michael Paquier
On Wed, Oct 4, 2017 at 10:37 AM, Robert Haas  wrote:
> On Tue, Oct 3, 2017 at 9:34 PM, Michael Paquier
>  wrote:
>> +/* Write an unsigned integer field (anything written with UINT64_FORMAT) */
>> +#define WRITE_UINT64_FIELD(fldname) \
>> +   appendStringInfo(str, " :" CppAsString(fldname) " " UINT64_FORMAT, \
>> +node->fldname)
>> Correct call here.
>
> I'm sorry, but I don't understand this comment.

I just mean that your patch is correct here. I don't always complain :)
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit queryId?

2017-10-03 Thread Robert Haas
On Tue, Oct 3, 2017 at 9:34 PM, Michael Paquier
 wrote:
> +/* Write an unsigned integer field (anything written with UINT64_FORMAT) */
> +#define WRITE_UINT64_FIELD(fldname) \
> +   appendStringInfo(str, " :" CppAsString(fldname) " " UINT64_FORMAT, \
> +node->fldname)
> Correct call here.

I'm sorry, but I don't understand this comment.

>  {
> -   return hash_any((const unsigned char *) str, len);
> +   return hash_any_extended((const unsigned char *) str, len, 0);
>  }
> [...]
> -   uint32  start_hash = hash_any(jumble, JUMBLE_SIZE);
> +   uint64  start_hash = hash_any_extended(jumble, JUMBLE_SIZE, 
> 0);
> Missing two DatumGetUInt64() perhaps? HEAD looks wrong to me as well.

Ah, yes, you're right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

2017-10-03 Thread Andres Freund
On 2017-10-03 19:53:41 -0400, Andrew Dunstan wrote:
> On 09/27/2017 02:52 PM, Tom Lane wrote:
> > Andrew Dunstan  writes:
> >> At this stage on reflection I agree it should be pulled :-(
> > That seems to be the consensus, so I'll go make it happen.
> >
> >> I'm not happy about the idea of marking an input function as not
> >> parallel safe, certainly not without a good deal of thought and
> >> discussion that we don't have time for this cycle.
> > I think the way forward is to do what we had as of HEAD (984c92074),
> > but add the ability to transmit the blacklist table to parallel
> > workers.  Since we expect the blacklist table would be empty most of
> > the time, this should be close to no overhead in practice.  I concur
> > that the idea of marking the relevant functions parallel-restricted is
> > probably not as safe a fix as I originally thought, and it's not a
> > very desirable restriction even if it did fix the problem.

> Do you have any suggestion as to how we should transmit the blacklist to
> parallel workers?

How about storing them in the a dshash table instead of dynahash?
Similar to how we're now dealing with the shared typmod registry stuff?
It should be fairly simple to now simply add a new struct Session member
shared_enum_whatevs_table.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Warnings in objectaddress.c

2017-10-03 Thread Tom Lane
Robert Haas  writes:
> Perhaps we should apply some glorified version of this:

> +if (list_length(object) < 2)
> +elog(ERROR, "fail");

> However, I'm not 100% sure that would be sufficient to suppress these
> warnings, because the compiler has got to be smart enough to know that
> elog() doesn't return and that i >= 2 is sufficient to guarantee that
> everything is initialized.

I'm betting it wouldn't help.  I was considering something along the line
of unrolling the loop:

Assert(list_length(object) == 2);

assign typenames[0] and typeoids[0] from linitial(object)

assign typenames[1] and typeoids[1] from lsecond(object)

This would involve duplicating the loop body, but that's only 3 lines,
so I think it wouldn't even net out as more code.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: protocol version negotiation (Re: Libpq PGRES_COPY_BOTH - version compatibility)

2017-10-03 Thread Tom Lane
Badrul Chowdhury  writes:
> 1. Pgwire protocol v3.0 with negotiation is called v3.1.
> 2. There are 2 patches for the change: a BE-specific patch that will be 
> backported and a FE-specific patch that is only for pg10 and above.

TBH, anything that presupposes a backported change in the backend is
broken by definition.  We expect libpq to be able to connect to older
servers, and that has to include servers that didn't get this memo.

It would be all right for libpq to make a second connection attempt
if its first one fails, as we did in the 2.0 -> 3.0 change.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] datetime.h defines like PM conflict with external libraries

2017-10-03 Thread Andrew Dunstan


On 10/03/2017 04:43 PM, Tom Lane wrote:
> Andres Freund  writes:
>> On 2017-10-03 16:34:38 -0400, Andrew Dunstan wrote:
>>> AFAICT at a quick glance these are only used in a couple of files. Maybe
>>> the defs need to be floated off to a different header with more limited
>>> inclusion?
>> Why not just rename them to PG_PM etc? If we force potential external
>> users to do some changes, we can use more unique names just as well -
>> the effort to adapt won't be meaningfully higher... IMNSHO there's not
>> much excuse for defining macros like PM globally.
> I like the new-header-file idea because it will result in minimal code
> churn and thus minimal back-patching hazards.
>
> I do *not* like "PG_PM".  For our own purposes that adds no uniqueness
> at all.  If we're to touch these symbols then I'd go for names like
> "DATETIME_PM".  Or maybe "DT_PM" ... there's a little bit of precedent
> for the DT_ prefix already.
>
>   


Yeah. If we use a prefix +1 for DT_. If we do that then I think they
should *all* be prefixed, not just the ones we know of conflicts for.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-03 Thread Peter Geoghegan
On Tue, Oct 3, 2017 at 9:48 AM, Alvaro Herrera  wrote:
> But that still doesn't fix the problem;
> as far as I can see, vacuum removes the root of the chain, not yet sure
> why, and then things are just as corrupted as before.

Are you sure it's not opportunistic pruning? Another thing that I've
noticed with this problem is that the relevant IndexTuple will pretty
quickly vanish, presumably due to LP_DEAD setting (but maybe not
actually due to LP_DEAD setting).

(Studies the problem some more...)

I now think that it actually is a VACUUM problem, specifically a
problem with VACUUM pruning. You see the HOT xmin-to-xmax check
pattern that you mentioned within heap_prune_chain(), which looks like
where the incorrect tuple prune (or possibly, at times, redirect?)
takes place. (I refer to the prune/kill that you mentioned today, that
frustrated your first attempt at a fix -- "I modified the multixact
freeze code...".)

The attached patch "fixes" the problem -- I cannot get amcheck to
complain about corruption with this applied. And, "make check-world"
passes. Hopefully it goes without saying that this isn't actually my
proposed fix. It tells us something that this at least *masks* the
problem, though; it's a start.

FYI, the repro case page contents looks like this with the patch applied:

postgres=# select lp, lp_flags, t_xmin, t_xmax, t_ctid,
to_hex(t_infomask) as infomask,
to_hex(t_infomask2) as infomask2
from heap_page_items(get_raw_page('t', 0));
 lp | lp_flags | t_xmin  | t_xmax | t_ctid | infomask | infomask2
+--+-+++--+---
  1 |1 | 1845995 |  0 | (0,1)  | b02  | 3
  2 |2 | |||  |
  3 |0 | |||  |
  4 |0 | |||  |
  5 |0 | |||  |
  6 |0 | |||  |
  7 |1 | 1846001 |  0 | (0,7)  | 2b02 | 8003
(7 rows)

-- 
Peter Geoghegan
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 52231ac..90eb39e 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -470,13 +470,6 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum,
 		ItemPointerSet(&(tup.t_self), BufferGetBlockNumber(buffer), offnum);
 
 		/*
-		 * Check the tuple XMIN against prior XMAX, if any
-		 */
-		if (TransactionIdIsValid(priorXmax) &&
-			!TransactionIdEquals(HeapTupleHeaderGetXmin(htup), priorXmax))
-			break;
-
-		/*
 		 * OK, this tuple is indeed a member of the chain.
 		 */
 		chainitems[nchain++] = offnum;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallelize queries containing initplans

2017-10-03 Thread Robert Haas
On Tue, Oct 3, 2017 at 7:33 AM, Amit Kapila  wrote:
>> - Why do we even need contains_parallel_unsafe_param() and
>> is_initplan_is_below_current_query_level() in the first place, anyway?
>>  I think there's been some discussion of that on this thread, but I'm
>> not sure I completely understand it, and the comments in the patch
>> don't help me understand why we now need this restriction.
>
> This is to ensure that initplans are never below Gather node.  If we
> allow parallelism when initplan is below current query level, (a) then
> we need a way to pass the result of initplan from worker to other
> workers and to master backend (b) if initplan contains reference to
> some parameter above the current query level (undirect correlated
> plans), then we need a mechanism to pass such parameters (basically
> allow correlated initplans work).

Man, this stuff makes my hurt.  Leaving aside the details of what this
particular patch tries to handle, and with due regard for the
possibility that I don't know what I'm talking about, I feel like
there are four cases: (1) the InitPlan refers to one or more
parameters whose value is computed below the Gather node, (2) the
InitPlan refers to one or more parameters whose value is computed
above the Gather node, (3) the InitPlan refers to both parameters
computed above the Gather and parameters computed below the Gather,
and (4) the InitPlan refers to no parameters at all.

In cases (1) and (3), we cannot compute the value at the Gather node
because some of the necessary information doesn't exist yet.  it
wouldn't make any sense to try to compute it at that point because
each the value on which the InitPlan output depends can be different
in each worker and can change at different times in different workers.
However, in case (1), it's definitely OK to compute the value in the
worker itself, and in case (3), it's OK provided that the necessary
parameter values from above the Gather are passed down to the workers.

In cases (2) and (4), we can compute the value at the Gather node.  If
we tried to compute it within individual workers, then we'd run into a
few problems: (a) the plan might not be parallel-safe, (b) it would be
inefficient to recompute the same value in every worker, and (c) if
the expression is volatile, the workers might not all arrive at the
same answer.  Now, if the plan is parallel-safe, then we could also
choose NOT to compute the value at the Gather node and instead let the
first participant that needs the value compute it; while it's being
computed, it could fence off other participants from embarking on the
same computation, and once the computation is done, all participants
get the same computed value.  But AFAICS all of that is strictly
optional: in any case where the InitPlan doesn't depend a value
produced below the Gather, it is legal (implementation difficulties
aside) to compute it at the Gather node.

Having said all that, I think that this patch only wants to handle the
subset of cases (2) and (4) where the relevant InitPlan is attached
ABOVE the Gather node -- which seems very reasonable, because
evaluating an InitPlan at a level of the plan tree above the level at
which it is defined sounds like it might be complex.  But I still
don't quite see why we need these tests.  I mean, if we only allow
Param references from a set of safe parameter IDs, and the safe
parameter IDs include only those IDs that can be generated in a
worker, then won't other InitPlans in the workers anyway be ruled out?
 I assume there won't be an InitPlan whose output Param is never
referenced anywhere -- and if there were, it wouldn't really matter
whether a worker thought it could compute it, because there'd be
nothing to trigger that computation to actually happen.

If I am all mixed up, please help straighten me out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] WIP Add ALWAYS DEFERRED option for constraints

2017-10-03 Thread Vik Fearing
On 10/03/2017 10:10 PM, Andreas Joseph Krogh wrote:
> While we're in deferrable constraints land...; 
> I even more often need deferrable /conditional /unique-indexes.
> In PG you now may have:
> 
> ALTER TABLE email_folder ADD CONSTRAINT some_uk UNIQUE (owner_id, 
> folder_type, name) DEFERRABLE INITIALLY DEFERRED;
> 
>  
> But this isn't supported:
> 
> CREATE UNIQUE INDEX some_uk ON email_folder(owner_id, folder_type, name) 
> WHERE parent_id IS NULL DEFERRABLE INITIALLY DEFERRED;
> 
> Are there any plans to support this?

I don't want to hijack the thread, but you can do that with exclusion
constraints.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] generated columns

2017-10-03 Thread Greg Stark
There are some unanswered questions with column grants too. Do we
allow granting access to a calculated column which accesses columns
the user doesn't have access to?

If so then this is a suitable substitute for using updateable views to
handle things like granting users access to things like password
hashes or personal data with details censored without giving them
access to the unhashed password or full personal info.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-10-03 Thread Tom Lane
"Bossart, Nathan"  writes:
> Since get_rel_oids() was altered in 19de0ab2, here is a new version of
> the patch.

I thought it would be a good idea to get this done before walking away
from the commitfest and letting all this info get swapped out of my
head.  So I've reviewed and pushed this.

I took out most of the infrastructure you'd put in for constructing
RangeVars for tables not explicitly named on the command line.  It
was buggy (eg you can't assume a relcache entry will stick around)
and I don't believe it's necessary.  I don't think that warnings
should be issued for any tables not explicitly named.

In any case, though, the extent to which we should add more warning
or log output seems like a fit topic for a new thread and a separate
patch.  Let's call this one done.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM

2017-10-03 Thread Andreas Seltenreich
Michael Paquier writes:

> I am attaching a patch that addresses the bugs for the spin lock sections.
> [2. text/x-diff; walreceiver-spin-calls.patch]

I haven't seen a spinlock PANIC since testing with the patch applied.
They occured five times with the same amount of testing done earlier.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallelize queries containing initplans

2017-10-03 Thread Amit Kapila
On Mon, Oct 2, 2017 at 8:13 PM, Robert Haas  wrote:
> On Thu, Sep 14, 2017 at 10:45 PM, Amit Kapila  wrote:
>> The latest patch again needs to be rebased.  Find rebased patch
>> attached with this email.
>
> I read through this patch this morning.   Copying Tom in the hopes
> that he might chime in on the following two issues in particular:
>

It seems you forgot to include Tom, included now.

> 1. Is there any superior alternative to adding ptype to ParamExecData?
>  I think the reason why we don't have this already is because the plan
> tree that populates the Param must output the right type and the plan
> tree that reads the Param must expect the right type, and there's no
> need for anything else.  But for serialization and deserialization
> this seems to be necessary.  I wonder whether it would be better to
> try to capture this at the time the Param is generated (e.g.
> var->vartype in assign_param_for_var) rather than derived at execution
> time by applying exprType(), but I'm not sure how that would work
> exactly, or whether it's really any better.
>
> 2. Do max_parallel_hazard_walker and set_param_references() have the
> right idea about which parameters are acceptable candidates for this
> optimization?  The idea seems to be to collect the setParam sets for
> all initplans between the current query level and the root.  That
> looks correct to me but I'm not an expert on this parameter stuff.
>
> Some other comments:
>
> - I think parallel_hazard_walker should likely work by setting
> safe_param_ids to the right set of parameter IDs rather than testing
> whether the parameter is safe by checking either whether it is in
> safe_param_ids or some other condition is met.
>

Okay, I think it should work the way you are suggesting.  I will give it a try.

> - contains_parallel_unsafe_param() sounds like a function that merely
> returns true or false, but it actually has major side effects.  Those
> side effects also look unsafe; mutating previously-generated paths can
> corrupt the rel's pathlist, because it will no longer have the sort
> order and other characteristics that add_path() creates and upon which
> other code relies.
>
> - Can't is_initplan_is_below_current_query_level() be confused when
> there are multiple subqueries in the tree?  For example if the
> toplevel query has subqueries a and b and a has a sub-subquery aa
> which has an initplan, won't this function think that b has an
> initplan below the current query level?  If not, maybe a comment is in
> order explaining why not?
>

We can discuss above two points once you are convinced about whether
we need any such functions, so let's first discuss that.

> - Why do we even need contains_parallel_unsafe_param() and
> is_initplan_is_below_current_query_level() in the first place, anyway?
>  I think there's been some discussion of that on this thread, but I'm
> not sure I completely understand it, and the comments in the patch
> don't help me understand why we now need this restriction.
>

This is to ensure that initplans are never below Gather node.  If we
allow parallelism when initplan is below current query level, (a) then
we need a way to pass the result of initplan from worker to other
workers and to master backend (b) if initplan contains reference to
some parameter above the current query level (undirect correlated
plans), then we need a mechanism to pass such parameters (basically
allow correlated initplans work).

Now, one might think (a) and or (b) is desirable so that it can be
used in more number of cases, but based on TPC-H analysis we have
found that it is quite useful even without those and in fact after we
support those cases the benefits might not be significant.

The patch contains few comments in generate_gather_paths and at the
top of functions contains_parallel_unsafe_param and
is_initplan_is_below_current_query_level, if you feel it should be
explained in further detail, then let me know.


> - The new code in ExplainPrintPlan() needs a comment.
>

There was a comment, but I have added a somewhat detailed comment now,
check if that makes it clear.

> - I have typically referred in comments to "Gather or Gather Merge"
> rather than "gather or gather merge".
>

Changed as per suggestion.

Changed funcation name is_initplan_is_below_current_query_level to
is_initplan_below_current_query_level as well.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pq_pushdown_initplan_v10.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] list of credits for release notes

2017-10-03 Thread Kohei KaiGai
2017-10-03 6:09 GMT+09:00 Tatsuo Ishii :
>> On Wed, Sep 27, 2017 at 8:29 PM, Michael Paquier
>>  wrote:
>>> Looking at this list, the first name is followed by the family name,
>>> so there are inconsistencies with some Japanese names:
>>> - Fujii Masao should be Masao Fujii.
>>> - KaiGai Kohei should be Kohei Kaigai.
>>
>> But his emails say Fujii Masao, not Masao Fujii.
>
> Michael is correct.
>
> Sometimes people choose family name first in the emails.  However I am
> sure "Fujii" is the family name and "Masao" is the firstname.
>
>> KaiGai's case is a bit trickier, as his email name has changed over time.
>
> Michael is correct about Kaigai too.
>
I set up my personal e-mail account using ${FAMILYNAME} ${FIRSTNAME}
manner according to the eastern convention.
However, my last company enforced a centralized e-mail account policy,
so ${FIRSTNAME} ${LASTNAME} was shown when I post a message
from the jp.nec.com domain.
It is the reason why my email name has changed.

Thanks,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging idle checkpoints

2017-10-03 Thread Stephen Frost
Greetings,

* Kyotaro HORIGUCHI (horiguchi.kyot...@lab.ntt.co.jp) wrote:
> At Tue, 3 Oct 2017 10:23:08 +0900, Michael Paquier 
>  wrote in 
> 
> > On Tue, Oct 3, 2017 at 12:01 AM, Stephen Frost  wrote:
> > > I certainly don't care for the idea of adding log messages saying we
> > > aren't doing anything just to match a count that's incorrectly claiming
> > > that checkpoints are happening when they aren't.
> > >
> > > The down-thread suggestion of keeping track of skipped checkpoints might
> > > be interesting, but I'm not entirely convinced it really is.  We have
> > > time to debate that, of course, but I don't really see how that's
> > > helpful.  At the moment, it seems like the suggestion to add that column
> > > is based on the assumption that we're going to start logging skipped
> > > checkpoints and having that column would allow us to match up the count
> > > between the new column and the "skipped checkpoint" messages in the logs
> > > and I can not help but feel that this is a ridiculous amount of effort
> > > being put into the analysis of something that *didn't* happen.
> > 
> > Being able to look at how many checkpoints are skipped can be used as
> > a tuning indicator of max_wal_size and checkpoint_timeout, or in short
> > increase them if those remain idle.
> 
> We ususally adjust the GUCs based on how often checkpoint is
> *executed* and how many of the executed checkpoints have been
> triggered by xlog progress (or with shorter interval than
> timeout). It seems enough. Counting skipped checkpoints gives
> just a rough estimate of how long the system was getting no
> substantial updates. I doubt that users get something valuable by
> counting skipped checkpoints.

Yeah, I tend to agree.  I don't really see how counting skipped
checkpoints helps to size max_wal_size or even checkpoint_timeout.  The
whole point here is that nothing is happening and if nothing is
happening then there's no real need to adjust max_wal_size or
checkpoint_timeout or, well, much of anything really..

> > Since their introduction in
> > 335feca4, m_timed_checkpoints and m_requested_checkpoints track the
> > number of checkpoint requests, not if a checkpoint has been actually
> > executed or not, I am not sure that this should be changed after 10
> > years. So, to put it in other words, wouldn't we want a way to track
> > checkpoints that are *executed*, meaning that we could increment a
> > counter after doing the skip checks in CreateRestartPoint() and
> > CreateCheckPoint()?
> 
> This sounds reasonable to me.

I agree that tracking executed checkpoints is valuable, but, and perhaps
I'm missing something, isn't that the same as tracking non-skipped
checkpoints?  I suppose we could have both, if we really feel the need,
provided that doesn't result in more work or effort being done than
simply keeping the count.  I'd hate to end up in a situation where we're
writing things out unnecessairly just to keep track of checkpoints that
were requested but ultimately skipped because there wasn't anything to
do.

Thanks!

Stephen


signature.asc
Description: Digital signature


[HACKERS] Postgresql gives error that role goes not exists while it exists

2017-10-03 Thread Nick Dro
Can someone assists with the issue posted on StackOverflow?
 
https://stackoverflow.com/questions/46540537/postgresql-9-3-creation-of-group-role-causes-permission-problems
 
 
Creation of new Group Role causes postgresql to think that Login roles does not exist. I think it's a bug? or at least a wrong error message
 

Re: [HACKERS] Repetitive code in RI triggers

2017-10-03 Thread Ildar Musin

Hi Maksim,

On 26.09.2017 11:51, Maksim Milyutin wrote:

On 19.09.2017 11:09, Daniel Gustafsson wrote:


Removing reviewer Maksim Milyutin from patch entry due to inactivity and
community account email bouncing.  Maksim: if you are indeed reviewing
this
patch, then please update the community account and re-add to the
patch entry.

cheers ./daniel


Daniel, thanks for noticing. I have updated my account and re-added to
the patch entry.

Ildar, your patch is conflicting with the current HEAD of master branch,
please update it.



Thank you for checking the patch out. Yes, it seems that original code 
was reformatted and this led to merging conflicts. I've fixed that and 
also introduced some minor improvements. The new version is in attachment.


Thanks!

--
Ildar Musin
i.mu...@postgrespro.ru
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index c2891e6..25cdf7d 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -196,8 +196,9 @@ static int	ri_constraint_cache_valid_count = 0;
 static bool ri_Check_Pk_Match(Relation pk_rel, Relation fk_rel,
   HeapTuple old_row,
   const RI_ConstraintInfo *riinfo);
-static Datum ri_restrict_del(TriggerData *trigdata, bool is_no_action);
-static Datum ri_restrict_upd(TriggerData *trigdata, bool is_no_action);
+static Datum ri_restrict(TriggerData *trigdata, bool is_no_action);
+static Datum ri_setnull(TriggerData *trigdata);
+static Datum ri_setdefault(FunctionCallInfo fcinfo);
 static void quoteOneName(char *buffer, const char *name);
 static void quoteRelationName(char *buffer, Relation rel);
 static void ri_GenerateQual(StringInfo buf,
@@ -605,7 +606,7 @@ RI_FKey_noaction_del(PG_FUNCTION_ARGS)
 	/*
 	 * Share code with RESTRICT case.
 	 */
-	return ri_restrict_del((TriggerData *) fcinfo->context, true);
+	return ri_restrict((TriggerData *) fcinfo->context, true);
 }
 
 /* --
@@ -630,175 +631,10 @@ RI_FKey_restrict_del(PG_FUNCTION_ARGS)
 	/*
 	 * Share code with NO ACTION case.
 	 */
-	return ri_restrict_del((TriggerData *) fcinfo->context, false);
+	return ri_restrict((TriggerData *) fcinfo->context, false);
 }
 
 /* --
- * ri_restrict_del -
- *
- *	Common code for ON DELETE RESTRICT and ON DELETE NO ACTION.
- * --
- */
-static Datum
-ri_restrict_del(TriggerData *trigdata, bool is_no_action)
-{
-	const RI_ConstraintInfo *riinfo;
-	Relation	fk_rel;
-	Relation	pk_rel;
-	HeapTuple	old_row;
-	RI_QueryKey qkey;
-	SPIPlanPtr	qplan;
-	int			i;
-
-	/*
-	 * Get arguments.
-	 */
-	riinfo = ri_FetchConstraintInfo(trigdata->tg_trigger,
-	trigdata->tg_relation, true);
-
-	/*
-	 * Get the relation descriptors of the FK and PK tables and the old tuple.
-	 *
-	 * fk_rel is opened in RowShareLock mode since that's what our eventual
-	 * SELECT FOR KEY SHARE will get on it.
-	 */
-	fk_rel = heap_open(riinfo->fk_relid, RowShareLock);
-	pk_rel = trigdata->tg_relation;
-	old_row = trigdata->tg_trigtuple;
-
-	switch (riinfo->confmatchtype)
-	{
-			/* --
-			 * SQL:2008 15.17 
-			 *	General rules 9) a) iv):
-			 *		MATCH SIMPLE/FULL
-			 *			... ON DELETE RESTRICT
-			 * --
-			 */
-		case FKCONSTR_MATCH_SIMPLE:
-		case FKCONSTR_MATCH_FULL:
-			switch (ri_NullCheck(old_row, riinfo, true))
-			{
-case RI_KEYS_ALL_NULL:
-case RI_KEYS_SOME_NULL:
-
-	/*
-	 * No check needed - there cannot be any reference to old
-	 * key if it contains a NULL
-	 */
-	heap_close(fk_rel, RowShareLock);
-	return PointerGetDatum(NULL);
-
-case RI_KEYS_NONE_NULL:
-
-	/*
-	 * Have a full qualified key - continue below
-	 */
-	break;
-			}
-
-			/*
-			 * If another PK row now exists providing the old key values, we
-			 * should not do anything.  However, this check should only be
-			 * made in the NO ACTION case; in RESTRICT cases we don't wish to
-			 * allow another row to be substituted.
-			 */
-			if (is_no_action &&
-ri_Check_Pk_Match(pk_rel, fk_rel, old_row, riinfo))
-			{
-heap_close(fk_rel, RowShareLock);
-return PointerGetDatum(NULL);
-			}
-
-			if (SPI_connect() != SPI_OK_CONNECT)
-elog(ERROR, "SPI_connect failed");
-
-			/*
-			 * Fetch or prepare a saved plan for the restrict delete lookup
-			 */
-			ri_BuildQueryKey(, riinfo, RI_PLAN_RESTRICT_DEL_CHECKREF);
-
-			if ((qplan = ri_FetchPreparedPlan()) == NULL)
-			{
-StringInfoData querybuf;
-char		fkrelname[MAX_QUOTED_REL_NAME_LEN];
-char		attname[MAX_QUOTED_NAME_LEN];
-char		paramname[16];
-const char *querysep;
-Oid			queryoids[RI_MAX_NUMKEYS];
-
-/* --
- * The query string built is
- *	SELECT 1 FROM ONLY  x WHERE $1 = fkatt1 [AND ...]
- *		   FOR KEY SHARE OF x
- * The type id's for the $ parameters are those of the
- * corresponding PK attributes.
- * --
- */
-initStringInfo();
-quoteRelationName(fkrelname, fk_rel);
-appendStringInfo(, "SELECT 1 FROM ONLY %s x",
-	

Re: [HACKERS] [PATCH] Incremental sort

2017-10-03 Thread Robert Haas
On Mon, Oct 2, 2017 at 12:37 PM, Alexander Korotkov
 wrote:
> I've applied patch on top of c12d570f and rerun the same benchmarks.
> CSV-file with results is attached.  There is no dramatical changes.  There
> is still minority of performance regression cases while majority of cases
> has improvement.

Yes, I think these results look pretty good.  But are these times in
seconds?  You might need to do some testing with bigger sorts.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPDATE of partition key

2017-10-03 Thread Amit Khandekar
On 30 September 2017 at 01:26, Robert Haas  wrote:
> On Fri, Sep 29, 2017 at 3:53 PM, Robert Haas  wrote:
>> On Fri, Sep 22, 2017 at 1:57 AM, Amit Khandekar  
>> wrote:
>>> The patch for the above change is :
>>> 0002-Prevent-a-redundant-ConvertRowtypeExpr-node.patch
>>
>> Thinking about this a little more, I'm wondering about how this case
>> arises.  I think that for this patch to avoid multiple conversions,
>> we'd have to be calling map_variable_attnos on an expression and then
>> calling map_variable_attnos on that expression again.

We are not calling map_variable_attnos() twice. The first time it
calls, there is already the ConvertRowtypeExpr node if the expression
is a whole row var. This node is already added from
adjust_appendrel_attrs(). So the conversion is done by two different
functions.

For ConvertRowtypeExpr, map_variable_attnos_mutator() recursively
calls map_variable_attnos_mutator() for ConvertRowtypeExpr->arg with
coerced_var=true.

>
> I guess I didn't quite finish this thought, sorry.  Maybe it's
> obvious, but the point I was going for is: why would we do that, vs.
> just converting once?

The first time ConvertRowtypeExpr node gets added in the expression is
when adjust_appendrel_attrs() is called for each of the child tables.
Here, for each of the child table, when the parent parse tree is
converted into the child parse tree, the whole row var (in RETURNING
or WITH CHECK OPTIONS expr) is wrapped with ConvertRowtypeExpr(), so
child parse tree (or the child WCO expr) has this ConvertRowtypeExpr
node.

The second time this node is added is during update-tuple-routing in
ExecInitModifyTable(), when map_partition_varattnos() is called for
each of the partitions to convert from the first per-subplan
RETURNING/WCO expression to the RETURNING/WCO expression belonging to
the leaf partition. This second conversion happens for the leaf
partitions which are not already present in per-subplan UPDATE result
rels.

So the first conversion is from parent to child while building
per-subplan plans, and the second is from first per-subplan child to
another child for building expressions of the leaf partitions.

So suppose the root partitioned table RETURNING expression is a whole
row var wr(r) where r is its composite type representing the root
table type.
Then, one of its UPDATE child tables will have its RETURNING
expression converted like this :
wr(r)  ===>  CRE(r) -> wr(c1)
where CRE(r) represents ConvertRowtypeExpr of result type r, which has
its arg pointing to wr(c1) which is a whole row var of composite type
c1 for the child table c1. So this node converts from composite type
of child table to composite type of root table.

Now, when the second conversion occurs for the leaf partition (i.e.
during update-tuple-routing), the conversion looks like this :
CRE(r) -> wr(c1)  ===>  CRE(r) -> wr(c2)
But W/o the 0002*ConvertRowtypeExpr*.patch the conversion would have
looked like this :
CRE(r) -> wr(c1)  ===>  CRE(r) -> CRE(c1) -> wr(c2)
In short, we omit the intermediate CRE(c1) node.


While writing this down, I observed that after multi-level partition
tree expansion was introduced, the child table expressions are not
converted directly from the root. Instead, they are converted from
their immediate parent. So there is a chain of conversions : to leaf
from its parent, to that parent from its parent, and so on from the
root. Effectively, during the first conversion, there are that many
ConvertRowtypeExpr nodes one above the other already present in the
UPDATE result rel expressions. But my patch handles the optimization
only for the leaf partition conversions.

If already has CRE : CRE(rr) -> wr(r)
Parent-to-child conversion ::: CRE(p) -> wr(r)  ===>   CRE(rr) ->
CRE(r) -> wr(c1)
W patch : CRE(rr) -> CRE(r) -> wr(c1) ===> CRE(rr) -> CRE(r) -> wr(c2)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] list of credits for release notes

2017-10-03 Thread Robert Haas
On Mon, Oct 2, 2017 at 5:09 PM, Tatsuo Ishii  wrote:
> Michael is correct.
>
> Sometimes people choose family name first in the emails.  However I am
> sure "Fujii" is the family name and "Masao" is the firstname.

But I don't think that directly answers the question of how he would
prefer to be credited.  Since I am American, I prefer to be credited
using the style "${FIRSTNAME} ${LASTNAME}", but the preferences of
someone from another country might be the same or different.  I don't
think we should presume that someone prefers something other than the
style in their email name unless they say so.

The question of what to do about pseudonyms is a tricky one - I'm not
really keen to have such things in the release notes in lieu of actual
names.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM

2017-10-03 Thread Alvaro Herrera
Andreas Seltenreich wrote:
> Michael Paquier writes:
> 
> > I am attaching a patch that addresses the bugs for the spin lock sections.
> > [2. text/x-diff; walreceiver-spin-calls.patch]
> 
> I haven't seen a spinlock PANIC since testing with the patch applied.
> They occured five times with the same amount of testing done earlier.

Thanks for testing.  I'm about to push something.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit queryId?

2017-10-03 Thread Vladimir Sitnikov
>OK, so here's a patch.  Review appreciated.

Please correct typo "Write an unsigned integer field (anythign written with
UINT64_FORMAT)".  anythign ->  anything.

Vladimir


Re: [HACKERS] [sqlsmith] crash in RestoreLibraryState during low-memory testing

2017-10-03 Thread Andreas Seltenreich
Tom Lane writes:

> Presumably somebody could dig into the libc source code and prove or
> disprove this, though it would sure help to know exactly what platform
> and version Andreas is testing on.

This is the code in glibc-2.24 around the crash site:

,[ glibc-2.24/elf/dl-load.c:442 ]
|   to_free = cp = expand_dynamic_string_token (l, cp, 1);
|
|   size_t len = strlen (cp);
`

…while expand_dynamic_string_token will indeed return NULL on a failed
malloc.  Code in the most recent glibc looks the same, so I'll carry
this issue over to the glibc bugzilla then.

Sorry about the noise…
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit queryId?

2017-10-03 Thread Michael Paquier
On Tue, Oct 3, 2017 at 3:12 PM, Andres Freund  wrote:
> On 2017-10-03 03:07:09 +0300, Alexander Korotkov wrote:
>> On Tue, Oct 3, 2017 at 12:32 AM, Tom Lane  wrote:
>> +1,
>> I see 3 options there:
>> 1) Drop high-order bit, as you proposed.
>> 2) Allow negative queryIds.
>> 3) Implement unsigned 64-type.
>
> 4) use numeric, efficiency when querying is not a significant concern here
> 5) use a custom type that doesn't support arithmetic, similar to pg_lsn.

Why not just returning a hexa-like text?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Assert that the correct locks are held when calling PageGetLSN()

2017-10-03 Thread Alvaro Herrera
Michael Paquier wrote:
> On Tue, Oct 3, 2017 at 12:54 AM, Alvaro Herrera  
> wrote:

> > I'd argue about this in the same direction I argued about
> > BufferGetPage() needing an LSN check that's applied separately: if it's
> > too easy for a developer to do the wrong thing (i.e. fail to apply the
> > separate LSN check after reading the page) then the routine should be
> > patched to do the check itself, and another routine should be offered
> > for the cases that explicitly can do without the check.
> >
> > I was eventually outvoted in the BufferGetPage() saga, though, so I'm
> > not sure that that discussion sets precedent.
> 
> Oh... I don't recall this discussion. A quick lookup at the archives
> does not show me a specific thread either.

I mean Kevin patch for snapshot-too-old:
https://www.postgresql.org/message-id/CACjxUsPPCbov6DDPnuGpR%3DFmxHsjSn_MRi3rJYgvbRMCRfFz%2BA%40mail.gmail.com

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Improve geometric types

2017-10-03 Thread Kyotaro HORIGUCHI
Thanks.

At Mon, 2 Oct 2017 11:46:15 +0200, Emre Hasegeli  wrote in 

Re: [HACKERS] Possible SSL improvements for a newcomer to tackle

2017-10-03 Thread Magnus Hagander
On Tue, Oct 3, 2017 at 6:33 AM, Tom Lane  wrote:

> Zeus Kronion  writes:
> > 2) I was surprised to learn the following from the docs:
>
> >> By default, PostgreSQL will not perform any verification of the server
> >> certificate.
>
> > Is there a technical reason to perform no verification by default?
> Wouldn't
> > a safer default be desirable?
>
> I'm not an SSL expert, so insert appropriate grain of salt, but AIUI the
> question is what are you going to verify against?  You need some local
> notion of which are your trusted root certificates before you can verify
> anything.  So to default to verification would be to default to failing to
> connect at all until user has created a ~/.postgresql/root.crt file with
> valid, relevant entries.  That seems like a nonstarter.
>

Yes, you need something to verify against.

One way to do it would be to default to the "system global certificate
store", which is what most other SSL apps do. For example on a typical
debian/ubuntu, that'd be the store in /etc/ssl/certs/ca-certificates.crt.
Exactly where to find them would be distribution-specific though, and we
would need to actually add support for a second certificate store. But that
would probably be a useful feature in itself.



> It's possible that we could adopt some policy like "if the root.crt file
> exists then default to verify" ... but that seems messy and unreliable,
> so I'm not sure it would really add any security.
>

No that's horrible. If it's unreliable, it doesn't provide any actual
benefit. We have a history of that in our default being prefer instead of
allow, but we definitely shouldn't make that situation even worse.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] Possible SSL improvements for a newcomer to tackle

2017-10-03 Thread Adrien Nayrat
Hi,

On 10/03/2017 06:15 AM, Zeus Kronion wrote:
> 2) I was surprised to learn the following from the docs:
> 
>> By default, PostgreSQL will not perform any verification of the server
> certificate. This means that it is possible to spoof the server identity (for
> example by modifying a DNS record or by taking over the server IP address)
> without the client knowing. In order to prevent spoofing, SSL certificate
> verification must be used.
> 
> Is there a technical reason to perform no verification by default? Wouldn't a
> safer default be desirable?

If you want to verify server's certificate you should use DANE [1] + DNSSEC [2]
? (I am not an SSL expert too)

If I understand correctly, you can store your certificate in a DNS record
(TLSA). Then the client can check the certificate. You must trust your DNS
server (protection against spoofing), that's why you have to use DNSSEC.



1: https://en.wikipedia.org/wiki/DNS-based_Authentication_of_Named_Entities
2: https://en.wikipedia.org/wiki/Domain_Name_System_Security_Extensions

-- 
Adrien NAYRAT



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Another oddity in handling of WCO constraints in postgres_fdw

2017-10-03 Thread Ashutosh Bapat
On Thu, Sep 28, 2017 at 5:45 PM, Etsuro Fujita
 wrote:
> Hi,
>
> Commit 7086be6e3627c1ad797e32ebbdd232905b5f577f addressed mishandling of WCO
> in direct foreign table modification by disabling it when we have WCO, but I
> noticed another oddity in postgres_fdw:
>
> postgres=# create table base_tbl (a int, b int);
> postgres=# create function row_before_insupd_trigfunc() returns trigger as
> $$begin new.a := new.a + 10; return new; end$$ language plpgsql;
> postgres=# create trigger row_before_insupd_trigger before insert or update
> on base_tbl for each row execute procedure row_before_insupd_trigfunc();
> postgres=# create server loopback foreign data wrapper postgres_fdw options
> (dbname 'postgres');
> postgres=# create user mapping for CURRENT_USER server loopback;
> postgres=# create foreign table foreign_tbl (a int, b int) server loopback
> options (table_name 'base_tbl');
> postgres=# create view rw_view as select * from foreign_tbl where a < b with
> check option;
>
> So, this should fail, but
>
> postgres=# insert into rw_view values (0, 5);
> INSERT 0 1
>
> The reason for that is: this is processed using postgres_fdw's non-direct
> foreign table modification (ie. ForeignModify), but unlike the RETURNING or
> local after trigger case, the ForeignModify doesn't take care that remote
> triggers might change the data in that case, so the WCO is evaluated using
> the data supplied, not the data actually inserted, which I think is wrong.
> (I should have noticed that as well while working on the fix, though.)  So,
> I'd propose to fix that by modifying postgresPlanForeignModify so that it
> handles WCO the same way as for the RETURNING case.  Attached is a patch for
> that.  I'll add the patch to the next commitfest.
>

Enforcing WCO constraints imposed by the local server on the row/DML
being passed to the foreign server is fine, but trying to impose them
on the row being inserted/updated at the foreign server looks odd. May
be we should just leave this case as it is. I am comparing this case
with the way we handle constraints on a foreign table.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SendRowDescriptionMessage() is slow for queries with a lot of columns

2017-10-03 Thread Andres Freund
Hi,

Attached is a revised version of this patchset. I'd like to get some
input on two points:

1) Does anybody have a better idea than the static buffer in
   SendRowDescriptionMessage()? That's not particularly pretty, but
   there's not really a convenient stringbuffer to use when called from
   exec_describe_portal_message(). We could instead create a local
   buffer for exec_describe_portal_message().

   An alternative idea would be to have one reeusable buffer created for
   each transaction command, but I'm not sure that's really better.

2) There's a lot of remaining pq_sendint() callers in other parts of the
   tree. If others are ok with that, I'd do a separate pass over them.
   I'd say that even after doing that, we should keep pq_sendint(),
   because a lot of extension code is using that.

3) The use of restrict, with a configure based fallback, is something
   we've not done before, but it's C99 and delivers significantly more
   efficient code. Any arguments against?

Regards,

Andres
>From ff8c4128a46199beab2beb09c1ad0627bbc18b94 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 20 Sep 2017 13:01:22 -0700
Subject: [PATCH 1/6] Add configure infrastructure to detect support for C99's
 restrict.

Will be used in later commits improving performance for a few key
routines where information about aliasing allows for significantly
better code generation.

This allows to use the C99 'restrict' keyword without breaking C89, or
for that matter C++, compilers. If not supported it's defined to be
empty.

Author: Andres Freund
---
 configure | 46 +++
 configure.in  |  1 +
 src/include/pg_config.h.in| 14 +
 src/include/pg_config.h.win32 |  6 ++
 4 files changed, 67 insertions(+)

diff --git a/configure b/configure
index 216447e739..5fa7a61025 100755
--- a/configure
+++ b/configure
@@ -11545,6 +11545,52 @@ _ACEOF
 ;;
 esac
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for C/C++ restrict keyword" >&5
+$as_echo_n "checking for C/C++ restrict keyword... " >&6; }
+if ${ac_cv_c_restrict+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_cv_c_restrict=no
+   # The order here caters to the fact that C++ does not require restrict.
+   for ac_kw in __restrict __restrict__ _Restrict restrict; do
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+typedef int * int_ptr;
+	int foo (int_ptr $ac_kw ip) {
+	return ip[0];
+   }
+int
+main ()
+{
+int s[1];
+	int * $ac_kw t = s;
+	t[0] = 0;
+	return foo(t)
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  ac_cv_c_restrict=$ac_kw
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+ test "$ac_cv_c_restrict" != no && break
+   done
+
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_c_restrict" >&5
+$as_echo "$ac_cv_c_restrict" >&6; }
+
+ case $ac_cv_c_restrict in
+   restrict) ;;
+   no) $as_echo "#define restrict /**/" >>confdefs.h
+ ;;
+   *)  cat >>confdefs.h <<_ACEOF
+#define restrict $ac_cv_c_restrict
+_ACEOF
+ ;;
+ esac
+
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for printf format archetype" >&5
 $as_echo_n "checking for printf format archetype... " >&6; }
 if ${pgac_cv_printf_archetype+:} false; then :
diff --git a/configure.in b/configure.in
index a2e3d8331a..bebbd11af9 100644
--- a/configure.in
+++ b/configure.in
@@ -1299,6 +1299,7 @@ fi
 m4_defun([AC_PROG_CC_STDC], []) dnl We don't want that.
 AC_C_BIGENDIAN
 AC_C_INLINE
+AC_C_RESTRICT
 PGAC_PRINTF_ARCHETYPE
 AC_C_FLEXIBLE_ARRAY_MEMBER
 PGAC_C_SIGNED
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 368a297e6d..b7ae9a0702 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -916,6 +916,20 @@
if such a type exists, and if the system does not define it. */
 #undef intptr_t
 
+/* Define to the equivalent of the C99 'restrict' keyword, or to
+   nothing if this is not supported.  Do not define if restrict is
+   supported directly.  */
+#undef restrict
+/* Work around a bug in Sun C++: it does not support _Restrict or
+   __restrict__, even though the corresponding Sun C compiler ends up with
+   "#define restrict _Restrict" or "#define restrict __restrict__" in the
+   previous line.  Perhaps some future version of Sun C++ will work with
+   restrict; if so, hopefully it defines __RESTRICT like Sun C does.  */
+#if defined __SUNPRO_CC && !defined __RESTRICT
+# define _Restrict
+# define __restrict__
+#endif
+
 /* Define to empty if the C compiler does not understand signed types. */
 #undef signed
 
diff --git a/src/include/pg_config.h.win32 b/src/include/pg_config.h.win32
index 3537b6f704..e6b3c5d551 100644
--- a/src/include/pg_config.h.win32
+++ b/src/include/pg_config.h.win32
@@ -674,6 +674,12 @@
 #define inline __inline
 #endif
 
+/* Define to the equivalent of the C99 'restrict' keyword, or to
+   nothing if this is not supported.  

Re: [HACKERS] 64-bit queryId?

2017-10-03 Thread Andres Freund
On 2017-10-03 17:06:20 +0900, Michael Paquier wrote:
> On Tue, Oct 3, 2017 at 3:12 PM, Andres Freund  wrote:
> > On 2017-10-03 03:07:09 +0300, Alexander Korotkov wrote:
> >> On Tue, Oct 3, 2017 at 12:32 AM, Tom Lane  wrote:
> >> +1,
> >> I see 3 options there:
> >> 1) Drop high-order bit, as you proposed.
> >> 2) Allow negative queryIds.
> >> 3) Implement unsigned 64-type.
> >
> > 4) use numeric, efficiency when querying is not a significant concern here
> > 5) use a custom type that doesn't support arithmetic, similar to pg_lsn.
> 
> Why not just returning a hexa-like text?

Two reasons: First, it'd look fairly different to before, whereas 4/5
would probably just continue to work fairly transparently in a lot of
cases. Secondly, what's the advantage in doing so over 4)?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pnstrdup considered armed and dangerous

2017-10-03 Thread Andres Freund
On 2016-10-03 14:55:24 -0700, Andres Freund wrote:
> Hi,
> 
> A colleage of me just wrote innocent looking code like
> char *shardRelationName = pnstrdup(relationName, NAMEDATALEN);
> which is at the moment wrong if relationName isn't preallocated to
> NAMEDATALEN size.
> 
> /*
>  * pnstrdup
>  *Like pstrdup(), but append null byte to a
>  *not-necessarily-null-terminated input string.
>  */
> char *
> pnstrdup(const char *in, Size len)
> {
>   char   *out = palloc(len + 1);
> 
>   memcpy(out, in, len);
>   out[len] = '\0';
>   return out;
> }
> 
> isn't that a somewhat weird behaviour / implementation? Not really like
> strndup(), which one might believe to be analoguous...

I've since hit this bug again. To fix it, you'd need strnlen. The lack
of which I'd also independently hit twice.  So here's a patch adding
pg_strnlen and using that to fix pnstrdup.

Greetings,

Andres Freund
>From 89ac4ce2cdad83806f83c0bc5ddac0e9ab1e038c Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Thu, 21 Sep 2017 11:43:26 -0700
Subject: [PATCH 1/2] Add pg_strnlen() a portable implementation of strlen.

As the OS version is likely going to be more optimized, fall back to
it if available, as detected by configure.
---
 configure |  2 +-
 configure.in  |  2 +-
 src/common/string.c   | 20 
 src/include/common/string.h   | 15 +++
 src/include/pg_config.h.in|  3 +++
 src/include/pg_config.h.win32 |  3 +++
 src/port/snprintf.c   | 12 ++--
 7 files changed, 45 insertions(+), 12 deletions(-)

diff --git a/configure b/configure
index 5fa7a61025..6584e47293 100755
--- a/configure
+++ b/configure
@@ -8777,7 +8777,7 @@ fi
 
 
 
-for ac_func in strerror_r getpwuid_r gethostbyname_r
+for ac_func in strerror_r getpwuid_r gethostbyname_r strnlen
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
diff --git a/configure.in b/configure.in
index bebbd11af9..15d4c85162 100644
--- a/configure.in
+++ b/configure.in
@@ -961,7 +961,7 @@ LIBS="$LIBS $PTHREAD_LIBS"
 AC_CHECK_HEADER(pthread.h, [], [AC_MSG_ERROR([
 pthread.h not found;  use --disable-thread-safety to disable thread safety])])
 
-AC_CHECK_FUNCS([strerror_r getpwuid_r gethostbyname_r])
+AC_CHECK_FUNCS([strerror_r getpwuid_r gethostbyname_r strnlen])
 
 # Do test here with the proper thread flags
 PGAC_FUNC_STRERROR_R_INT
diff --git a/src/common/string.c b/src/common/string.c
index 159d9ea7b6..901821f3d8 100644
--- a/src/common/string.c
+++ b/src/common/string.c
@@ -41,3 +41,23 @@ pg_str_endswith(const char *str, const char *end)
 	str += slen - elen;
 	return strcmp(str, end) == 0;
 }
+
+
+/*
+ * Portable version of posix' strnlen.
+ *
+ * Returns the number of characters before a null-byte in the string pointed
+ * to by str, unless there's no null-byte before maxlen. In the latter case
+ * maxlen is returned.
+ */
+#ifndef HAVE_STRNLEN
+size_t
+pg_strnlen(const char *str, size_t maxlen)
+{
+	const char *p = str;
+
+	while (maxlen-- > 0 && *p)
+		p++;
+	return p - str;
+}
+#endif
diff --git a/src/include/common/string.h b/src/include/common/string.h
index 5f3ea71d61..3d46b80918 100644
--- a/src/include/common/string.h
+++ b/src/include/common/string.h
@@ -12,4 +12,19 @@
 
 extern bool pg_str_endswith(const char *str, const char *end);
 
+/*
+ * Portable version of posix' strnlen.
+ *
+ * Returns the number of characters before a null-byte in the string pointed
+ * to by str, unless there's no null-byte before maxlen. In the latter case
+ * maxlen is returned.
+ *
+ * Use the system strnlen if provided, it's likely to be faster.
+ */
+#ifdef HAVE_STRNLEN
+#define pg_strnlen(str, maxlen) strnlen(str, maxlen)
+#else
+extern size_t pg_strnlen(const char *str, size_t maxlen);
+#endif
+
 #endif			/* COMMON_STRING_H */
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index b7ae9a0702..257262908c 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -496,6 +496,9 @@
 /* Define to 1 if you have the `strlcpy' function. */
 #undef HAVE_STRLCPY
 
+/* Define to 1 if you have the `strnlen' function. */
+#undef HAVE_STRNLEN
+
 /* Define to use have a strong random number source */
 #undef HAVE_STRONG_RANDOM
 
diff --git a/src/include/pg_config.h.win32 b/src/include/pg_config.h.win32
index e6b3c5d551..08ae2f9a86 100644
--- a/src/include/pg_config.h.win32
+++ b/src/include/pg_config.h.win32
@@ -345,6 +345,9 @@
 /* Define to 1 if you have the  header file. */
 #define HAVE_STRING_H 1
 
+/* Define to 1 if you have the `strnlen' function. */
+#define HAVE_STRNLEN
+
 /* Define to use have a strong random number source */
 #define HAVE_STRONG_RANDOM 1
 
diff --git a/src/port/snprintf.c b/src/port/snprintf.c
index 231e5d6bdb..531d2c5ee3 100644
--- a/src/port/snprintf.c
+++ b/src/port/snprintf.c
@@ -43,6 +43,8 @@
 #endif
 #include 
 

Re: [HACKERS] [PATCH] Improve geometric types

2017-10-03 Thread Kyotaro HORIGUCHI
At Mon, 2 Oct 2017 08:23:49 -0400, Robert Haas  wrote in 

> On Mon, Oct 2, 2017 at 4:23 AM, Kyotaro HORIGUCHI
>  wrote:
> > For other potential reviewers:
> >
> > I found the origin of the function here.
> >
> > https://www.postgresql.org/message-id/4a90bd76.7070...@netspace.net.au
> > https://www.postgresql.org/message-id/AANLkTim4cHELcGPf5w7Zd43_dQi_2RJ_b5_F_idSSbZI%40mail.gmail.com
> >
> > And the reason for pg_hypot is seen here.
> >
> > https://www.postgresql.org/message-id/407d949e0908222139t35ad3ad2q3e6b15646a27d...@mail.gmail.com
> >
> > I think the replacement is now acceptable according to the discussion.
> > ==
> 
> I think if we're going to do this it should be separated out as its
> own patch.

+1

> Also, I think someone should explain what the reasoning
> behind the change is.  Do we, for example, foresee that the built-in
> code might be faster or less likely to overflow?  Because we're
> clearly taking a risk -- most trivially, that the BF will break, or
> more seriously, that some machines will have versions of this function
> that don't actually behave quite the same.
> 
> That brings up a related point.  How good is our test case coverage
> for hypot(), especially in strange corner cases, like this one
> mentioned in pg_hypot()'s comment:
> 
>  * This implementation conforms to IEEE Std 1003.1 and GLIBC, in that the
>  * case of hypot(inf,nan) results in INF, and not NAN.

I'm not sure how precise we practically need them to be
identical.  FWIW as a rough confirmation on my box, I compared
hypot and pg_hypot for the following arbitrary choosed pairs of
parameters.


 {2.2e-308, 2.2e-308},
 {2.2e-308, 1.7e307},
 {1.7e307, 1.7e307},
 {1.7e308, 1.7e308},
 {2.2e-308, DBL_MAX},
 {1.7e308, DBL_MAX},
 {DBL_MIN, DBL_MAX},
 {DBL_MAX, DBL_MAX},
 {1.7e307, INFINITY},
 {2.2e-308, INFINITY},
 {0, INFINITY},
 {DBL_MIN, INFINITY},
 {INFINITY, INFINITY},
 {1, NAN},
 {INFINITY, NAN},
 {NAN, NAN},


Only the first pair gave slightly not-exactly-equal results but
it seems to do no harm. hypot set underflow flag.

 0: hypot=3.111269837220809e-308 (== 0.0 is 0, < DBL_MIN is 0)
   pg_hypot=3.11126983722081e-308 (== 0.0 is 0, < DBL_MIN is 0)
   equal=0,
   hypot(errno:0, inval:0, div0:0, of=0, uf=1),
   pg_hypot(errno:0, inval:0, div0:0, of=0, uf=0)

But not sure how the both functions behave on other platforms.

> I'm potentially willing to commit a patch that just makes the
> pg_hypot() -> hypot() change and does nothing else, if there are not
> objections to that change, but I want to be sure that we'll know right
> away if that turns out to break.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
#include 
#include 
#include 
#include 
#include 

double
pg_hypot(double x, double y)
{
double  yx;

/* Handle INF and NaN properly */
if (isinf(x) || isinf(y))
return (double) INFINITY;

if (isnan(x) || isnan(y))
return (double) NAN;

/* Else, drop any minus signs */
x = fabs(x);
y = fabs(y);

/* Swap x and y if needed to make x the larger one */
if (x < y)
{
double  temp = x;

x = y;
y = temp;
}

/*
 * If y is zero, the hypotenuse is x.  This test saves a few cycles in
 * such cases, but more importantly it also protects against
 * divide-by-zero errors, since now x >= y.
 */
if (y == 0.0)
return x;

/* Determine the hypotenuse */
yx = y / x;
return x * sqrt(1.0 + (yx * yx));
}

void
setfeflags(int *invalid, int *divzero, int *overflow, int *underflow)
{
int err = fetestexcept(FE_INVALID | FE_DIVBYZERO |
   FE_OVERFLOW | FE_UNDERFLOW);
*invalid = ((err & FE_INVALID) != 0);
*divzero = ((err & FE_DIVBYZERO) != 0);
*overflow = ((err & FE_OVERFLOW) != 0);
*underflow = ((err & FE_UNDERFLOW) != 0);
}

int
main(void)
{
double x;
double y;
double p[][2] =
{
{2.2e-308, 2.2e-308},
{2.2e-308, 1.7e307},
{1.7e307, 1.7e307},
{1.7e308, 1.7e308},
{2.2e-308, DBL_MAX},
{1.7e308, DBL_MAX},
{DBL_MIN, DBL_MAX},
{DBL_MAX, DBL_MAX},
{1.7e307, INFINITY},
{2.2e-308, INFINITY},
{0, INFINITY},
{DBL_MIN, INFINITY},
{INFINITY, INFINITY},
{1, NAN},
{INFINITY, NAN},
{NAN, NAN},
{0.0, 0.0}

Re: [HACKERS] [PATCH] Improve geometric types

2017-10-03 Thread Alvaro Herrera
Robert Haas wrote:

> I'm potentially willing to commit a patch that just makes the
> pg_hypot() -> hypot() change and does nothing else, if there are not
> objections to that change, but I want to be sure that we'll know right
> away if that turns out to break.

Uh, I thought pg_hypot() was still needed on our oldest supported
platforms.  Or is hypot() already supported there?  If not, I suppose we
can keep the HYPOT() macro, and have it use hypot() on platforms that
have it and pg_hypot elsewhere?  That particular fraction of 0001 seemed
a bit dubious to me, as the largest possible source of platform
dependency issues.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible SSL improvements for a newcomer to tackle

2017-10-03 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Magnus Hagander  writes:
> > On Tue, Oct 3, 2017 at 6:33 AM, Tom Lane  wrote:
> >> I'm not an SSL expert, so insert appropriate grain of salt, but AIUI the
> >> question is what are you going to verify against?
> 
> > One way to do it would be to default to the "system global certificate
> > store", which is what most other SSL apps do. For example on a typical
> > debian/ubuntu, that'd be the store in /etc/ssl/certs/ca-certificates.crt.
> > Exactly where to find them would be distribution-specific though, and we
> > would need to actually add support for a second certificate store. But that
> > would probably be a useful feature in itself.
> 
> Maybe.  The impression I have is that it's very common for installations
> to use a locally-run CA to generate server and client certs.  I would not
> expect them to put such certs into /etc/ssl/certs.  But I suppose there
> might be cases where you would actually pay for a universally-valid cert
> for a DB server ...

In many larger enterprises, they actually deploy systems with their own
CA installed into the system global certificate store (possibly removing
certain other CAs from that set too, and distributing their own version
of the relevant package that maintains the CA set).

I agree with Magnus that most other SSL apps do default to the system
global cert store and it's generally what's expected.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Improve geometric types

2017-10-03 Thread Emre Hasegeli
> I think if we're going to do this it should be separated out as its
> own patch.  Also, I think someone should explain what the reasoning
> behind the change is.  Do we, for example, foresee that the built-in
> code might be faster or less likely to overflow?  Because we're
> clearly taking a risk -- most trivially, that the BF will break, or
> more seriously, that some machines will have versions of this function
> that don't actually behave quite the same.

I included removal of pg_hypot() on my patch simply because the
comment on the function header is suggesting it.  I though if we are
going to clean this module up, we better deal it first.  I understand
the risk.  The patches include more changes.  It may be a good idea to
have those together.

> That brings up a related point.  How good is our test case coverage
> for hypot(), especially in strange corner cases, like this one
> mentioned in pg_hypot()'s comment:
>
>  * This implementation conforms to IEEE Std 1003.1 and GLIBC, in that the
>  * case of hypot(inf,nan) results in INF, and not NAN.

I don't see any tests of geometric types with INF or NaN.  Currently,
there isn't consistent behaviour for them.  I don't think we can
easily add portable ones on the current state, but we should be able
to do so with my patches.  I will look into it.

> I'm potentially willing to commit a patch that just makes the
> pg_hypot() -> hypot() change and does nothing else, if there are not
> objections to that change, but I want to be sure that we'll know right
> away if that turns out to break.

I can split this one into another patch.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Improve geometric types

2017-10-03 Thread Emre Hasegeli
> Uh, I thought pg_hypot() was still needed on our oldest supported
> platforms.  Or is hypot() already supported there?  If not, I suppose we
> can keep the HYPOT() macro, and have it use hypot() on platforms that
> have it and pg_hypot elsewhere?  That particular fraction of 0001 seemed
> a bit dubious to me, as the largest possible source of platform
> dependency issues.

What is our oldest supported platform?

We can also just keep pg_hypot().  I don't think getting rid of it
justifies much work.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgresql gives error that role goes not exists while it exists

2017-10-03 Thread Euler Taveira
2017-10-03 5:49 GMT-03:00 Nick Dro :
> Can someone assists with the issue posted on StackOverflow?
>
> https://stackoverflow.com/questions/46540537/postgresql-9-3-creation-of-group-role-causes-permission-problems
>
>
> Creation of new Group Role causes postgresql to think that Login roles does
> not exist. I think it's a bug? or at least a wrong error message
>
I'm not sure. I bet a dime that the role was created as "Iris" and you
are trying to assing "iris" (they are different). If you list the
roles, we can confirm that.


-- 
   Euler Taveira   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible SSL improvements for a newcomer to tackle

2017-10-03 Thread Tom Lane
Magnus Hagander  writes:
> On Tue, Oct 3, 2017 at 6:33 AM, Tom Lane  wrote:
>> I'm not an SSL expert, so insert appropriate grain of salt, but AIUI the
>> question is what are you going to verify against?

> One way to do it would be to default to the "system global certificate
> store", which is what most other SSL apps do. For example on a typical
> debian/ubuntu, that'd be the store in /etc/ssl/certs/ca-certificates.crt.
> Exactly where to find them would be distribution-specific though, and we
> would need to actually add support for a second certificate store. But that
> would probably be a useful feature in itself.

Maybe.  The impression I have is that it's very common for installations
to use a locally-run CA to generate server and client certs.  I would not
expect them to put such certs into /etc/ssl/certs.  But I suppose there
might be cases where you would actually pay for a universally-valid cert
for a DB server ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] list of credits for release notes

2017-10-03 Thread Tatsuo Ishii
> On Mon, Oct 2, 2017 at 5:09 PM, Tatsuo Ishii  wrote:
>> Michael is correct.
>>
>> Sometimes people choose family name first in the emails.  However I am
>> sure "Fujii" is the family name and "Masao" is the firstname.
> 
> But I don't think that directly answers the question of how he would
> prefer to be credited.

Of course. It's different story.

> Since I am American, I prefer to be credited
> using the style "${FIRSTNAME} ${LASTNAME}", but the preferences of
> someone from another country might be the same or different.  I don't
> think we should presume that someone prefers something other than the
> style in their email name unless they say so.

My concern is that the list seems implicitly assumes that each name
appears first name then last name. So people might misunderstand that
"Fujii" is the first name and "Masao" is the last name. I don't how he
actually feels about that, but if I were him, I would feel
uncomfortable. If the list explicitly stats that we do not guarantee
that the order of the last names and the first names are correct, then
that kind of misunderstanding could be avoided.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Two pass CheckDeadlock in contentent case

2017-10-03 Thread Sokolov Yura

On 2017-10-03 17:30, Sokolov Yura wrote:

Good day, hackers.

During hard workload sometimes process reaches deadlock timeout
even if no real deadlock occurred. It is easily reproducible with
pg_xact_advisory_lock on same value + some time consuming
operation (update) and many clients.

When backend reaches deadlock timeout, it calls CheckDeadlock,
which locks all partitions of lock hash in exclusive mode to
walk through graph and search for deadlock.

If hundreds of backends reaches this timeout trying to acquire
advisory lock on a same value, it leads to hard-stuck for many
seconds, cause they all traverse same huge lock graph under
exclusive lock.
During this stuck there is no possibility to do any meaningful
operations (no new transaction can begin).

Attached patch makes CheckDeadlock to do two passes:
- first pass uses LW_SHARED on partitions of lock hash.
  DeadLockCheck is called with flag "readonly", so it doesn't
  modify anything.
- If there is possibility of "soft" or "hard" deadlock detected,
  ie if there is need to modify lock graph, then partitions
  relocked with LW_EXCLUSIVE, and DeadLockCheck is called again.

It fixes hard-stuck, cause backends walk lock graph under shared
lock, and found that there is no real deadlock.

Test on 4 socket xeon machine:
pgbench_zipf -s 10 -c 800  -j 100 -M prepared -T 450 -f
./ycsb_read_zipf.sql@50 -f ./ycsb_update_lock2_zipf.sql@50 -P 5

ycsb_read_zipf.sql:
\set i random_zipfian(1, 10 * :scale, 1.01)
SELECT abalance FROM pgbench_accounts WHERE aid = :i;
ycsb_update_lock2_zipf.sql:
\set i random_zipfian(1, 10 * :scale, 1.01)
select lock_and_update( :i );

CREATE OR REPLACE FUNCTION public.lock_and_update(i integer)
 RETURNS void
 LANGUAGE sql
AS $function$
SELECT pg_advisory_xact_lock( $1 );
UPDATE pgbench_accounts SET abalance = 1 WHERE aid = $1;
$function$

Without attached patch:

progress: 5.0 s, 45707.0 tps, lat 15.599 ms stddev 83.757
progress: 10.0 s, 51124.4 tps, lat 15.681 ms stddev 78.353
progress: 15.0 s, 52293.8 tps, lat 15.327 ms stddev 77.017
progress: 20.0 s, 51280.4 tps, lat 15.603 ms stddev 78.199
progress: 25.0 s, 47278.6 tps, lat 16.795 ms stddev 83.570
progress: 30.0 s, 41792.9 tps, lat 18.535 ms stddev 93.697
progress: 35.0 s, 12393.7 tps, lat 33.757 ms stddev 169.116
progress: 40.0 s, 0.0 tps, lat -nan ms stddev -nan
progress: 45.0 s, 0.0 tps, lat -nan ms stddev -nan
progress: 50.0 s, 1.2 tps, lat 2497.734 ms stddev 5393.166
progress: 55.0 s, 0.0 tps, lat -nan ms stddev -nan
progress: 60.0 s, 27357.9 tps, lat 160.622 ms stddev 1866.625
progress: 65.0 s, 38770.8 tps, lat 20.829 ms stddev 104.453
progress: 70.0 s, 40553.2 tps, lat 19.809 ms stddev 99.741

(autovacuum led to trigger deadlock timeout,
 and therefore, to stuck)

Patched:

progress: 5.0 s, 56264.7 tps, lat 12.847 ms stddev 66.980
progress: 10.0 s, 55389.3 tps, lat 14.329 ms stddev 71.997
progress: 15.0 s, 50757.7 tps, lat 15.730 ms stddev 78.222
progress: 20.0 s, 50797.3 tps, lat 15.736 ms stddev 79.296
progress: 25.0 s, 48485.3 tps, lat 16.432 ms stddev 82.720
progress: 30.0 s, 45202.1 tps, lat 17.733 ms stddev 88.554
progress: 35.0 s, 40035.8 tps, lat 19.343 ms stddev 97.787
progress: 40.0 s, 14240.1 tps, lat 47.625 ms stddev 265.465
progress: 45.0 s, 30150.6 tps, lat 31.140 ms stddev 196.114
progress: 50.0 s, 38975.0 tps, lat 20.543 ms stddev 104.281
progress: 55.0 s, 40573.9 tps, lat 19.770 ms stddev 99.487
progress: 60.0 s, 38361.1 tps, lat 20.693 ms stddev 103.141
progress: 65.0 s, 39793.3 tps, lat 20.216 ms stddev 101.080
progress: 70.0 s, 38387.9 tps, lat 20.886 ms stddev 104.482

(autovacuum led to trigger deadlock timeout,
 but postgresql did not stuck)

I believe, patch will positively affect other heavy workloads
as well, although I have not collect benchmarks.

`make check-world` passes with configured `--enable-tap-tests 
--enable-casserts`


Excuse me, corrected version is in attach.

--
Sokolov Yura
Postgres Professional: https://postgrespro.ru
The Russian Postgres Companydiff --git a/src/backend/storage/lmgr/deadlock.c b/src/backend/storage/lmgr/deadlock.c
index 5e49c78905..6211d41f68 100644
--- a/src/backend/storage/lmgr/deadlock.c
+++ b/src/backend/storage/lmgr/deadlock.c
@@ -214,7 +214,7 @@ InitDeadLockChecking(void)
  * and (b) we are typically invoked inside a signal handler.
  */
 DeadLockState
-DeadLockCheck(PGPROC *proc)
+DeadLockCheck(PGPROC *proc, bool readonly)
 {
 	int			i,
 j;
@@ -245,6 +245,16 @@ DeadLockCheck(PGPROC *proc)
 		return DS_HARD_DEADLOCK;	/* cannot find a non-deadlocked state */
 	}
 
+	if (readonly)
+	{
+		if (nWaitOrders > 0)
+			return DS_HARD_DEADLOCK;
+		else if (blocking_autovacuum_proc != NULL)
+			return DS_BLOCKED_BY_AUTOVACUUM;
+		else
+			return DS_NO_DEADLOCK;
+	}
+
 	/* Apply any needed rearrangements of wait queues */
 	for (i = 0; i < nWaitOrders; i++)
 	{
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 

Re: [HACKERS] Postgresql gives error that role goes not exists while it exists

2017-10-03 Thread Tom Lane
Craig Ringer  writes:
> We could only emit a useful HINT if we actually went and looked in the
> relevant catalog for a different-cased version. Which is pretty
> expensive.

There is actually a hint somewhat like that for the specific case of
misspelled column names in DML statements:

postgres=# create table foo ("Iris" int);
CREATE TABLE
postgres=# select iris from foo;
ERROR:  column "iris" does not exist
LINE 1: select iris from foo;
   ^
HINT:  Perhaps you meant to reference the column "foo.Iris".

but that's a bit different because the set of column names to be
considered is very constrained --- only columns belonging to tables
listed in FROM.  The parser has already sucked in the column name
lists for those tables, so no additional catalog fetches are needed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental sort

2017-10-03 Thread Alexander Korotkov
On Tue, Oct 3, 2017 at 2:52 PM, Robert Haas  wrote:

> On Mon, Oct 2, 2017 at 12:37 PM, Alexander Korotkov
>  wrote:
> > I've applied patch on top of c12d570f and rerun the same benchmarks.
> > CSV-file with results is attached.  There is no dramatical changes.
> There
> > is still minority of performance regression cases while majority of cases
> > has improvement.
>
> Yes, I think these results look pretty good.  But are these times in
> seconds?  You might need to do some testing with bigger sorts.


Good point.  I'll rerun benchmarks with larger dataset size.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] list of credits for release notes

2017-10-03 Thread Alvaro Herrera
Michael Paquier wrote:

> Looking at this list, the first name is followed by the family name,
> so there are inconsistencies with some Japanese names:
> - Fujii Masao should be Masao Fujii.
> - KaiGai Kohei should be Kohei Kaigai.

We've already been here:
https://www.postgresql.org/message-id/20150613231826.gy133...@postgresql.org

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible SSL improvements for a newcomer to tackle

2017-10-03 Thread Nico Williams
On Tue, Oct 03, 2017 at 09:44:01AM -0400, Tom Lane wrote:
> Magnus Hagander  writes:
> > On Tue, Oct 3, 2017 at 6:33 AM, Tom Lane  wrote:
> >> I'm not an SSL expert, so insert appropriate grain of salt, but AIUI the
> >> question is what are you going to verify against?
> 
> > One way to do it would be to default to the "system global certificate
> > store", which is what most other SSL apps do. For example on a typical
> > debian/ubuntu, that'd be the store in /etc/ssl/certs/ca-certificates.crt.
> > Exactly where to find them would be distribution-specific though, and we
> > would need to actually add support for a second certificate store. But that
> > would probably be a useful feature in itself.
> 
> Maybe.  The impression I have is that it's very common for installations
> to use a locally-run CA to generate server and client certs.  I would not
> expect them to put such certs into /etc/ssl/certs.  But I suppose there
> might be cases where you would actually pay for a universally-valid cert
> for a DB server ...

No, that is very common.  However, in non-enterprise uses it's also very
common for those to be Web PKI certificates, which would be very
inappropriate for use in PG, so I agree that PG should not use the
system trust anchor set by default.  PG should just require that a trust
anchor set be configured.

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible SSL improvements for a newcomer to tackle

2017-10-03 Thread Nico Williams
On Tue, Oct 03, 2017 at 11:45:24AM +0200, Adrien Nayrat wrote:
> On 10/03/2017 06:15 AM, Zeus Kronion wrote:
> > 2) I was surprised to learn the following from the docs:
> > 
> >> By default, PostgreSQL will not perform any verification of the server
> > certificate. This means that it is possible to spoof the server identity 
> > (for
> > example by modifying a DNS record or by taking over the server IP address)
> > without the client knowing. In order to prevent spoofing, SSL certificate
> > verification must be used.
> > 
> > Is there a technical reason to perform no verification by default? Wouldn't 
> > a
> > safer default be desirable?
> 
> If you want to verify server's certificate you should use DANE [1] + DNSSEC 
> [2]
> ? (I am not an SSL expert too)
> 
> If I understand correctly, you can store your certificate in a DNS record
> (TLSA). Then the client can check the certificate. You must trust your DNS
> server (protection against spoofing), that's why you have to use DNSSEC.

+1, but it's trickier than you might think.  I can connect you with
Viktor Dukhovni, who has implemented DANE for OpenSSL, and done yeoman's
work getting DANE for SMTP working.

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SendRowDescriptionMessage() is slow for queries with a lot of columns

2017-10-03 Thread Robert Haas
On Tue, Oct 3, 2017 at 3:55 AM, Andres Freund  wrote:
> Attached is a revised version of this patchset.

I don't much like the functions with "_pre" affixed to their names.
It's not at all clear that "pre" means "preallocated"; it sounds more
like you're doing something ahead of time.  I wonder about maybe
calling these e.g. pq_writeint16, with "write" meaning "assume
preallocation" and "send" meaning "don't assume preallocation".  There
could be other ideas, too.

> I'd like to get some
> input on two points:
>
> 1) Does anybody have a better idea than the static buffer in
>SendRowDescriptionMessage()? That's not particularly pretty, but
>there's not really a convenient stringbuffer to use when called from
>exec_describe_portal_message(). We could instead create a local
>buffer for exec_describe_portal_message().
>
>An alternative idea would be to have one reeusable buffer created for
>each transaction command, but I'm not sure that's really better.

I don't have a better idea.

> 2) There's a lot of remaining pq_sendint() callers in other parts of the
>tree. If others are ok with that, I'd do a separate pass over them.
>I'd say that even after doing that, we should keep pq_sendint(),
>because a lot of extension code is using that.

I think we should change everything to the new style and I wouldn't
object to removing pq_sendint() either.  However, if we want to keep
it with a note that only extension code should use it, that's OK with
me, too.

> 3) The use of restrict, with a configure based fallback, is something
>we've not done before, but it's C99 and delivers significantly more
>efficient code. Any arguments against?

It's pretty unobvious why it helps here.  I think you should add
comments.  Also, unless I'm missing something, there's nothing to keep
pq_sendintXX_pre from causing an alignment fault except unbridled
optimism...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] list of credits for release notes

2017-10-03 Thread Fujii Masao
On Tue, Oct 3, 2017 at 9:57 PM, Tatsuo Ishii  wrote:
>> On Mon, Oct 2, 2017 at 5:09 PM, Tatsuo Ishii  wrote:
>>> Michael is correct.
>>>
>>> Sometimes people choose family name first in the emails.  However I am
>>> sure "Fujii" is the family name and "Masao" is the firstname.
>>
>> But I don't think that directly answers the question of how he would
>> prefer to be credited.
>
> Of course. It's different story.
>
>> Since I am American, I prefer to be credited
>> using the style "${FIRSTNAME} ${LASTNAME}", but the preferences of
>> someone from another country might be the same or different.  I don't
>> think we should presume that someone prefers something other than the
>> style in their email name unless they say so.
>
> My concern is that the list seems implicitly assumes that each name
> appears first name then last name. So people might misunderstand that
> "Fujii" is the first name and "Masao" is the last name. I don't how he
> actually feels about that, but if I were him, I would feel
> uncomfortable. If the list explicitly stats that we do not guarantee
> that the order of the last names and the first names are correct, then
> that kind of misunderstanding could be avoided.

Yeah, your concern might be right, but I prefer Fujii Masao,
i.e., family-name-first style, so I have used that name in my email
and past release notes so far.

Anyway, thanks for your kind consideration!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible SSL improvements for a newcomer to tackle

2017-10-03 Thread Nico Williams
On Tue, Oct 03, 2017 at 12:33:00AM -0400, Tom Lane wrote:
> Zeus Kronion  writes:
> > 2) I was surprised to learn the following from the docs:
> 
> >> By default, PostgreSQL will not perform any verification of the server
> >> certificate.
> 
> > Is there a technical reason to perform no verification by default? Wouldn't
> > a safer default be desirable?
> 
> I'm not an SSL expert, so insert appropriate grain of salt, but AIUI the
> question is what are you going to verify against?  You need some local
> notion of which are your trusted root certificates before you can verify
> anything.  So to default to verification would be to default to failing to
> connect at all until user has created a ~/.postgresql/root.crt file with
> valid, relevant entries.  That seems like a nonstarter.
> 
> It's possible that we could adopt some policy like "if the root.crt file
> exists then default to verify" ... but that seems messy and unreliable,
> so I'm not sure it would really add any security.

You do always need trust anchors in order to verify a peer's
certificate.

Usually there will be a system-wide trust anchor set, though it may not
be appropriate for use with PG...

Still, it would be safer to refuse to connect until the lack of trust
anchors is rectified than to connect without warning about the inability
to verify a server.  By forcing the user (admins) to take action to
remediate the problem, the problem then gets fixed, whereas plowing on
creates an invisible (for many users) security problem.

Now, the use of channel binding from authentication methods like GSS-API
helps a fair bit, though mostly it helps by leveraging some other
authentication infrastructure that the users (admins) have set up --
they could easily have setup PKI too then.  With SCRAM there is much
less infrastructure (but also SCRAM requires very good passwords; a PAKE
would be much better).

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM

2017-10-03 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera  writes:

> > Tom Lane wrote:
> >> I don't especially like the Asserts inside spinlocks, either.
> 
> > I didn't change these.  It doesn't look to me that these asserts are
> > worth very much as production code.
> 
> OK.  If we ever see these hit in the buildfarm I might argue for
> reconsidering, but without some evidence of that sort it's not
> worth much concern.

Sure.  I would be very surprised if buildfarm ever exercises this code.

> > I think the latch is only used locally.  Seems that it was only put in
> > shmem to avoid a separate variable ...
> 
> Hm, I'm strongly tempted to move it to a separate static variable then.
> That's not a bug fix, so maybe it only belongs in HEAD, but is there
> value in keeping the branches in sync in this code?  It sounded from
> your commit message like they were pretty different already :-(

Well, there were conflicts in almost every branch, but they were pretty
minor.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit queryId?

2017-10-03 Thread Robert Haas
On Mon, Oct 2, 2017 at 8:07 PM, Alexander Korotkov
 wrote:
> +1,
> I see 3 options there:
> 1) Drop high-order bit, as you proposed.
> 2) Allow negative queryIds.
> 3) Implement unsigned 64-type.
>
> #1 causes minor loss of precision which looks rather insignificant in given
> context.
> #2 might be rather unexpected for users whose previously had non-negative
> queryIds.  Changing queryId from 32-bit to 64-bit itself might require some
> adoption from monitoring software. But queryIds are user-visible, and
> negative queryIds would look rather nonlogical.
> #3 would be attaching hard and long-term problem by insufficient reason.
> Thus, #1 looks like most harmless solution.

I think we should just allow negative queryIds.  I mean, the hash
functions have behaved that way for a long time:

rhaas=# select hashtext('');
  hashtext
-
 -1477818771
(1 row)

It seems silly to me to throw away a perfectly good bit from the hash
value just because of some risk of minor user confusion.  I do like
Michael's suggestion of outputing hexa-like text, but changing the
types of the user-visible output columns seems like a job for another
patch.  Similarly, if we were to adopt Andres's suggestions of a new
type or using numeric or Alexander's suggestion of implementing a
64-bit unsigned type, I think it should be a separate patch from this
one.

I would note that such a patch will actually create real
incompatibility -- extension upgrades might fail if there are
dependent views, for example -- whereas merely having query IDs start
to sometimes be negative only creates an incompatibility for people
who assumed that the int64 type wouldn't actually use its full range
of allowable values.  I don't deny that someone may have done that, of
course, but I wouldn't guess that it's a common thing... maybe I'm
wrong.

Meanwhile, updated patch with a fix for the typo pointed out by
Vladimir Sitnikov attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


64-bit-queryid-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Two pass CheckDeadlock in contentent case

2017-10-03 Thread Sokolov Yura

Good day, hackers.

During hard workload sometimes process reaches deadlock timeout
even if no real deadlock occurred. It is easily reproducible with
pg_xact_advisory_lock on same value + some time consuming
operation (update) and many clients.

When backend reaches deadlock timeout, it calls CheckDeadlock,
which locks all partitions of lock hash in exclusive mode to
walk through graph and search for deadlock.

If hundreds of backends reaches this timeout trying to acquire
advisory lock on a same value, it leads to hard-stuck for many
seconds, cause they all traverse same huge lock graph under
exclusive lock.
During this stuck there is no possibility to do any meaningful
operations (no new transaction can begin).

Attached patch makes CheckDeadlock to do two passes:
- first pass uses LW_SHARED on partitions of lock hash.
  DeadLockCheck is called with flag "readonly", so it doesn't
  modify anything.
- If there is possibility of "soft" or "hard" deadlock detected,
  ie if there is need to modify lock graph, then partitions
  relocked with LW_EXCLUSIVE, and DeadLockCheck is called again.

It fixes hard-stuck, cause backends walk lock graph under shared
lock, and found that there is no real deadlock.

Test on 4 socket xeon machine:
pgbench_zipf -s 10 -c 800  -j 100 -M prepared -T 450 -f 
./ycsb_read_zipf.sql@50 -f ./ycsb_update_lock2_zipf.sql@50 -P 5


ycsb_read_zipf.sql:
\set i random_zipfian(1, 10 * :scale, 1.01)
SELECT abalance FROM pgbench_accounts WHERE aid = :i;
ycsb_update_lock2_zipf.sql:
\set i random_zipfian(1, 10 * :scale, 1.01)
select lock_and_update( :i );

CREATE OR REPLACE FUNCTION public.lock_and_update(i integer)
 RETURNS void
 LANGUAGE sql
AS $function$
SELECT pg_advisory_xact_lock( $1 );
UPDATE pgbench_accounts SET abalance = 1 WHERE aid = $1;
$function$

Without attached patch:

progress: 5.0 s, 45707.0 tps, lat 15.599 ms stddev 83.757
progress: 10.0 s, 51124.4 tps, lat 15.681 ms stddev 78.353
progress: 15.0 s, 52293.8 tps, lat 15.327 ms stddev 77.017
progress: 20.0 s, 51280.4 tps, lat 15.603 ms stddev 78.199
progress: 25.0 s, 47278.6 tps, lat 16.795 ms stddev 83.570
progress: 30.0 s, 41792.9 tps, lat 18.535 ms stddev 93.697
progress: 35.0 s, 12393.7 tps, lat 33.757 ms stddev 169.116
progress: 40.0 s, 0.0 tps, lat -nan ms stddev -nan
progress: 45.0 s, 0.0 tps, lat -nan ms stddev -nan
progress: 50.0 s, 1.2 tps, lat 2497.734 ms stddev 5393.166
progress: 55.0 s, 0.0 tps, lat -nan ms stddev -nan
progress: 60.0 s, 27357.9 tps, lat 160.622 ms stddev 1866.625
progress: 65.0 s, 38770.8 tps, lat 20.829 ms stddev 104.453
progress: 70.0 s, 40553.2 tps, lat 19.809 ms stddev 99.741

(autovacuum led to trigger deadlock timeout,
 and therefore, to stuck)

Patched:

progress: 5.0 s, 56264.7 tps, lat 12.847 ms stddev 66.980
progress: 10.0 s, 55389.3 tps, lat 14.329 ms stddev 71.997
progress: 15.0 s, 50757.7 tps, lat 15.730 ms stddev 78.222
progress: 20.0 s, 50797.3 tps, lat 15.736 ms stddev 79.296
progress: 25.0 s, 48485.3 tps, lat 16.432 ms stddev 82.720
progress: 30.0 s, 45202.1 tps, lat 17.733 ms stddev 88.554
progress: 35.0 s, 40035.8 tps, lat 19.343 ms stddev 97.787
progress: 40.0 s, 14240.1 tps, lat 47.625 ms stddev 265.465
progress: 45.0 s, 30150.6 tps, lat 31.140 ms stddev 196.114
progress: 50.0 s, 38975.0 tps, lat 20.543 ms stddev 104.281
progress: 55.0 s, 40573.9 tps, lat 19.770 ms stddev 99.487
progress: 60.0 s, 38361.1 tps, lat 20.693 ms stddev 103.141
progress: 65.0 s, 39793.3 tps, lat 20.216 ms stddev 101.080
progress: 70.0 s, 38387.9 tps, lat 20.886 ms stddev 104.482

(autovacuum led to trigger deadlock timeout,
 but postgresql did not stuck)

I believe, patch will positively affect other heavy workloads
as well, although I have not collect benchmarks.

`make check-world` passes with configured `--enable-tap-tests 
--enable-casserts`


--
Sokolov Yura
Postgres Professional: https://postgrespro.ru
The Russian Postgres Companydiff --git a/src/backend/storage/lmgr/deadlock.c b/src/backend/storage/lmgr/deadlock.c
index 5e49c78905..2d6a282d0a 100644
--- a/src/backend/storage/lmgr/deadlock.c
+++ b/src/backend/storage/lmgr/deadlock.c
@@ -214,7 +214,7 @@ InitDeadLockChecking(void)
  * and (b) we are typically invoked inside a signal handler.
  */
 DeadLockState
-DeadLockCheck(PGPROC *proc)
+DeadLockCheck(PGPROC *proc, bool readonly)
 {
 	int			i,
 j;
@@ -245,6 +245,16 @@ DeadLockCheck(PGPROC *proc)
 		return DS_HARD_DEADLOCK;	/* cannot find a non-deadlocked state */
 	}
 
+	if (readonly)
+	{
+		if (nWaitOrders > 0)
+			return DS_SOFT_DEADLOCK;
+		else if (blocking_autovacuum_proc != NULL)
+			return DS_BLOCKED_BY_AUTOVACUUM;
+		else
+			return DS_NO_DEADLOCK;
+	}
+
 	/* Apply any needed rearrangements of wait queues */
 	for (i = 0; i < nWaitOrders; i++)
 	{
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 5f6727d501..a95bd99f8f 100644
--- a/src/backend/storage/lmgr/proc.c
+++ 

Re: [HACKERS] [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM

2017-10-03 Thread Tom Lane
Alvaro Herrera  writes:
> Fixed the pstrdup problem by replacing with strlcpy() to stack-allocated
> variables (rather than palloc + memcpy as proposed in Michael's patch).

+1

> Tom Lane wrote:
>> I don't especially like the Asserts inside spinlocks, either.

> I didn't change these.  It doesn't look to me that these asserts are
> worth very much as production code.

OK.  If we ever see these hit in the buildfarm I might argue for
reconsidering, but without some evidence of that sort it's not
worth much concern.

>> I'm also rather befuddled by the fact that this code sets and clears
>> walrcv->latch outside the critical sections.  If that field is used
>> by any other process, surely that's completely unsafe.  If it isn't,
>> why is it being kept in shared memory?

> I think the latch is only used locally.  Seems that it was only put in
> shmem to avoid a separate variable ...

Hm, I'm strongly tempted to move it to a separate static variable then.
That's not a bug fix, so maybe it only belongs in HEAD, but is there
value in keeping the branches in sync in this code?  It sounded from
your commit message like they were pretty different already :-(

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgresql gives error that role goes not exists while it exists

2017-10-03 Thread Craig Ringer
On 3 October 2017 at 20:47, Euler Taveira  wrote:
>
> 2017-10-03 5:49 GMT-03:00 Nick Dro :
> > Can someone assists with the issue posted on StackOverflow?
> >
> > https://stackoverflow.com/questions/46540537/postgresql-9-3-creation-of-group-role-causes-permission-problems
> >
> >
> > Creation of new Group Role causes postgresql to think that Login roles does
> > not exist. I think it's a bug? or at least a wrong error message
> >
> I'm not sure. I bet a dime that the role was created as "Iris" and you
> are trying to assing "iris" (they are different). If you list the
> roles, we can confirm that.
>

... and the reason we don't emit a HINT here is that the exact same
HINT would apply in any context involving identifiers, so we'd just
flood the logs. It'd be spurious in most cases.

We could only emit a useful HINT if we actually went and looked in the
relevant catalog for a different-cased version. Which is pretty
expensive.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM

2017-10-03 Thread Alvaro Herrera
Fixed the pstrdup problem by replacing with strlcpy() to stack-allocated
variables (rather than palloc + memcpy as proposed in Michael's patch).

About the other problems:

Tom Lane wrote:

> I took a quick look through walreceiver.c, and there are a couple of
> obvious problems of the same ilk in WalReceiverMain, which is doing this:
> 
>   walrcv->lastMsgSendTime = walrcv->lastMsgReceiptTime = 
> walrcv->latestWalEndTime = GetCurrentTimestamp();
> 
> (ie, a potential kernel call) inside a spinlock.  But there seems no
> real problem with just collecting the timestamp before we enter that
> critical section.

Done that way.

> I also don't especially like the fact that just above there it reaches
> elog(PANIC) with the lock still held, though at least that's a case that
> should never happen.

Fixed by releasing spinlock just before elog.

> Further down, it's doing a pfree() inside the spinlock, apparently
> for no other reason than to save one "if (tmp_conninfo)".

Fixed.

> I don't especially like the Asserts inside spinlocks, either.  Personally,
> I think if those conditions are worth testing then they're worth testing
> for real (in production).  Variables that are manipulated by multiple
> processes are way more likely to assume unexpected states than local
> variables.

I didn't change these.  It doesn't look to me that these asserts are
worth very much as production code.

> I'm also rather befuddled by the fact that this code sets and clears
> walrcv->latch outside the critical sections.  If that field is used
> by any other process, surely that's completely unsafe.  If it isn't,
> why is it being kept in shared memory?

I think the latch is only used locally.  Seems that it was only put in
shmem to avoid a separate variable ...

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Improve geometric types

2017-10-03 Thread Tom Lane
Emre Hasegeli  writes:
>> Uh, I thought pg_hypot() was still needed on our oldest supported
>> platforms.  Or is hypot() already supported there?

> What is our oldest supported platform?

Our normal reference for such questions is SUS v2 (POSIX 1997):
http://pubs.opengroup.org/onlinepubs/007908799/

I looked into that, and noted that it does specify hypot(), but
it has different rules for edge conditions:

If the result would cause overflow, HUGE_VAL is returned and errno
may be set to [ERANGE].

If x or y is NaN, NaN is returned. and errno may be set to [EDOM].

If the correct result would cause underflow, 0 is returned and
errno may be set to [ERANGE].

whereas POSIX 2008 saith:

If the correct value would cause overflow, a range error shall
occur and hypot(), hypotf(), and hypotl() shall return the value
of the macro HUGE_VAL, HUGE_VALF, and HUGE_VALL, respectively.

[MX] If x or y is ±Inf, +Inf shall be returned (even if one of x
or y is NaN).

If x or y is NaN, and the other is not ±Inf, a NaN shall be
returned.

[MXX] If both arguments are subnormal and the correct result is
subnormal, a range error may occur and the correct result shall
be returned.

In short, the reason that the existing comment makes a point of the
Inf-and-NaN behavior is that the standard changed somewhere along the
line.  While I believe we could get away with assuming that hypot()
exists everywhere, it's much less clear that we could rely on the result
being Inf and not NaN in this case.

Now, it's also not clear that anything in PG really cares.  But if we
do care, I think we should keep pg_hypot() ... and maybe clarify the
comment a bit more.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible SSL improvements for a newcomer to tackle

2017-10-03 Thread Magnus Hagander
On Tue, Oct 3, 2017 at 3:51 PM, Stephen Frost  wrote:

> Tom,
>
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > Magnus Hagander  writes:
> > > On Tue, Oct 3, 2017 at 6:33 AM, Tom Lane  wrote:
> > >> I'm not an SSL expert, so insert appropriate grain of salt, but AIUI
> the
> > >> question is what are you going to verify against?
> >
> > > One way to do it would be to default to the "system global certificate
> > > store", which is what most other SSL apps do. For example on a typical
> > > debian/ubuntu, that'd be the store in /etc/ssl/certs/ca-
> certificates.crt.
> > > Exactly where to find them would be distribution-specific though, and
> we
> > > would need to actually add support for a second certificate store. But
> that
> > > would probably be a useful feature in itself.
> >
> > Maybe.  The impression I have is that it's very common for installations
> > to use a locally-run CA to generate server and client certs.  I would not
> > expect them to put such certs into /etc/ssl/certs.  But I suppose there
> > might be cases where you would actually pay for a universally-valid cert
> > for a DB server ...
>
> In many larger enterprises, they actually deploy systems with their own
> CA installed into the system global certificate store (possibly removing
> certain other CAs from that set too, and distributing their own version
> of the relevant package that maintains the CA set).
>

I think this is also something that's seen more now than it used to be.
Back when we initially did the SSL support, few people actually did this.
That's not just for this scenario of course -- larger enterprises are much
more likely to *have* proper PKI management today, and if they do then it
will include the Linux boxes.

Bottom line: things in this area has change greatly in the past 10-15 years.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] 64-bit queryId?

2017-10-03 Thread Alexander Korotkov
On Tue, Oct 3, 2017 at 5:55 PM, Robert Haas  wrote:

> On Mon, Oct 2, 2017 at 8:07 PM, Alexander Korotkov
>  wrote:
> > +1,
> > I see 3 options there:
> > 1) Drop high-order bit, as you proposed.
> > 2) Allow negative queryIds.
> > 3) Implement unsigned 64-type.
> >
> > #1 causes minor loss of precision which looks rather insignificant in
> given
> > context.
> > #2 might be rather unexpected for users whose previously had non-negative
> > queryIds.  Changing queryId from 32-bit to 64-bit itself might require
> some
> > adoption from monitoring software. But queryIds are user-visible, and
> > negative queryIds would look rather nonlogical.
> > #3 would be attaching hard and long-term problem by insufficient reason.
> > Thus, #1 looks like most harmless solution.
>
> I think we should just allow negative queryIds.  I mean, the hash
> functions have behaved that way for a long time:
>
> rhaas=# select hashtext('');
>   hashtext
> -
>  -1477818771
> (1 row)
>
> It seems silly to me to throw away a perfectly good bit from the hash
> value just because of some risk of minor user confusion.  I do like
> Michael's suggestion of outputing hexa-like text, but changing the
> types of the user-visible output columns seems like a job for another
> patch.  Similarly, if we were to adopt Andres's suggestions of a new
> type or using numeric or Alexander's suggestion of implementing a
> 64-bit unsigned type, I think it should be a separate patch from this
> one.
>
> I would note that such a patch will actually create real
> incompatibility -- extension upgrades might fail if there are
> dependent views, for example -- whereas merely having query IDs start
> to sometimes be negative only creates an incompatibility for people
> who assumed that the int64 type wouldn't actually use its full range
> of allowable values.  I don't deny that someone may have done that, of
> course, but I wouldn't guess that it's a common thing... maybe I'm
> wrong.
>

BTW, you didn't comment Tom's suggestion about dropping high-order bit
which trades minor user user confusion to minor loss of precision.
TBH, for me it's not so important whether we allow negative queryIds or
drop high-order bit.  I would be anyway very good to have 64-(or 63-)bit
queryIds committed.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] SendRowDescriptionMessage() is slow for queries with a lot of columns

2017-10-03 Thread Andres Freund
On 2017-10-03 11:06:08 -0400, Robert Haas wrote:
> On Tue, Oct 3, 2017 at 3:55 AM, Andres Freund  wrote:
> > Attached is a revised version of this patchset.
> 
> I don't much like the functions with "_pre" affixed to their names.
> It's not at all clear that "pre" means "preallocated"; it sounds more
> like you're doing something ahead of time.  I wonder about maybe
> calling these e.g. pq_writeint16, with "write" meaning "assume
> preallocation" and "send" meaning "don't assume preallocation".  There
> could be other ideas, too.

I can live with write, although I don't think it jibes well with
the pq_send* naming.


> > 3) The use of restrict, with a configure based fallback, is something
> >we've not done before, but it's C99 and delivers significantly more
> >efficient code. Any arguments against?


> Also, unless I'm missing something, there's nothing to keep
> pq_sendintXX_pre from causing an alignment fault except unbridled
> optimism...

Fair argument, I'll replace it back with a fixed-length memcpy. At least
my gcc optimizes that away again - I ended up with the plain assignment
while debugging the above, due to the lack of restrict.


> It's pretty unobvious why it helps here.  I think you should add
> comments.

Will. I'd stared at this long enough that I thought it'd be obvious. But
it took me a couple hours to get there, so ... yes.   The reason it's
needed here is that given:

static inline void
pq_sendint8_pre(StringInfo restrict buf, int8 i)
{
int32 ni = pg_hton32(i);

Assert(buf->len + sizeof(i) <= buf->maxlen);
memcpy((char* restrict) (buf->data + buf->len), , sizeof(i));
buf->len += sizeof(i);
}

without the restrict the compiler has no way to know that buf, buf->len,
*(buf->data + x) do not overlap. Therefore buf->len cannot be kept in a
register across subsequent pq_sendint*_pre calls, but has to be stored
and loaded before each of the the memcpy calls.  There's two reasons for
that:

- We compile -fno-strict-aliasing. That prevents the compiler from doing
  type based inference that buf and buf->len do not overlap with
  buf->data
- Even with type based strict aliasing, using char * type data and
  memcpy prevents that type of analysis - but restrict promises that
  there's no overlap - which we know there isn't.

Makes sense?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM

2017-10-03 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> Alvaro Herrera  writes:
>>> I think the latch is only used locally.  Seems that it was only put in
>>> shmem to avoid a separate variable ...

>> Hm, I'm strongly tempted to move it to a separate static variable then.

Oh, wait, look at

/*
 * Wake up the walreceiver main loop.
 *
 * This is called by the startup process whenever interesting xlog records
 * are applied, so that walreceiver can check if it needs to send an apply
 * notification back to the master which may be waiting in a COMMIT with
 * synchronous_commit = remote_apply.
 */
void
WalRcvForceReply(void)
{
WalRcv->force_reply = true;
if (WalRcv->latch)
SetLatch(WalRcv->latch);
}

So that's trouble waiting to happen, for sure.  At the very least we
need to do a single fetch of WalRcv->latch, not two.  I wonder whether
even that is sufficient, though: this coding requires an atomic fetch of
a pointer, which is something we generally don't assume to be safe.

I'm inclined to think that it'd be a good idea to move the set and
clear of the latch field into the nearby spinlock critical sections,
and then change WalRcvForceReply to look like

void
WalRcvForceReply(void)
{
Latch  *latch;

WalRcv->force_reply = true;
/* fetching the latch pointer might not be atomic, so use spinlock */
SpinLockAcquire(>mutex);
latch = WalRcv->latch;
SpinLockRelease(>mutex);
if (latch)
SetLatch(latch);
}

This still leaves us with a hazard of applying SetLatch to the latch
of a PGPROC that is no longer the walreceiver's, but I think that's
okay (and we have the same problem elsewhere).  At worst it leads to
an extra wakeup of the next process using that PGPROC.

I'm also thinking that the force_reply flag needs to be sig_atomic_t,
not bool, because bool store/fetch isn't necessarily atomic on all
platforms.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] list of credits for release notes

2017-10-03 Thread Dang Minh Huong
2017/09/28 3:47、Peter Eisentraut  のメッセージ:

> At the PGCon Developer Meeting it was agreed[0] to add a list of credits
> to the release notes, including everyone who was mentioned in a commit
> message.  I have now completed that list.
> 
> Attached is the proposed documentation commit as well as the raw list.

Thanks! 
Sorry but please change "Huong Dangminh"(my name) to "Dang Minh Huong".

---
Dang Minh Huong



Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-03 Thread Alvaro Herrera
So here's my attempt at an explanation for what is going on.  At one
point, we have this:

select lp, lp_flags, t_xmin, t_xmax, t_ctid, to_hex(t_infomask) as infomask,
to_hex(t_infomask2) as infomask2
from heap_page_items(get_raw_page('t', 0));
 lp | lp_flags | t_xmin | t_xmax | t_ctid | infomask | infomask2 
+--++++--+---
  1 |1 |  2 |  0 | (0,1)  | 902  | 3
  2 |0 ||||  | 
  3 |1 |  2 |  19928 | (0,4)  | 3142 | c003
  4 |1 |  14662 |  19929 | (0,5)  | 3142 | c003
  5 |1 |  14663 |  19931 | (0,6)  | 3142 | c003
  6 |1 |  14664 |  19933 | (0,7)  | 3142 | c003
  7 |1 |  14665 |  0 | (0,7)  | 2902 | 8003
(7 filas)

which shows a HOT-update chain, where the t_xmax are multixacts.  Then a
vacuum freeze comes, and because the multixacts are below the freeze
horizon for multixacts, we get this:

select lp, lp_flags, t_xmin, t_xmax, t_ctid, to_hex(t_infomask) as infomask,
to_hex(t_infomask2) as infomask2
from heap_page_items(get_raw_page('t', 0));
 lp | lp_flags | t_xmin | t_xmax | t_ctid | infomask | infomask2 
+--++++--+---
  1 |1 |  2 |  0 | (0,1)  | 902  | 3
  2 |0 ||||  | 
  3 |1 |  2 |  14662 | (0,4)  | 2502 | c003
  4 |1 |  2 |  14663 | (0,5)  | 2502 | c003
  5 |1 |  2 |  14664 | (0,6)  | 2502 | c003
  6 |1 |  2 |  14665 | (0,7)  | 2502 | c003
  7 |1 |  2 |  0 | (0,7)  | 2902 | 8003
(7 filas)

where the xmin values have all been frozen, and the xmax values are now
regular Xids.  I think the HOT code that walks the chain fails to detect
these as chains, because the xmin values no longer match the xmax
values.  I modified the multixact freeze code, so that whenever the
update Xid is below the cutoff Xid, it's set to FrozenTransactionId,
since keeping the other value is invalid anyway (even though we have set
the HEAP_XMAX_COMMITTED flag).  But that still doesn't fix the problem;
as far as I can see, vacuum removes the root of the chain, not yet sure
why, and then things are just as corrupted as before.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit queryId?

2017-10-03 Thread Robert Haas
On Tue, Oct 3, 2017 at 12:04 PM, Alexander Korotkov
 wrote:
> BTW, you didn't comment Tom's suggestion about dropping high-order bit which
> trades minor user user confusion to minor loss of precision.

Oh, I thought I did comment on that.  I favor allowing negative IDs
rather than minor loss of precision.

> TBH, for me it's not so important whether we allow negative queryIds or drop
> high-order bit.  I would be anyway very good to have 64-(or 63-)bit queryIds
> committed.

Great.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM

2017-10-03 Thread Tom Lane
I wrote:
> So that's trouble waiting to happen, for sure.  At the very least we
> need to do a single fetch of WalRcv->latch, not two.  I wonder whether
> even that is sufficient, though: this coding requires an atomic fetch of
> a pointer, which is something we generally don't assume to be safe.
>
> I'm inclined to think that it'd be a good idea to move the set and
> clear of the latch field into the nearby spinlock critical sections,
> and then change WalRcvForceReply to look like ...

Concretely, as per the attached.

I reordered the WalRcvData fields to show that the "latch" field is now
treated as protected by the spinlock.  In the back branches, we shouldn't
do that, just in case some external code is touching the mutex field.

regards, tom lane

diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 57c305d..1bf9be6 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -256,13 +256,14 @@ WalReceiverMain(void)
 	walrcv->lastMsgSendTime =
 		walrcv->lastMsgReceiptTime = walrcv->latestWalEndTime = now;
 
+	/* Report the latch to use to awaken this process */
+	walrcv->latch = >procLatch;
+
 	SpinLockRelease(>mutex);
 
 	/* Arrange to clean up at walreceiver exit */
 	on_shmem_exit(WalRcvDie, 0);
 
-	walrcv->latch = >procLatch;
-
 	/* Properly accept or ignore signals the postmaster might send us */
 	pqsignal(SIGHUP, WalRcvSigHupHandler);	/* set flag to read config file */
 	pqsignal(SIGINT, SIG_IGN);
@@ -777,8 +778,7 @@ WalRcvDie(int code, Datum arg)
 	/* Ensure that all WAL records received are flushed to disk */
 	XLogWalRcvFlush(true);
 
-	walrcv->latch = NULL;
-
+	/* Mark ourselves inactive in shared memory */
 	SpinLockAcquire(>mutex);
 	Assert(walrcv->walRcvState == WALRCV_STREAMING ||
 		   walrcv->walRcvState == WALRCV_RESTARTING ||
@@ -789,6 +789,7 @@ WalRcvDie(int code, Datum arg)
 	walrcv->walRcvState = WALRCV_STOPPED;
 	walrcv->pid = 0;
 	walrcv->ready_to_display = false;
+	walrcv->latch = NULL;
 	SpinLockRelease(>mutex);
 
 	/* Terminate the connection gracefully. */
@@ -1344,9 +1345,15 @@ ProcessWalSndrMessage(XLogRecPtr walEnd, TimestampTz sendTime)
 void
 WalRcvForceReply(void)
 {
+	Latch	   *latch;
+
 	WalRcv->force_reply = true;
-	if (WalRcv->latch)
-		SetLatch(WalRcv->latch);
+	/* fetching the latch pointer might not be atomic, so use spinlock */
+	SpinLockAcquire(>mutex);
+	latch = WalRcv->latch;
+	SpinLockRelease(>mutex);
+	if (latch)
+		SetLatch(latch);
 }
 
 /*
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 78f8693..b1f28d0 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -226,6 +226,7 @@ RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr, const char *conninfo,
 	WalRcvData *walrcv = WalRcv;
 	bool		launch = false;
 	pg_time_t	now = (pg_time_t) time(NULL);
+	Latch	   *latch;
 
 	/*
 	 * We always start at the beginning of the segment. That prevents a broken
@@ -274,12 +275,14 @@ RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr, const char *conninfo,
 	walrcv->receiveStart = recptr;
 	walrcv->receiveStartTLI = tli;
 
+	latch = walrcv->latch;
+
 	SpinLockRelease(>mutex);
 
 	if (launch)
 		SendPostmasterSignal(PMSIGNAL_START_WALRECEIVER);
-	else if (walrcv->latch)
-		SetLatch(walrcv->latch);
+	else if (latch)
+		SetLatch(latch);
 }
 
 /*
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index 9a8b2e2..e58fc49 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -117,14 +117,6 @@ typedef struct
 	/* set true once conninfo is ready to display (obfuscated pwds etc) */
 	bool		ready_to_display;
 
-	slock_t		mutex;			/* locks shared variables shown above */
-
-	/*
-	 * force walreceiver reply?  This doesn't need to be locked; memory
-	 * barriers for ordering are sufficient.
-	 */
-	bool		force_reply;
-
 	/*
 	 * Latch used by startup process to wake up walreceiver after telling it
 	 * where to start streaming (after setting receiveStart and
@@ -133,6 +125,15 @@ typedef struct
 	 * normally mapped to procLatch when walreceiver is running.
 	 */
 	Latch	   *latch;
+
+	slock_t		mutex;			/* locks shared variables shown above */
+
+	/*
+	 * force walreceiver reply?  This doesn't need to be locked; memory
+	 * barriers for ordering are sufficient.  But we do need atomic fetch and
+	 * store semantics, so use sig_atomic_t.
+	 */
+	sig_atomic_t force_reply;	/* used as a bool */
 } WalRcvData;
 
 extern WalRcvData *WalRcv;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-03 Thread Peter Geoghegan
On Tue, Oct 3, 2017 at 9:48 AM, Alvaro Herrera  wrote:
> which shows a HOT-update chain, where the t_xmax are multixacts.  Then a
> vacuum freeze comes, and because the multixacts are below the freeze
> horizon for multixacts, we get this:
>
> select lp, lp_flags, t_xmin, t_xmax, t_ctid, to_hex(t_infomask) as infomask,
> to_hex(t_infomask2) as infomask2
> from heap_page_items(get_raw_page('t', 0));
>  lp | lp_flags | t_xmin | t_xmax | t_ctid | infomask | infomask2
> +--++++--+---
>   1 |1 |  2 |  0 | (0,1)  | 902  | 3
>   2 |0 ||||  |
>   3 |1 |  2 |  14662 | (0,4)  | 2502 | c003
>   4 |1 |  2 |  14663 | (0,5)  | 2502 | c003
>   5 |1 |  2 |  14664 | (0,6)  | 2502 | c003
>   6 |1 |  2 |  14665 | (0,7)  | 2502 | c003
>   7 |1 |  2 |  0 | (0,7)  | 2902 | 8003
> (7 filas)
>
> where the xmin values have all been frozen, and the xmax values are now
> regular Xids.

I thought that we no longer store FrozenTransactionId (xid 2) as our
"raw" xmin while freezing, and yet that's what we see here. It looks
like pageinspect is looking at raw xmin (it's calling
HeapTupleHeaderGetRawXmin(), not HeapTupleHeaderGetXmin()). What's the
deal with that? Shouldn't the original xmin be preserved for forensic
analysis, following commit 37484ad2?

I guess what you mean is that this is what you see having modified the
code to actually store FrozenTransactionId as xmin once more, in an
effort to fix this?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible SSL improvements for a newcomer to tackle

2017-10-03 Thread Adrien Nayrat
On 10/03/2017 05:47 PM, Nico Williams wrote:
> +1, but it's trickier than you might think.  I can connect you with
> Viktor Dukhovni, who has implemented DANE for OpenSSL, and done yeoman's
> work getting DANE for SMTP working.

I really appreciate, but I know it is very tricky :). I do not pretend to work
on this feature, I really do not have sufficient knowledge :/.

-- 
Adrien NAYRAT



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM

2017-10-03 Thread Tom Lane
I wrote:
>> So that's trouble waiting to happen, for sure.  At the very least we
>> need to do a single fetch of WalRcv->latch, not two.  I wonder whether
>> even that is sufficient, though: this coding requires an atomic fetch of
>> a pointer, which is something we generally don't assume to be safe.

BTW, I had supposed that this bug was of long standing, but actually it's
new in v10, dating to 597a87ccc9a6fa8af7f3cf280b1e24e41807d555.  Before
that walreceiver start/stop just changed the owner of a long-lived shared
latch, and there was no question of stale pointers.

I considered reverting that decision, but the reason for it seems to have
been to let libpqwalreceiver.c manipulate MyProc->procLatch rather than
having to know about a custom latch.  That's probably a sufficient reason
to justify some squishiness in the wakeup logic.  Still, we might want to
revisit it if we find any other problems here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

2017-10-03 Thread Andrew Dunstan


On 09/27/2017 02:52 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> At this stage on reflection I agree it should be pulled :-(
> That seems to be the consensus, so I'll go make it happen.
>
>> I'm not happy about the idea of marking an input function as not
>> parallel safe, certainly not without a good deal of thought and
>> discussion that we don't have time for this cycle.
> I think the way forward is to do what we had as of HEAD (984c92074),
> but add the ability to transmit the blacklist table to parallel
> workers.  Since we expect the blacklist table would be empty most of
> the time, this should be close to no overhead in practice.  I concur
> that the idea of marking the relevant functions parallel-restricted is
> probably not as safe a fix as I originally thought, and it's not a
> very desirable restriction even if it did fix the problem.
>
>   


Do you have any suggestion as to how we should transmit the blacklist to
parallel workers?

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [PATCH] BUG #13416: Postgres >= 9.3 doesn't use optimized shared memory on Solaris anymore

2017-10-03 Thread Sean Chittenden
Hello.  We identified the same problem.  Sam Gwydir and Josh Clulow were able 
to put together the appropriate fix after.

The breakage in src/backend/port/sysv_shmem.c and 
src/include/storage/dsm_impl.h should be back ported to all supported versions 
(the regression was introduced between the 9.2 and 9.3 branches).

The only real question remaining is: do we want to change the default behavior, 
as detected by initdb(1), to use ISM on Illumos/Solaris derived systems?  Given 
the memory bloat experienced when using POSIX shared memory (potentially very 
significant for systems with larger shared_buffers), we think it's probably 
prudent to change the default from dynamic_shared_memory_type=posix to sysv.  
Unfortunately I think it's not worth changing the default behavior of initdb(1) 
in the current supported branches, but it should be changed for 10.

Please let us know if there are any questions.  -sc


--
Sean Chittenden
se...@joyent.com


ism.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-10-03 Thread Bossart, Nathan
On 10/3/17, 5:59 PM, "Tom Lane"  wrote:
> I thought it would be a good idea to get this done before walking away
> from the commitfest and letting all this info get swapped out of my
> head.  So I've reviewed and pushed this.

Thanks!

> I took out most of the infrastructure you'd put in for constructing
> RangeVars for tables not explicitly named on the command line.  It
> was buggy (eg you can't assume a relcache entry will stick around)
> and I don't believe it's necessary.  I don't think that warnings
> should be issued for any tables not explicitly named.
>
> In any case, though, the extent to which we should add more warning
> or log output seems like a fit topic for a new thread and a separate
> patch.  Let's call this one done.

I'll look into submitting that to the next commitfest.

Nathan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Issues with logical replication

2017-10-03 Thread Petr Jelinek
On 02/10/17 18:59, Petr Jelinek wrote:
>>
>> Now fix the trigger function:
>> CREATE OR REPLACE FUNCTION replication_trigger_proc() RETURNS TRIGGER AS $$
>> BEGIN
>>   RETURN NEW;
>> END $$ LANGUAGE plpgsql;
>>
>> And manually perform at master two updates inside one transaction:
>>
>> postgres=# begin;
>> BEGIN
>> postgres=# update pgbench_accounts set abalance=abalance+1 where aid=26;
>> UPDATE 1
>> postgres=# update pgbench_accounts set abalance=abalance-1 where aid=26;
>> UPDATE 1
>> postgres=# commit;
>> 
>>
>> and in replica log we see:
>> 2017-10-02 18:40:26.094 MSK [2954] LOG:  logical replication apply
>> worker for subscription "sub" has started
>> 2017-10-02 18:40:26.101 MSK [2954] ERROR:  attempted to lock invisible
>> tuple
>> 2017-10-02 18:40:26.102 MSK [2882] LOG:  worker process: logical
>> replication worker for subscription 16403 (PID 2954) exited with exit
>> code 1
>>
>> Error happens in trigger.c:
>>
>> #3  0x0069bddb in GetTupleForTrigger (estate=0x2e36b50,
>> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tid=0x2dd08ac,
>>     lockmode=LockTupleNoKeyExclusive, newSlot=0x7ffc4420ec40) at
>> trigger.c:3103
>> #4  0x0069b259 in ExecBRUpdateTriggers (estate=0x2e36b50,
>> epqstate=0x7ffc4420eda0, relinfo=0x2dcfe90, tupleid=0x2dd08ac,
>>     fdw_trigtuple=0x0, slot=0x2dd0240) at trigger.c:2748
>> #5  0x006d2395 in ExecSimpleRelationUpdate (estate=0x2e36b50,
>> epqstate=0x7ffc4420eda0, searchslot=0x2dd0358, slot=0x2dd0240)
>>     at execReplication.c:461
>> #6  0x00820894 in apply_handle_update (s=0x7ffc442163b0) at
>> worker.c:736
> 
> We have locked the same tuple in RelationFindReplTupleByIndex() just
> before this gets called and didn't get the same error. I guess we do
> something wrong with snapshots. Will need to investigate more.
> 

Okay, so it's not snapshot. It's the fact that we don't set the
es_output_cid in replication worker which GetTupleForTrigger is using
when locking the tuple. Attached one-liner fixes it.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index b58d2e1008..d6a5cf7088 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -204,6 +204,8 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel)
 	estate->es_num_result_relations = 1;
 	estate->es_result_relation_info = resultRelInfo;
 
+	estate->es_output_cid = GetCurrentCommandId(true);
+
 	/* Triggers might need a slot */
 	if (resultRelInfo->ri_TrigDesc)
 		estate->es_trig_tuple_slot = ExecInitExtraTupleSlot(estate);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >