Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY

2017-02-02 Thread Pavan Deolasee
On Thu, Feb 2, 2017 at 6:14 PM, Amit Kapila  wrote:

>
>   /*
> + * If the index list was invalidated, we better also invalidate the index
> + * attribute list (which should automatically invalidate other attributes
> + * such as primary key and replica identity)
> + */
>
> + relation->rd_indexattr = NULL;
> +
>
> I think setting directly to NULL will leak the memory.
>

Good catch, thanks. Will fix.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)

2017-02-02 Thread Corey Huinker
On Fri, Feb 3, 2017 at 12:57 AM, Fabien COELHO  wrote:

>
> Hello Corey,
>
> Glad you like the barking. I'm happy to let the prompt issue rest for now,
>> we can always add it later.
>>
>> If we DID want it, however, I don't think it'll be hard to add a special
>> prompt (Thinking %T or %Y because they both look like branches, heh),
>>
>
> Ah!:-) T may stand for Tree, but Y looks a little bit more like branches.
> Maybe Y for Yew.
>

Well played. The %Y prompt can be a separate patch.

New patch. Highlights:
- rebased to master as of ten minutes ago
- Interactive barking on branching state changes, commands typed while in
inactive state
- Help text. New block in help text called "Conditionals"
- SendQuery calls in mainloop.c are all encapsulated in send_query() to
ensure the same if-active and if-interactive logic is used
- Exactly one perl TAP test, testing ON_ERROR_STOP. I predict more will be
needed, but I'm not sure what coverage is desired
- I also predict that my TAP test style is pathetic
- regression tests now have comments to explain purpose
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index ae58708..c0ba4c4 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2035,6 +2035,67 @@ hello 10
 
   
 
+  
+\if expr
+\elif expr
+\else
+\endif
+
+
+This group of commands implements nestable conditional blocks, like
+this:
+
+
+SELECT
+EXISTS(SELECT 1 FROM customer WHERE customer_id = 123) as is_customer,
+EXISTS(SELECT 1 FROM employee WHERE employee_id = 456) as is_employee
+\gset
+\if :is_customer
+SELECT * FROM customer WHERE customer_id = 123;
+\elif :is_employee
+\echo 'is not a customer but is an employee'
+SELECT * FROM employee WHERE employee_id = 456;
+\else
+\if yes
+\echo 'not a customer or employee'
+\else
+\echo 'this should never print'
+\endif
+\endif
+
+
+Conditional blocks must begin with a \if and end
+with an \endif, and the pairs must be found in
+the same source file. If an EOF is reached on the main file or an
+\include-ed file before all local
+\if-\endif are matched, then
+psql will raise an error.
+
+
+A conditional block can have any number of 
+\elif clauses, which may optionally be followed by a
+single \else clause.
+
+
+The \if and \elif commands
+read the rest of the line and evaluate it as a boolean expression.
+Currently, expressions are limited to a single unquoted string
+which are evaluated like other on/off options, so the valid values
+are any unambiguous case insensitive matches for one of:
+true, false, 
1,
+0, on, off,
+yes, no.  So 
+t, T, and tR
+will all match true.
+
+
+Lines within false branches are not evaluated in any way: queries are
+not sent to the server, non-conditional commands are not evaluated but
+bluntly ignored, nested if-expressions in such branches are also not
+evaluated but are tallied to check for proper nesting.
+
+
+  
 
   
 \ir or \include_relative 
filename
diff --git a/src/bin/psql/Makefile b/src/bin/psql/Makefile
index c53733f..7a418c6 100644
--- a/src/bin/psql/Makefile
+++ b/src/bin/psql/Makefile
@@ -61,8 +61,16 @@ uninstall:
 
 clean distclean:
rm -f psql$(X) $(OBJS) lex.backup
+   rm -rf tmp_check
 
 # files removed here are supposed to be in the distribution tarball,
 # so do not clean them in the clean/distclean rules
 maintainer-clean: distclean
rm -f sql_help.h sql_help.c psqlscanslash.c
+
+
+check:
+   $(prove_check)
+
+installcheck:
+   $(prove_installcheck)
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index f17f610..dcf567e 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -49,6 +49,7 @@
 #include "psqlscanslash.h"
 #include "settings.h"
 #include "variables.h"
+#include "fe_utils/psqlscan_int.h"
 
 /*
  * Editable database object types.
@@ -132,7 +133,7 @@ HandleSlashCmds(PsqlScanState scan_state,
status = PSQL_CMD_ERROR;
}
 
-   if (status != PSQL_CMD_ERROR)
+   if (status != PSQL_CMD_ERROR && pset.active_branch)
{
/* eat any remaining arguments after a valid command */
/* note we suppress evaluation of backticks here */
@@ -194,6 +195,30 @@ read_connect_arg(PsqlScanState scan_state)
return result;
 }
 
+/*
+ * Read and interpret argument as a boolean expression.
+ * Return true if a boolean value was successfully parsed.
+ */
+static bool
+read_boolean_expression(PsqlScanState scan_state, char *action,
+   bool *result)
+{
+   char*value = 

Re: [HACKERS] Packages: Again

2017-02-02 Thread Pavel Stehule
2017-02-03 7:34 GMT+01:00 Craig Ringer :

> On 3 February 2017 at 14:27, Pavel Stehule 
> wrote:
> >
> >
> > 2017-01-20 17:01 GMT+01:00 Joshua D. Drake :
> >>
> >> On 01/17/2017 09:26 AM, Robert Haas wrote:
> >>>
> >>> On Fri, Jan 13, 2017 at 7:24 PM, Peter Geoghegan 
> wrote:
> 
>  MERGE isn't UPSERT, and isn't even in competition with UPSERT as a
>  feature. I've written reams of text explaining why this is so in
>  precise detail, ...
> >>
> >>
> >>
> >> Hello,
> >>
> >> This is the webinar that started this whole thread (well the original
> >> thread, not this weird MERGE/UPSERT stuff):
> >>
> >> https://www.commandprompt.com/blog/postgresql_for_oracle_people/
> >>
> >> Thank you to everyone that responded. You will see in this Webinar that
> at
> >> least from the Oracle people perspective, PostgreSQL is not an option
> unless
> >> it has packages.
> >>
> >> The other item that people bring up a few times is Oracle Forms but as
> >> that is actually external (although dependent) on Oracle, I don't see
> that
> >> as our responsibility.
> >
> >
> > DB2 propose using schemas instead packages
> >
> > https://www.ibm.com/developerworks/data/library/
> techarticle/dm-0711zubiri/
> >
> > Now I am working with Oracle application - and I try to understand to
> Oracle
> > developers - often pattern is using "Oracle schema" as database - and
> then
> > the packages has sense. But there is not a mapping "Oracle schema" =
> > "PostgreSQL schema" - and packages is a redundant concept in Postgres
> (and
> > in all db, where the schema are like namaspace - MSSQL, DB2, MySQL).
>
> It sounds like we could benefit from a documentation section
> "packages" that describes how to get package-like behaviour (minus the
> pre-compiled updates) from Pg using schemas and, once added, secure
> variables.
>

It should be documented and presented (who is read a documentation? :-))

It is not only PostgreSQL issue, same issue has to have any other
databases. The Oracle architecture is very specific and often question is,
how to map Oracle database to PostgreSQL. A common questions - how schema
should be used, where schema should be used, where database should be used.
What is practical limit of size of PostgreSQL catalogue.


>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] proposal: session server side variables

2017-02-02 Thread Pavel Stehule
2017-02-03 7:25 GMT+01:00 Fabien COELHO :

>
> Hello Pavel,
>
> The @1 area is partially solved by psql session variables or by pgAdmin
>> scripting functionality. @2 is partially solved by GUC but without
>> possibility to set a access rights.
>>
>> I didn't found any implementation of XA variables [...]
>>
>
> I did: GUCs in PostgreSQL are an implementation of transactional session
> variables.
>

GUC was not designed for usage in stored procedures.


>
> As I wrote on the thread, given the "security check" use case, the safe
> alternative to transactional session variables is to have nested
> transactions. This seems like a far away prospect for pg, but is a reality
> for Oracle, DB2 and some others that have untransactional session
> variables, so at least it is safe in their case, if not elegant.
>

You have everywhere the fence between transactional/untransactional - and
you can see some unwanted artefacts there. The world "secure" just means -
a possibility to set access rights - nothing more, nothing less.



>
> My "hard" opinion is that providing an unsafe by default feature (i.e.
> which works as in some particular cases, but may fail silently if the
> transaction fails), especially for a security related use case which
> motivates the whole feature addition, is a very bad idea for the product.
> If a committer likes it anyway, good for you.
>

I respect to your opinion and I understand - I have a similar strong
opinion on packages in Postgres. In this case I prefer a common
implementation - and common expectation.

When some feature is PostgreSQL original, then we can design how we can.
But when we implement some feature that exists already, then we should to
respect some previous, older major implementation.

The great example is our implementation of OUT parameters in PL. The idea
is great - modern languages use it Golang, Rust. But in PL area is unique,
different. One from significant issues migrations to Postgres, Postgres
adoptions is this small feature.

The people who is working with stored procedures doesn't expect XA behave,
overhead when they working with some objects named "variables". We can
implement XA support for variables, ale I don't think so default should be
XA. Only few cases where variables can be used are are XA sensitive.

Regards

Pavel


>
> Other opinions I expressed on the thread are somehow "softer", i.e. even
> if I think that there are better (simpler, easier) alternatives, these are
> only alternatives.
>
> --
> Fabien.
>


Re: [HACKERS] Packages: Again

2017-02-02 Thread Craig Ringer
On 3 February 2017 at 14:27, Pavel Stehule  wrote:
>
>
> 2017-01-20 17:01 GMT+01:00 Joshua D. Drake :
>>
>> On 01/17/2017 09:26 AM, Robert Haas wrote:
>>>
>>> On Fri, Jan 13, 2017 at 7:24 PM, Peter Geoghegan  wrote:

 MERGE isn't UPSERT, and isn't even in competition with UPSERT as a
 feature. I've written reams of text explaining why this is so in
 precise detail, ...
>>
>>
>>
>> Hello,
>>
>> This is the webinar that started this whole thread (well the original
>> thread, not this weird MERGE/UPSERT stuff):
>>
>> https://www.commandprompt.com/blog/postgresql_for_oracle_people/
>>
>> Thank you to everyone that responded. You will see in this Webinar that at
>> least from the Oracle people perspective, PostgreSQL is not an option unless
>> it has packages.
>>
>> The other item that people bring up a few times is Oracle Forms but as
>> that is actually external (although dependent) on Oracle, I don't see that
>> as our responsibility.
>
>
> DB2 propose using schemas instead packages
>
> https://www.ibm.com/developerworks/data/library/techarticle/dm-0711zubiri/
>
> Now I am working with Oracle application - and I try to understand to Oracle
> developers - often pattern is using "Oracle schema" as database - and then
> the packages has sense. But there is not a mapping "Oracle schema" =
> "PostgreSQL schema" - and packages is a redundant concept in Postgres (and
> in all db, where the schema are like namaspace - MSSQL, DB2, MySQL).

It sounds like we could benefit from a documentation section
"packages" that describes how to get package-like behaviour (minus the
pre-compiled updates) from Pg using schemas and, once added, secure
variables.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Packages: Again

2017-02-02 Thread Pavel Stehule
2017-01-20 17:01 GMT+01:00 Joshua D. Drake :

> On 01/17/2017 09:26 AM, Robert Haas wrote:
>
>> On Fri, Jan 13, 2017 at 7:24 PM, Peter Geoghegan  wrote:
>>
>>> MERGE isn't UPSERT, and isn't even in competition with UPSERT as a
>>> feature. I've written reams of text explaining why this is so in
>>> precise detail, ...
>>>
>>
>
> Hello,
>
> This is the webinar that started this whole thread (well the original
> thread, not this weird MERGE/UPSERT stuff):
>
> https://www.commandprompt.com/blog/postgresql_for_oracle_people/
>
> Thank you to everyone that responded. You will see in this Webinar that at
> least from the Oracle people perspective, PostgreSQL is not an option
> unless it has packages.
>
> The other item that people bring up a few times is Oracle Forms but as
> that is actually external (although dependent) on Oracle, I don't see that
> as our responsibility.


DB2 propose using schemas instead packages

https://www.ibm.com/developerworks/data/library/techarticle/dm-0711zubiri/

Now I am working with Oracle application - and I try to understand to
Oracle developers - often pattern is using "Oracle schema" as database -
and then the packages has sense. But there is not a mapping "Oracle schema"
= "PostgreSQL schema" - and packages is a redundant concept in Postgres
(and in all db, where the schema are like namaspace - MSSQL, DB2, MySQL).

Regards

Pavel


>
> Sincerely,
>
> JD
>
>
> --
> Command Prompt, Inc.  http://the.postgres.company/
> +1-503-667-4564
> PostgreSQL Centered full stack support, consulting and development.
> Everyone appreciates your honesty, until you are honest with them.
> Unless otherwise stated, opinions are my own.
>


Re: [HACKERS] proposal: session server side variables

2017-02-02 Thread Fabien COELHO


Hello Pavel,


The @1 area is partially solved by psql session variables or by pgAdmin
scripting functionality. @2 is partially solved by GUC but without
possibility to set a access rights.

I didn't found any implementation of XA variables [...]


I did: GUCs in PostgreSQL are an implementation of transactional session 
variables.


As I wrote on the thread, given the "security check" use case, the safe 
alternative to transactional session variables is to have nested 
transactions. This seems like a far away prospect for pg, but is a reality 
for Oracle, DB2 and some others that have untransactional session 
variables, so at least it is safe in their case, if not elegant.


My "hard" opinion is that providing an unsafe by default feature (i.e. 
which works as in some particular cases, but may fail silently if the 
transaction fails), especially for a security related use case which 
motivates the whole feature addition, is a very bad idea for the product. 
If a committer likes it anyway, good for you.


Other opinions I expressed on the thread are somehow "softer", i.e. even 
if I think that there are better (simpler, easier) alternatives, these are 
only alternatives.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2017-02-02 Thread Michael Paquier
On Fri, Feb 3, 2017 at 2:56 PM, Pavel Stehule  wrote:
> My patch was marked as "returned with feedback".  Personally, I had not a
> idea what can be next step and what is preferred design, if some preferred
> design exists. I don't know what I have to change on my proposal.

Perhaps this was not adapted, sorry about that. Now the latest patch
is 2-month old and does not apply. If you think that the approach you
are taking is worth it, you can of course submit again a new version
and make the discussion move on. Finding a consensus is the difficult
part though.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2017-02-02 Thread Pavel Stehule
There is a link - comparation Oracle package variables and DB2 global
variables

https://www.ibm.com/developerworks/data/library/techarticle/dm-0711zubiri/

Regards

Pavel


Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)

2017-02-02 Thread Fabien COELHO


Hello Corey,


Glad you like the barking. I'm happy to let the prompt issue rest for now,
we can always add it later.

If we DID want it, however, I don't think it'll be hard to add a special
prompt (Thinking %T or %Y because they both look like branches, heh),


Ah!:-) T may stand for Tree, but Y looks a little bit more like branches. 
Maybe Y for Yew.



With a prompt1 of '%T> ' Would then resolve to "ttfi" [...]
This is just idle musing, I'm perfectly happy to leave it out entirely.


I like it. I would prefer to have it available, but my advice is to follow 
committer' opinions. At the minimum, there must be some trace, either 
"barking" or "prompting".


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: session server side variables

2017-02-02 Thread Pavel Stehule
2017-02-01 6:42 GMT+01:00 Pavel Stehule :

>
>
> 2017-02-01 6:05 GMT+01:00 Michael Paquier :
>
>> On Wed, Jan 11, 2017 at 10:42 PM, Craig Ringer 
>> wrote:
>> > There is no code yet. Code review and testing is where things get
>> firmer.
>> >
>> > My personal stance right now is that I'd like to see catalog-decared
>> typed
>> > variables. I would prefer them to be transactional and would at least
>> oppose
>> > anything that didn't allow future room for that capability. I'd prefer
>> that
>> > non-transactional vars be clearly declared as such.
>> >
>> > In the end though... I'm not the one implementing it. I can have some
>> > influence through the code review process. But it's whoever steps up
>> with a
>> > proposed implementation that has the biggest say. The rest of us can
>> say yes
>> > or no to some degree... but nobody can make someone else implement
>> something
>> > they don't want.
>>
>> The last patch is from the 6th of December and does not apply anymore:
>> https://www.postgresql.org/message-id/CAFj8pRA9w_AujBAYdLR0U
>> VfXwwoxhmn%2BFbNHnD3_NL%3DJ9x3y8w%40mail.gmail.com
>> I don't have a better idea than marking this patch as "returned with
>> feedback" for now, as the thread has died 3 weeks ago as well.
>>
>
> There is not a agreement on the form of session variables.
>

Today I found on net a documentation to DB2 "CREATE VARIABLE"  command. I
had not any idea, so this statement exists already, although it is old
feature - I found a doc from 2007.

The DB2 design is very similar to my proposal - secured access, persistent
metadata, unshared untransactional data limmited by session.

They doesn't use a access functions - the access is with notation
schemaname.variablename. I proposed this syntax as next step in
implementation.

The DB2 authors doesn't propose transactional variables - when user needs
XA behave, then global temporary tables should be used.

My patch was marked as "returned with feedback".  Personally, I had not a
idea what can be next step and what is preferred design, if some preferred
design exists. I don't know what I have to change on my proposal.

I understand, so there are two different concepts - 1. using variables for
adhoc writing like T-SQL, MySQL or 2. using variables as global session
objects for stored procedures.

The @1 area is partially solved by psql session variables or by pgAdmin
scripting functionality. @2 is partially solved by GUC but without
possibility to set a access rights.

I didn't found any implementation of XA variables or persistent variables
on the world.

Regards

Pavel



>
> Regards
>
> Pavel
>
>
>> --
>> Michael
>>
>
>


Re: [HACKERS] Logical Replication and Character encoding

2017-02-02 Thread Craig Ringer
On 3 Feb. 2017 15:47, "Kyotaro HORIGUCHI" 
wrote:

Hello,

At Fri, 3 Feb 2017 09:16:47 +0800, Craig Ringer 
wrote in 
> On 2 February 2017 at 11:45, Euler Taveira  wrote:
>
> > I don't think storage without conversion is an acceptable approach. We
> > should provide options to users such as ignore tuple or NULL for
> > column(s) with conversion problem. I wouldn't consider storage data
> > without conversion because it silently show incorrect data and we
> > historically aren't flexible with conversion routines.

It is possible technically. But changing the behavior of a
subscript and/or publication requires change of SQL syntax. It
seems a bit too late for proposing such a new feature..

IMHO unintentional silent data loss must not be happen so the
default behavior on conversion failure cannot be other than stop
of replication.


Agree. Which is why we should default to disallowing mismatched upstream
and downstream encodings. At least to start with.


> pglogical and BDR both require identical encoding; they test for this
> during startup and refuse to replicate if the encoding differs.
>
> For the first pass at core I suggest a variant on that policy: require
> source and destination encoding to be the same. This should probably
> be the first change, since it definitively prevents the issue from
> arising.

If the check is performed by BDR, pglogical itself seems to be
allowed to convert strings when the both end have different
encodings in a non-BDR environment. Is this right?


Hm. Maybe it's changed since I last looked. We started off disallowing
mismatched encodings anyway.

Note that I'm referring to pglogical, the tool, not "in core logical
replication for postgres"



> If time permits we could also allow destination encoding to be UTF-8
> (including codepage 65001) with any source encoding. This requires
> encoding conversion to be performed, of course.

Does this mean that BDR might work on heterogeneous encoding
environemnt?


No. Here the "we" was meant to be PG core for V10 or later.

But anyway some encodings (like SJIS) have
caharacters with the same destination in its mapping so BDR
doesn't seem to work with such conversions. So each encoding
might should have a property to inform its usability under BDR
environment, but.


PG doesn't allow SJIS as a db encoding. So it doesn't matter here.



> The downside is that this will impact users who use a common subset of
> two encodings. This is most common for Windows-1252 <-> ISO-8859-15
> (or -1 if you're old-school) but also arises anywhere the common 7 bit
> subset is used. Until we can define an encoding exception policy
> though, I think we should defer supporting those and make them a
> "later" problem.

If the conversion is rejected for now, we should check the
encoding identity instead.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


Re: [HACKERS] Radix tree for character conversion

2017-02-02 Thread Kyotaro HORIGUCHI
Tnanks to that Heikki have pushed the first two patches and a
part of the third, only one patch is remaining now.

# Sorry for not separating KOI8 stuffs.

At Tue, 31 Jan 2017 19:06:09 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20170131.190609.254672218.horiguchi.kyot...@lab.ntt.co.jp>
> > Thanks for the new version, I'll look at it once I am done with the
> > cleanup of the current CF. For now I have moved it to the CF 2017-03.
> 
> Agreed. Thank you.

Attached is the latest version on the current master (555494d).

Note: since this patch is created by git diff --irreversble-delete,
three files mb/Unicode/*.(txt|xml) to be deleted are left alone.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 68d75100b7e8aaab7706ea780a1e23557c676c87 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 10 Jan 2017 20:02:00 +0900
Subject: [PATCH] Use radix tree for character conversion

This patch adds multibyte character converter based using radix tree
based on Heikki's rework of my previous patch.
---
 src/backend/utils/mb/Makefile  | 2 +
 src/backend/utils/mb/Unicode/.gitignore|11 +
 src/backend/utils/mb/Unicode/Makefile  |72 +-
 src/backend/utils/mb/Unicode/UCS_to_BIG5.pl| 9 +-
 src/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl  | 9 +-
 .../utils/mb/Unicode/UCS_to_EUC_JIS_2004.pl|19 +-
 src/backend/utils/mb/Unicode/UCS_to_EUC_JP.pl  | 6 +-
 src/backend/utils/mb/Unicode/UCS_to_EUC_KR.pl  |13 +-
 src/backend/utils/mb/Unicode/UCS_to_EUC_TW.pl  | 9 +-
 src/backend/utils/mb/Unicode/UCS_to_GB18030.pl | 9 +-
 src/backend/utils/mb/Unicode/UCS_to_JOHAB.pl   |11 +-
 .../utils/mb/Unicode/UCS_to_SHIFT_JIS_2004.pl  |14 +-
 src/backend/utils/mb/Unicode/UCS_to_SJIS.pl|29 +-
 src/backend/utils/mb/Unicode/UCS_to_UHC.pl |11 +-
 src/backend/utils/mb/Unicode/UCS_to_most.pl| 5 +-
 src/backend/utils/mb/Unicode/convutils.pm  |   679 +-
 src/backend/utils/mb/Unicode/euc-jis-2004-std.txt  | 11549 ---
 src/backend/utils/mb/Unicode/gb-18030-2000.xml | 30916 ---
 src/backend/utils/mb/Unicode/make_mapchecker.pl|78 +
 src/backend/utils/mb/Unicode/map_checker.c |94 +
 .../utils/mb/Unicode/sjis-0213-2004-std.txt| 11549 ---
 src/backend/utils/mb/char_converter.c  |   116 +
 src/backend/utils/mb/conv.c|   137 +-
 .../conversion_procs/utf8_and_big5/utf8_and_big5.c | 8 +-
 .../utf8_and_cyrillic/utf8_and_cyrillic.c  |16 +-
 .../utf8_and_euc2004/utf8_and_euc2004.c| 8 +-
 .../utf8_and_euc_cn/utf8_and_euc_cn.c  | 8 +-
 .../utf8_and_euc_jp/utf8_and_euc_jp.c  | 8 +-
 .../utf8_and_euc_kr/utf8_and_euc_kr.c  | 8 +-
 .../utf8_and_euc_tw/utf8_and_euc_tw.c  | 8 +-
 .../utf8_and_gb18030/utf8_and_gb18030.c| 8 +-
 .../conversion_procs/utf8_and_gbk/utf8_and_gbk.c   | 8 +-
 .../utf8_and_iso8859/utf8_and_iso8859.c|   127 +-
 .../utf8_and_johab/utf8_and_johab.c| 8 +-
 .../conversion_procs/utf8_and_sjis/utf8_and_sjis.c | 8 +-
 .../utf8_and_sjis2004/utf8_and_sjis2004.c  | 8 +-
 .../conversion_procs/utf8_and_uhc/utf8_and_uhc.c   | 8 +-
 .../conversion_procs/utf8_and_win/utf8_and_win.c   |98 +-
 src/include/mb/pg_wchar.h  |56 +-
 39 files changed, 1355 insertions(+), 54385 deletions(-)
 create mode 100644 src/backend/utils/mb/Unicode/.gitignore
 delete mode 100644 src/backend/utils/mb/Unicode/euc-jis-2004-std.txt
 delete mode 100644 src/backend/utils/mb/Unicode/gb-18030-2000.xml
 create mode 100755 src/backend/utils/mb/Unicode/make_mapchecker.pl
 create mode 100644 src/backend/utils/mb/Unicode/map_checker.c
 delete mode 100644 src/backend/utils/mb/Unicode/sjis-0213-2004-std.txt
 create mode 100644 src/backend/utils/mb/char_converter.c

diff --git a/src/backend/utils/mb/Makefile b/src/backend/utils/mb/Makefile
index 89bec21..d48e729 100644
--- a/src/backend/utils/mb/Makefile
+++ b/src/backend/utils/mb/Makefile
@@ -14,6 +14,8 @@ include $(top_builddir)/src/Makefile.global
 
 OBJS = encnames.o conv.o mbutils.o wchar.o wstrcmp.o wstrncmp.o
 
+conv.o: conv.c char_converter.c
+
 include $(top_srcdir)/src/backend/common.mk
 
 clean distclean maintainer-clean:
diff --git a/src/backend/utils/mb/Unicode/.gitignore b/src/backend/utils/mb/Unicode/.gitignore
new file mode 100644
index 000..3908cc3
--- /dev/null
+++ b/src/backend/utils/mb/Unicode/.gitignore
@@ -0,0 +1,11 @@
+# ignore backup files of editors
+/*[~#]
+
+# ignore authority files
+/*.TXT
+/*.txt
+/*.xml
+
+# ignore generated files
+/map_checker
+/map_checker.h
diff --git a/src/backend/utils/mb/Unicode/Makefile b/src/backend/utils/mb/Unicode/Makefile

Re: [HACKERS] pageinspect: Hash index support

2017-02-02 Thread Ashutosh Sharma
>> I think it's OK to check hash_bitmap_info() or any other functions
>> with different page types at least once.
>>
>> [1]- 
>> https://www.postgresql.org/message-id/CA%2BTgmoZUjrVy52TUU3b_Q5L6jcrt7w5R4qFwMXdeCuKQBmYWqQ%40mail.gmail.com
>
> Sure, I just don't know if we need to test them 4 or 5 times.  But I
> think this is good enough for now - it's not like this code is written
> in stone once committed.
>
> So, committed.  Wow, I wish every patch had this many reviewers.
>

Thanks Robert and Thank you to all the experts for showing their
interest towards this project.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?

2017-02-02 Thread Kyotaro HORIGUCHI
At Thu, 2 Feb 2017 15:34:33 +0900, Michael Paquier  
wrote in 
> On Thu, Feb 2, 2017 at 1:26 AM, Fujii Masao  wrote:
> > I'm afraid that many WAL segments would start with a continuation record
> > when there are the workload of short transactions (e.g., by pgbench), and
> > which would make restart_lsn go behind very much. No?
> 
> I don't quite understand this argument. Even if there are many small
> transactions, that would cause restart_lsn to just be late by one
> segment, all the time.
> 
> > The discussion on this thread just makes me think that restart_lsn should
> > indicate the replay location instead of flush location. This seems safer.
> 
> That would penalize WAL retention on the primary with standbys using
> recovery_min_apply_delay and a slot for example...
> 
> We can attempt to address this problem two ways. The patch proposed
> (ugly btw and there are two typos!) is doing it in the WAL sender by
> not making restart_lsn jump to the next segment if a continuation
> record is found.

Sorry for the ug..:p Anyway, the previous version was not the
latest. The attached one is the revised version. (Sorry, I
haven't find a typo by myself..)

>  Or we could have the standby request for the next
> segment instead if the record it wants to replay from is at a boundary
> and that it locally has the beginning of the record, and it has it
> because it already confirmed to the primary that it flushed to the
> next segment. Not sure which fix is better though.

We could it as I said, with some refactoring ReadRecord involving
reader plugin mechanism..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From d835bf248e6869f7b843d339c9213a082e332297 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 1 Feb 2017 16:07:22 +0900
Subject: [PATCH] Fix a bug of physical replication slot.

A physical-replication standby can stop just at the boundary of WAL
segments. restart_lsn of the slot on the master can be assumed to be
the same location. The last segment on the master will be removed
after some checkpoints for the case. If the first record of the next
segment is a continuation record, it is only on the master and its
beginning is only on the standby so the standby cannot restart because
the record to read is scattered to two sources.

This patch detains restart_lsn in the last sgement when the first page
of the next segment is a continuation record.
---
 src/backend/replication/walsender.c | 104 +---
 1 file changed, 97 insertions(+), 7 deletions(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 76f09fb..0ec7ba9 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -188,6 +188,13 @@ static volatile sig_atomic_t replication_active = false;
 static LogicalDecodingContext *logical_decoding_ctx = NULL;
 static XLogRecPtr logical_startptr = InvalidXLogRecPtr;
 
+/*
+ * This variable corresponds to restart_lsn in pg_replication_slots for a
+ * physical slot. This has a valid value only when it differs from the current
+ * flush pointer.
+ */
+static XLogRecPtr	   restartLSN = InvalidXLogRecPtr;
+
 /* Signal handlers */
 static void WalSndSigHupHandler(SIGNAL_ARGS);
 static void WalSndXLogSendHandler(SIGNAL_ARGS);
@@ -220,7 +227,7 @@ static void WalSndPrepareWrite(LogicalDecodingContext *ctx, XLogRecPtr lsn, Tran
 static void WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, bool last_write);
 static XLogRecPtr WalSndWaitForWal(XLogRecPtr loc);
 
-static void XLogRead(char *buf, XLogRecPtr startptr, Size count);
+static bool XLogRead(char *buf, XLogRecPtr startptr, Size count, bool noutfoundok);
 
 
 /* Initialize walsender process before entering the main command loop */
@@ -504,6 +511,9 @@ StartReplication(StartReplicationCmd *cmd)
 			ereport(ERROR,
 	(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 	 (errmsg("cannot use a logical replication slot for physical replication";
+
+		/* Restore restartLSN from replication slot */
+		restartLSN = MyReplicationSlot->data.restart_lsn;
 	}
 
 	/*
@@ -519,6 +529,10 @@ StartReplication(StartReplicationCmd *cmd)
 	else
 		FlushPtr = GetFlushRecPtr();
 
+	/* Set InvalidXLogRecPtr if catching up */
+	if (restartLSN == FlushPtr)
+		restartLSN = InvalidXLogRecPtr;
+
 	if (cmd->timeline != 0)
 	{
 		XLogRecPtr	switchpoint;
@@ -727,7 +741,7 @@ logical_read_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, int req
 		count = flushptr - targetPagePtr;
 
 	/* now actually read the data, we know it's there */
-	XLogRead(cur_page, targetPagePtr, XLOG_BLCKSZ);
+	XLogRead(cur_page, targetPagePtr, XLOG_BLCKSZ, false);
 
 	return count;
 }
@@ -1486,7 +1500,7 @@ static void
 ProcessStandbyReplyMessage(void)
 {
 	XLogRecPtr	writePtr,
-

Re: [HACKERS] Logical Replication and Character encoding

2017-02-02 Thread Kyotaro HORIGUCHI
Hello,

At Fri, 3 Feb 2017 09:16:47 +0800, Craig Ringer  wrote 
in 
> On 2 February 2017 at 11:45, Euler Taveira  wrote:
> 
> > I don't think storage without conversion is an acceptable approach. We
> > should provide options to users such as ignore tuple or NULL for
> > column(s) with conversion problem. I wouldn't consider storage data
> > without conversion because it silently show incorrect data and we
> > historically aren't flexible with conversion routines.

It is possible technically. But changing the behavior of a
subscript and/or publication requires change of SQL syntax. It
seems a bit too late for proposing such a new feature..

IMHO unintentional silent data loss must not be happen so the
default behavior on conversion failure cannot be other than stop
of replication.

> pglogical and BDR both require identical encoding; they test for this
> during startup and refuse to replicate if the encoding differs.
> 
> For the first pass at core I suggest a variant on that policy: require
> source and destination encoding to be the same. This should probably
> be the first change, since it definitively prevents the issue from
> arising.

If the check is performed by BDR, pglogical itself seems to be
allowed to convert strings when the both end have different
encodings in a non-BDR environment. Is this right?

> If time permits we could also allow destination encoding to be UTF-8
> (including codepage 65001) with any source encoding. This requires
> encoding conversion to be performed, of course.

Does this mean that BDR might work on heterogeneous encoding
environemnt? But anyway some encodings (like SJIS) have
caharacters with the same destination in its mapping so BDR
doesn't seem to work with such conversions. So each encoding
might should have a property to inform its usability under BDR
environment, but...

On the other hand, no prolem is seen in encoding conversions in
non-BDR environments. (except the behavior on failure)

> The downside is that this will impact users who use a common subset of
> two encodings. This is most common for Windows-1252 <-> ISO-8859-15
> (or -1 if you're old-school) but also arises anywhere the common 7 bit
> subset is used. Until we can define an encoding exception policy
> though, I think we should defer supporting those and make them a
> "later" problem.

If the conversion is rejected for now, we should check the
encoding identity instead.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2017-02-02 Thread Amit Langote
On 2017/01/31 6:42, Peter Eisentraut wrote:
> On 1/25/17 12:54 AM, Ashutosh Bapat wrote:
>> The documentation available at
>> https://www.postgresql.org/docs/devel/static/sql-createtable.html,
>> does not make it clear that the lower bound of a range partition is
>> always inclusive and the higher one is exclusive. I think a note in
>> section " PARTITION OF parent_table FOR VALUES partition_bound_spec"
>> would be helpful.
> 
> Hmm.  I see the practical use of that, but I think this is going to be a
> source of endless confusion.  Can we make that a bit clearer in the
> syntax, for example by using additional keywords (INCLUSIVE/EXCLUSIVE)?

The decision not to make that configurable with INCLUSIVE/EXCLUSIVE syntax
was deliberate.  To summarize, we can start with a default configuration
catering to most practical cases (that is, inclusive lower and exclusive
upper bounds) and documenting so (not done yet, which I will post a doc
patch today for).  If it turns out that there is some demand for making
that configurable, we can later add the code to handle that internally
plus the syntax.  But *starting* with that syntax means we have to
potentially needlessly carry the code to handle seldom used cases that
could not be made as efficient as it is now with all lower bounds being
inclusive and upper bounds exclusive.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CA%2BTgmoZou4ApEvC_nfhOxsi5G4SoD_evwNaiYn60ZcJ4XB_-QQ%40mail.gmail.com




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)

2017-02-02 Thread Corey Huinker
>
>
> On reflection, it seems fairly improbable to me that people would use
> \if and friends interactively.  They're certainly useful for scripting,
> but would you really type commands that you know are going to be ignored?
>

I'm thinking the one use-case is where a person is debugging a
non-interactive script, cuts and pastes it into an interactive script, and
then scrolls through command history to fix the bits that didn't work. So,
no, you wouldn't *type* them per se, but you'd want the session as if you
had. The if-then barking might really be useful in that context.


>
> Therefore, I don't think we should stress out about fitting branching
> activity into the prompts.  That's just not the use-case.  (Note: we
> might well have to reconsider that if we get looping, but for now it's
> not a problem.)  Moreover, if someone is confused because they don't
> realize they're inside a failed \if, it's unlikely that a subtle change in
> the prompt would help them.  So your more in-the-face approach of printing
> messages seems good to me.
>

Glad you like the barking. I'm happy to let the prompt issue rest for now,
we can always add it later.

If we DID want it, however, I don't think it'll be hard to add a special
prompt (Thinking %T or %Y because they both look like branches, heh), and
it could print the if-state stack, maybe something like:

\if true
  \if true
 \if false
\if true

With a prompt1 of '%T> ' Would then resolve to

ttfi>

for true, true, false, ignored.

This is just idle musing, I'm perfectly happy to leave it out entirely.


> This seems more or less reasonable, although:
>
> > # \endif
> > active \endif, executing commands
>
> looks a bit weird.  Maybe instead say "exited \if, executing commands"?
>

+1


>
> BTW, what is your policy about nesting these things in include files?
> My immediate inclination is that if we hit EOF with open \if state,
> we should drop it and revert to the state in the surrounding file.
> Otherwise things will be way too confusing.
>

That's how it works now if you have ON_ERROR_STOP off, plus an error
message telling you about the imbalance. If you have ON_ERROR_STOP on, it's
fatal.

All \if-\endif pairs must be balanced within a file (well, within a
MainLoop, but to the user it looks like a file). Every new file opened via
\i or \ir starts a new if-stack. Because commands in an inactive branch are
never executed, we don't have to worry about the state of the parent stack
when we do a \i, because we know it's trues all the way down.

We chose this not so much because if-endif needed it (we could have thrown
it into the pset struct just as easily), but because of the issues that
might come up with a \while loop: needing to remember previous GOTO points
in a file (if the input even *is* a file...) is going to be hard enough,
remembering them across files would be harder, and further complicated by
the possibility that a file \included on one iteration might not be
included on the next (or vice versa)...and like you said, way too confusing.


Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION

2017-02-02 Thread Kyotaro HORIGUCHI
At Fri, 3 Feb 2017 01:02:47 +0900, Fujii Masao  wrote in 

Re: [HACKERS] Constraint exclusion failed to prune partition in case of partition expression involves function call

2017-02-02 Thread Amit Langote
On 2017/02/02 21:09, amul sul wrote:
> Hi,
> 
> In following case, constraint exclusion not able prune partition (even
> if function is immutable), is this know behaviour?

Yes.  The where condition in your example query does not specify the
partition key column, so constraint exclusion won't work, which requires
variable in the condition to be spelled out exactly same as the partition
key column.  Here the partitioning code is going to return check
constraints of the form abs(a) = 0 for foo_list1, abs(a) = 1 for foo_list2
and so on, for the constraint exclusion logic to work upon.

> --Explain plan
> postgres=# explain select * from foo_list where a = 2;
>QUERY PLAN
> -
>  Append  (cost=0.00..103.50 rows=25 width=36)
>->  Seq Scan on foo_list  (cost=0.00..0.00 rows=1 width=36)
>  Filter: (a = 2)
>->  Seq Scan on foo_list1  (cost=0.00..25.88 rows=6 width=36)
>  Filter: (a = 2)
>->  Seq Scan on foo_list2  (cost=0.00..25.88 rows=6 width=36)
>  Filter: (a = 2)
>->  Seq Scan on foo_list3  (cost=0.00..25.88 rows=6 width=36)
>  Filter: (a = 2)
>->  Seq Scan on foo_list4  (cost=0.00..25.88 rows=6 width=36)
>  Filter: (a = 2)
> (11 rows)

If you try with where abs(a) = 2, it works:

explain select * from foo_list where abs(a) = 2;
   QUERY PLAN
-
 Append  (cost=0.00..29.05 rows=7 width=36)
   ->  Seq Scan on foo_list  (cost=0.00..0.00 rows=1 width=36)
 Filter: (abs(a) = 2)
   ->  Seq Scan on foo_list3  (cost=0.00..29.05 rows=6 width=36)
 Filter: (abs(a) = 2)
(5 rows)

See an old exchange at the link below for a kind of similar example and
some explanations about why the thing that one thinks would or should
happen doesn't happen.

https://www.postgresql.org/message-id/CA%2BTgmoaE9NZ_RiqZQLp2aJXPO4E78QxkQYL-FR2zCDop96Ahdg%40mail.gmail.com

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)

2017-02-02 Thread Tom Lane
Corey Huinker  writes:
>> As it is, I've added interactive mode psql_error notifications about the
>> resulting branching state of any branching commands, and any attempt to
>> send non-branching commands or queries while in an inactive branch will
>> generate a psql_error saying that the command was ignored. Waiting til I
>> get what should or shouldn't be done about prompts before issuing the next
>> patch revision.

On reflection, it seems fairly improbable to me that people would use
\if and friends interactively.  They're certainly useful for scripting,
but would you really type commands that you know are going to be ignored?

Therefore, I don't think we should stress out about fitting branching
activity into the prompts.  That's just not the use-case.  (Note: we
might well have to reconsider that if we get looping, but for now it's
not a problem.)  Moreover, if someone is confused because they don't
realize they're inside a failed \if, it's unlikely that a subtle change in
the prompt would help them.  So your more in-the-face approach of printing
messages seems good to me.

> So far, interactive branching information will look like this (it prints on
> every branching command and on every ignored command):

This seems more or less reasonable, although:

> # \endif
> active \endif, executing commands

looks a bit weird.  Maybe instead say "exited \if, executing commands"?

BTW, what is your policy about nesting these things in include files?
My immediate inclination is that if we hit EOF with open \if state,
we should drop it and revert to the state in the surrounding file.
Otherwise things will be way too confusing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical Replication and Character encoding

2017-02-02 Thread Craig Ringer
On 2 February 2017 at 11:45, Euler Taveira  wrote:

> I don't think storage without conversion is an acceptable approach. We
> should provide options to users such as ignore tuple or NULL for
> column(s) with conversion problem. I wouldn't consider storage data
> without conversion because it silently show incorrect data and we
> historically aren't flexible with conversion routines.

pglogical and BDR both require identical encoding; they test for this
during startup and refuse to replicate if the encoding differs.

For the first pass at core I suggest a variant on that policy: require
source and destination encoding to be the same. This should probably
be the first change, since it definitively prevents the issue from
arising.

If time permits we could also allow destination encoding to be UTF-8
(including codepage 65001) with any source encoding. This requires
encoding conversion to be performed, of course.

The downside is that this will impact users who use a common subset of
two encodings. This is most common for Windows-1252 <-> ISO-8859-15
(or -1 if you're old-school) but also arises anywhere the common 7 bit
subset is used. Until we can define an encoding exception policy
though, I think we should defer supporting those and make them a
"later" problem.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect: Hash index support

2017-02-02 Thread Michael Paquier
On Fri, Feb 3, 2017 at 4:28 AM, Jesper Pedersen
 wrote:
> On 02/02/2017 02:24 PM, Robert Haas wrote:
>>
>> So, committed.  Wow, I wish every patch had this many reviewers.
>>
>
> Thanks Robert !

9 people in total per the commit message. Yes that's rare, and thanks
for doing the effort to list everybody.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-02 Thread Noah Misch
On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote:
> On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:
> >Attached please find my patch for XLC/AIX.
> >The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
> >The comment in this file says that:
> >
> >   * __fetch_and_add() emits a leading "sync" and trailing "isync",
> >thereby
> >   * providing sequential consistency.  This is undocumented.
> >
> >But it is not true any more (I checked generated assembler code in
> >debugger).
> >This is why I have added __sync() to this function. Now pgbench working
> >normally.

Konstantin, does "make -C src/bin/pg_bench check" fail >10% of the time in the
bad build?

> Seems like it was not so much undocumented, but an implementation detail
> that was not guaranteed after all..

Seems so.

> There was a long thread on these things the last time this was changed: 
> https://www.postgresql.org/message-id/20160425185204.jrvlghn3jxulsb7i%40alap3.anarazel.de.
> I couldn't find an explanation there of why we thought that fetch_and_add
> implicitly performs sync and isync.

It was in the generated code, for AIX xlc 12.1.0.0.

> >Also there is mysterious disappearance of assembler section function
> >with sync instruction from pg_atomic_compare_exchange_u32_impl.
> >I have fixed it by using __sync() built-in function instead.
> 
> __sync() seems more appropriate there, anyway. We're using intrinsics for
> all the other things in generic-xlc.h. But it sure is scary that the "asm"
> sections just disappeared.

That is a problem, but it's a stretch to conclude that asm sections are
generally prone to removal, while intrinsics are generally durable.

> @@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile 
> pg_atomic_uint32 *ptr,
>  static inline uint32
>  pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
>  {
> + uint32  ret;
> +
>   /*
> -  * __fetch_and_add() emits a leading "sync" and trailing "isync", 
> thereby
> -  * providing sequential consistency.  This is undocumented.
> +  * Use __sync() before and __isync() after, like in compare-exchange
> +  * above.
>*/
> - return __fetch_and_add((volatile int *)>value, add_);
> + __sync();
> +
> + ret = __fetch_and_add((volatile int *)>value, add_);
> +
> + __isync();
> +
> + return ret;
>  }

Since this emits double syncs with older xlc, I recommend instead replacing
the whole thing with inline asm.  As I opined in the last message of the
thread you linked above, the intrinsics provide little value as abstractions
if one checks the generated code to deduce how to use them.  Now that the
generated code is xlc-version-dependent, the port is better off with
compiler-independent asm like we have for ppc in s_lock.h.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical decoding of two-phase transactions

2017-02-02 Thread Craig Ringer
On 3 February 2017 at 03:34, Robert Haas  wrote:
> On Wed, Feb 1, 2017 at 4:35 PM, Craig Ringer  wrote:
>> Right. Per my comments uothread I don't see why we need to add anything more
>> to WAL here.
>>
>> Stas was concerned about what happens in logical decoding if we crash
>> between PREPSRE TRANSACTION and COMMIT PREPARED. But we'll always go back
>> and decode the whole txn again anyway so it doesn't matter.
>>
>> We can just track it on ReorderBufferTxn when we see it at PREPARE
>> TRANSACTION time.
>
> Oh, hmm.  I guess if that's how it works then we don't need it in WAL
> after all.  I'm not sure that re-decoding the already-prepared
> transaction is a very good plan, but if that's what we're doing anyway
> this patch probably shouldn't change it.

We don't have much choice at the moment.

Logical decoding must restart from the xl_running_xacts most recently
prior to the xid allocation for the oldest xact the client hasn't
confirmed receipt of decoded data + commit for. That's because reorder
buffers are not persistent; if a decoding session crashes we throw
away accumulated reorder buffers, both those in memory and those
spilled to disk. We have to re-create them by restarting decoding from
the beginning of the oldest xact of interest.

We could make reorder buffers persistent and shared between decoding
sessions but it'd totally change the logical decoding model and create
some other problems. It's certainly not a topic for this patch. So we
can take it as given that we'll always restart decoding from BEGIN
again at a crash.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-02-02 Thread Michael Paquier
On Fri, Feb 3, 2017 at 8:50 AM, Daniel Verite  wrote:
> What if we look at the change from the pessimistic angle?
> An example of confusion that the change would create:
> a lot of users currently choose pg_wal for the destination
> directory of their archive command.

Really? I find that surprising to say "a lot". Perhaps there are some,
but first I would not suspect that there are many people to use this
repository name, and even less crazy enough to store them directly in
PGDATA itself. And of course that's not on a different partition :)

> Also googling for pg_wal, I'm finding food for thought like this
> IBM technote:
> http://www-01.ibm.com/support/docview.wss?uid=isg3T1015637
> which recommends to
> "Remove all files under /var/lib/pgsql/9.0/data/pg_wal/"
> and also calls that directory the "write-ahead log directory"
> which is quite confusing because apparently it's the destination of
> their archive command.

Well this product of IBM is one.

> There's also the disruption in existing backup scripts that directly
> reference pg_xlog. Obviously these scripts are critical, and there's
> something weird in breaking that intentionally. The motivation of the
> breakage is likely to be felt as frivolous and unfair to those people who
> are adversely impacted by the change, even if the part of not
> reading the release notes is their fault.

Those are not complicated to fix because they are hard failures.
Sufficient tests need to be done so as backup scripts don't show in
red on live systems before deploying them. The original reason to do
the rename is that there are folks as well thinking that removing
pg_xlog is fine on full disk because, you know, it contains just
*logs* so they are not critical for the system.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal

2017-02-02 Thread Daniel Verite
Peter Eisentraut wrote:

> You really have (at least) three options here:
> 
> 1. Rename nothing
> 2. Rename directory only
> 3. Rename everything

I vote for 1) as I believe the renaming will create more confusion
than it's worth, not even considering the renaming of functions
and views.

What if we look at the change from the pessimistic angle?
An example of confusion that the change would create:
a lot of users currently choose pg_wal for the destination
directory of their archive command. Less-informed users
that set up archiving and/or log shipping in PG10 based on
advice online from previous versions will be fairly
confused about the missing pg_xlog, and the fact that the
pg_wal directory they're supposed to create already exists.

Also googling for pg_wal, I'm finding food for thought like this
IBM technote:
http://www-01.ibm.com/support/docview.wss?uid=isg3T1015637
which recommends to 
"Remove all files under /var/lib/pgsql/9.0/data/pg_wal/"
and also calls that directory the "write-ahead log directory"
which is quite confusing because apparently it's the destination of
their archive command.

This brings the question: what about the people who will delete
their pg_wal (ex-pg_xlog) when they face a disk-full condition and
they mix up in their mind the archive directory and the WAL directory?
It's hard to guess how many could make that mistake but I don't see
why it would be less probable than confusing pg_xlog and pg_log.
At least the contents of the latter directories look totally
different, contrary to the wal directory versus wal archive.

Also, what about the users who are helped currently by
the fact that "pg_xlog" is associated at the top of google results
to good articles that explain what it is, why it fills up and what
to do about it? By burying the name "pg_xlog" we're also loosing that
connection, and in the worst case making worse the problem we
wanted to help with.

There's also the disruption in existing backup scripts that directly
reference pg_xlog. Obviously these scripts are critical, and there's
something weird in breaking that intentionally. The motivation of the
breakage is likely to be felt as frivolous and unfair to those people who
are adversely impacted by the change, even if the part of not
reading the release notes is their fault.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)

2017-02-02 Thread Corey Huinker
>
>
> All,
> As it is, I've added interactive mode psql_error notifications about the
> resulting branching state of any branching commands, and any attempt to
> send non-branching commands or queries while in an inactive branch will
> generate a psql_error saying that the command was ignored. Waiting til I
> get what should or shouldn't be done about prompts before issuing the next
> patch revision.
>

So far, interactive branching information will look like this (it prints on
every branching command and on every ignored command):

# \if true
active \if, executing commands
# select 1;
 ?column?
--
1
(1 row)

Time: 0.282 ms
# \else
inactive \else, ignoring commands
# select 1;
inside inactive branch, query ignored.
# select
... # 1;
inside inactive branch, query ignored.
# \endif
active \endif, executing commands
# \if false
inactive \if, ignoring commands
# \i file_name
inside inactive branch, command ignored.
# \elif false
inactive \elif, ignoring commands
# \else
active \else, executing commands
# \endif
active \endif, executing commands


Comments welcome.


Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)

2017-02-02 Thread Corey Huinker
On Wed, Feb 1, 2017 at 4:58 PM, Jim Nasby  wrote:

> I think the issue here is that the original case for this is a user
> accidentally getting into an \if and then having no clue what's going on.
> That's similar to what happens when you miss a quote or a semicolon. We
> handle those cases with %R, and I think %R needs to support if as well.
>
> Perhaps there's value to providing more info (active branch, etc), but
> ISTM trying to do that will just confuse the original (%R) case.
>

Jim,

After spending a few minutes to familiarize myself with %R, I'm in
agreement with your second statement (adding if-else to %R will just
confuse %R). However, your first statement seems to indicate the opposite.
Can you elaborate?

All,
As it is, I've added interactive mode psql_error notifications about the
resulting branching state of any branching commands, and any attempt to
send non-branching commands or queries while in an inactive branch will
generate a psql_error saying that the command was ignored. Waiting til I
get what should or shouldn't be done about prompts before issuing the next
patch revision.


Re: [HACKERS] TRAP: FailedAssertion("!(hassrf)", File: "nodeProjectSet.c", Line: 180)

2017-02-02 Thread Erik Rijkers

On 2017-02-02 22:44, Tom Lane wrote:

Erik Rijkers  writes:

Something is broken in HEAD:


Fixed, thanks for the report!



Indeed, the complicated version of the script runs again as before.

Thank you very much,

Erik Rijkers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TRAP: FailedAssertion("!(hassrf)", File: "nodeProjectSet.c", Line: 180)

2017-02-02 Thread Tom Lane
Erik Rijkers  writes:
> Something is broken in HEAD:

Fixed, thanks for the report!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] possible self-deadlock window after bad ProcessStartupPacket

2017-02-02 Thread Jimmy Yih
Hello,

There may possibly be a very small window for a double exit() self-deadlock
during a forked backend process's ProcessStartupPacket returns and status
is not STATUS_OK.  The process will go into proc_exit and then a very well
timed SIGQUIT will call startup_die for another proc_exit.  If timed
correctly, the two exit() calls can deadlock since exit() is not
re-entrant.  It seems extremely hard to hit the deadlock but I believe the
opportunity is there.

Using gdb, I was able to create the window and get this stacktrace:

#0  startup_die (postgres_signal_arg=0) at postmaster.c:5090
#1  
#2  proc_exit_prepare (code=0) at ipc.c:158
#3  0x007c4135 in proc_exit (code=0) at ipc.c:102
#4  0x0076b736 in BackendInitialize (port=0x2c13740) at
postmaster.c:4207
#5  0x0076b190 in BackendStartup (port=0x2c13740) at postmaster.c:3979
#6  0x00767ad3 in ServerLoop () at postmaster.c:1722
#7  0x007671df in PostmasterMain (argc=3, argv=0x2bebad0) at
postmaster.c:1330
#8  0x006b5df6 in main (argc=3, argv=0x2bebad0) at main.c:228

I got the stacktrace by doing the following:

   1. gdb attach to postmaster and run set follow-fork child and break
   postmaster.c:4206(right after ProcessStartupPacket) and continue
   2. In another terminal, open a psql session which should trigger the gdb
   follow
   3. In the gdb session, set status=1 and step into proc_exit()
   4. In another terminal, kill -s QUIT  to send SIGQUIT to the
   child process. Or run pg_ctl stop -M immediate.
   5. In the gdb session, step to process the signal into startup_die and
   run bt

This was discovered while hacking on Greenplum Database (currently based
off of Postgres 8.3) where we recently started encountering the
self-deadlock intermittently in our testing environment.

Here's the pull request discussion:
https://github.com/greenplum-db/gpdb/pull/1662

In that pull request, we fix the issue by checking for proc_exit_inprogress.
Is there a reason why startup_die should not check for proc_exit_inprogress?

In the above pull request, Heikki also mentions that a similar scenario can
happen during palloc() as well... which is similar to what we saw in
Greenplum a couple years back for a deadlock in a malloc() call where we
responded by changing exit() to _exit() in quickdie as a fix.  That could
possibly be applicable to latest Postgres as well.

Regards,
Jimmy


Re: [HACKERS] Enabling replication connections by default in pg_hba.conf

2017-02-02 Thread Simon Riggs
On 2 February 2017 at 18:48, Peter Eisentraut
 wrote:
> On 2/2/17 8:32 AM, Simon Riggs wrote:
>> I think we should remove the "replication" false database concept in
>> pg_hba.conf altogether and allow any valid pg_hba rule to invoke a
>> replication connection, if one is requested. Roles would still need
>> the REPLICATION capability before this would be allowed. Having both
>> of those things doesn't materially improve security control.
>
> It's weirdly inconsistent now.  You need a "replication" line in
> pg_hba.conf to connect for logical decoding, but you can't restrict that
> to a specific database because the database column in pg_hba.conf is
> occupied by the "replication" key word.

Agreed. Change needed.

> However, you would still want a way to configure a user for logical
> decoding for any database but no physical replication, or vice versa.
> Just getting rid of the replication key word would prevent that, I think.

We currently have REPLICATION to mean "physical replication".

I think if we have another capability for logical replication then we
would be able to easily allow one or the other, or both, so I don't
see a problem here that forces us to keep pg_hba.conf the way it is.

>> It would also be useful to be able prevent users with REPLICATION
>> capability from connecting as normal users, if the are marked as
>> NOLOGIN.
>
> That sounds useful.
>
> (Superusers not have the replication attribute by default is an
> additional possible annoyance.)

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time to up bgwriter_lru_maxpages?

2017-02-02 Thread Andres Freund
On 2017-02-02 14:47:53 -0500, Robert Haas wrote:
> I expect that increasing the maximum value of shared_buffers beyond
> what can be stored by an integer could have a noticeable distributed
> performance cost for the entire system.  It might be a pretty small
> cost, but then again maybe not; for example, BufferDesc's buf_id
> member would have to get wider, and probably the freeNext member, too.
> Andres already did unspeakable things to make a BufferDesc fit into
> one cache line for performance reasons, so that wouldn't be great
> news.

Yea, we'd have to get rid of BufferDesc's buf_id - but that's not that
hard, just a bit verbose. You can get the buf_id already with a tiny bit
of pointer math.  I don't think there should be too many other changes,
but I might be wrong.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time to up bgwriter_lru_maxpages?

2017-02-02 Thread Andres Freund
On 2017-02-02 11:41:44 -0800, Jim Nasby wrote:
> On 2/1/17 4:28 PM, Andres Freund wrote:
> > On 2016-11-28 11:40:53 -0800, Jim Nasby wrote:
> > > With current limits, the most bgwriter can do (with 8k pages) is 1000 
> > > pages
> > > * 100 times/sec = 780MB/s. It's not hard to exceed that with modern
> > > hardware. Should we increase the limit on bgwriter_lru_maxpages?
> >
> > FWIW, I think working on replacing bgwriter (e.g. by working on the
> > patch I send with a POC replacement) wholesale is a better approach than
> > spending time increasing limits.
>
> Do you have a link to that? I'm not seeing anything in the archives.

Not at hand, but I can just give you the patches.  These are very likely
to conflict, but it shouldn't be too hard to resolve...

What it basically does is move as much of the clock-sweep to bgwriter,
which keeps a certain number of clean pages available.  There's a
lock-free ringbuffer that backends can just pick pages off.

The approach with the ringbuffer has the advantage that with relatively
small changes we can scale to having multiple bgwriters (needs some
changes in the ringbuf implementation, but not that many).

Andres
>From c008aad0445d2d8b1247597940ce09dbd60e Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Fri, 19 Feb 2016 12:07:51 -0800
Subject: [PATCH 1/2] Basic lockless single producer, multiple consumer
 ringbuffer

---
 src/backend/lib/Makefile  |   2 +-
 src/backend/lib/ringbuf.c | 162 ++
 src/include/lib/ringbuf.h |  60 +
 3 files changed, 223 insertions(+), 1 deletion(-)
 create mode 100644 src/backend/lib/ringbuf.c
 create mode 100644 src/include/lib/ringbuf.h

diff --git a/src/backend/lib/Makefile b/src/backend/lib/Makefile
index 2d2ba84fe9..03a4795706 100644
--- a/src/backend/lib/Makefile
+++ b/src/backend/lib/Makefile
@@ -13,6 +13,6 @@ top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
 OBJS = binaryheap.o bipartite_match.o hyperloglog.o ilist.o pairingheap.o \
-   rbtree.o stringinfo.o
+   rbtree.o ringbuf.o stringinfo.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/lib/ringbuf.c b/src/backend/lib/ringbuf.c
new file mode 100644
index 00..62761cc447
--- /dev/null
+++ b/src/backend/lib/ringbuf.c
@@ -0,0 +1,162 @@
+/*-
+ *
+ * ringbuf.c
+ *	  Single writer, multiple reader, lock-free ringbuffer.
+ *
+ * ...
+ *
+ * Copyright (c) 2015, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/backend/lib/ringbuf.c
+ *
+ *-
+ */
+
+#include "postgres.h"
+
+#include "lib/ringbuf.h"
+#include "storage/proc.h"
+
+static inline uint32
+ringbuf_backendid(ringbuf *rb, uint32 pos)
+{
+	return pos & 0x;
+}
+
+/*
+ * Compute the new offset, slightly complicated by the fact that we only want
+ * to modify the lower 16 bits.
+ */
+static inline uint32
+ringbuf_advance_pos(uint32 pos)
+{
+	return ((pos + 1) & 0x) | (pos & 0x);
+}
+
+uint32
+ringbuf_elements(ringbuf *rb)
+{
+	uint32 read_off = ringbuf_pos(rb, pg_atomic_read_u32(>read_state));
+	uint32 write_off = ringbuf_pos(rb, rb->write_off);
+
+	/* not wrapped around */
+	if (read_off <= write_off)
+	{
+		return write_off - read_off;
+	}
+
+	/* wrapped around */
+	return (rb->size - read_off) + write_off;
+}
+
+size_t
+ringbuf_size(size_t nelems)
+{
+	Assert(nelems <= 0x);
+	return sizeof(ringbuf) + sizeof(void *) * nelems;
+}
+
+/*
+ * Memory needs to be externally allocated and be at least
+ * ringbuf_size(nelems) large.
+ */
+ringbuf *
+ringbuf_create(void *target, size_t nelems)
+{
+	ringbuf *rb = (ringbuf *) target;
+
+	Assert(nelems <= 0x);
+
+	memset(target, 0, ringbuf_size(nelems));
+
+	rb->size = nelems;
+	pg_atomic_init_u32(>read_state, 0);
+	rb->write_off = 0;
+
+	return rb;
+}
+
+bool
+ringbuf_push(ringbuf *rb, void *data)
+{
+	uint32 read_off = pg_atomic_read_u32(>read_state);
+
+	/*
+	 * Check if full - can be outdated, but that's ok. New readers are just
+	 * going to further consume elements, never cause the buffer to become
+	 * full.
+	 */
+	if (ringbuf_pos(rb, read_off) == ringbuf_pos(rb, rb->write_off + 1))
+	{
+		return false;
+	}
+
+	rb->elements[ringbuf_pos(rb, rb->write_off)] = data;
+
+	/*
+	 * The write adding the data needs to be visible before the corresponding
+	 * increase of write_off is visible.
+	 */
+	pg_write_barrier();
+
+	rb->write_off = ringbuf_advance_pos(rb->write_off);
+
+	return true;
+}
+
+
+bool
+ringbuf_pop(ringbuf *rb, void **data)
+{
+	void *ret;
+	uint32 mybackend = MyProc->backendId;
+
+	Assert((mybackend & 0x) == mybackend);
+
+	while (true)
+	{
+		uint32 read_state = pg_atomic_read_u32(>read_state);
+		uint32 read_off = ringbuf_pos(rb, read_state);
+		uint32 old_read_state = read_state;
+
+		/* check if empty - can be 

Re: [HACKERS] Time to up bgwriter_lru_maxpages?

2017-02-02 Thread Robert Haas
On Wed, Feb 1, 2017 at 9:47 PM, Jim Nasby  wrote:
> Before doing that the first thing to look at would be why the limit is
> currently INT_MAX / 2 instead of INT_MAX.

Generally the rationale for GUCs with limits of that sort is that
there is or might be code someplace that multiplies the value by 2 and
expects the result not to overflow.

I expect that increasing the maximum value of shared_buffers beyond
what can be stored by an integer could have a noticeable distributed
performance cost for the entire system.  It might be a pretty small
cost, but then again maybe not; for example, BufferDesc's buf_id
member would have to get wider, and probably the freeNext member, too.
Andres already did unspeakable things to make a BufferDesc fit into
one cache line for performance reasons, so that wouldn't be great
news.

Anyway, I committed the patch posted here.  Or the important line out
of the two, anyway.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time to up bgwriter_lru_maxpages?

2017-02-02 Thread Jim Nasby

On 2/1/17 4:28 PM, Andres Freund wrote:

On 2016-11-28 11:40:53 -0800, Jim Nasby wrote:

With current limits, the most bgwriter can do (with 8k pages) is 1000 pages
* 100 times/sec = 780MB/s. It's not hard to exceed that with modern
hardware. Should we increase the limit on bgwriter_lru_maxpages?


FWIW, I think working on replacing bgwriter (e.g. by working on the
patch I send with a POC replacement) wholesale is a better approach than
spending time increasing limits.


Do you have a link to that? I'm not seeing anything in the archives.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical decoding of two-phase transactions

2017-02-02 Thread Robert Haas
On Wed, Feb 1, 2017 at 4:35 PM, Craig Ringer  wrote:
> Right. Per my comments uothread I don't see why we need to add anything more
> to WAL here.
>
> Stas was concerned about what happens in logical decoding if we crash
> between PREPSRE TRANSACTION and COMMIT PREPARED. But we'll always go back
> and decode the whole txn again anyway so it doesn't matter.
>
> We can just track it on ReorderBufferTxn when we see it at PREPARE
> TRANSACTION time.

Oh, hmm.  I guess if that's how it works then we don't need it in WAL
after all.  I'm not sure that re-decoding the already-prepared
transaction is a very good plan, but if that's what we're doing anyway
this patch probably shouldn't change it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical decoding of two-phase transactions

2017-02-02 Thread Robert Haas
On Wed, Feb 1, 2017 at 2:32 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Also, including the GID in the WAL for each COMMIT/ABORT PREPARED
>> doesn't seem inordinately expensive to me.
>
> I'm confused ... isn't it there already?  If not, how do we handle
> reconstructing 2PC state from WAL at all?

By XID.  See xl_xact_twophase, which gets included in xl_xact_commit
or xl_xact_abort.  The GID has got to be there in the XL_XACT_PREPARE
record, but not when actually committing/rolling back.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect: Hash index support

2017-02-02 Thread Jesper Pedersen

On 02/02/2017 02:24 PM, Robert Haas wrote:

So, committed.  Wow, I wish every patch had this many reviewers.



Thanks Robert !

Best regards,
 Jesper



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect: Hash index support

2017-02-02 Thread Robert Haas
On Thu, Feb 2, 2017 at 6:29 AM, Ashutosh Sharma  wrote:
>>
>> +   ptr = (char *) itup + IndexInfoFindDataOffset(itup->t_info);
>> +   Assert(ptr <= uargs->page + BLCKSZ);
>>
>> I think this should be promoted to an ereport(); these functions can
>> accept an arbitrary bytea.
>
> I think we had added 'ptr' variable initially just to dump hash code
> in hexadecimal format but now since we have removed that logic from
> current code, I think it is no  more required and I have therefore
> removed it from the current patch. Below is the code that used it
> initially. It got changed as per your comment - [1]

OK.

> I think it's OK to check hash_bitmap_info() or any other functions
> with different page types at least once.
>
> [1]- 
> https://www.postgresql.org/message-id/CA%2BTgmoZUjrVy52TUU3b_Q5L6jcrt7w5R4qFwMXdeCuKQBmYWqQ%40mail.gmail.com

Sure, I just don't know if we need to test them 4 or 5 times.  But I
think this is good enough for now - it's not like this code is written
in stone once committed.

So, committed.  Wow, I wish every patch had this many reviewers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Provide list of subscriptions and publications in psql's completion

2017-02-02 Thread Peter Eisentraut
On 2/2/17 12:48 AM, Michael Paquier wrote:
> +#define Query_for_list_of_subscriptions \
> +" SELECT pg_catalog.quote_ident(subname) "\
> +"   FROM pg_catalog.pg_subscription "\
> +"  WHERE substring(pg_catalog.quote_ident(subname),1,%d)='%s'"

This query should also be qualified by current database.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Provide list of subscriptions and publications in psql's completion

2017-02-02 Thread Peter Eisentraut
On 2/2/17 12:48 PM, Fujii Masao wrote:
> +#define Query_for_list_of_subscriptions \
> +" SELECT pg_catalog.quote_ident(subname) "\
> +"   FROM pg_catalog.pg_subscription "\
> +"  WHERE substring(pg_catalog.quote_ident(subname),1,%d)='%s'"
> 
> Since non-superuser is not allowed to access to pg_subscription,
> pg_stat_subscription should be accessed here, instead. Thought?

Arguably, you could leave it like that, assuming it fails cleanly for
nonsuperusers.  Nonsuperusers are not going to be able to run any
commands on subscriptions anyway, so there is little use for it.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling replication connections by default in pg_hba.conf

2017-02-02 Thread Peter Eisentraut
On 2/2/17 8:32 AM, Simon Riggs wrote:
> I think we should remove the "replication" false database concept in
> pg_hba.conf altogether and allow any valid pg_hba rule to invoke a
> replication connection, if one is requested. Roles would still need
> the REPLICATION capability before this would be allowed. Having both
> of those things doesn't materially improve security control.

It's weirdly inconsistent now.  You need a "replication" line in
pg_hba.conf to connect for logical decoding, but you can't restrict that
to a specific database because the database column in pg_hba.conf is
occupied by the "replication" key word.

However, you would still want a way to configure a user for logical
decoding for any database but no physical replication, or vice versa.
Just getting rid of the replication key word would prevent that, I think.

> It would also be useful to be able prevent users with REPLICATION
> capability from connecting as normal users, if the are marked as
> NOLOGIN.

That sounds useful.

(Superusers not have the replication attribute by default is an
additional possible annoyance.)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2017-02-02 Thread Nico Williams
On Thu, Feb 02, 2017 at 12:14:10PM -0500, Tom Lane wrote:
> Also, somebody who wants a check like that isn't necessarily going to want
> "no WHERE clause" training wheels.  So you're going to need to think about
> facilities to enable or disable different checks.

WHERE-less-ness should be something that should be visible to a
statement trigger that could then reject the operation if desirable.

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Provide list of subscriptions and publications in psql's completion

2017-02-02 Thread Fujii Masao
On Thu, Feb 2, 2017 at 2:48 PM, Michael Paquier
 wrote:
> Hi all,
>
> While testing a bit the logical replication facility, I have bumped on
> the fast that psql's completion does not show the list of things
> already created. Attached is a patch.

+#define Query_for_list_of_subscriptions \
+" SELECT pg_catalog.quote_ident(subname) "\
+"   FROM pg_catalog.pg_subscription "\
+"  WHERE substring(pg_catalog.quote_ident(subname),1,%d)='%s'"

Since non-superuser is not allowed to access to pg_subscription,
pg_stat_subscription should be accessed here, instead. Thought?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2017-02-02 Thread David Fetter
On Thu, Feb 02, 2017 at 12:14:10PM -0500, Tom Lane wrote:
> Alvaro Herrera  writes:
> > Pavel Stehule wrote:
> >> Better to enhance this feature step by step.
> 
> > Agreed -- IMO this is a reasonable first step, except that I would
> > rename the proposed extension so that it doesn't focus solely on
> > the first step.
> 
> Quite.  The patch fails to make up its mind whether it's a trivial
> example meant as a coding demonstration, or something that's going
> to become actually useful.
> 
> In the category of "actually useful", I would put checks like "are
> there unqualified outer references in subqueries".  That's something
> we see complaints about at least once a month, I think, and it's the
> type of error that even seasoned SQL authors can make easily.  But
> the current patch is not extensible in that direction (checking for
> this in post_parse_analyze_hook seems quite impractical).
> 
> Also, somebody who wants a check like that isn't necessarily going
> to want "no WHERE clause" training wheels.  So you're going to need
> to think about facilities to enable or disable different checks.

This is just the discussion I'd hoped for.  I'll draft up a patch in
the next day or two, reflecting what's gone so far.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2017-02-02 Thread Tom Lane
Alvaro Herrera  writes:
> Pavel Stehule wrote:
>> Better to enhance this feature step by step.

> Agreed -- IMO this is a reasonable first step, except that I would
> rename the proposed extension so that it doesn't focus solely on the
> first step.

Quite.  The patch fails to make up its mind whether it's a trivial example
meant as a coding demonstration, or something that's going to become
actually useful.

In the category of "actually useful", I would put checks like "are there
unqualified outer references in subqueries".  That's something we see
complaints about at least once a month, I think, and it's the type of
error that even seasoned SQL authors can make easily.  But the current
patch is not extensible in that direction (checking for this in
post_parse_analyze_hook seems quite impractical).

Also, somebody who wants a check like that isn't necessarily going to want
"no WHERE clause" training wheels.  So you're going to need to think about
facilities to enable or disable different checks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TRAP: FailedAssertion("!(hassrf)", File: "nodeProjectSet.c", Line: 180)

2017-02-02 Thread Tom Lane
Andres Freund  writes:
> On 2017-02-01 23:27:36 -0500, Tom Lane wrote:
>> I think the appropriate fix is that, once split_pathtarget_at_srfs() has
>> computed a tentative list of SRFs it needs to evaluate, it has to make a
>> second pass to see if any of them match expressions that were assigned to
>> the next level down.  This is pretty annoying, but we'd only have to do it
>> if target_contains_srfs and context.nextlevel_contains_srfs are both true,
>> which will be a negligibly small fraction of queries in practice.

> Hm.  Can't really come up with something better, but I'm kinda tired
> too...

I wrote a patch along that line, and was just about ready to commit it
when I realized that really this is all wrong.  Fixing it this way
handles the case of

regression=# select generate_series(1,3), generate_series(1,3) + 1;
 generate_series | ?column? 
-+--
   1 |2
   2 |3
   3 |4
(3 rows)

which is what you got before v10, because the two SRFs ran in lockstep
despite being at different expression nesting levels.  However, consider

regression=# select generate_series(1,3), generate_series(2,4) + 1;
 generate_series | ?column? 
-+--
   1 |3
   2 |3
   3 |3
   1 |4
   2 |4
   3 |4
   1 |5
   2 |5
   3 |5
(9 rows)

That's *not* what you got before:

regression=# select generate_series(1,3), generate_series(2,4) + 1;
 generate_series | ?column? 
-+--
   1 |3
   2 |4
   3 |5
(3 rows)

Really the problem here is that split_pathtarget_at_srfs is completely
wrong about how to assign SRFs to different levels in a stack of
ProjectSet nodes.  It's doing that according to each SRF's top-down
nesting level, but it needs to do it bottom-up, so that a SRF is evaluated
in the k'th step if there are k-1 nested levels of SRFs in its arguments.

This is doable, but I think the routine will have to be completely
rewritten not just hacked around the edges.  Off to do that ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY

2017-02-02 Thread Alvaro Herrera
Pavan Deolasee wrote:

> I can reproduce this entire scenario using gdb sessions. This also explains
> why the patch I sent earlier helps to solve the problem.

Ouch.  Great detective work there.

I think it's quite possible that this bug explains some index errors,
such as primary keys (or unique indexes) that when recreated fail with
duplicate values: the first value inserted would get in via an UPDATE
during the CIC race window, and sometime later the second value is
inserted correctly.  Because the first one is missing from the index,
the second one does not give rise to a dupe error upon insertion, but
they are of course both detected when the index is recreated.

I'm going to study the bug a bit more, and put in some patch before the
upcoming minor tag on Monday.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2017-02-02 Thread Alvaro Herrera
Pavel Stehule wrote:

> Identification of unjoined tables should be very useful - but it is far to
> original proposal - so it can be solved separately.
> 
> This patch is simple - and usually we prefer more simple patches than one
> bigger.
> 
> Better to enhance this feature step by step.

Agreed -- IMO this is a reasonable first step, except that I would
rename the proposed extension so that it doesn't focus solely on the
first step.  I'd pick a name that suggests that various kinds of checks
are applied to queries, so "require_where" would be only one of various
options that can be enabled.  It will make no sense to enable the
require_where module in order to disallow unjoined tables.  At the same
time, I don't see us providing a dozen of "require_foo" modules.  So
we'd better start making this one extensible from the get go.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-02 Thread Konstantin Knizhnik

Hi Tony,

On 02.02.2017 17:10, REIX, Tony wrote:


Hi Konstantin

I've discussed the "zombie/exit" issue with our expert here.

- He does not think that AIX has anything special here

- If the process is marked  in ps, this is because the flag 
SEXIT is set, thus the process is blocked somewhere in the kexitx() 
syscall, waiting for something.


- In order to know what it is waiting for, the best would be to have a 
look with *kdb*.




kdb shows the following stack:

pvthread+073000 STACK:
[005E1958]slock+000578 (005E1958, 80001032 [??])
[9558].simple_lock+58 ()
[00651DBC]vm_relalias+00019C (??, ??, ??, ??, ??)
[006544AC]vm_map_entry_delete+00074C (??, ??, ??)
[00659C30]vm_map_delete+000150 (??, ??, ??, ??)
[00659D88]vm_map_deallocate+48 (??, ??)
[0011C588]kexitx+001408 (??)
[000BB08C]kexit+8C ()
___ Recovery (FFF9290) ___
WARNING: Eyecatcher/version mismatch in RWA


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2017-02-02 Thread Pavel Stehule
2017-02-02 16:34 GMT+01:00 Bruce Momjian :

> On Thu, Feb  2, 2017 at 07:18:45AM -0800, David Fetter wrote:
> > On Thu, Feb 02, 2017 at 03:16:29PM +, Bruce Momjian wrote:
> > > I just don't see this patch going in.  I think it needs are larger
> > > approach to the problems it is trying to solve.  I think it then
> > > will be very useful.
> >
> > What problems that it's trying to solve are not addressed?
>
> Unjoined tables.  Inconsistent alias references.  I think there are a
> bunch of things and if we can make a list and get a mode to warn about
> all of them, it would be very useful.
>

Identification of unjoined tables should be very useful - but it is far to
original proposal - so it can be solved separately.

This patch is simple - and usually we prefer more simple patches than one
bigger.

Better to enhance this feature step by step.

Regards

Pavel



>
> --
>   Bruce Momjian  http://momjian.us
>   EnterpriseDB http://enterprisedb.com
>
> + As you are, so once was I.  As I am, so you will be. +
> +  Ancient Roman grave inscription +
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2017-02-02 Thread David Fetter
On Thu, Feb 02, 2017 at 10:34:43AM -0500, Bruce Momjian wrote:
> On Thu, Feb  2, 2017 at 07:18:45AM -0800, David Fetter wrote:
> > On Thu, Feb 02, 2017 at 03:16:29PM +, Bruce Momjian wrote:
> > > I just don't see this patch going in.  I think it needs are
> > > larger approach to the problems it is trying to solve.  I think
> > > it then will be very useful.
> > 
> > What problems that it's trying to solve are not addressed?
> 
> Unjoined tables.  Inconsistent alias references.  I think there are
> a bunch of things and if we can make a list and get a mode to warn
> about all of them, it would be very useful.

There is always a delicate balance between putting in all that's
required for a minimum viable feature and actually getting something
out there.

As I see it, this feature's main benefit is that it prevents some very
common and at the same time very damaging kinds of harm.  It also, for
now, serves a pedagogical purpose.  It's relatively straight-forward
for someone with little or no PostgreSQL experience to look at it and
see what it does.

I originally sent the feature to cover unsubtle types of blunders, I'd
like to keep "unsubtle blunders" as the guiding principle here.

I can see where an unintentional CROSS JOIN would qualify under that
standard.  I'm not sure I understand what you mean by inconsistent
aliasing.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION

2017-02-02 Thread Fujii Masao
On Thu, Feb 2, 2017 at 2:36 PM, Michael Paquier
 wrote:
> On Thu, Feb 2, 2017 at 2:13 PM, Tom Lane  wrote:
>> Kyotaro HORIGUCHI  writes:
>>> Then, the reason for the TRY-CATCH cluase is that I found that
>>> some functions called from there can throw exceptions.
>>
>> Yes, but all LWLocks should be released by normal error recovery.
>> It should not be necessary for this code to clean that up by hand.
>> If it were necessary, there would be TRY-CATCH around every single
>> LWLockAcquire in the backend, and we'd have an unreadable and
>> unmaintainable system.  Please don't add a TRY-CATCH unless it's
>> *necessary* -- and you haven't explained why this one is.

Yes.

> Putting hands into the code and at the problem, I can see that
> dropping a subscription on a node makes it unresponsive in case of a
> stop. And that's just because calls to LWLockRelease are missing as in
> the patch attached. A try/catch problem should not be necessary.

Thanks for the patch!

With the patch, LogicalRepLauncherLock is released at the end of
DropSubscription(). But ISTM that the lock should be released just after
logicalrep_worker_stop() and there is no need to protect the removal of
replication slot with the lock.

/*
* If we found worker but it does not have proc set it is starting up,
* wait for it to finish and then kill it.
*/
while (worker && !worker->proc)
{

ISTM that the above loop in logicalrep_worker_stop() is not necessary
because LogicalRepLauncherLock ensures that the above condition is
always false. Thought? Am I missing something?

If the above condition is true, which means that there is the worker slot
having the "subid" of the worker to kill, but its "proc" has not been set yet.
Without LogicalRepLauncherLock, this situation can happen after "subid"
is set by the launcher and before "proc" is set by the worker. But
LogicalRepLauncherLock protects those operations, so logicalrep_worker_stop()
called while holding the lock should always think the above condition is false.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-02 Thread Konstantin Knizhnik

On 02.02.2017 18:20, REIX, Tony wrote:


Hi Konstantin

I have an issue with pgbench. Any idea ?




Pgbench -s options specifies scale.
Scale 1000 corresponds to 1000 million rows and requires about 16Gb at disk.


# mkdir /tmp/PGS
 # chown pgstbf.staff /tmp/PGS

 # su pgstbf

 $ /opt/freeware/bin/*initdb* -D /tmp/PGS
 The files belonging to this database system will be owned by user 
"pgstbf".

 This user must also own the server prcess.

 The database cluster will be initialized with locale "C".
 The default database encoding has accordingly been set to "SQL_ASCII".
 The default text search configuration will be set to "english".

 Data page checksums are disabled.

 fixing permissions on existing directory /tmp/PGS ... ok
 creating subdirectories ... ok
 selecting default max_connections ... 100
 selecting default shared_buffers ... 128MB
 selecting dynamic shared memory implementation ... posix
 creating configuration files ... ok
 running bootstrap script ... ok
 performing post-bootstrap initialization ... ok
 syncing data to disk ... ok

 WARNING: enabling "trust" authentication for local connections
 You can change this by editing pg_hba.conf or using the option -A, or 
--auth-local and --auth-host, the next time you run initdb.


 Success. You can now start the database server using:


 $ /opt/freeware/bin/*pg_ctl* -D /tmp/PGS -l /tmp/PGS/logfile *start*
 server starting

 $ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile status
  pg_ctl: server is running (PID: 11599920)
 /opt/freeware/bin/postgres_64 "-D" "/tmp/PGS"


 $ /usr/bin/*createdb* pgstbf
 $


 $ *pgbench* -i -s 1000
 creating tables...
 10 of 1 tuples (0%) done (elapsed 0.29 s, remaining 288.09 s)
 ...
 1 of 1 tuples (100%) done (elapsed 42.60 s, remaining 
0.00 s)
*ERROR:  could not extend file "base/16384/24614": wrote only 7680 of 
8192 bytes at block 131071**

** HINT:  Check free disk space.*
 CONTEXT:  COPY pgbench_accounts, line 7995584
 PQendcopy failed


After cleaning all /tmp/PGS and symlinking it to /home, where I have 
6GB free, I've retried and I got nearly the same:



 1 of 1 tuples (100%) done (elapsed 204.65 s, 
remaining 0.00 s)
 ERROR:  could not extend file "base/16384/16397.6": *No space left on 
device*

 HINT:  Check free disk space.
 CONTEXT:  COPY pgbench_accounts, line 51235802
PQendcopy failed


*Do I need more than 6GB ???*


*Thanks*

*Tony*


$ df -k .
Filesystem1024-blocks  Free %UsedIused %Iused Mounted on
/dev/hd1 45088768   6719484   86%   94601639% /home

bash-4.3$ pwd
/tmp/PGS

bash-4.3$ ll /tmp/PGS
lrwxrwxrwx1 root system   10 Feb  2 08:43 /tmp/PGS -> 
/home/PGS/



$ df -k
Filesystem1024-blocks  Free %UsedIused %Iused Mounted on
/dev/hd4   524288277284   48%1073314% /
/dev/hd2  6684672148896   98%4930348% /usr
/dev/hd9var   2097152314696   85%2493418% /var
/dev/hd3  3145728   2527532   20%  418 1% /tmp
*/dev/hd1 45088768   6719484   86%   94601639% /home*
/dev/hd11admin  1310721306921%7 1% /admin
/proc   - -- - - /proc
/dev/hd10opt 65273856829500   99%   93833941% /opt
/dev/livedump  2621442617761%4 1% 
/var/adm/ras/livedump

/aha- --18 1% /aha

$ cat logfile
LOG:  database system was shut down at 2017-02-02 09:08:31 CST
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
ERROR:  could not extend file "base/16384/16397.6": No space left on 
device

HINT:  Check free disk space.
CONTEXT:  COPY pgbench_accounts, line 51235802
STATEMENT:  copy pgbench_accounts from stdin



$ *ulimit -a*
core file size  (blocks, -c) 1048575
data seg size   (kbytes, -d) 131072
*file size   (blocks, -f) unlimited*
max memory size (kbytes, -m) 32768
open files  (-n) 2000
pipe size(512 bytes, -p) 64
stack size  (kbytes, -s) 32768
cpu time   (seconds, -t) unlimited
max user processes  (-u) unlimited
virtual memory  (kbytes, -v) unlimited


bash-4.3$ ll /tmp/PGS
lrwxrwxrwx1 root system   10 Feb  2 08:43 /tmp/PGS -> 
/home/PGS/

bash-4.3$ ls -l
total 120
-rw---1 pgstbf   staff 4 Feb  2 09:08 PG_VERSION
drwx--6 pgstbf   staff   256 Feb  2 09:09 base
drwx--2 pgstbf   staff  4096 Feb  2 09:09 global
-rw---1 pgstbf   staff   410 Feb  2 09:13 logfile
drwx--2 pgstbf   staff   256 Feb  2 09:08 pg_clog
drwx--2 pgstbf   staff   256 Feb  2 09:08 pg_commit_ts
drwx--2 pgstbf   staff   256 Feb  2 09:08 pg_dynshmem
-rw---1 pgstbf   staff  4462 Feb  2 09:08 

[HACKERS] Some patch triage from Brussels

2017-02-02 Thread Magnus Hagander
The developer meeting in Brussels ran out of topics :), so we have spent
some time going through the patches on the upcoming CF. We did manage to
assign a couple of patches to people (and we also discussed but didn't
actually assign all the patches that Robert has his hands in :P).

And we found one patch that was already committed but not flagged as such :)


We did not really have quorum to make any hard decisions about patches, so
we didn't do that.

However, we did come up with a list of patches that this group things we
should push back to PostgreSQL 11 at this point, in order to make it easier
to focus on the remaining ones for PostgreSQL 10 and not end up with too
many things that are half-finished. This was based on the general feeling
of the people here who had seen it, as well as when the patches were
submitted.

Our suggestion is we push the following patches back. We haven't done it,
because we figured we don't have the mandate to do the push, but we suggest
this is done fairly quickly.

* Protect syscache from bloating with negative cache entries
* pg_stat_statements with plans
* "Causal reads" mode for load balancing reads without stale data
* Vertical Clustered Index

This is not a judgement on if we think we *want* the features -- we
actually want all of them. We just don't think they're going to be ready in
time for 10.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2017-02-02 Thread Bruce Momjian
On Thu, Feb  2, 2017 at 07:18:45AM -0800, David Fetter wrote:
> On Thu, Feb 02, 2017 at 03:16:29PM +, Bruce Momjian wrote:
> > I just don't see this patch going in.  I think it needs are larger
> > approach to the problems it is trying to solve.  I think it then
> > will be very useful.
> 
> What problems that it's trying to solve are not addressed?

Unjoined tables.  Inconsistent alias references.  I think there are a
bunch of things and if we can make a list and get a mode to warn about
all of them, it would be very useful.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2017-02-02 Thread Bruce Momjian
I just don't see this patch going in.  I think it needs are larger approach to 
the problems it is trying to solve.  I think it then will be very useful.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-02 Thread REIX, Tony
Hi Konstantin

I have an issue with pgbench. Any idea ?


  # mkdir /tmp/PGS
 # chown pgstbf.staff /tmp/PGS

 # su pgstbf

 $ /opt/freeware/bin/initdb -D /tmp/PGS
 The files belonging to this database system will be owned by user "pgstbf".
 This user must also own the server prcess.

 The database cluster will be initialized with locale "C".
 The default database encoding has accordingly been set to "SQL_ASCII".
 The default text search configuration will be set to "english".

 Data page checksums are disabled.

 fixing permissions on existing directory /tmp/PGS ... ok
 creating subdirectories ... ok
 selecting default max_connections ... 100
 selecting default shared_buffers ... 128MB
 selecting dynamic shared memory implementation ... posix
 creating configuration files ... ok
 running bootstrap script ... ok
 performing post-bootstrap initialization ... ok
 syncing data to disk ... ok

 WARNING: enabling "trust" authentication for local connections
 You can change this by editing pg_hba.conf or using the option -A, or 
--auth-local and --auth-host, the next time you run initdb.

 Success. You can now start the database server using:


 $ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile start
 server starting


 $ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile status
  pg_ctl: server is running (PID: 11599920)
 /opt/freeware/bin/postgres_64 "-D" "/tmp/PGS"

 $ /usr/bin/createdb pgstbf
 $


 $ pgbench -i -s 1000
 creating tables...
 10 of 1 tuples (0%) done (elapsed 0.29 s, remaining 288.09 s)
 ...
 1 of 1 tuples (100%) done (elapsed 42.60 s, remaining 0.00 s)
 ERROR:  could not extend file "base/16384/24614": wrote only 7680 of 8192 
bytes at block 131071
 HINT:  Check free disk space.
 CONTEXT:  COPY pgbench_accounts, line 7995584
 PQendcopy failed


After cleaning all /tmp/PGS and symlinking it to /home, where I have 6GB free, 
I've retried and I got nearly the same:


 1 of 1 tuples (100%) done (elapsed 204.65 s, remaining 0.00 s)
 ERROR:  could not extend file "base/16384/16397.6": No space left on device
 HINT:  Check free disk space.
 CONTEXT:  COPY pgbench_accounts, line 51235802
PQendcopy failed


Do I need more than 6GB ???


Thanks

Tony


$ df -k .
Filesystem1024-blocks  Free %UsedIused %Iused Mounted on
/dev/hd1 45088768   6719484   86%   94601639% /home

bash-4.3$ pwd
/tmp/PGS

bash-4.3$ ll /tmp/PGS
lrwxrwxrwx1 root system   10 Feb  2 08:43 /tmp/PGS -> /home/PGS/



$ df -k
Filesystem1024-blocks  Free %UsedIused %Iused Mounted on
/dev/hd4   524288277284   48%1073314% /
/dev/hd2  6684672148896   98%4930348% /usr
/dev/hd9var   2097152314696   85%2493418% /var
/dev/hd3  3145728   2527532   20%  418 1% /tmp
/dev/hd1 45088768   6719484   86%   94601639% /home
/dev/hd11admin  1310721306921%7 1% /admin
/proc   - -- - -  /proc
/dev/hd10opt 65273856829500   99%   93833941% /opt
/dev/livedump  2621442617761%4 1% /var/adm/ras/livedump
/aha- --18 1% /aha


$ cat logfile
LOG:  database system was shut down at 2017-02-02 09:08:31 CST
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
ERROR:  could not extend file "base/16384/16397.6": No space left on device
HINT:  Check free disk space.
CONTEXT:  COPY pgbench_accounts, line 51235802
STATEMENT:  copy pgbench_accounts from stdin



$ ulimit -a
core file size  (blocks, -c) 1048575
data seg size   (kbytes, -d) 131072
file size   (blocks, -f) unlimited
max memory size (kbytes, -m) 32768
open files  (-n) 2000
pipe size(512 bytes, -p) 64
stack size  (kbytes, -s) 32768
cpu time   (seconds, -t) unlimited
max user processes  (-u) unlimited
virtual memory  (kbytes, -v) unlimited


bash-4.3$ ll /tmp/PGS
lrwxrwxrwx1 root system   10 Feb  2 08:43 /tmp/PGS -> /home/PGS/
bash-4.3$ ls -l
total 120
-rw---1 pgstbf   staff 4 Feb  2 09:08 PG_VERSION
drwx--6 pgstbf   staff   256 Feb  2 09:09 base
drwx--2 pgstbf   staff  4096 Feb  2 09:09 global
-rw---1 pgstbf   staff   410 Feb  2 09:13 logfile
drwx--2 pgstbf   staff   256 Feb  2 09:08 pg_clog
drwx--2 pgstbf   staff   256 Feb  2 09:08 pg_commit_ts
drwx--2 pgstbf   staff   256 Feb  2 09:08 pg_dynshmem
-rw---1 pgstbf   staff  4462 Feb  2 09:08 pg_hba.conf
-rw---1 pgstbf   staff  1636 Feb  2 09:08 pg_ident.conf
drwx--4 pgstbf   staff   256 Feb  2 09:08 pg_logical
drwx--4 pgstbf   staff   256 Feb 

Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE

2017-02-02 Thread David Fetter
On Thu, Feb 02, 2017 at 03:16:29PM +, Bruce Momjian wrote:
> I just don't see this patch going in.  I think it needs are larger
> approach to the problems it is trying to solve.  I think it then
> will be very useful.

What problems that it's trying to solve are not addressed?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY

2017-02-02 Thread Pavan Deolasee
On Mon, Jan 30, 2017 at 7:20 PM, Pavan Deolasee 
wrote:

>
> Based on my investigation so far and the evidence that I've collected,
> what probably happens is that after a relcache invalidation arrives at the
> new transaction, it recomputes the rd_indexattr but based on the old,
> cached rd_indexlist. So the rd_indexattr does not include the new columns.
> Later rd_indexlist is updated, but the rd_indexattr remains what it was.
>
>
I was kinda puzzled this report did not get any attention, though it's
clearly a data corruption issue. Anyways, now I know why this is happening
and can reproduce this very easily via debugger and two sessions.

The offending code is RelationGetIndexAttrBitmap(). Here is the sequence of
events:

Taking the same table as in the report:

CREATE TABLE cic_tab_accounts (
aid bigint UNIQUE,
abalance bigint,
aid1 bigint
);

1. Session S1 calls DROP INDEX pgb_a_aid1 and completes
2. Session S2 is in process of rebuilding its rd_indexattr bitmap (because
of previous relcache invalidation caused by DROP INDEX). To be premise,
assume that the session has finished rebuilding its index list. Since DROP
INDEX has just happened,
4793 indexoidlist = RelationGetIndexList(relation);

3. S1 then starts CREATE INDEX CONCURRENTLY pgb_a_aid1 ON
cic_tab_accounts(aid1)
4. S1 creates catalog entry for the new index, commits phase 1 transaction
and builds MVCC snapshot for the second phase and also finishes the initial
index build
5. S2 now proceeds. Now notice that S2 had computed the index list based on
the old view i.e. just one index.

The following comments in relcahce.c are now significant:

4799 /*
4800  * Copy the rd_pkindex and rd_replidindex value computed by
4801  * RelationGetIndexList before proceeding.  This is needed because
a
4802  * relcache flush could occur inside index_open below, resetting
the
4803  * fields managed by RelationGetIndexList. (The values we're
computing
4804  * will still be valid, assuming that caller has a sufficient lock
on
4805  * the relation.)
4806  */

That's what precisely goes wrong.

6. When index_open is called, relcache invalidation generated by the first
transaction commit in CIC gets accepted and processed. As part of this
invalidation, rd_indexvalid is set to 0, which invalidates rd_indexlist too
7. But S2 is currently using index list obtained at step 2 above (which has
only old index). It goes ahead and computed rd_indexattr based on just the
old index.
8. While relcahce invalidation processing at step 6 cleared rd_indexattr
(along with rd_indexvalid), it is again set at
4906 relation->rd_indexattr = bms_copy(indexattrs);

But what gets set at step 8 is based on the old view. There is no way
rd_indexattr will be invalidated until S2 receives another relcache
invalidation. This will happen when the CIC proceeds. But until then, every
update done by S2 will generate broken HOT chains, leading to data
corruption.

I can reproduce this entire scenario using gdb sessions. This also explains
why the patch I sent earlier helps to solve the problem.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Non-deterministic behavior with floating point in parallel mode

2017-02-02 Thread Tom Lane
Ruben Buchatskiy  writes:
> We have found that in parallel mode result of queries is non-deterministic
> when the types of the attributes in table are double precision
> (floating-point).

Yeah ...

> That is because floating-point addition is not necessarily associative.

Right, exactly.

> Is this desirable behavior?

It's not especially the fault of parallelism.  Any change in the order in
which the SUM visits the rows could cause a similar change in the results.
IOW, you are being overoptimistic about how deterministic this result
is any of the time.

Use numeric, not float, if you can't tolerate this sort of behavior.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling replication connections by default in pg_hba.conf

2017-02-02 Thread Petr Jelinek
On 02/02/17 14:32, Simon Riggs wrote:
> On 23 January 2017 at 04:29, Michael Paquier  
> wrote:
>> Hi all,
>>
>> As now wal_level = replica has become the default for Postgres 10,
>> could we consider as well making replication connections enabled by
>> default in pg_hba.conf?
> 
> Agreed
> 
>> This requires just uncommenting a couple of
>> lines in pg_hba.conf.sample.
> 
> I don't think that is the right way to do this. Changing the default
> doesn't reduce the complexity.
> 
> I think we should remove the "replication" false database concept in
> pg_hba.conf altogether and allow any valid pg_hba rule to invoke a
> replication connection, if one is requested. Roles would still need
> the REPLICATION capability before this would be allowed. Having both
> of those things doesn't materially improve security control.
> 

+1

> It would also be useful to be able prevent users with REPLICATION
> capability from connecting as normal users, if the are marked as
> NOLOGIN.
> 

+1

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-02 Thread REIX, Tony
Hi Konstantin

I've discussed the "zombie/exit" issue with our expert here.

- He does not think that AIX has anything special here

- If the process is marked  in ps, this is because the flag SEXIT is 
set, thus the process is blocked somewhere in the kexitx() syscall, waiting for 
something.

- In order to know what it is waiting for, the best would be to have a look 
with kdb.

- either it is waiting for an asynchronous I/O to end, or a thread to end if 
the process is multi-thread

- Using the proctree command for analyzing the issue is not a good idea, since 
the process will block in kexitx() if there is an operation on /proc being done

- If the process is marked , that means that the process has not 
called waitpid() yet for getting the son's status. Maybe the parent is blocked 
in non-interruptible code where the signal handler cannot be called.

- In short, that may be due to many causes... Use kdb is the best way.

- Instead of proctree (which makes use of /proc), use: "ps -faT ".


I'll try to reproduce here.

Regards

Tony

Le 01/02/2017 à 21:26, Konstantin Knizhnik a écrit :
On 02/01/2017 08:30 PM, REIX, Tony wrote:



About the zombie issue, I've discussed with my colleagues. Looks like the 
process keeps zombie till the father looks at its status. However, though I did 
that several times, I  do not remember well the details. And that should be not 
specific to AIX. I'll discuss with another colleague, tomorrow, who should 
understand this better than me.

1. Process is not in zomby state (according to ps). It is in  state... 
It is something AIX specific, I have not see processes in this state at Linux.
2. I have implemented simple test - forkbomb. It creates 1000 children and then 
wait for them. It is about ten times slower than at Intel/Linux, but still much 
faster than 100 seconds. So there is some difference between postgress backend 
and dummy process doing nothing - just immediately terminating after return 
from fork()


Regards,

Tony

Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not 
"check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01..0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless 
you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"



Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a 
guide for using it for testing.

pgbench is part of Postgres distributive (src/bin/pgbench)



I would add such tests when building my PostgreSQL RPMs on AIX. So any help is 
welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any 
idea how I should proceed ? Any PostgreSQL performance benchmark that I could 
find and use ? pgbench ?

pgbench is most widely used tool simulating OLTP workload. Certainly it is 
quite primitive and its results are rather artificial. TPC-C seems to be better 
choice.
But the best case is to implement your own benchmark simulating actual workload 
of your real application.


- I'm interested in any information for improving the performance & quality of 
my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are 
free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) 
sells IBM Power machines under the Escala brand since ages (25 years this 
year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX 
?

We still have one open issue at AIX: see 
https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] Enabling replication connections by default in pg_hba.conf

2017-02-02 Thread Simon Riggs
On 23 January 2017 at 04:29, Michael Paquier  wrote:
> Hi all,
>
> As now wal_level = replica has become the default for Postgres 10,
> could we consider as well making replication connections enabled by
> default in pg_hba.conf?

Agreed

> This requires just uncommenting a couple of
> lines in pg_hba.conf.sample.

I don't think that is the right way to do this. Changing the default
doesn't reduce the complexity.

I think we should remove the "replication" false database concept in
pg_hba.conf altogether and allow any valid pg_hba rule to invoke a
replication connection, if one is requested. Roles would still need
the REPLICATION capability before this would be allowed. Having both
of those things doesn't materially improve security control.

It would also be useful to be able prevent users with REPLICATION
capability from connecting as normal users, if the are marked as
NOLOGIN.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2017-02-02 Thread Alvaro Herrera
Pavan Deolasee wrote:

> Do you think we should apply the patch to remove ItemPointerCopy()? I will
> rework the HeapTupleHeaderGetNextTid() after that. Not that it depends on
> removing ItemPointerCopy(), but decided to postpone it until we make a call
> on that patch.

My inclination is not to.  We don't really know where we are going with
storage layer reworks in the near future, and we might end up changing
this in other ways.  We might find ourselves needing this kind of
abstraction again.  I don't think this means we need to follow it
completely in new code, since it's already broken in other places, but
let's not destroy it completely just yet.

> BTW I've run now long stress tests with the patch applied and see no new
> issues, even when indexes are dropped and recreated concurrently (includes
> my patch to fix CIC bug in the master though). In another 24 hour test,
> WARM could do 274M transactions where as master did 164M transactions. I
> did not drop and recreate indexes during this run.

Eh, that's a 67% performance improvement.  Nice.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] patch: optimize information_schema.constraint_column_usage

2017-02-02 Thread Alexey Bashtanov

Hello hackers,

The view information_schema.constraint_column_usage becomes slow when 
the number of columns and constraints raise to substantial values.
This is because of a join condition that allows only join filter to 
enforce. The patch is to optimize it.
See many_constraints.sql file attached for a performance test: create 
3000 tables with 10 columns and a PK each and select * from the view.
The last statement works for 22 seconds on master branch, 34 
milliseconds optimized on my laptop.


Best Regards,
  Alexey Bashtanov


many-constraints.sql
Description: application/sql
diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index 00550eb..ffb1564 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -801,8 +801,8 @@ CREATE VIEW constraint_column_usage AS
   WHERE nr.oid = r.relnamespace
 AND r.oid = a.attrelid
 AND nc.oid = c.connamespace
-AND (CASE WHEN c.contype = 'f' THEN r.oid = c.confrelid AND a.attnum = ANY (c.confkey)
-  ELSE r.oid = c.conrelid AND a.attnum = ANY (c.conkey) END)
+AND r.oid = CASE c.contype WHEN 'f' THEN c.confrelid ELSE c.conrelid END
+AND a.attnum = ANY (CASE c.contype WHEN 'f' THEN c.confkey ELSE c.conkey END)
 AND NOT a.attisdropped
 AND c.contype IN ('p', 'u', 'f')
 AND r.relkind = 'r'

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Non-deterministic behavior with floating point in parallel mode

2017-02-02 Thread Ruben Buchatskiy
Hi hackers,

We have found that in parallel mode result of queries is non-deterministic
when the types of the attributes in table are double precision
(floating-point).

Our example is based on TPC-H, but some NUMERIC columns type was changed to
DOUBLE PRECISION;

When running without parallelism

tpch=# set max_parallel_workers_per_gather to 0;
SET
tpch=# select sum(l_extendedprice) from lineitem where l_shipdate <= date
'1998-12-01' - interval '116 days';
   sum
--
 448157055361.319
(1 row)

output is always the same.

But in parallel mode

tpch=# set max_parallel_workers_per_gather to 1;
SET
tpch=# select sum(l_extendedprice) from lineitem where l_shipdate <= date
'1998-12-01' - interval '116 days';
   sum
--
 448157055361.341
(1 row)

tpch=# select sum(l_extendedprice) from lineitem where l_shipdate <= date
'1998-12-01' - interval '116 days';
   sum
-
 448157055361.348
(1 row)

result differs between runs.

That is because floating-point addition is not necessarily associative.
That is, (a + b) + c is not necessarily equal to a + (b + c).
In parallel mode, the order in which the attribute values are added
(summed) changes between runs, which leads to non-deterministic results.

Is this desirable behavior?

-- 

*Best Regards,**Ruben.* 
ISP RAS.


Re: [HACKERS] Enabling replication connections by default in pg_hba.conf

2017-02-02 Thread Magnus Hagander
On Thu, Feb 2, 2017 at 6:08 AM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 1/22/17 10:29 PM, Michael Paquier wrote:
> > As now wal_level = replica has become the default for Postgres 10,
> > could we consider as well making replication connections enabled by
> > default in pg_hba.conf? This requires just uncommenting a couple of
> > lines in pg_hba.conf.sample.
>
> Yes, I think this makes sense, as one of the reason for these changes is
> to simplify the use of pg_basebackup.
>

+1.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)

2017-02-02 Thread Fabien COELHO


Hello,


What do I need to do to get TAP tests running?


I misunderstood. You need to configure with "--enable-tap-tests".


There is a spurious empty line added at the very end of "mainloop.h":

  +
   #endif   /* MAINLOOP_H */


Not in my diff, but that's been coming and going in your diff reviews.


Strange. Maybe this is linked to the warning displayed with "git apply" 
when I apply the diff.



I would suggest to add a short one line comment before each test to
explain what is being tested, like "-- test \elif execution", "-- test
\else execution"...


Where are you suggesting this?


In "regress/sql/psql.sql", in front of each group which starts a test.


Debatable suggestion about "psql_branch_empty":


The name isn't great. Maybe psql_branch_stack_empty()?


Yep, maybe, or "empty_stack" or "stack_is_empty" or IDK...


"psql_branch_end_state": it is a pop, it could be named "psql_branch_pop"

Yeah, we either need to go fully with telling the programmer that it's a
stack (push/pop/empty) or (begin_branch/end_branch/not_branching). I'm
inclined to go full-stack, as it were.


Anything consistent is ok. I'm fine with calling a stack a stack:-)

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY

2017-02-02 Thread Amit Kapila
On Mon, Jan 30, 2017 at 7:20 PM, Pavan Deolasee
 wrote:
> Hello All,
>
> While stress testing WARM, I encountered a data consistency problem. It
> happens when index is built concurrently. What I thought to be a WARM
> induced bug and it took me significant amount of investigation to finally
> conclude that it's a bug in the master. In fact, we tested all the way up to
> 9.2 release and the bug exists in all releases.
>
> In my test set up, I keep a UNIQUE index on an "aid" column and then have
> many other "aid[1-4]" columns, on which indexes are built and dropped. The
> script then updates a randomly selected row using any of the aid[1-4]
> columns. Even these columns are updated during the tests to force WARM
> updates, but within a narrow range of non-overlapping values. So we can
> always find the unique row using some kind of range condition. The
> reproduction case is much simpler and I'll explain that below.
>
> In the reproduction scenario (attached), it happens fairly quickly, may be
> after 3-10 rounds of DROP/CREATE. You may need to adjust the -j value if it
> does not happen for you even after a few rounds. Just for the record, I
> haven't checked past reports to correlate if the bug explains any of the
> index corruption issues we might have received.
>
> To give a short summary of how CIC works with HOT, it consists of three
> phases:
> 1. Create index entry in the catalog, mark it not ready for inserts, commit
> the transaction, start new transaction and then wait for all potential lock
> holders on the table to end
> 2. Take a MVCC snapshot, build the index, mark index ready for inserts,
> commit and then once again wait for potential lock holders on the table to
> end
> 3. Do a validation phase where we look at MVCC-visible tuples and add
> missing entries to the index. We only look at the root line pointer of each
> tuple while deciding whether a particular tuple already exists in the index
> or not. IOW we look at HOT chains and not tuples themselves.
>
> The interesting case is phase 1 and 2. The phase 2 works on the premise that
> every transaction started after we finished waiting for all existing
> transactions will definitely see the new index. All these new transactions
> then look at the columns used by the new index and consider them while
> deciding HOT-safety of updates. Now for reasons which I don't fully
> understand, there exists a path where a transaction starts after CIC waited
> and took its snapshot at the start of the second phase, but still fails to
> see the new index. Or may be it sees the index, but fails to update its
> rd_indexattr list to take into account the columns of the new index. That
> leads to wrong decisions and we end up with a broken HOT chain, something
> the CIC algorithm is not prepared to handle. This in turn leads the new
> index missing entries for the update tuples i.e. the index may have aid1 =
> 10, but the value in the heap could be aid1 = 11.
>
> Based on my investigation so far and the evidence that I've collected, what
> probably happens is that after a relcache invalidation arrives at the new
> transaction, it recomputes the rd_indexattr but based on the old, cached
> rd_indexlist. So the rd_indexattr does not include the new columns. Later
> rd_indexlist is updated, but the rd_indexattr remains what it was.
>
> There is definitely an evidence of this sequence of events (collected after
> sprinkling code with elog messages), but it's not yet clear to me why it
> affects only a small percentage of transactions. One possibility is that it
> probably depends on at what stage an invalidation is received and processed
> by the backend. I find it a bit confusing that even though rd_indexattr is
> tightly coupled to the rd_indexlist, we don't invalidate or recompute them
> together. Shouldn't we do that?
>

During invalidation processing, we do destroy them together in
function RelationDestroyRelation().  So ideally, after invalidation
processing, both should be built again, but not sure why that is not
happening in this case.

> For example, in the attached patch we
> invalidate rd_indexattr whenever we recompute rd_indexlist (i.e. if we see
> that rd_indexvalid is 0).

  /*
+ * If the index list was invalidated, we better also invalidate the index
+ * attribute list (which should automatically invalidate other attributes
+ * such as primary key and replica identity)
+ */

+ relation->rd_indexattr = NULL;
+

I think setting directly to NULL will leak the memory.

> This patch fixes the problem for me (I've only
> tried master), but I am not sure if this is a correct fix.
>

I think it is difficult to say that fix is correct unless there is a
clear explanation of what led to this problem.  Another angle to
investigate is to find out when relation->rd_indexvalid is set to 0
for the particular session where you are seeing the problem.  I could
see few places where we are setting it to 0 and clearing rd_indexlist,
but 

Re: [HACKERS] What is "index returned tuples in wrong order" for recheck supposed to guard against?

2017-02-02 Thread Kyotaro HORIGUCHI
I forgot to mention this..

At Thu, 02 Feb 2017 21:11:16 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20170202.26.49572782.horiguchi.kyot...@lab.ntt.co.jp>
> Hello, I digged this topic covered by a spiderweb..
> 
> # PostGIS requires so many libraries installed :(
> 
> FIY, for the given test case, the following query hits the bug.
> 
> | SELECT edge.geom AS geom
> | FROM (SELECT * FROM knn_recheck_geom) AS edge
> | ORDER BY '01010014AE47E17A141FC09AA170C0' <-> edge.geom LIMIT 2;
> 
> 
> At Tue, 3 Jan 2017 11:18:55 -0500, Robert Haas  wrote 
> in 
> > On Tue, Jan 3, 2017 at 12:36 AM, Regina Obe  wrote:
> > > It's still not quite clear to me even looking at that git commit, why 
> > > those need to error instead of going thru recheck aside from efficiency.
> > 
> > The code that reorders the returned tuples assumes that (1) the actual
> > distance is always greater than or equal to the estimated distance and
> > (2) the index returns the tuples in order of increasing estimated
> > distance.  Imagine that the estimated distances are 0, 1, 2, 3... and
> > the real distances are 2,3,4,5...  When it sees the
> > estimated-distance-0 tuple it computes that the actual distance is 2,
> > but it doesn't know whether there's going to be a tuple later with an
> > actual distance between 0 and 2, so it buffers the tuple. When it sees
> > the estimated-distance-1 tuple it computes that the actual distance is
> > 2, and now it knows there won't be any more estimated or actual
> > distances between 0 and 1, but there could still be a tuple with an
> > estimated distance of 1 and 2 whose actual distance is also between 1
> > and 2, so it buffers the second tuple as well.  When it sees the third
> > tuple, with estimated distance 2, it now knows that there won't be any
> > further tuples whose estimated or actual distance is less than 2.  So
> > now it can emit the first tuple that it saw, because that had an
> > actual distance of 2 and from this point forward the index will never
> > return anything with a smaller estimated or actual distance.  The
> > estimated-distance-1 tuple still has to stay in the buffer, though,
> > until we see a tuple whose estimated distance is greater than that
> > tuple's actual distance (3).
> 
> The estimation is calculated using box2df_distance, and the
> recheck evaluation is using distnce (of libgeom).
> 
> For the problematic case, the distance's result is
> "6.992439" and box2df_distance's is
> "6.99330517578125" by "%20.18lf". The point should be just on
> the edge of its bounding box.

The binary representaions of the two are the following.

distance:
 "40 1b f8 d4 fd f3 b6 45"
=(0:plus) (101 = 2^(1025-1023) = *4) 1."bf8d4fdf3b645"

estimation:
 "40 1b f8 d5 00 00 00 00"
=(0:plus) (101 = 2^(1025-1023) = *4) 1."bf8d5"

I mean, the estimation (box2df_distance) seems getting precision
reduction on its matissa. (I cannot say more than this, though)


> gserialized_gist_distance_2d() of PostGIS 2.3.0:
> >  /* In all cases, since we only have keys (boxes) we'll return */
> >  /* the minimum possible distance, which is the box2df_distance */
> >  /* and let the recheck sort things out in the case of leaves */
> >  distance = (double)box2df_distance(entry_box, _box);
> 
> This theoretically doesn't give larger value than the real
> distance but it is failing.  I'm not sure how to fix this but at
> least it is not a problem of GIST or PostgreSQL.
> 
> If set the distance() of libgeom as the base, box2df_distance
> might should return a very-very slightly smaller value (I don't
> know how much it should be.) so that it can conseal the error of
> its operation.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What is "index returned tuples in wrong order" for recheck supposed to guard against?

2017-02-02 Thread Kyotaro HORIGUCHI
Hello, I digged this topic covered by a spiderweb..

# PostGIS requires so many libraries installed :(

FIY, for the given test case, the following query hits the bug.

| SELECT edge.geom AS geom
| FROM (SELECT * FROM knn_recheck_geom) AS edge
| ORDER BY '01010014AE47E17A141FC09AA170C0' <-> edge.geom LIMIT 2;


At Tue, 3 Jan 2017 11:18:55 -0500, Robert Haas  wrote in 

> On Tue, Jan 3, 2017 at 12:36 AM, Regina Obe  wrote:
> > It's still not quite clear to me even looking at that git commit, why those 
> > need to error instead of going thru recheck aside from efficiency.
> 
> The code that reorders the returned tuples assumes that (1) the actual
> distance is always greater than or equal to the estimated distance and
> (2) the index returns the tuples in order of increasing estimated
> distance.  Imagine that the estimated distances are 0, 1, 2, 3... and
> the real distances are 2,3,4,5...  When it sees the
> estimated-distance-0 tuple it computes that the actual distance is 2,
> but it doesn't know whether there's going to be a tuple later with an
> actual distance between 0 and 2, so it buffers the tuple. When it sees
> the estimated-distance-1 tuple it computes that the actual distance is
> 2, and now it knows there won't be any more estimated or actual
> distances between 0 and 1, but there could still be a tuple with an
> estimated distance of 1 and 2 whose actual distance is also between 1
> and 2, so it buffers the second tuple as well.  When it sees the third
> tuple, with estimated distance 2, it now knows that there won't be any
> further tuples whose estimated or actual distance is less than 2.  So
> now it can emit the first tuple that it saw, because that had an
> actual distance of 2 and from this point forward the index will never
> return anything with a smaller estimated or actual distance.  The
> estimated-distance-1 tuple still has to stay in the buffer, though,
> until we see a tuple whose estimated distance is greater than that
> tuple's actual distance (3).

The estimation is calculated using box2df_distance, and the
recheck evaluation is using distnce (of libgeom).

For the problematic case, the distance's result is
"6.992439" and box2df_distance's is
"6.99330517578125" by "%20.18lf". The point should be just on
the edge of its bounding box.

gserialized_gist_distance_2d() of PostGIS 2.3.0:
>  /* In all cases, since we only have keys (boxes) we'll return */
>  /* the minimum possible distance, which is the box2df_distance */
>  /* and let the recheck sort things out in the case of leaves */
>  distance = (double)box2df_distance(entry_box, _box);

This theoretically doesn't give larger value than the real
distance but it is failing.  I'm not sure how to fix this but at
least it is not a problem of GIST or PostgreSQL.

If set the distance() of libgeom as the base, box2df_distance
might should return a very-very slightly smaller value (I don't
know how much it should be.) so that it can conseal the error of
its operation.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Constraint exclusion failed to prune partition in case of partition expression involves function call

2017-02-02 Thread amul sul
Hi,

In following case, constraint exclusion not able prune partition (even
if function is immutable), is this know behaviour?

--CASE 1 :  create table & insert data
create table foo_list (a integer, b text) partition by list (abs(a));
create table foo_list1 partition of foo_list for values in (0);
create table foo_list2 partition of foo_list for values in (1);
create table foo_list3 partition of foo_list for values in (2);
create table foo_list4 partition of foo_list for values in (3);
insert into foo_list values(0),(1),(-1),(2),(-2),(3),(-3);

--Explain plan
postgres=# explain select * from foo_list where a = 2;
   QUERY PLAN
-
 Append  (cost=0.00..103.50 rows=25 width=36)
   ->  Seq Scan on foo_list  (cost=0.00..0.00 rows=1 width=36)
 Filter: (a = 2)
   ->  Seq Scan on foo_list1  (cost=0.00..25.88 rows=6 width=36)
 Filter: (a = 2)
   ->  Seq Scan on foo_list2  (cost=0.00..25.88 rows=6 width=36)
 Filter: (a = 2)
   ->  Seq Scan on foo_list3  (cost=0.00..25.88 rows=6 width=36)
 Filter: (a = 2)
   ->  Seq Scan on foo_list4  (cost=0.00..25.88 rows=6 width=36)
 Filter: (a = 2)
(11 rows)

AFAUI, constraint exclusion should prune all above table other than
foo_list3 as happens in the following case :

-- CASE 2: create table & insert data
create table bar_list (a integer, b text) partition by list (a);
create table bar_list1 partition of bar_list for values in (0);
create table bar_list2 partition of bar_list for values in (1);
create table bar_list3 partition of bar_list for values in (2);
create table bar_list4 partition of bar_list for values in (3);
insert into bar_list values(0),(1),(2),(3);

--- Explain plan
postgres=# explain select * from bar_list where a = 2;
   QUERY PLAN
-
 Append  (cost=0.00..25.88 rows=7 width=36)
   ->  Seq Scan on bar_list  (cost=0.00..0.00 rows=1 width=36)
 Filter: (a = 2)
   ->  Seq Scan on bar_list3  (cost=0.00..25.88 rows=6 width=36)
 Filter: (a = 2)
(5 rows)

Thanks & Regards,
Amul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect: Hash index support

2017-02-02 Thread Ashutosh Sharma
>
> +   ptr = (char *) itup + IndexInfoFindDataOffset(itup->t_info);
> +   Assert(ptr <= uargs->page + BLCKSZ);
>
> I think this should be promoted to an ereport(); these functions can
> accept an arbitrary bytea.

I think we had added 'ptr' variable initially just to dump hash code
in hexadecimal format but now since we have removed that logic from
current code, I think it is no  more required and I have therefore
removed it from the current patch. Below is the code that used it
initially. It got changed as per your comment - [1]

>
> +   if (opaque->hasho_flag & LH_BUCKET_PAGE)
> +   stat->hasho_prevblkno = InvalidBlockNumber;
> +   else
> +   stat->hasho_prevblkno = opaque->hasho_prevblkno;
>
> I think we should return the raw value here.  Mithun's patch to cache
> the metapage hasn't gone in yet, but even if it does, let's assume
> anyone using contrib/pageinspect wants to see the data that's
> physically present, not our gloss on it.

okay, agreed. I have corrected this in the attached v10 patch.

>
> Other than that, I don't think I have any other comments on this.  The
> tests that got added look a little verbose to me (do we really need to
> check pages 1-4 separately in each case when they're all hash pages?
> if hash_bitmap_info rejects one, surely it will reject the others) but
> I'm not going to fight too hard if Peter wants it that way.
>

I think it's OK to check hash_bitmap_info() or any other functions
with different page types at least once.

[1]- 
https://www.postgresql.org/message-id/CA%2BTgmoZUjrVy52TUU3b_Q5L6jcrt7w5R4qFwMXdeCuKQBmYWqQ%40mail.gmail.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


0001-Add-support-for-hash-index-in-pageinspect-contrib-mo_v10.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add tab completion for DEALLOCATE

2017-02-02 Thread Dagfinn Ilmari Mannsåker
Tom Lane  writes:

> ilm...@ilmari.org (Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?=) writes:
>> While playing with prepared statements in psql, I noticed that EXECUTE
>> tab-completes the list of active prepared statements, but DEALLOCATE
>> does not.
>> Attached is a patch to fix this.
>
> Good idea, but I think it would be better to give DEALLOCATE its own
> entry in the list, so it could be placed in alphabetical order.

I was following the example from FETCH/MOVE further down to avoid
duplicating the body of the stanza, but I guess in this case it's simple
enough that it doesn't matter.

Here's an updated patch wich adds it as a separate stanza.

-- 
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.   - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.  - Calle Dybedahl


>From 8b5e4ed63e558475b193b1d668ed14baae0c497c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Thu, 2 Feb 2017 10:09:26 +
Subject: [PATCH] Add tab completion for DEALLOCATE

EXECUTE already tab-completes the list of prepared statements, but
DEALLOCATE was missing.
---
 src/bin/psql/tab-complete.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index d6fffcf42f..539ab1afe2 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2453,6 +2453,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches5("CREATE", "EVENT", "TRIGGER", MatchAny, "ON"))
 		COMPLETE_WITH_LIST3("ddl_command_start", "ddl_command_end", "sql_drop");
 
+/* DEALLOCATE */
+	else if (Matches1("DEALLOCATE"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_prepared_statements);
+
 /* DECLARE */
 	else if (Matches2("DECLARE", MatchAny))
 		COMPLETE_WITH_LIST5("BINARY", "INSENSITIVE", "SCROLL", "NO SCROLL",
-- 
2.11.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Superowners

2017-02-02 Thread Simon Riggs
On 30 January 2017 at 16:34, Peter Eisentraut
 wrote:
> On 1/30/17 9:04 AM, Simon Riggs wrote:
>> all I want in this release is
>> super-ownership.
>
> What exactly is super-ownership, and what problems does it solve?

The problem is that there is no easy way for a DBA to have privs on
multiple sets of objects, so there is a request for superuser in many
cases. Superuser is too strong for most situations, so we are stuck.
We need some middle ground where a single user can manage many "normal
application objects" (tables, views, sequences, matviews, functions,
indexes, triggers) without problem, while not compromising other areas
that require higher security.

Probably more than 50% of PostgreSQL installs now use services that
block superuser accounts, so the majority of PostgreSQL users are
affected by these problems.

The permissions desirable for logical replication are a good example
of this, but not in any sense the only issue.

My hope is that we release v10 with a permissions model that allows
logical replication to be realistically usable when superuser is not
available. This is not a new requirement, but the privilege aspect of
the logical replication has been pushed back. While thinking about
other problems of access control I've rethought this so I now see the
wider problem and would like to solve that rather than just focus on
the needs of logical replication.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Superowners

2017-02-02 Thread Simon Riggs
On 30 January 2017 at 16:43, Tom Lane  wrote:
> Simon Riggs  writes:
>> Agreed. Let me reiterate: all I want in this release is
>> super-ownership.
>
> While I'm not entirely convinced whether super-ownership is a good idea
> or not, I am pretty sure that rushing to get it into v10 is a bad idea.
> This is a rather fundamental change in our permissions model and it
> might turn out to have undesirable consequences.

Agreed. My view is that the current mechanism almost forces people to
use superusers for many things and that is definitely undesirable.

> Or even more directly: any patch for this would necessarily be landing
> in the last v10 commitfest.  We have a project policy against major
> changes showing up for the first time in the last fest of a cycle,
> for good reasons.

I understand.

> Let's take our time and get it right.

So we are able to see what is proposed, I attach a patch.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


db_owner_has_obj_privs.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multivariate statistics (v19)

2017-02-02 Thread Tomas Vondra



On 02/01/2017 11:52 PM, Alvaro Herrera wrote:

Still looking at 0002.

pg_ndistinct_in disallows input, claiming that pg_node_tree does the 
same thing. But pg_node_tree does it for security reasons: you could 
crash the backend if you supplied a malicious value. I don't think

that applies to pg_ndistinct_in. Perhaps it will be useful to inject
fake stats at some point, so why not allow it? It shouldn't be
complicated (though it does require writing some additional code, so
perhaps that's one reason we don't want to allow input of these
values).

>

Yes, I haven't written the code, and I'm not sure it's a very practical 
way to inject custom statistics. But if we decide to allow that in the 
future, we can probably add the code.


There's a subtle difference between pg_node_tree and the data types for 
statistics - pg_node_tree stores the value as a string (matching the 
nodeToString output), so the _in function is fairly simple. Of course, 
stringToNode() assumes safe input, which is why the input is disabled.


OTOH the statistics are stored in an optimized binary format, allowing 
to use the value directly (without having to do expensive parsing etc).


I was thinking that the easiest way to add support for _in would be to 
add a bunch of Nodes for the statistics, along with in/out functions, 
but keeping the internal binary representation. But that'll be tricky to 
do in a safe way - even if those nodes are coded in a very defensive 
ways, I'd bet there'll be ways to inject unsafe nodes.


So I'm OK with not having the _in for now. If needed, it's possible to 
construct the statistics as a bytea using a bit of C code. That's at 
least obviously unsafe, as anything written in C, touching the memory.



The comment on top of pg_ndistinct_out is missing the "_out"; also it
talks about histograms, which is not what this is about.



OK, will fix.


In the same function, a trivial point you don't need to pstrdup() the
.data out of a stringinfo; it's already palloc'ed in the right context
-- just PG_RETURN_CSTRING(str.data) and forget about "ret".  Saves you
one line.



Will fix too.


Nearby, some auxiliary functions such as n_choose_k and
num_combinations are not documented. What it is that they do? I'd
move these at the end of the file, keeping the important entry points
at the top of the file.


I'd say n-choose-k is pretty widely known term from combinatorics. The 
comment would essentially say just 'this is n-choose-k' which seems 
rather pointless. So as much as I dislike the self-documenting code, 
this actually seems like a good case of that.



I see this patch has a estimate_ndistinct() which claims to be a re-
implementation of code already in analyze.c, but it is actually a lot
simpler than what analyze.c does.  I've been wondering if it'd be a good
idea to use some of this code so that some routines are moved out of
analyze.c; good implementations of statistics-related functions would
live in src/backend/statistics/ where they can be used both by analyze.c
and your new mvstats stuff.  (More generally I am beginning to wonder if
the new directory should be just src/backend/statistics.)



I'll look into that. I have to check if I ignored some assumptions or 
corner cases the analyze.c deals with.



common.h does not belong in src/backend/utils/mvstats; IMO it should be
called src/include/utils/mvstat.h.  Also, it must not include
postgres.h, and it probably doesn't need most of the #includes it has;
those are better put into whatever include it.  It definitely needs a
guarding #ifdef MVSTATS_H around its whole content too.  An include file
is not just a way to avoid #includes in other files; it is supposed to
be a minimally invasive way of exporting the structs and functions
implemented in some file into other files.  So it must be kept minimal.



Will do.


psql/tab-complete.c compares the wrong version number (9.6 instead of
10).

Is it important to have a cast from pg_ndistinct to bytea?  I think
it's odd that outputting it as bytea yields something completely
different than as text.  (The bytea is not human readable and cannot be
used for future input, so what is the point?)



Because it internally is a bytea, and it seems useful to have the 
ability to inspect the bytea value directly (e.g. to see the length of 
the bytea and not the string output).




In another subthread you seem to have surrendered to the opinion that
the new catalog should be called pg_statistics_ext, just in case in the
future we come up with additional things to put on it.  However, given
its schema, with a "starelid / stakeys", is it sensible to think that
we're going to get anything other than something that involves multiple
variables?  Maybe it should just be "pg_statistics_multivar" and if
something else comes along we create another catalog with an appropriate
schema.  Heck, how does this catalog serve the purpose of cross-table
statistics in the first place, given that it has room to record a single

Re: pg_authid.rolpassword format (was Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol)

2017-02-02 Thread Heikki Linnakangas

On 02/02/2017 05:50 AM, David Rowley wrote:

On 2 February 2017 at 00:13, Heikki Linnakangas  wrote:

Ok, I'll drop the second patch for now. I committed the first patch after
fixing the things you and Michael pointed out. Thanks for the review!


dbd69118 caused small compiler warning for me.

The attached fixed it.


Fixed, thanks!

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvements in psql hooks for variables

2017-02-02 Thread Andrew Dunstan


On 02/01/2017 11:26 AM, Tom Lane wrote:
> "Daniel Verite"  writes:
>> That works for me. I tested and read it and did not find anything
>> unexpected or worth objecting.
>> "\unset var" acting as "\set var off" makes sense considering
>> that its opposite "\set var" is now an accepted
>> synonym of "\set var on" for the boolean built-ins.
> Thanks for reviewing!  I've pushed this now --- Andrew, you should
> be able to revert the RedisFDW test script to the way it was.
>
>   

Done.

cheers

andrew


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-02 Thread Konstantin Knizhnik

Last update on the issue with deadlock in XLogInsert.

After almost one day of working, pgbench is once again not working 
normally:(

There are no deadlock, there are no core files and no error messages in log.
But TPS is almost zero:

progress: 57446.0 s, 1.1 tps, lat 3840.265 ms stddev NaNQ
progress: 57447.3 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57448.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57449.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57450.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57451.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57452.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57453.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57454.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57455.1 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57456.5 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57457.1 s, 164.6 tps, lat 11504.085 ms stddev 5902.148
progress: 57458.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57459.0 s, 234.0 tps, lat 1597.573 ms stddev 3665.814
progress: 57460.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57461.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57462.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57463.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57464.0 s, 602.8 tps, lat 906.765 ms stddev 1940.256
progress: 57465.0 s, 7.2 tps, lat 38.052 ms stddev 12.302
progress: 57466.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57467.1 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57468.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57469.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57470.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57471.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57472.1 s, 147.8 tps, lat 4379.790 ms stddev 3431.477
progress: 57473.0 s, 1314.1 tps, lat 156.884 ms stddev 535.761
progress: 57474.0 s, 1272.2 tps, lat 31.548 ms stddev 59.538
progress: 57475.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57476.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57477.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57478.0 s, 1688.6 tps, lat 268.379 ms stddev 956.537
progress: 57479.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57480.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57481.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57482.1 s, 29.0 tps, lat 3500.432 ms stddev 54.177
progress: 57483.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57484.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57485.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57486.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57487.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57488.0 s, 66.0 tps, lat 9813.646 ms stddev 19.807
progress: 57489.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57490.0 s, 31.0 tps, lat 8368.125 ms stddev 933.997
progress: 57491.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57492.0 s, 1601.0 tps, lat 226.865 ms stddev 844.952
progress: 57493.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57494.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ


ps auwx shows the following picture:
[10:44:12]postgres@postgres:~/postgresql $ ps auwx | fgrep postgres
postgres 61802470  0.4  0.0 4064 4180  pts/6 A18:54:58 976:56 
pgbench -c 100 -j 20 -P 1 -T 10 -p 5436
postgres 15271518  0.0  0.0 138276 15024  - A10:43:34  0:06 
postgres: autovacuum worker process   postgres
postgres 13305354  0.0  0.0 22944 21356  - A20:49:04 27:51 
postgres: autovacuum worker process   postgres
postgres  5245902  0.0  0.0 14072 14020  - A18:54:59 10:24 
postgres: postgres postgres [local] COMMIT
postgres 44303278  0.0  0.0 15176 14036  - A18:54:59 10:18 
postgres: postgres postgres [local] COMMIT
postgres 38601340  0.0  0.0 11564 14008  - A18:54:59 10:16 
postgres: postgres postgres [local] COMMIT
postgres 53674890  0.0  0.0 12712 14004  - A18:54:59  8:54 
postgres: postgres postgres [local] COMMIT
postgres 27591640  0.0  0.0 15040 14028  - A18:54:59  8:38 
postgres: postgres postgres [local] COMMIT
postgres 40960422  0.0  0.0 12128 13996  - A18:54:59  8:36 
postgres: postgres postgres [local] COMMIT
postgres 41288514  0.0  0.0 10544 14012  - A18:54:59  8:30 
postgres: postgres postgres [local] idle
postgres 55771564  0.0  0.0 12844 14008  - A18:54:59  8:24 
postgres: postgres postgres [local] COMMIT
postgres 21760842  0.0  0.0 13164 14008  - A18:54:59  8:17 
postgres: postgres postgres [local] COMMIT
postgres 18810974  0.0  0.0 10416 14012  - A18:54:59  8:13 
postgres: postgres postgres [local] idle in transaction
postgres 17566474  0.0  0.0 10224 14012  - A18:54:59  8:02 
postgres: postgres postgres [local] COMMIT
postgres 63963402  0.0  0.0 11300 14000  - A18:54:59  7:48 
postgres: postgres postgres [local] COMMIT
postgres  9963962  0.0  0.0 15548 14024  - A18:54:59  7:37 
postgres: postgres postgres [local] idle
postgres 10094942  0.0  0.0 12192 13996  - A18:54:59  7:33