[HACKERS] psql --help=variables lists variables that can't be set

2015-12-03 Thread Peter Eisentraut
psql --help=variables shows variables treated specially by psql.  And it
tells you

Usage:
  psql --set=NAME=VALUE
  or \set NAME VALUE

But some of the variables listed cannot usefully be set, only read, such as

  DBNAME the currently connected database name
  HOST   the currently connected database server
  LASTOIDthe value of last affected OID

These can be read in a script, but there is nothing useful you can do
with them on the command line.

Also, for variables such as HISTCONTROL, IGNOREEOF, PROMPT*, they are
more commonly set in psqlrc, not on the command line.

Should we trim this list down to variables that are actually useful to
set on the command line?

I also have some concerns about the listed environment variables.  The
list of libpq PG* variables is incomplete, and if we're going to curate
the list, we surely don't need to show the "not recommended" PGPASSWORD
variable.

That list probably also needs some ifdefs, since SHELL and TMPDIR don't
work on Windows, and PSQL_HISTORY only works when readline support is built.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-03 Thread Michael Paquier
On Thu, Dec 3, 2015 at 3:44 PM, Michael Paquier
 wrote:
> On Thu, Dec 3, 2015 at 6:59 AM, Alvaro Herrera wrote:
>> I didn't push the changed for config_default you requested a few
>> messages upthread; it's not clear to me how setting it to undef affects
>> the whole thing.  If setting it to undef makes the MSVC toolchain run
>> the tap tests in the default config, then I can do it; let's be clear
>> about what branch to backpatch this to.  Also the "1;" at the end of
>> RewindTest.
>
>
> Setting it to undef will prevent the tests to run, per vcregress.pl:
> die "Tap tests not enabled in configuration"
>   unless $config->{tap_tests};
> Also, setting it to undef will match the existing behavior on
> platforms where ./configure is used because the switch
> --enable-tap-tests needs to be used there. And I would believe that in
> most cases Windows environments are not going to have IPC::Run
> deployed.
>
> I have also rebased the recovery test suite as the attached, using the
> infrastructure that has been committed lately.

By the way, if there are no objections, I am going to mark this patch
as committed in the CF app. Putting in the infrastructure is already a
good step forward, and I will add an entry in next CF to track the
patch to add tests for recovery itself. Alvaro, what do you think?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] broken tests

2015-12-03 Thread Pavel Stehule
2015-12-03 22:50 GMT+01:00 Pavel Raiskup :

> On Thursday 03 of December 2015 20:49:09 Pavel Stehule wrote:
> > 2015-12-03 12:39 GMT+01:00 Pavel Raiskup :
> >
> > > On Wednesday 02 of December 2015 20:26:56 Pavel Stehule wrote:
> > > > 2015-12-02 20:08 GMT+01:00 Alvaro Herrera  >:
> > > >
> > > > > Pavel Stehule wrote:
> > > > > > Hi
> > > > > >
> > > > > > Today I have problem with regress tests on my laptop.
> > > > >
> > > > > Maybe this is because of the libxml version?
> > > >
> > > > 100%, same issue is with 9.4.5
> > > >
> > > > After downgrade to 2.9.2 (from 2.9.3) this issue was out
> > >
> > > Agreed.
> > >
> > > > So it is looking like Fedora fault
> > >
> > > Not really.  This has been changed in libxml2 upstream:
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1286692
> >
> > ok
> >
> > I would not be offensive
>
> Sorry if that looked like this :(.  I just wanted to point out that this
> is not "just" about Fedora --> but either libxml2 (upstream) needs to be
> fixed or PostgreSQL's testsuite.
>

juju

Pavel



>
> Pavel
>
>


Re: [HACKERS] Confusing results with lateral references

2015-12-03 Thread Ashutosh Bapat
On Fri, Dec 4, 2015 at 10:58 AM, Amit Langote  wrote:

> On 2015/12/03 21:26, Ashutosh Bapat wrote:
> > Session 1
> > postgres=# begin;
> > BEGIN
> > postgres=# update t1 set val = 2 where val2 = 1;
> > UPDATE 1
> >
> > Session 2
> > postgres=# select * from t1 left join t2 on (t1.val = t2.val) for update
> of
> > t1;
> >
> > query waits
> >
> > Session 1
> > postgres=# commit;
> > COMMIT
> >
> >
> > Session 2 query returns two rows
> > select * from t1 left join t2 on (t1.val = t2.val) for update of t1;
> >  val | val2 | val | val2
> > -+--+-+--
> >2 |1 | |
> >2 |1 | |
> > (2 rows)
> >
> > It's confusing to see two rows from left join result when the table
> really
> > has only a single row. Is this behaviour expected?
>
> Maybe it is. Because the other table still has two (1, 1) rows, LockRows's
> subplan would still produce two rows in result, no?
>
>
Documentation at
http://www.postgresql.org/docs/9.4/static/queries-table-expressions.html
says
(T1) LEFT OUTER JOIN (T2)

First, an inner join is performed. Then, *for each row in T1* that does not
satisfy the join condition with any row in T2, *a joined row is added* with
null values in columns of T2. Thus, the joined table always has at least
one row for each row in T1.
So there should be only one row for each row of outer table that didn't
join with the inner table. IOW a join with no joining rows should have same
number of rows as outer table.


> Thanks,
> Amit
>
>
>


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-03 Thread Haribabu Kommi
On Fri, Dec 4, 2015 at 7:45 AM, Pavel Stehule  wrote:
>
>
> this tracing can be implemented to main pg_hba processing. When you are
> connect from some specific client - and you can see, why you cannot to
> connect to Postgres

The trace messages that are going to print doesn't come to client until the
connection gets successful. The traces may not useful for the clients
to find out
why the connection is failing. But it may be useful for administrators.
How about the attached patch?

[kommih@localhost bin]$ ./psql postgres -h ::1
psql (9.6devel)
Type "help" for help.

postgres=#

ServerLog:
NOTICE:  Skipped 84 pg_hba line, because of host connection type.
NOTICE:  Skipped 86 pg_hba line, because of non matching IP.

Regards,
Hari Babu
Fujitsu Australia


hba_trace.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql --help=variables lists variables that can't be set

2015-12-03 Thread Pavel Stehule
2015-12-04 4:08 GMT+01:00 Peter Eisentraut :

> psql --help=variables shows variables treated specially by psql.  And it
> tells you
>
> Usage:
>   psql --set=NAME=VALUE
>   or \set NAME VALUE
>
> But some of the variables listed cannot usefully be set, only read, such as
>
>   DBNAME the currently connected database name
>   HOST   the currently connected database server
>   LASTOIDthe value of last affected OID
>
> These can be read in a script, but there is nothing useful you can do
> with them on the command line.
>
> Also, for variables such as HISTCONTROL, IGNOREEOF, PROMPT*, they are
> more commonly set in psqlrc, not on the command line.
>
> Should we trim this list down to variables that are actually useful to
> set on the command line?
>

we can reduce this list if we use command line help. For interactive help
\h variables it should to show all.

Regards

Pavel


>
> I also have some concerns about the listed environment variables.  The
> list of libpq PG* variables is incomplete, and if we're going to curate
> the list, we surely don't need to show the "not recommended" PGPASSWORD
> variable.
>

+1


>
> That list probably also needs some ifdefs, since SHELL and TMPDIR don't
> work on Windows, and PSQL_HISTORY only works when readline support is
> built.
>

+1



>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-03 Thread Noah Misch
On Fri, Dec 04, 2015 at 02:47:36PM +0900, Michael Paquier wrote:
> On Fri, Dec 4, 2015 at 2:43 PM, Noah Misch  wrote:
> > On Wed, Dec 02, 2015 at 04:33:50PM -0300, Alvaro Herrera wrote:
> >> How did you figure
> >> that that was the version used, anyway?
> >
> > I asked Bruce at one point.
> 
> So we are trying to use the same version over the years to keep code
> consistent across back-branches?

No, I recall no policy discussion concerning this.

> Do you think we should try to use a
> newer version instead with each pgindent run? That would induce a
> rebasing cost when back-patching, but we cannot stay with the same
> version of perltidy forever either...

No.  I expect we'll implicitly change perltidy versions when Bruce upgrades
his OS.  Assuming future perltidy changes affect us not much more than past
changes (six years of perltidy development changed eighteen PostgreSQL source
lines), that's just fine.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-03 Thread Michael Paquier
On Fri, Dec 4, 2015 at 2:43 PM, Noah Misch  wrote:
> On Wed, Dec 02, 2015 at 04:33:50PM -0300, Alvaro Herrera wrote:
>> How did you figure
>> that that was the version used, anyway?
>
> I asked Bruce at one point.

So we are trying to use the same version over the years to keep code
consistent across back-branches? Do you think we should try to use a
newer version instead with each pgindent run? That would induce a
rebasing cost when back-patching, but we cannot stay with the same
version of perltidy forever either...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] overflow checks optimized away

2015-12-03 Thread Michael Paquier
On Fri, Dec 4, 2015 at 1:27 AM, Andres Freund  wrote:
> On 2015-12-03 12:10:27 +, Greg Stark wrote:
>> I'm leaning towards using the builtin functions described here
>
> For performance reasons? Michael's version of the patch had the
> necessary 'raw' macros, and they don't look *that* bad. Using the
> __builtin variants when available, would be nice - and not hard. On
> e.g. x86 the overflow checks can be done much more efficiently than both
> the current and patched checks.

Using the _builtin functions when available would be indeed a nice
optimization that the previous patch missed.

> I wonder though if we can replace
>
> #define PG_INT16_ADD_OVERFLOWS(a, b) (  \
> ((a) > 0 && (b) > 0 && (a) > PG_INT16_MAX - (b)) || \
> ((a) < 0 && (b) < 0 && (a) < PG_INT16_MIN - (b)))
>
> #define PG_INT32_ADD_OVERFLOWS(a, b) (  \
> ((a) > 0 && (b) > 0 && (a) > PG_INT32_MAX - (b)) || \
> ((a) < 0 && (b) < 0 && (a) < PG_INT32_MIN - (b)))
>
> ...
>
> with something like
> #define PG_ADD_OVERFLOWS(a, b, MINVAL, MAXVAL) (\
> ((a) > 0 && (b) > 0 && (a) > MAXVAL - (b)) ||   \
> ((a) < 0 && (b) < 0 && (a) < MINVAL - (b)))
> #define PG_INT16_ADD_OVERFLOWS(a, b)\
>  PG_ADD_OVERFLOWS(a, b, PG_INT16_MIN, PG_INT16_MAX)
>
> especially for the MUL variants that'll save a bunch of long repetitions.

Yeah, we should. Those would save quite a couple of lines in c.h.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] broken tests

2015-12-03 Thread Michael Paquier
On Fri, Dec 4, 2015 at 2:59 PM, Pavel Stehule  wrote:
> 2015-12-03 22:50 GMT+01:00 Pavel Raiskup :
>> Sorry if that looked like this :(.  I just wanted to point out that this
>> is not "just" about Fedora --> but either libxml2 (upstream) needs to be
>> fixed or PostgreSQL's testsuite.
>
>
> juju

FWIW, I just faced the issue myself with a custom build of libxml2
using 2.9.3... Hopefully this is going to be addressed soon, libxml2
2.9.3 has addressed many CVEs, like CVE-2015-7941, but if it breaks
the xml datatype itself...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusing results with lateral references

2015-12-03 Thread Amit Langote
On 2015/12/03 21:26, Ashutosh Bapat wrote:
> Session 1
> postgres=# begin;
> BEGIN
> postgres=# update t1 set val = 2 where val2 = 1;
> UPDATE 1
> 
> Session 2
> postgres=# select * from t1 left join t2 on (t1.val = t2.val) for update of
> t1;
> 
> query waits
> 
> Session 1
> postgres=# commit;
> COMMIT
> 
> 
> Session 2 query returns two rows
> select * from t1 left join t2 on (t1.val = t2.val) for update of t1;
>  val | val2 | val | val2
> -+--+-+--
>2 |1 | |
>2 |1 | |
> (2 rows)
> 
> It's confusing to see two rows from left join result when the table really
> has only a single row. Is this behaviour expected?

Maybe it is. Because the other table still has two (1, 1) rows, LockRows's
subplan would still produce two rows in result, no?

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-03 Thread Haribabu Kommi
On Fri, Dec 4, 2015 at 8:05 AM, Alvaro Herrera  wrote:
>> >> Here I attached the patch with the suggested changes.
>> >> Along with line number, I kept the options column also with authentication
>> >> options as a jsonb datatype.
>> >>
>> >> Example output:
>> >>
>> >> postgres=# select pg_hba_lookup('test','all','::1');
>> >> NOTICE:  Skipped 84 Hba line, because of non matching IP.
>> >> NOTICE:  Skipped 86 Hba line, because of non matching database.
>> >> NOTICE:  Skipped 87 Hba line, because of non matching role.
>> >>  pg_hba_lookup
>> >> ---
>> >>  (89,trust,{})
>> >> (1 row)
>> >>
>> >> comments?
>
> I don't like this interface.  It's nice for psql, but everybody else is
> going to lose.  I think these should be reported in the SRF result set
> as well; perhaps add a "mode" column that says "skipped" for such rows,
> and "matched" for the one that, uh, matches.  (Please try calling your
> function with "select * from" which should give nicer output.)
>

How about as follows?

postgres=# select * from pg_hba_lookup('all','all','::1');
 line_number | type  | database |  user   |  address  | hostname |
method | options |  mode
-+---+--+-+---+--++-+-
  84   | local  | ["all"]| ["all"]   |
| | trust  | {}  | skipped
  86   | host   | ["all"]| ["all"]   | 127.0.0.1 |
| trust  | {}  | skipped
  88   | host   | ["all"]| ["all"]   | ::1
   | | trust  | {}  | matched
(3 rows)


In the above case, all the columns are displayed. Based on the
feedback we can keep
the required columns. I didn't yet removed the NOTICE messages in the
attached version.
Are they still required?


Regards,
Hari Babu
Fujitsu Australia


pg_hba_lookup_poc_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Passing initially_valid values instead of !skip_validation to StoreRelCheck() in AddRelationNewConstraints()

2015-12-03 Thread amul sul
let me put it in other words, is there any harm use of initially_valid instead 
of !skip_validation?

Earlier to commit I mentioned in my first post, there is only one flag, IMO 
i.e. skip_validation, which are serving both purpose, setting 
pg_constraint.convalidated & decide to skip or validate existing data.

Now, if we have two flag, which can separately use for there respective 
purpose, then why do you think that it is not readable? 

>As Marko points out that would be actually a new
>SQL-level feature that requires much more than changing that line.

May be yes.

Regards, Amul Sul



On Friday, 4 December 2015 6:21 AM, Amit Langote 
 wrote:
On 2015/12/03 20:44, amul sul wrote:
>> On Thursday, 3 December 2015 4:36 PM, Amit Langote 
>>  wrote:
>> Especially from a readability standpoint, I think using skip_validation may 
>> be more apt. 
>> Why - the corresponding parameter of StoreRelCheck() dictates what's stored 
>> in pg_constraint.convalidated.
> 
> Why not? won't initially_valid flag serve same purpose ?

Yes it could, but IMO, it wouldn't be a readability improvement. As I
said, you could think of the variable/argument as delivering whether the
clause is marked NOT VALID or not. Also, see below.

> 
>> So, if skip_validation is 'true' because user specified the constraint NOT 
>> VALID,
>> StoreRelCheck() will store the constraint with convalidated as 'false'
> 
> I guess thats was added before initially_valid flag. As I said, in normal 
> case gram.y take care of skip_validation & initially_valid values, if one is 
> 'true' other will be 'false'. 
> 
>> The user will have to separately validate the constraint by issuing a ALTER 
>> TABLE VALIDATE CONSTRAINT
>> command at a time of their choosing.
> 
> 
> This could be time consuming operation for big table, If I am pretty much 
> sure that my constraint will be valid, simply I could set both 
> flag(initially_valid & skip_validation) to true.

There is currently no support for adding a constraint after-the-fact (that
is, using ALTER TABLE) and marking it valid  without actually verifying it
by scanning the table. As Marko points out that would be actually a new
SQL-level feature that requires much more than changing that line.


Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-03 Thread Pavel Stehule
2015-12-04 5:48 GMT+01:00 Haribabu Kommi :

> On Fri, Dec 4, 2015 at 7:45 AM, Pavel Stehule 
> wrote:
> >
> >
> > this tracing can be implemented to main pg_hba processing. When you are
> > connect from some specific client - and you can see, why you cannot to
> > connect to Postgres
>
> The trace messages that are going to print doesn't come to client until the
> connection gets successful. The traces may not useful for the clients
> to find out
> why the connection is failing. But it may be useful for administrators.
> How about the attached patch?
>

yes, it is only for admin and should be stored to log


>
> [kommih@localhost bin]$ ./psql postgres -h ::1
> psql (9.6devel)
> Type "help" for help.
>
> postgres=#
>
> ServerLog:
> NOTICE:  Skipped 84 pg_hba line, because of host connection type.
> NOTICE:  Skipped 86 pg_hba line, because of non matching IP.
>
> Regards,
> Hari Babu
> Fujitsu Australia
>


Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-03 Thread Pavel Stehule
2015-12-04 5:33 GMT+01:00 Haribabu Kommi :

> On Fri, Dec 4, 2015 at 8:05 AM, Alvaro Herrera 
> wrote:
> >> >> Here I attached the patch with the suggested changes.
> >> >> Along with line number, I kept the options column also with
> authentication
> >> >> options as a jsonb datatype.
> >> >>
> >> >> Example output:
> >> >>
> >> >> postgres=# select pg_hba_lookup('test','all','::1');
> >> >> NOTICE:  Skipped 84 Hba line, because of non matching IP.
> >> >> NOTICE:  Skipped 86 Hba line, because of non matching database.
> >> >> NOTICE:  Skipped 87 Hba line, because of non matching role.
> >> >>  pg_hba_lookup
> >> >> ---
> >> >>  (89,trust,{})
> >> >> (1 row)
> >> >>
> >> >> comments?
> >
> > I don't like this interface.  It's nice for psql, but everybody else is
> > going to lose.  I think these should be reported in the SRF result set
> > as well; perhaps add a "mode" column that says "skipped" for such rows,
> > and "matched" for the one that, uh, matches.  (Please try calling your
> > function with "select * from" which should give nicer output.)
> >
>
> How about as follows?
>
> postgres=# select * from pg_hba_lookup('all','all','::1');
>  line_number | type  | database |  user   |  address  | hostname |
> method | options |  mode
>
> -+---+--+-+---+--++-+-
>   84   | local  | ["all"]| ["all"]   |
> | | trust  | {}  | skipped
>   86   | host   | ["all"]| ["all"]   | 127.0.0.1 |
> | trust  | {}  | skipped
>   88   | host   | ["all"]| ["all"]   | ::1
>| | trust  | {}  | matched
> (3 rows)
>

the tabular interface is better, and then NOTICEs are not necessary.  I
like to see some info why row was skipped in the table.

Regards

Pavel


>
>
> In the above case, all the columns are displayed. Based on the
> feedback we can keep
> the required columns. I didn't yet removed the NOTICE messages in the
> attached version.
> Are they still required?
>
>
> Regards,
> Hari Babu
> Fujitsu Australia
>


Re: [HACKERS] Remaining 9.5 open items

2015-12-03 Thread Etsuro Fujita

On 2015/12/04 11:51, Noah Misch wrote:

On Tue, Dec 01, 2015 at 11:05:47AM -0500, Robert Haas wrote:

On Mon, Nov 30, 2015 at 2:43 PM, Tom Lane  wrote:

* Foreign join pushdown vs EvalPlanQual

Is this fixed by 5fc4c26db?  If not, what remains to do?



Unfortunately, no.  That commit allows FDWs to do proper EPQ handling
for plain table scans, but it proves to be inadequate for EPQ handling
for joins. Solving that problem will require another patch, and,
modulo a bunch of cosmetic issues, I'm reasonably happy with KaiGai
Kohei's latest submission.  I'll respond in more detail on that
thread, but the question I want to raise here is: do we want to
back-patch those changes to 9.5 at this late date?



Yes.  If 9.5 added a bad interface, better to fix the interface even now than
to live with the bad one.


I'd vote for fixing this.

I think the latest version of the patch for this is in good shape, but 
that would need some changes as proposed on that thread.  So, if there 
are no objections, I'll update the patch.


Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql --help=variables lists variables that can't be set

2015-12-03 Thread Andres Freund
On 2015-12-03 22:08:31 -0500, Peter Eisentraut wrote:
> psql --help=variables shows variables treated specially by psql.  And it
> tells you
> 
> Usage:
>   psql --set=NAME=VALUE
>   or \set NAME VALUE
> 
> But some of the variables listed cannot usefully be set, only read, such as
> 
>   DBNAME the currently connected database name
>   HOST   the currently connected database server
>   LASTOIDthe value of last affected OID
> 
> These can be read in a script, but there is nothing useful you can do
> with them on the command line.

Well, you can display them, which actually isn't uninteresting (-c
'\echo :DBNAME'), and at least LASTOID is rather interesting for
scripting purposes.

> Also, for variables such as HISTCONTROL, IGNOREEOF, PROMPT*, they are
> more commonly set in psqlrc, not on the command line.
> 
> Should we trim this list down to variables that are actually useful to
> set on the command line?

I think the increased discoverability isn't a bad thing, so I'm inclined
to not do that.

> I also have some concerns about the listed environment variables.  The
> list of libpq PG* variables is incomplete, and if we're going to curate
> the list, we surely don't need to show the "not recommended" PGPASSWORD
> variable.
>
> That list probably also needs some ifdefs, since SHELL and TMPDIR don't
> work on Windows, and PSQL_HISTORY only works when readline support is built.

I don't have much an opinion about those. I think it's not too bad to
show them regardless, nor is it bad to hide them.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Passing initially_valid values instead of !skip_validation to StoreRelCheck() in AddRelationNewConstraints()

2015-12-03 Thread amul sul
Hi Amit,

Thanks for prompt response.

>On Thursday, 3 December 2015 4:36 PM, Amit Langote 
> wrote:
>Especially from a readability standpoint, I think using skip_validation may be 
>more apt. 
>Why - the corresponding parameter of StoreRelCheck() dictates what's stored in 
>pg_constraint.convalidated.

Why not? won't initially_valid flag serve same purpose ?

> So, if skip_validation is 'true' because user specified the constraint NOT 
> VALID,
> StoreRelCheck() will store the constraint with convalidated as 'false'

I guess thats was added before initially_valid flag. As I said, in normal case 
gram.y take care of skip_validation & initially_valid values, if one is 'true' 
other will be 'false'. 

>The user will have to separately validate the constraint by issuing a ALTER 
>TABLE VALIDATE CONSTRAINT
>command at a time of their choosing.


This could be time consuming operation for big table, If I am pretty much sure 
that my constraint will be valid, simply I could set both flag(initially_valid 
& skip_validation) to true.

 
Regards,
Amul Sul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication and multimaster

2015-12-03 Thread Simon Riggs
On 30 November 2015 at 17:20, Konstantin Knizhnik  wrote:


> But looks like there is not so much sense in having multiple network
> connection between one pair of nodes.
> It seems to be better to have one connection between nodes, but provide
> parallel execution of received transactions at destination side. But it
> seems to be also nontrivial. We have now in PostgreSQL some infrastructure
> for background works, but there is still no abstraction of workers pool and
> job queue which can provide simple way to organize parallel execution of
> some jobs. I wonder if somebody is working now on it or we should try to
> propose our solution?
>

There are definitely two clear places where additional help would be useful
and welcome right now.

1. Allowing logical decoding to have a "speculative pre-commit data"
option, to allow some data to be made available via the decoding api,
allowing data to be transferred prior to commit. This would allow us to
reduce the delay that occurs at commit, especially for larger transactions
or very low latency requirements for smaller transactions. Some heuristic
or user interface would be required to decide whether to and which
transactions might make their data available prior to commit. And we would
need to send abort messages should the transactions not commit as expected.
That would be a patch on logical decoding and is an essentially separate
feature to anything currently being developed.

2. Some mechanism/theory to decide when/if to allow parallel apply. That
could be used for both physical and logical replication. Since the apply
side of logical replication is still being worked on there is a code
dependency there, so a working solution isn't what is needed yet. But the
general principles and any changes to the data content (wal_level) or
protocol (pglogical_output) would be useful.

We already have working multi-master that has been contributed to PGDG, so
contributing that won't gain us anything. There is a lot of code and
pglogical is the most useful piece of code to be carved off and reworked
for submission. The bottleneck is review and commit, not initial
development - which applies both to this area and most others in PostgreSQL.

Having a single network connection between nodes would increase efficiency
but also increase replication latency, so its not useful in all cases.

I think having some kind of message queue between nodes would also help,
since there are many cases for which we want to transfer data, not just a
replication data flow. For example, consensus on DDL, or MPP query traffic.
But that is open to wider debate.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] Confusing results with lateral references

2015-12-03 Thread Ashutosh Bapat
There's another seemingly wrong result, not with lateral, but with FOR
UPDATE.

postgres=# select * from t1;
 val | val2
-+--
   1 |1
(1 row)

postgres=# select * from t2;
 val | val2
-+--
   1 |1
   2 |2
   1 |1
(3 rows)

Session 1
postgres=# begin;
BEGIN
postgres=# update t1 set val = 2 where val2 = 1;
UPDATE 1

Session 2
postgres=# select * from t1 left join t2 on (t1.val = t2.val) for update of
t1;

query waits

Session 1
postgres=# commit;
COMMIT


Session 2 query returns two rows
select * from t1 left join t2 on (t1.val = t2.val) for update of t1;
 val | val2 | val | val2
-+--+-+--
   2 |1 | |
   2 |1 | |
(2 rows)

It's confusing to see two rows from left join result when the table really
has only a single row. Is this behaviour expected?

On Thu, Dec 3, 2015 at 3:49 PM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> Hi,
>
> I am seeing different results with two queries which AFAIU have same
> semantics and hence are expected to give same results.
>
> postgres=# \d t1
>   Table "public.t1"
>  Column |  Type   | Modifiers
> +-+---
>  val| integer |
>  val2   | integer |
>
> postgres=# \d t2
>   Table "public.t2"
>  Column |  Type   | Modifiers
> +-+---
>  val| integer |
>  val2   | integer |
>
> There's no data in the table to start with.
>
> postgres=# insert into t1 values (1, 1);
> postgres=# insert into t2 values (1, 1), (2, 2);
>
> Session 1
> postgres=# begin;
> BEGIN
> postgres=# update t1 set val = 2 where val2 = 1;
> UPDATE 1
>
> Session 2
> postgres=# select * from t1, (select distinct val, val2 from t2) t2 where
> t1.val = t2.val for update of t1;
>
> query waits here because of FOR UPDATE clause
>
> Session 1
> postgres=# commit;
> COMMIT
>
> Session 2 gives no rows
> postgres=# select * from t1, (select distinct val, val2 from t2) t2 where
> t1.val = t2.val for update of t1;
> val | val2 | val | val2
> -+--+-+--
> (0 rows)
>
>
> Reset values of t1
> postgres=# update t1 set val = 1 where val2 = 1;
> UPDATE 1
>
> Session 1
> postgres=# begin;
> BEGIN
> postgres=# update t1 set val = 2 where val2 = 1;
> UPDATE 1
>
> Session 2
> postgres=# select * from t1, lateral (select distinct val, val2 from t2
> where t2.val = t1.val) t2 for update of t1;
>
> query waits here
>
> Session 1
> postgres=# commit;
> COMMIT
>
> Session 2 gives results of the query
> postgres=# select * from t1, lateral (select distinct val, val2 from t2
> where t2.val = t1.val) t2 for update of t1;
>  val | val2 | val | val2
> -+--+-+--
>2 |1 |   1 |1
> (1 row)
>
> AFAIU, both the queries
>
> select * from t1, (select distinct val, val2 from t2) t2 where t1.val =
> t2.val for update of t1;
>
> AND
>
> select * from t1, lateral (select distinct val, val2 from t2 where t2.val
> = t1.val) t2 for update of t1;
>
> have same semantic and should give same results.
>
> Is seeing different results expected behaviour?
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Logical replication and multimaster

2015-12-03 Thread konstantin knizhnik

On Dec 3, 2015, at 10:34 AM, Craig Ringer wrote:

> On 3 December 2015 at 14:54, konstantin knizhnik  
> wrote:
> 
>> I'd really like to collaborate using pglogical_output if at all possible. 
>> Petr's working really hard to get the pglogical downstrem out too, with me 
>> helping where I can.
> 
> And where I can get  pglogical_output plugin? Sorry, but I can't quickly find 
> reference with Google...
> 
> It's been submitted to this CF.
> 
> https://commitfest.postgresql.org/7/418/
> 
> https://github.com/2ndQuadrant/postgres/tree/dev/pglogical-output
> 
> Any tests and comments would be greatly appreciated.
> 

Thank you.
I wonder if there is opposite part of the pipe for pglogical_output - analog of 
receiver_raw?



> -- 
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Passing initially_valid values instead of !skip_validation to StoreRelCheck() in AddRelationNewConstraints()

2015-12-03 Thread Marko Tiikkaja

On 12/3/15 12:44 PM, amul sul wrote:

On Thursday, 3 December 2015 4:36 PM, Amit Langote 
 wrote:
The user will have to separately validate the constraint by issuing a ALTER 
TABLE VALIDATE CONSTRAINT
command at a time of their choosing.


This could be time consuming operation for big table, If I am pretty much sure that 
my constraint will be valid, simply I could set both flag(initially_valid & 
skip_validation) to true.


I'm confused here.  It sounds like you're suggesting an SQL level 
feature, but you're really focused on a single line of code for some 
reason.  Could you take a step back and explain the high level picture 
of what you're trying to achieve?



.m


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication and multimaster

2015-12-03 Thread Simon Riggs
On 3 December 2015 at 12:06, konstantin knizhnik 
wrote:

>
> On Dec 3, 2015, at 10:34 AM, Craig Ringer wrote:
>
> On 3 December 2015 at 14:54, konstantin knizhnik <
> k.knizh...@postgrespro.ru> wrote:
>
>>
>> I'd really like to collaborate using pglogical_output if at all possible.
>> Petr's working really hard to get the pglogical downstrem out too, with me
>> helping where I can.
>>
>> And where I can get  pglogical_output plugin? Sorry, but I can't quickly
>> find reference with Google...
>>
>
> It's been submitted to this CF.
>
> https://commitfest.postgresql.org/7/418/
>
> https://github.com/2ndQuadrant/postgres/tree/dev/pglogical-output
>
> Any tests and comments would be greatly appreciated.
>
>
> Thank you.
> I wonder if there is opposite part of the pipe for pglogical_output -
> analog of receiver_raw?
>

Yes, there is. pglogical is currently in test and will be available
sometime soon.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] broken tests

2015-12-03 Thread Pavel Raiskup
On Wednesday 02 of December 2015 20:26:56 Pavel Stehule wrote:
> 2015-12-02 20:08 GMT+01:00 Alvaro Herrera :
> 
> > Pavel Stehule wrote:
> > > Hi
> > >
> > > Today I have problem with regress tests on my laptop.
> >
> > Maybe this is because of the libxml version?
> 
> 100%, same issue is with 9.4.5
> 
> After downgrade to 2.9.2 (from 2.9.3) this issue was out

Agreed.

> So it is looking like Fedora fault

Not really.  This has been changed in libxml2 upstream:
https://bugzilla.redhat.com/show_bug.cgi?id=1286692

Pavel



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] overflow checks optimized away

2015-12-03 Thread Greg Stark
On Tue, Dec 1, 2015 at 5:17 PM, Greg Stark  wrote:
> Sorry, I didn't look at it since. At the time I was using Xi Wang's software
> to find the overflow checks that need to be redone. He published a paper on
> it and it's actually pretty impressive. It constructs a constraint problem
> and then throws a kSAT solver at it to find out if there's any code that a
> compiler could optimize away regardless of whether any existant compiler is
> actually capable of detecting the case and optimizing it away.
> https://pdos.csail.mit.edu/papers/stack:sosp13.pdf

I did get this code running again (it was quite a hassle actually).
Attached is the report.

I'm leaning towards using the builtin functions described here

http://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins
https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html

They aren't in older GCC and clang and probably won't be in icc and
the like but we could probably implement replacements. The downside is
that then we wouldn't be able to use the generic one and would have to
use the type-specific ones which would be annoying. The Linux kernel
folk wrote wrappers that don't have that problem but they depend on
type_min() and type_max() which I've never heard of and have no idea
what support they have?

What version of GCC and other compilers did we decide we're targeting now?



-- 
greg
---
bug: anti-simplify
model: |
  %cmp48 = icmp ne i8* %30, null, !dbg !1018
  -->  true
stack: 
  - /home/stark/src/pg/postgresql-master/src/backend/storage/lmgr/lmgr.c:712:0
ncore: 1
core: 
  - /home/stark/src/pg/postgresql-master/src/backend/storage/lmgr/lmgr.c:713:0
- null pointer dereference
---
bug: anti-algebra
model: |
  %cmp = icmp ult i8* %add.ptr1, %start_address, !dbg !500
  -->  %16 = icmp slt i64 %15, 0, !dbg !500
  
  (4 + (8 * (sext i32 %7 to i64)) + %start_address)   (4 + (8 * (sext i32 %7 to i64)))   false
stack: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/int.c:756:0
ncore: 1
core: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/int.c:754:0
- signed addition overflow
---
bug: anti-algebra
model: |
  %cmp2 = icmp slt i32 %add1, %s, !dbg !570
  -->  %6 = icmp slt i32 %l, 0, !dbg !570
  
  (%s + %l)   %l   false
stack: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/varbit.c:1170:0
ncore: 1
core: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/varbit.c:1169:0
- signed addition overflow
---
bug: anti-algebra
model: |
  %cmp3 = icmp slt i32 %add, %start, !dbg !878
  -->  %2 = icmp slt i32 %length, 0, !dbg !878
  
  (%start + %length)   %length   %29 = icmp slt i32 %length, 0, !dbg !907
  
  (%start + %length)   %length   false
stack: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/varlena.c:1047:0
ncore: 1
core: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/varlena.c:1046:0
- signed addition overflow
---
bug: anti-algebra
model: |
  %cmp1 = icmp slt i32 %add, %S, !dbg !875
  -->  %2 = icmp slt i32 %L, 0, !dbg !875
  
  (%S + %L)   %L   false
stack: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/varlena.c:2666:0
ncore: 1
core: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/varlena.c:2665:0
- signed addition overflow
---
bug: anti-simplify
model: |
  %cmp19 = icmp slt i64 %mul, 0, !dbg !583
  -->  false
stack: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/int8.c:576:0
ncore: 1
core: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/int8.c:561:0
- signed multiplication overflow
---
bug: anti-simplify
model: |
  %cmp2 = icmp sgt i64 %1, 0, !dbg !580
  -->  false
stack: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/int8.c:711:0
ncore: 1
core: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/int8.c:709:0
- signed addition overflow
---
bug: anti-simplify
model: |
  %cmp2 = icmp slt i64 %1, 0, !dbg !580
  -->  false
stack: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/int8.c:755:0
ncore: 1
core: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/int8.c:753:0
- signed subtraction overflow
---
bug: anti-dce
model: |
  %cmp36 = icmp eq i64 %val.1, 0, !dbg !964
  -->  true
  
  lor.lhs.false38:
  %cmp39 = icmp slt i64 %val.0, 0, !dbg !964
  br i1 %cmp39, label %if.then41, label %if.end42, !dbg !964
stack: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/numeric.c:5233:0
ncore: 1
core: 
  - /home/stark/src/pg/postgresql-master/src/backend/utils/adt/numeric.c:5233:0
- signed subtraction overflow

Re: [HACKERS] Passing initially_valid values instead of !skip_validation to StoreRelCheck() in AddRelationNewConstraints()

2015-12-03 Thread Amit Langote

Hi Amul!

On 2015/12/03 17:52, amul sul wrote:
> Hi ALL,
> 
> Need your suggestions.
> initially_valid flag is added to make column constraint valid. (commit : 
> http://www.postgresql.org/message-id/e1q2ak9-0006hk...@gemulon.postgresql.org)
> 
> 
> IFAICU, initially_valid and skip_validation values are mutually exclusive at 
> constraint creation(ref: gram.y), unless it set explicitly.
> 
> Can we pass initially_valid instead of !skip_validation to StoreRelCheck() in 
> AddRelationNewConstraints(), as shown below?
> 

...

> 
> 
> This will make code more readable & in my case this could enable to skip 
> validation of existing data as well as mark check constraint valid, when we 
> have assurance that modified/added constraint are valid.
> 
> Comments? Thoughts? 

Especially from a readability standpoint, I think using skip_validation
may be more apt. Why - the corresponding parameter of StoreRelCheck()
dictates what's stored in pg_constraint.convalidated. So, if
skip_validation is 'true' because user specified the constraint NOT VALID,
StoreRelCheck() will store the constraint with convalidated as 'false',
because, well, user wishes to "skip" the validation for existing rows in
the table and until a constraint has been verified for all rows in the
table, it cannot be marked valid. The user will have to separately
validate the constraint by issuing a ALTER TABLE VALIDATE CONSTRAINT
command at a time of their choosing.

OTOH, if NOT VALID was not specified, validation will not be skipped -
skip_validation would be 'false', so the constraint would be stored as
valid and added to the list of constraints to be atomically verified in
the last phase of ALTER TABLE processing.

Does that make sense?

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Passing initially_valid values instead of !skip_validation to StoreRelCheck() in AddRelationNewConstraints()

2015-12-03 Thread amul sul
Hi ALL,

Need your suggestions.
initially_valid flag is added to make column constraint valid. (commit : 
http://www.postgresql.org/message-id/e1q2ak9-0006hk...@gemulon.postgresql.org)


IFAICU, initially_valid and skip_validation values are mutually exclusive at 
constraint creation(ref: gram.y), unless it set explicitly.

Can we pass initially_valid instead of !skip_validation to StoreRelCheck() in 
AddRelationNewConstraints(), as shown below?

==
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 7d7d062..04c4f8f 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -2349,7 +2349,7 @@ AddRelationNewConstraints(Relation rel,
* OK, store it.
*/
constrOid =
-StoreRelCheck(rel, ccname, expr, !cdef->skip_validation, is_local,
+StoreRelCheck(rel, ccname, expr, cdef->initially_valid, is_local,
is_local ? 0 : 1, cdef->is_no_inherit, is_internal);

numchecks++;

==


This will make code more readable & in my case this could enable to skip 
validation of existing data as well as mark check constraint valid, when we 
have assurance that modified/added constraint are valid.

Comments? Thoughts? 

Regards,
Amul Sul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-12-03 Thread Noah Misch
On Wed, Dec 02, 2015 at 04:46:26PM +0100, Andres Freund wrote:
> On 2015-12-02 09:57:19 -0500, Noah Misch wrote:
> > Hackers have been too reticent to revert and redo defect-laden
> > commits.  If doing that is weird today, let it be normal.
> 
> Why?

See my paragraph ending with the two sentences you quoted.

> Especially if reverting and redoing includes conflicts that mainly
> increases the chance of accidental bugs.

True.  (That doesn't apply to these patches.)

> > git remote add community git://git.postgresql.org/git/postgresql.git
> > git remote add nmisch_github https://github.com/nmisch/postgresql.git
> > git fetch --multiple community nmisch_github
> > git diff community/master...nmisch_github/mxt4-rm-legacy
> 
> That's a nearly useless diff, because it includes a lot of other changes
> (218 files changed, 2828 insertions(+), 8742 deletions(-)) made since
> you published the changes.

Perhaps you used "git diff a..b", not "git diff a...b".  If not, please send
the outputs of "git rev-parse community/master nmisch_github/mxt4-rm-legacy"
and "git --version".

> > @@ -2190,7 +2194,8 @@ MultiXactSetNextMXact(MultiXactId nextMulti,
> > -   /*
> > -* Computing the actual limits is only possible once the data directory 
> > is
> > -* in a consistent state. There's no need to compute the limits while
> > -* still replaying WAL - no decisions about new multis are made even
> > -* though multixact creations might be replayed. So we'll only do 
> > further
> > -* checks after TrimMultiXact() has been called.
> > -*/
> > +   /* Before the TrimMultiXact() call at end of recovery, skip the rest. */
> > if (!MultiXactState->finishedStartup)
> > return;
> > -
> > Assert(!InRecovery);
> > 
> > -   /* Set limits for offset vacuum. */
> > +   /*
> > +* Setting MultiXactState->oldestOffset entails a find_multixact_start()
> > +* call, which is only possible once the data directory is in a 
> > consistent
> > +* state. There's no need for an offset limit while still replaying WAL;
> > +* no decisions about new multis are made even though multixact 
> > creations
> > +* might be replayed.
> > +*/
> > needs_offset_vacuum = SetOffsetVacuumLimit();
> 
> I don't really see the benefit of this change. The previous location of
> the comment is where we return early, so it seems appropriate to
> document the reason there?

I made that low-importance change for two reasons.  First, returning at that
point skips more than just the setting a limit; it also skips autovacuum
signalling and wraparound warnings.  Second, the function has just computed
mxid "actual limits", so branch mxt2-cosmetic made the comment specify that we
defer an offset limit, not any and all limits.

> >  static bool
> >  SetOffsetVacuumLimit(void)
> >  {
> > -   MultiXactId oldestMultiXactId;
> > +   MultiXactId oldestMultiXactId;
> > MultiXactId nextMXact;
> > -   MultiXactOffset oldestOffset = 0;   /* placate compiler */
> > -   MultiXactOffset prevOldestOffset;
> > +   MultiXactOffset oldestOffset = 0;   /* placate compiler */
> > MultiXactOffset nextOffset;
> > boololdestOffsetKnown = false;
> > +   MultiXactOffset prevOldestOffset;
> > boolprevOldestOffsetKnown;
> > -   MultiXactOffset offsetStopLimit = 0;
> 
> I don't see the benefit of the order changes here.

I reacted the same way.  Commit 4f627f8 reordered some declarations, which I
reverted when I refinished that commit as branch mxt3-main.

> > -   if (oldestOffsetKnown)
> > -   ereport(DEBUG1,
> > -   (errmsg("oldest MultiXactId member is 
> > at offset %u",
> > -   oldestOffset)));
> 
> That's imo a rather useful debug message.

The branches emit that message at the same times 4f627f8^ and earlier emit it.

> > -   else
> > +   if (!oldestOffsetKnown)
> > +   {
> > +   /* XXX This message is incorrect if 
> > prevOldestOffsetKnown. */
> > ereport(LOG,
> > (errmsg("MultiXact member wraparound 
> > protections are disabled because oldest checkpointed MultiXact %u does not 
> > exist on disk",
> > oldestMultiXactId)));
> > +   }
> > }
> 
> Hm, the XXX is a "problem" in all 9.3+ - should we just fix it everywhere?

I welcome a project to fix it.

> > LWLockRelease(MultiXactTruncationLock);
> > 
> > /*
> > -* If we can, compute limits (and install them MultiXactState) to 
> > prevent
> > -* overrun of old data in the members SLRU area. We can only do so if 
> > the
> > -* oldest offset is known though.
> > +* There's no need to update anything if we don't know the oldest offset
> > +* or if it hasn't changed.
> >  */
> 
> Is that really a worthwhile optimization?

I would 

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-12-03 Thread Kyotaro HORIGUCHI
Hello,

At Thu, 3 Dec 2015 16:24:32 +0900, Amit Langote  
wrote in <565fee30.8010...@lab.ntt.co.jp>
> > Agreed. The patch already separates integer values and texts.
> > And re-reviewing the patch, there's no fields necessary to be
> > passed as string.
> > 
> > total_heap_pages, scanned_heap_pages, total_index_pages,
> > scanned_index_pages, vacrelstats->num_index_scans are currently
> > in int32.
> > 
> > Phase can be in integer, and schema_name and relname can be
> > represented by one OID, uint32.
> 
> AIUI, st_progress_message (strings) are to be used to share certain
> messages as progress information. I think the latest vacuum-progress patch
> uses it to report which phase lazy_scan_heap() is in, for example,
> "Scanning heap" for phase 1 of its processing and "Vacuuming index and
> heap" for phase 2. Those values are shown to the user in a text column
> named "phase" of the pg_stat_vacuum_progress view. That said, reporting
> phase as an integer value may also be worth a consideration. Some other
> command might choose to do that.
> 
> > Finally, *NO* text field is needed at least this usage. So
> > progress_message is totally useless regardless of other usages
> > unknown to us.
> 
> I think it may be okay at this point to add just those st_progress_*
> fields which are required by lazy vacuum progress reporting. If someone
> comes up with instrumentation ideas for some other command, they could
> post patches to add more st_progress_* fields and to implement
> instrumentation and a progress view for that command. This is essentially
> what Robert said in [1] in relation to my suggestion of taking out
> st_progress_param_float from this patch.

Yes. After taking a detour, though.

> By the way, there are some non-st_progress_* fields that I was talking
> about in my previous message. For example, st_activity_flag (which I have
> suggested to rename to st_command instead). It needs to be set once at the
> beginning of the command processing and need not be touched again. I think
> it may be a better idea to do the same for table name or OID. It also
> won't change over the duration of the command execution. So, we could set
> it once at the beginning where that becomes known. Currently in the patch,
> it's reported in st_progress_message[0] by every pgstat_report_progress()
> invocation. So, the table name will be strcpy()'d to shared memory for
> every scanned block that's reported.

If we don't have dedicate reporting functions for each phase
(that is, static data phase and progress phase), the one function
pgstat_report_progress does that by having some instruction from
the caller and it would be a bitfield.

Reading the flags for making decision whether every field to copy
or not and branching by that seems too much for int32 (or maybe
64?) fields. Alhtough it would be in effective when we have
progress_message fields, I don't think it is a good idea without
having progress_message.

> pgstat_report_progress(uint32 *param1,  uint16 param1_bitmap,
>char param2[][..], uint16 param2_bitmap)
> {
> ...
>   for(i = 0; i < 16 ; i++)
>   {
>   if (param1_bitmap & (1 << i))
>beentry->st_progress_param[i] = param1[i];
>   if (param2_bitmap & (1 << i))
>strcpy(beentry->..., param2[i]);
>   }

Thoughts?


> Thanks,
> Amit
> 
> [1]
> http://www.postgresql.org/message-id/CA+TgmoYdZk9nPDtS+_kOt4S6fDRQLW+1jnJBmG0pkRT4ynxO=q...@mail.gmail.com

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Confusing results with lateral references

2015-12-03 Thread Ashutosh Bapat
Hi,

I am seeing different results with two queries which AFAIU have same
semantics and hence are expected to give same results.

postgres=# \d t1
  Table "public.t1"
 Column |  Type   | Modifiers
+-+---
 val| integer |
 val2   | integer |

postgres=# \d t2
  Table "public.t2"
 Column |  Type   | Modifiers
+-+---
 val| integer |
 val2   | integer |

There's no data in the table to start with.

postgres=# insert into t1 values (1, 1);
postgres=# insert into t2 values (1, 1), (2, 2);

Session 1
postgres=# begin;
BEGIN
postgres=# update t1 set val = 2 where val2 = 1;
UPDATE 1

Session 2
postgres=# select * from t1, (select distinct val, val2 from t2) t2 where
t1.val = t2.val for update of t1;

query waits here because of FOR UPDATE clause

Session 1
postgres=# commit;
COMMIT

Session 2 gives no rows
postgres=# select * from t1, (select distinct val, val2 from t2) t2 where
t1.val = t2.val for update of t1;
val | val2 | val | val2
-+--+-+--
(0 rows)


Reset values of t1
postgres=# update t1 set val = 1 where val2 = 1;
UPDATE 1

Session 1
postgres=# begin;
BEGIN
postgres=# update t1 set val = 2 where val2 = 1;
UPDATE 1

Session 2
postgres=# select * from t1, lateral (select distinct val, val2 from t2
where t2.val = t1.val) t2 for update of t1;

query waits here

Session 1
postgres=# commit;
COMMIT

Session 2 gives results of the query
postgres=# select * from t1, lateral (select distinct val, val2 from t2
where t2.val = t1.val) t2 for update of t1;
 val | val2 | val | val2
-+--+-+--
   2 |1 |   1 |1
(1 row)

AFAIU, both the queries

select * from t1, (select distinct val, val2 from t2) t2 where t1.val =
t2.val for update of t1;

AND

select * from t1, lateral (select distinct val, val2 from t2 where t2.val =
t1.val) t2 for update of t1;

have same semantic and should give same results.

Is seeing different results expected behaviour?
-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Logical replication and multimaster

2015-12-03 Thread Shulgin, Oleksandr
On Thu, Dec 3, 2015 at 8:34 AM, Craig Ringer  wrote:

> On 3 December 2015 at 14:54, konstantin knizhnik <
> k.knizh...@postgrespro.ru> wrote:
>
>>
>> Are there some principle problems with it? In BDR it was handled in
>> alternative way, using executor callback. It will be much easier if DDL can
>> be replicated in the same way as normal SQL statements.
>>
>
> It can't. I wish it could.
>

That reminds me of that DDL deparsing patch I was trying to revive a while
ago.  Strangely, I cannot find it in any of the commit fests.  Will add it.

--
Alex


Re: [HACKERS] psql: add \pset true/false

2015-12-03 Thread Daniel Verite
Jim Nasby wrote:

> I was more thinking it would be nice to be able to temporarily 
> over-ride/wrap what an output function is doing. AFAIK that would allow 
> this to work everywhere (row(), copy, etc). I don't know of any remotely 
> practical way to do that, though.

Yes. Something like:

SET [LOCAL] OUTPUT STYLE FOR  TYPE "typename" TO json_value;

where json_value would represent a set of type-dependant
parameters.
Then the type's output_function and type_modifier_output_function
refered to in CREATE TYPE could use these parameters
to customize the text representation.

For example:
SET output style FOR TYPE bool TO '{"true":"t", "false":"f"}';
or
SET output style FOR TYPE bool TO '{"true":"TRUE", "false":"FALSE"}';

This style declaration doesn't quite fit with GUCs because it should
be bound to a type, but otherwise the behavior is comparable to a
session-wide or transaction-wide SET.

Could be used for date/times too:

SET output style FOR TYPE timestamptz
  TO  '{"format": "DD/MM/ HH24:MI TZ"}';

where applying format would essentially mean
to_char(timestamp, format), which is more flexible
than DateStyle for the output part.


Going even further, maybe some types could support:
SET output style FOR TYPE typename
  TO  '{"function": "funcname"}';

where the function should exist as 
 funcname(typename) returns text
and the type's output_function would just act as a wrapper.

or even:
SET output style FOR TYPE typename
  TO  '{"filter": "funcname"}';
where the function would exist as 
 funcname(text) returns text
and the type's output_function would call funcname
as an output filter for values already in text format.


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostgresNode::_update_pid using undefined variables in tap tests

2015-12-03 Thread Michael Paquier
Hi all,

While running the test suite this morning I have noticed the following error:
server stopped
readline() on closed filehandle $pidfile at
/Users/ioltas/git/postgres/src/bin/pg_rewind/../../../src/test/perl/PostgresNode.pm
line 308.
Use of uninitialized value in concatenation (.) or string at
/Users/ioltas/git/postgres/src/bin/pg_rewind/../../../src/test/perl/PostgresNode.pm
line 309.
# Postmaster PID is

This does not impact the run, but it creates unwelcome warnings in the
logs. This is actually caused by the following code in PostgresNode
that uses an incorrect check to see if the file has been correctly
opened or not:
open my $pidfile, $self->data_dir . "/postmaster.pid";
if (not defined $pidfile)

One way to fix this is to use if(open(...)), a second way I know of is
to check if the opened file handle matches tell($pidfile) == -1. The
patch attached uses the first method to fix the issue.
Regards,
-- 
Michael
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index aa7a00c..8f66a9c 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -297,17 +297,16 @@ sub _update_pid
 	# If we can open the PID file, read its first line and that's the PID we
 	# want.  If the file cannot be opened, presumably the server is not
 	# running; don't be noisy in that case.
-	open my $pidfile, $self->data_dir . "/postmaster.pid";
-	if (not defined $pidfile)
+	if (open my $pidfile, $self->data_dir . "/postmaster.pid")
 	{
-		$self->{_pid} = undef;
-		print "# No postmaster PID\n";
+		$self->{_pid} = <$pidfile>;
+		print "# Postmaster PID is $self->{_pid}\n";
+		close $pidfile;
 		return;
 	}
 
-	$self->{_pid} = <$pidfile>;
-	print "# Postmaster PID is $self->{_pid}\n";
-	close $pidfile;
+	$self->{_pid} = undef;
+	print "# No postmaster PID\n";
 }
 
 #

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-03 Thread Noah Misch
On Wed, Dec 02, 2015 at 04:33:50PM -0300, Alvaro Herrera wrote:
> Noah Misch wrote:
> > On Tue, Dec 01, 2015 at 08:11:21PM -0300, Alvaro Herrera wrote:
> > > Finally, I ran perltidy on all the files, which strangely changed stuff
> > > that I didn't expect it to change.  I wonder if this is related to the
> > > perltidy version.
> > 
> > The last pgindent run (commit 807b9e0) used perltidy v20090616, and perltidy
> > behavior has changed slightly over time.  Install that version to do your 
> > own
> > perltidy runs.
> 
> I tried that version, but it seems to emit the same.

  git checkout 807b9e0
  (find src -name \*.pl -o -name \*.pm ) | sort -u | xargs perltidy 
--profile=src/tools/pgindent/perltidyrc

perltidy v20090616 leaves the working directory clean, but perltidy v20150815
introduces diffs:

 src/backend/catalog/genbki.pl| 15 ---
 src/bin/pg_basebackup/t/010_pg_basebackup.pl |  2 +-
 src/tools/msvc/Install.pm|  6 +++---
 src/tools/msvc/Mkvcbuild.pm  |  2 +-
 src/tools/msvc/Project.pm|  2 +-
 src/tools/msvc/Solution.pm   |  5 ++---
 src/tools/msvc/gendef.pl |  4 ++--
 7 files changed, 18 insertions(+), 18 deletions(-)

You see a different result?

> How did you figure
> that that was the version used, anyway?

I asked Bruce at one point.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication and multimaster

2015-12-03 Thread Craig Ringer
On 3 December 2015 at 19:06, konstantin knizhnik 
wrote:

>
> On Dec 3, 2015, at 10:34 AM, Craig Ringer wrote:
>
> On 3 December 2015 at 14:54, konstantin knizhnik <
> k.knizh...@postgrespro.ru> wrote:
>
>>
>> I'd really like to collaborate using pglogical_output if at all possible.
>> Petr's working really hard to get the pglogical downstrem out too, with me
>> helping where I can.
>>
>> And where I can get  pglogical_output plugin? Sorry, but I can't quickly
>> find reference with Google...
>>
>
> It's been submitted to this CF.
>
> https://commitfest.postgresql.org/7/418/
>
> https://github.com/2ndQuadrant/postgres/tree/dev/pglogical-output
>
> Any tests and comments would be greatly appreciated.
>
>
> Thank you.
> I wonder if there is opposite part of the pipe for pglogical_output -
> analog of receiver_raw?
>

It's pglogical, and it's in progress, due to be released at the same time
as 9.5. We're holding it a little longer to nail down the user interface a
bit better, etc, and because sometimes the real world gets in the way.

The catalogs  and UI are very different to BDR, it's much more
extensible/modular, it supports much more flexible topologies, etc... but
lots of the core concepts are very similar. So if you go take a look at the
BDR code that'll give you a pretty solid idea of how a lot of it works,
though BDR has whole subsystems pglogical doesn't (global ddl lock, ddl
replication, etc).

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Logical replication and multimaster

2015-12-03 Thread Craig Ringer
On 3 December 2015 at 20:39, Simon Riggs  wrote:

> On 30 November 2015 at 17:20, Konstantin Knizhnik <
> k.knizh...@postgrespro.ru> wrote:
>
>
>> But looks like there is not so much sense in having multiple network
>> connection between one pair of nodes.
>> It seems to be better to have one connection between nodes, but provide
>> parallel execution of received transactions at destination side. But it
>> seems to be also nontrivial. We have now in PostgreSQL some infrastructure
>> for background works, but there is still no abstraction of workers pool and
>> job queue which can provide simple way to organize parallel execution of
>> some jobs. I wonder if somebody is working now on it or we should try to
>> propose our solution?
>>
>
> There are definitely two clear places where additional help would be
> useful and welcome right now.
>

Three IMO, in that a re-usable, generic bgworker pool driven by shmem
messaging would be quite handy. We'll want something like that when we have
transaction interleaving.

I think Konstantin's design is a bit restrictive at the moment; at the
least it needs to address sticky dispatch, and it almost certainly needs to
be using dynamic bgworkers (and maybe dynamic shmem too) to be flexible.
Some thought will be needed to make sure it doesn't rely on !EXEC_BACKEND
stuff like passing pointers to fork()ed data from postmaster memory too.
But the general idea sounds really useful, and we'll either need that or to
use async libpq for concurrent apply.


> 1. Allowing logical decoding to have a "speculative pre-commit data"
> option, to allow some data to be made available via the decoding api,
> allowing data to be transferred prior to commit.
>

Petr, Andres and I tended to refer to that as interleaved transaction
streaming. The idea being to send changes from multiple xacts mixed
together in the stream, identifed by an xid sent with each message, as we
decode them from WAL. Currently we add them to a local reorder buffer and
send them only in commit order after commit.

This moves responsibility for xact ordering (and buffering, if necessary)
to the downstream. It introduces the possibility that concurrently replayed
xacts could deadlock with each other and a few exciting things like that,
too, but with the payoff that we can continue to apply small transactions
in a timely manner even as we're streaming a big transaction like a COPY.

We could possibly enable interleaving right from the start of the xact, or
only once it crosses a certain size threshold. For your purposes Konstantin
you'd want to do it right from the start since latency is crucial for you.
For pglogical we'd probably want to buffer them a bit and only start
streaming if they got big.

This would allow us to reduce the delay that occurs at commit, especially
> for larger transactions or very low latency requirements for smaller
> transactions. Some heuristic or user interface would be required to decide
> whether to and which transactions might make their data available prior to
> commit.
>

I imagine we'd have a knob, either global or per-slot, that sets a
threshold based on size in bytes of the buffered xact. With 0 allowed as
"start immediately".


> And we would need to send abort messages should the transactions not
> commit as expected. That would be a patch on logical decoding and is an
> essentially separate feature to anything currently being developed.
>

I agree that this is strongly desirable. It'd benefit anyone using logical
decoding and would have wide applications.


> 2. Some mechanism/theory to decide when/if to allow parallel apply.
>

I'm not sure it's as much about allowing it as how to do it.


> We already have working multi-master that has been contributed to PGDG, so
> contributing that won't gain us anything.
>

Namely BDR.


> There is a lot of code and pglogical is the most useful piece of code to
> be carved off and reworked for submission.
>

Starting with the already-published output plugin, with the downstream to
come around the release of 9.5.


> Having a single network connection between nodes would increase efficiency
> but also increase replication latency, so its not useful in all cases.
>

If we interleave messages I'm not sure it's too big a problem. Latency
would only become an issue there if a big single row (big Datum contents)
causes lots of small work to get stuck behind it.

IMO this is a separate issue to be dealt with later.

I think having some kind of message queue between nodes would also help,
> since there are many cases for which we want to transfer data, not just a
> replication data flow. For example, consensus on DDL, or MPP query traffic.
> But that is open to wider debate.
>

Logical decoding doesn't really define any network protocol at all. It's
very flexible, and we can throw almost whatever we want down it. The
pglogical_output protocol is extensible enough that we can just add
additional messages when we need to, making 

Re: [HACKERS] Logical replication and multimaster

2015-12-03 Thread Craig Ringer
On 3 December 2015 at 15:27, konstantin knizhnik 
wrote:

>
> On Dec 3, 2015, at 4:18 AM, Craig Ringer wrote:
>
> > Excellent.
> >
> > It should be possible to make that a separate extension. You can use C
> functions from other extensions by exposing a single pg_proc function with
> 'internal' return type that populates a struct of function pointers for the
> API. A single DirectFunctionCall lets you get the API struct. That's how
> pglogical_output handles hooks. The main downside is that you can't do that
> without a connection to a database with the extension installed so the
> pg_proc entry is exposed.
>
>
> Actually, working under cluster and columnar storage extension I got
> several questions about PostgreSQL infrastructure.
> I always found some workarounds, but may it is better to ask community
> about it:)
>
> 1. Why there is no "conditional event" synchronization primitive in
> PostgreSQL. There is latch, but it is implemented using sockets and I
> afraid that it is not very fast.
> It will be nice to have some fast primitive like pthread condition
> variables.
>

The need for IPC makes things a bit more complex. Most places can get away
with using a latch, testing one or more conditions, and resuming waiting.

While what you describe sounds possibly nice is there any evidence that
it's a bottleneck or performance issue? Or is this premature optimisation
at work?


> 2. PostgreSQL semaphores seems to be not intended for external use outside
> PostgreSQL core  (for example in extensions).
> There is no way to request additional amount of semaphores. Right now
> semaphores are allocated based on maximal number of backends and spinlocks.
>

Same with spinlocks AFAIK.

You can add your own LWLocks though.


> 3. What is the right way of creation of background worker requiring access
> to shared memory, i.e. having control structure in main memory?
>

This is documented and well established.


> As far as I understand background workers have to be registered either
> PG_init, either outside Postmaster environment.
> If extension requires access to shared memory, then it should be
> registered in shared_preload_libraries list and should be initialized using
> shmem_startup hook.
>

Correct.

You can use dynamic shmem instead, but there are some issues there IIRC.
Petr may have more to say there.

Take a look at the BDR code for some examples, and there are some in
contrib too I think.

My_shmem_startup is needed because in _PG_init it is not possible to
> allocate shared memory.
>

Correct, since it's in early postmaster start.


> So if I need to allocate some control structure for background workers  in
> shared memory, then I should do it in My_shmem_startup.
>

Yes.


> But I can not register background workers in My_shmem_startup!


Correct. Register static bgworkers in _PG_init. Register dynamic bgworkers
later, in a normal backend function or a bgworker main loop.


> So I have to register background workers in PG_init while control
> structure for them is not yet ready.
>

Correct.

They aren't *started* until after shmem init, though.


> When I have implemented pool of background workers, I solved this problem
> by proving function which return address of control structure later - when
> it will be actually allocated.
>

Beware of EXEC_BACKEND. You can't assume you have shared postmaster memory
from fork().

I suggest that you allocate a static shmem array. Pass indexes into it as
the arguments to the bgworkers.  Have them look up their index in the array
to get their struct pointer.

Read the BDR code to see how this can work; see bdr_perdb.c, bdr_apply.c,
etc's bgworker main loops, bdr_supervisor.c and bdr_perdb.c's code for
registering dynamic bgworkers, and the _PG_init function's setup of the
static supervisor bgworker.

In your case I think you should probably be using dynamic bgworkers for
your pool anyway, so you can grow and shrink them as-needed.

But it seems to be some design flaw in BGW, isn' it?
>

I don't think so. You're registering the worker, saying "when you're ready
please start this". You're not starting it.

You can use dynamic bgworkers too. Same deal, you register them and the
postmaster starts them in a little while, but you can register them after
_PG_init.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] [RFC] overflow checks optimized away

2015-12-03 Thread Tom Lane
Greg Stark  writes:
> What version of GCC and other compilers did we decide we're targeting now?

I can't see us moving the compiler goalposts one inch for this.
"I'm going to break building on your compiler in order to work around
bugs in somebody else's compiler" isn't gonna fly.

The original post pointed out that we haven't introduced the appropriate
equivalents of -fwrapv for non-gcc compilers, which is a good point that
we should fix.  Beyond that, I'm not convinced.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] overflow checks optimized away

2015-12-03 Thread Greg Stark
On Thu, Dec 3, 2015 at 2:51 PM, Tom Lane  wrote:
> I can't see us moving the compiler goalposts one inch for this.
> "I'm going to break building on your compiler in order to work around
> bugs in somebody else's compiler" isn't gonna fly.

Fwiw the builtins offer a carrot as well. They promise to use
architecture features like arithmetic status flags which can be faster
than explicit comparisons and also avoid extra branches that can mess
up cache and branch prediction.

I was proposing to implement wrappers around them that do the checks
manually if they're not present.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] overflow checks optimized away

2015-12-03 Thread Tom Lane
Greg Stark  writes:
> On Thu, Dec 3, 2015 at 2:51 PM, Tom Lane  wrote:
>> I can't see us moving the compiler goalposts one inch for this.
>> "I'm going to break building on your compiler in order to work around
>> bugs in somebody else's compiler" isn't gonna fly.

> I was proposing to implement wrappers around them that do the checks
> manually if they're not present.

As long as there's a fallback for compilers without the builtins, fine.
I read your question as suggesting that you were thinking about not
providing a fallback ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] overflow checks optimized away

2015-12-03 Thread Andres Freund
On 2015-12-03 12:10:27 +, Greg Stark wrote:
> I'm leaning towards using the builtin functions described here

For performance reasons? Michael's version of the patch had the
necessary 'raw' macros, and they don't look *that* bad. Using the
__builtin variants when available, would be nice - and not hard. On
e.g. x86 the overflow checks can be done much more efficiently than both
the current and patched checks.

I wonder though if we can replace

#define PG_INT16_ADD_OVERFLOWS(a, b) (  \
((a) > 0 && (b) > 0 && (a) > PG_INT16_MAX - (b)) || \
((a) < 0 && (b) < 0 && (a) < PG_INT16_MIN - (b)))

#define PG_INT32_ADD_OVERFLOWS(a, b) (  \
((a) > 0 && (b) > 0 && (a) > PG_INT32_MAX - (b)) || \
((a) < 0 && (b) < 0 && (a) < PG_INT32_MIN - (b)))

...

with something like
#define PG_ADD_OVERFLOWS(a, b, MINVAL, MAXVAL) (\
((a) > 0 && (b) > 0 && (a) > MAXVAL - (b)) ||   \
((a) < 0 && (b) < 0 && (a) < MINVAL - (b)))
#define PG_INT16_ADD_OVERFLOWS(a, b)\
 PG_ADD_OVERFLOWS(a, b, PG_INT16_MIN, PG_INT16_MAX)

especially for the MUL variants that'll save a bunch of long repetitions.

> The downside is that then we wouldn't be able to use the generic one
> and would have to use the type-specific ones which would be annoying.

Doesn't seem to be all that bad to me. If we do the fallback like in the
above it should be fairly ok repetition wise.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [DOCS] max_worker_processes on the standby

2015-12-03 Thread Petr Jelinek

On 2015-11-16 22:43, Alvaro Herrera wrote:

I paraphrase Fujii Masao, who wrote:


1. Start the master and standby servers with track_commit_timestamp enabled.
2. Disable track_commit_timestamp in the master and restart the master server.
3. Run checkpoint in the master.
4. Run restartpoint in the standby after the checkpoint WAL record generated
5. Restart the standby server.
6. Enable track_commit_timestamp in the master and restart the master server.
7. Disable track_commit_timestamp in the master and restart the master server.



What I think strange is that pg_last_committed_xact() behaves differently
in #2, #5, and #7 though the settings of track_commit_timestamp are same
in both servers, i.e., it's disabled in the master but enabled in the standby.


Interesting, thanks.  You're right that this behaves oddly.

I think in order to fix these two points (#5 and #7), we need to make
the standby not honour the GUC at all, i.e. only heed what the master
setting is.


8. Promote the standby server to new master.
 Since committs is still inactive even after the promotion,
 pg_last_committed_xact() keeps causing an ERROR though
 track_commit_timestamp is on.

I was thinking that whether committs is active or not should depend on
the setting of track_commit_timestamp *after* the promotion.
The behavior in #8 looked strange.


To fix this problem, we can re-run StartupCommitTs() after recovery is
done, this time making sure to honour the GUC setting.

I haven't tried the scenarios we fixed with the previous round of
patching, but this patch seems to close the problems just reported.
(My next step will be looking over the recovery test framework by
Michael et al, so that I can apply a few tests for this stuff.)
In the meantime, if you can look this over I would appreciate it.



While this seems good, I'd code it slightly differently. I didn't like 
the addition of new bool when it's not really needed. This brings the 
question if we actually need the BootStrapCommitTs and StartupCommitTs 
functions which really don't do much though.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


committs-activation-fix.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: add 'waiting for replication' to pg_stat_activity.state

2015-12-03 Thread Amit Kapila
On Thu, Dec 3, 2015 at 9:02 AM, Craig Ringer  wrote:
>
> On 3 December 2015 at 09:32, Peter Eisentraut  wrote:
>>
>> On 12/2/15 7:00 PM, Craig Ringer wrote:
>> > I notice that you don't set the 'waiting' flag.  'waiting' is presently
>> > documented as:
>> >
>> >True if this backend is currently waiting on a
lock
>> >
>> > ... but I'm inclined to just widen its definition and set it here,
since
>> > we most certainly are waiting, and the column isn't named
>> > 'waiting_on_a_lock'. It shouldn't upset various canned lock monitoring
>> > queries people have since they generally do an inner join on pg_locks
>> > anyway.
>>
>> I'm not so sure about that assumption.
>
>
> Even if it's an outer join, the worst that'll happen is that they'll get
entries with nulls in pg_locks. I don't think it's worth worrying about too
much.
>

That can be one way of dealing with it and another is that we
keep the current column as it is and add another view related
wait stats, anyway we need something like that for other purposes
like lwlock waits etc.  Basically something on lines what we
is being discussed in thread [1]

[1] -
http://www.postgresql.org/message-id/CAA4eK1+=5Ex8-5NRr3u94=_t2p65v0kcjZ5rXddVmkx=lwa...@mail.gmail.com


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Logical replication and multimaster

2015-12-03 Thread Petr Jelinek

On 2015-12-03 14:32, Craig Ringer wrote:

On 3 December 2015 at 15:27, konstantin knizhnik
> wrote:

3. What is the right way of creation of background worker requiring
access to shared memory, i.e. having control structure in main memory?


This is documented and well established.

As far as I understand background workers have to be registered
either PG_init, either outside Postmaster environment.
If extension requires access to shared memory, then it should be
registered in shared_preload_libraries list and should be
initialized using shmem_startup hook.


Correct.

You can use dynamic shmem instead, but there are some issues there IIRC.
Petr may have more to say there.
Take a look at the BDR code for some examples, and there are some in
contrib too I think.



If you have your own flock of dynamic workers that you manage yourself, 
it's probably easier to use dynamic shared memory. You can see some 
examples in the tests and also in the parallel query code for how to do 
it. The only real issue we faced with using dynamic shared memory was 
that we needed to do IPC from normal backends and that gets complicated 
when you don't have the worker info in the normal shmem.



The registration timing and working with normal shmem is actually not a 
problem. Just register shmem start hook in _PG_init and if you are 
registering any bgworkers there as well make sure you set bgw_start_time 
correctly (usually what you want is BgWorkerStart_RecoveryFinished). 
Then you'll have the shmem hook called before the bgworker is actually 
started.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] Another little thing about psql wrapped expanded output

2015-12-03 Thread Jeff Janes
On Wed, Dec 2, 2015 at 7:59 PM, David Fetter  wrote:
> On Wed, Dec 02, 2015 at 10:36:56PM -0500, Josh Berkus wrote:
>> On 12/02/2015 05:24 PM, Tom Lane wrote:
>> > Don't think I agree.  Suppose that you have a wider-than-screen table
>> > and you use a pager to scroll left and right in that.  If we shorten the
>> > dashed lines, then once you scroll to the right of wherever they stop,
>> > you lose that visual cue separating the rows.  This matters a lot if
>> > only a few of the column values are very wide: everywhere else, there's
>> > gonna be lots of whitespace.
>>
>> What pager lets me scroll right infinitely?  Because I wanna install that.
>
> I don't know about infinitely, but at least with the -S (no wrap)
> option, less lets you use left- and right-arrow, prefixed by
> multipliers, if you like, to scroll horizontally
>
> And now I have learned something new about a pager I've used every day
> for decades.


And even if you don't specify -S, 'less' will jump into that mode
anyway as soon as you hit the right array.  It starts out wrapping,
but hit the right arrow and it undoes the wrap.  So I often do hit '1'
right arrow to get out of the wrap situation described above.  I do
lose the leftmost character on the screen.  Then you can left-arrow
all the way back to the left, and it jumps back into wrap mode.

Pretty darn useful at times.  Now, if I could just find a way to tell
'less' after-the-fact "pretend I started you up with the -X flag".
Once I find the part I want, I want to quit the pager without clearing
that stuff off of the screen.  Sometimes.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-12-03 Thread Catalin Iacob
On Thu, Dec 3, 2015 at 7:58 AM, Pavel Stehule  wrote:
> I am sorry, I don't understand. Now due inheritance plpy.Fatal and
> plpy.SPIError has availability to use keyword parameters.

Indeed, I didn't realize this, but I don't think it changes the main argument.

What I think should happen:

1. Error should take keyword arguments
2. same for Fatal
3. Fatal should not be changed to inherit from Error - it should stay
like it is now, just another exception class
You can argue a Fatal error is an Error but these classes already
exist and are independent, changing their relationship is backward
incompatible.
4. SPIError should not be changed at all
 It's for errors raised by query execution not user PL/Python code
so doing raise SPIError in PL/Python doesn't make sense.
 And again, changing the inheritance relationship of these
existing classes changes meaning of existing code that catches the
exceptions.
5. all the reporting functions: plpy.debug...plpy.fatal functions
should also be changed to allow more arguments than the message and
allow them as keyword arguments
 They are Python wrappers for ereport and this would make them
similar in capabilities to the PL/pgSQL RAISE
 This will make plpy.error(...) stay equivalent in capabilities
with raise plpy.Error(...), same for fatal and Fatal
6. the docs should move to the "Utility Functions" section
 There, it's already described how to raise errors either via the
exceptions or the utility functions.

I think the above doesn't break anything, doesn't confuse user
exceptions with backend SPIError exceptions, enhances error reporting
features for the PL/Python user to bring them up to par with ereport
and PL/pgSQL RAISE and solve your initial use case at the top of the
thread (but with slightly different spelling to match what already
exists in PL/Python):

"We cannot to raise PostgreSQL exception with setting all possible
fields. I propose new function

plpy.ereport(level, [ message [, detail [, hint [, sqlstate, ... )"

Is what I mean more clear now? Do you (and others) agree?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] snapshot too old, configured by time

2015-12-03 Thread Andres Freund
On 2015-12-02 14:48:24 -0600, Kevin Grittner wrote:
> On Wed, Dec 2, 2015 at 12:39 AM, Michael Paquier
>  wrote:
> > On Mon, Nov 9, 2015 at 8:07 AM, Steve Singer  wrote:
> 
> >> In snapmgr.c
> >>
> >>
> >> + * XXX: If we can trust a read of an int64 value to be atomic, we can 
> >> skip the
> >> + * spinlock here.
> >> + */
> >> +int64
> >> +GetOldSnapshotThresholdTimestamp(void)
> >>
> >>
> >> Was your intent with the XXX for it to be a TODO to only aquire the lock on
> >> platforms without the atomic 64bit operations?
> 
> I'm not sure whether we can safely assume a read of an int64 to be
> atomic on any platform; if we actually can on some platforms, and
> we have a #define to tell us whether we are in such an environment,
> we could condition the spinlock calls on that.  Are we there yet?

We currently don't assume it's atomic. And there are platforms, e.g 32
bit arm, where that's not the case
(c.f. https://wiki.postgresql.org/wiki/Atomics). It'd be rather useful
to abstract that knowledge into a macro...

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] overflow checks optimized away

2015-12-03 Thread Greg Stark
On Thu, Dec 3, 2015 at 2:51 PM, Tom Lane  wrote:
> Greg Stark  writes:
>> What version of GCC and other compilers did we decide we're targeting now?
>
> I can't see us moving the compiler goalposts one inch for this.

That's not the question I asked :/

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Refactor Perl test code

2015-12-03 Thread Alvaro Herrera
Tom Lane wrote:
> Michael Paquier  writes:
> > On Thu, Dec 3, 2015 at 12:19 PM, Tom Lane  wrote:
> >> BTW, not the fault of this patch in particular, but this example points
> >> up the complaint I've had right along about how opaque TAP test failures
> >> are.  How did you dig down to see that error message?
> 
> > Well, it showed up on my terminal...
> 
> Not on mine, as per the extract I showed.  Probably a Perl version
> difference, but I don't think we can exactly write off RHEL6 as an
> uninteresting obsolete distribution.  (The Perl here is 5.10.1.)

Hmm, maybe a selinux policy or something like that is preventing some
system call from working, and this causes a Perl call to fail which we
may be failing to report; for instance we have
open(ORIG_STDOUT, ">");
in TestLib.pm's INIT block, without an "or die" which would let us know
that it failed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] broken tests

2015-12-03 Thread Pavel Stehule
2015-12-03 12:39 GMT+01:00 Pavel Raiskup :

> On Wednesday 02 of December 2015 20:26:56 Pavel Stehule wrote:
> > 2015-12-02 20:08 GMT+01:00 Alvaro Herrera :
> >
> > > Pavel Stehule wrote:
> > > > Hi
> > > >
> > > > Today I have problem with regress tests on my laptop.
> > >
> > > Maybe this is because of the libxml version?
> >
> > 100%, same issue is with 9.4.5
> >
> > After downgrade to 2.9.2 (from 2.9.3) this issue was out
>
> Agreed.
>
> > So it is looking like Fedora fault
>
> Not really.  This has been changed in libxml2 upstream:
> https://bugzilla.redhat.com/show_bug.cgi?id=1286692


ok

I would not be offensive

Regards

Pavel


>
>
> Pavel
>
>


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-03 Thread Michael Paquier
On Thu, Dec 3, 2015 at 6:59 AM, Alvaro Herrera  wrote:
> I didn't push the changes for config_default you requested a few
> messages upthread; it's not clear to me how setting it to undef affects
> the whole thing.  If setting it to undef makes the MSVC toolchain run
> the tap tests in the default config, then I can do it; let's be clear
> about what branch to backpatch this to.  Also the "1;" at the end of
> RewindTest.

Setting it to undef will prevent the tests to run, per vcregress.pl:
sub tap_check
{
die "Tap tests not enabled in configuration"
  unless $config->{tap_tests};
Also, setting it to undef will match the existing behavior on
platforms where ./configure is used because the switch
--enable-tap-tests needs to be used there. And I would believe that in
most cases Windows environments are not going to have IPC::Run
deployed.

I have also rebased the recovery test suite as the attached, using the
infrastructure that has been committed lately.
Regards,
-- 
Michael
From de2121eeb50c5ae49b29a2ac21a16579eae2de98 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 3 Dec 2015 13:48:48 +0900
Subject: [PATCH 1/2] Fix tap_test configuration in MSVC builds

---
 src/tools/msvc/config_default.pl | 27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/src/tools/msvc/config_default.pl b/src/tools/msvc/config_default.pl
index b9f2ff4..e50be7e 100644
--- a/src/tools/msvc/config_default.pl
+++ b/src/tools/msvc/config_default.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 
 our $config = {
-	asserts => 0,# --enable-cassert
+	asserts  => 0,# --enable-cassert
 	  # integer_datetimes=>1,   # --enable-integer-datetimes - on is now default
 	  # float4byval=>1, # --disable-float4-byval, on by default
 
@@ -13,18 +13,19 @@ our $config = {
 	# blocksize => 8, # --with-blocksize, 8kB by default
 	# wal_blocksize => 8, # --with-wal-blocksize, 8kB by default
 	# wal_segsize => 16,  # --with-wal-segsize, 16MB by default
-	ldap => 1,# --with-ldap
-	extraver => undef,# --with-extra-version=
-	nls  => undef,# --enable-nls=
-	tcl  => undef,# --with-tls=
-	perl => undef,# --with-perl
-	python   => undef,# --with-python=
-	openssl  => undef,# --with-openssl=
-	uuid => undef,# --with-ossp-uuid
-	xml  => undef,# --with-libxml=
-	xslt => undef,# --with-libxslt=
-	iconv=> undef,# (not in configure, path to iconv)
-	zlib => undef # --with-zlib=
+	ldap  => 1,# --with-ldap
+	extraver  => undef,# --with-extra-version=
+	nls   => undef,# --enable-nls=
+	tap_tests => undef,# --enable-tap-tests
+	tcl   => undef,# --with-tls=
+	perl  => undef,# --with-perl
+	python=> undef,# --with-python=
+	openssl   => undef,# --with-openssl=
+	uuid  => undef,# --with-ossp-uuid
+	xml   => undef,# --with-libxml=
+	xslt  => undef,# --with-libxslt=
+	iconv => undef,# (not in configure, path to iconv)
+	zlib  => undef # --with-zlib=
 };
 
 1;
-- 
2.6.3

From 4ee2a99db2a414f85e961110279f4b97309cd927 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 3 Dec 2015 15:41:12 +0900
Subject: [PATCH 2/2] Add recovery test suite

This includes basic tests maipulating standbys, be they archiving or
streaming nodes, and some basic sanity checks around them.
---
 src/test/Makefile   |   2 +-
 src/test/perl/RecoveryTest.pm   | 167 
 src/test/recovery/.gitignore|   3 +
 src/test/recovery/Makefile  |  17 +++
 src/test/recovery/README|  19 
 src/test/recovery/t/001_stream_rep.pl   |  58 ++
 src/test/recovery/t/002_archiving.pl|  43 +++
 src/test/recovery/t/003_recovery_targets.pl | 123 
 src/test/recovery/t/004_timeline_switch.pl  |  66 +++
 src/test/recovery/t/005_replay_delay.pl |  41 +++
 10 files changed, 538 insertions(+), 1 deletion(-)
 create mode 100644 src/test/perl/RecoveryTest.pm
 create mode 100644 src/test/recovery/.gitignore
 create mode 100644 src/test/recovery/Makefile
 create mode 100644 src/test/recovery/README
 create mode 100644 src/test/recovery/t/001_stream_rep.pl
 create mode 100644 src/test/recovery/t/002_archiving.pl
 create mode 100644 src/test/recovery/t/003_recovery_targets.pl
 create mode 100644 src/test/recovery/t/004_timeline_switch.pl
 create mode 100644 src/test/recovery/t/005_replay_delay.pl

diff --git a/src/test/Makefile b/src/test/Makefile
index b713c2c..7f7754f 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,7 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = regress isolation modules
+SUBDIRS = regress isolation modules recovery
 
 # We don't build or 

Re: [HACKERS] Error with index on unlogged table

2015-12-03 Thread Andres Freund
On 2015-11-27 16:59:20 +0900, Michael Paquier wrote:
> Attached is a patch that fixes the issue for me in master and 9.5.
> Actually in the last patch I forgot a call to smgrwrite to ensure that
> the INIT_FORKNUM is correctly synced to disk when those pages are
> replayed at recovery, letting the reset routines for unlogged
> relations do their job correctly. I have noticed as well that we need
> to do the same for gin and brin relations. In this case I think that
> we could limit the flush to unlogged relations, my patch does it
> unconditionally though to generalize the logic. Thoughts?

I think it's a good idea to limit this to unlogged relations. For a
dataset that fits into shared_buffers and creates many relations, the
additional disk writes could be noticeable.

> diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
> index 99337b0..d7964ac 100644
> --- a/src/backend/access/brin/brin.c
> +++ b/src/backend/access/brin/brin.c
> @@ -682,7 +682,12 @@ brinbuildempty(PG_FUNCTION_ARGS)
>   brin_metapage_init(BufferGetPage(metabuf), BrinGetPagesPerRange(index),
>  BRIN_CURRENT_VERSION);
>   MarkBufferDirty(metabuf);
> - log_newpage_buffer(metabuf, false);
> + /*
> +  * When this full page image is replayed, there is no guarantee that
> +  * this page will be present to disk when replayed particularly for
> +  * unlogged relations, hence enforce it to be flushed to disk.
> +  */

The grammar seems a bit off here.

> + /*
> +  * Initialize and xlog metabuffer and root buffer. When those full
> +  * pages are replayed, it is not guaranteed that those relation
> +  * init forks will be flushed to disk after replaying them, hence
> +  * enforce those pages to be flushed to disk at replay, only the
> +  * last record will enforce a flush for performance reasons and
> +  * because it is actually unnecessary to do it multiple times.
> +  */

That comment needs some love too. I think it really only makes sense if
you know the current state. There's also some language polishing needed.

> @@ -450,14 +450,21 @@ ginbuildempty(PG_FUNCTION_ARGS)
>   START_CRIT_SECTION();
>   GinInitMetabuffer(MetaBuffer);
>   MarkBufferDirty(MetaBuffer);
> - log_newpage_buffer(MetaBuffer, false);
> + log_newpage_buffer(MetaBuffer, false, false);
>   GinInitBuffer(RootBuffer, GIN_LEAF);
>   MarkBufferDirty(RootBuffer);
> - log_newpage_buffer(RootBuffer, false);
> + log_newpage_buffer(RootBuffer, false, true);
>   END_CRIT_SECTION();
>
Why isn't the metapage flushed to disk?

> --- a/src/backend/access/spgist/spginsert.c
> +++ b/src/backend/access/spgist/spginsert.c
> @@ -173,7 +173,7 @@ spgbuildempty(PG_FUNCTION_ARGS)
> (char *) page, true);
>   if (XLogIsNeeded())
>   log_newpage(>rd_smgr->smgr_rnode.node, INIT_FORKNUM,
> - SPGIST_METAPAGE_BLKNO, page, false);
> + SPGIST_METAPAGE_BLKNO, page, false, 
> false);
>  
>   /* Likewise for the root page. */
>   SpGistInitPage(page, SPGIST_LEAF);
> @@ -183,7 +183,7 @@ spgbuildempty(PG_FUNCTION_ARGS)
> (char *) page, true);
>   if (XLogIsNeeded())
>   log_newpage(>rd_smgr->smgr_rnode.node, INIT_FORKNUM,
> - SPGIST_ROOT_BLKNO, page, true);
> + SPGIST_ROOT_BLKNO, page, true, false);
>

I don't think it's correct to not flush in these cases. Yes, the primary
does an smgrimmedsync() - but that's not done on the standby.

> @@ -9392,9 +9396,28 @@ xlog_redo(XLogReaderState *record)
>* when checksums are enabled. There is no difference in 
> handling
>* XLOG_FPI and XLOG_FPI_FOR_HINT records, they use a different 
> info
>* code just to distinguish them for statistics purposes.
> +  *
> +  * XLOG_FPI_FOR_SYNC records are generated when a relation 
> needs to
> +  * be sync'ed just after replaying a full page. This is 
> important
> +  * to match the master behavior in the case where a page is 
> written
> +  * directly to disk without going through shared buffers, 
> particularly
> +  * when writing init forks for index relations.
>*/

How about

FPI_FOR_SYNC records indicate that the page immediately needs to be
written to disk, not just to shared buffers. This is important if the
on-disk state is to be the authoritative, not the state in shared
buffers. E.g. because on-disk files may later be copied directly.

> diff --git a/src/backend/commands/tablecmds.c 
> b/src/backend/commands/tablecmds.c
> index 0ddde72..eb22a51 100644
> --- a/src/backend/commands/tablecmds.c
> +++ b/src/backend/commands/tablecmds.c
> @@ -9920,7 +9920,8 @@ copy_relation_data(SMgrRelation src, 

Re: [HACKERS] [DOCS] max_worker_processes on the standby

2015-12-03 Thread Alvaro Herrera
Petr Jelinek wrote:

> While this seems good, I'd code it slightly differently. I didn't like the
> addition of new bool when it's not really needed. This brings the question
> if we actually need the BootStrapCommitTs and StartupCommitTs functions
> which really don't do much though.

Thanks, it's certainly nice that this got simpler.  (I'm not in love
with the idea of having xlog.c know what flag needs to pass in each
case, but I don't see any option that's more convenient.)

We weren't quite there however -- namely this patch didn't close problem
#8 in Fujii-san rundown.  The problem is that when promoting,
standbyState is not STANDBY_DISABLED but STANDBY_SNAPSHOT_READY (which
is a bit surprising but not something this patch should fix).  To fix
this I took the StartupCommitTs() call out of that block, so that it
runs inconditionally.

I also changed the hint message:

postgres=# select * from pg_last_committed_xact();
ERROR:  could not get commit timestamp data
HINT:  Make sure the configuration parameter "track_commit_timestamp" is set in 
the master server.

Otherwise this would be very confusing:

postgres=# select * from pg_last_committed_xact();
ERROR:  could not get commit timestamp data
HINT:  Make sure the configuration parameter "track_commit_timestamp" is set.
postgres=# show track_commit_timestamp ;
 track_commit_timestamp 

 on
(1 fila)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] eXtensible Transaction Manager API

2015-12-03 Thread Robert Haas
On Tue, Dec 1, 2015 at 9:19 AM, Bruce Momjian  wrote:
> On Tue, Nov 17, 2015 at 12:48:38PM -0500, Robert Haas wrote:
>> > At this time, the number of round trips needed particularly for READ
>> > COMMITTED transactions that need a new snapshot for each query was
>> > really a performance killer. We used DBT-1 (TPC-W) which is less
>> > OLTP-like than DBT-2 (TPC-C), still with DBT-1 the scalability limit
>> > was quickly reached with 10-20 nodes..
>>
>> Yeah.  I think this merits a good bit of thought.  Superficially, at
>> least, it seems that every time you need a snapshot - which in the
>> case of READ COMMITTED is for every SQL statement - you need a network
>> roundtrip to the snapshot server.  If multiple backends request a
>> snapshot in very quick succession, you might be able to do a sort of
>> "group commit" thing where you send a single request to the server and
>> they all use the resulting snapshot, but it seems hard to get very far
>> with such optimization.  For example, if backend 1 sends a snapshot
>> request and backend 2 then realizes that it also needs a snapshot, it
>> can't just wait for the reply from backend 1 and use that one.  The
>> user might have committed a transaction someplace else and then kicked
>> off a transaction on backend 2 afterward, expecting it to see the work
>> committed earlier.  But the snapshot returned to backend 1 might have
>> been taken before that.  So, all in all, this seems rather crippling.
>>
>> Things are better if the system has a single coordinator node that is
>> also the arbiter of commits and snapshots.  Then, it can always take a
>> snapshot locally with no network roundtrip, and when it reaches out to
>> a shard, it can pass along the snapshot information with the SQL query
>> (or query plan) it has to send anyway.  But then the single
>> coordinator seems like it becomes a bottleneck.  As soon as you have
>> multiple coordinators, one of them has got to be the arbiter of global
>> ordering, and now all of the other coordinators have to talk to it
>> constantly.
>
> I think the performance benefits of having a single coordinator are
> going to require us to implement different snapshot/transaction code
> paths for single coordinator and multi-coordinator cases.  :-(  That is,
> people who can get by with only a single coordinator are going to want
> to do that to avoid the overhead of multiple coordinators, while those
> using multiple coordinators are going to have to live with that
> overhead.
>
> I think an open question is what workloads can use a single coordinator?
> Read-only?  Long queries?  Are those cases also ones where the
> snapshot/transaction overhead is negligible, meaning we don't need the
> single coordinator optimizations?

Well, the cost of the coordinator is basically a per-snapshot cost,
which means in effect a per-transaction cost.  So if your typical
query runtimes are measured in seconds or minutes, it's probably going
to be really hard to swamp the coordinator.  If they're measured
milliseconds or microseconds, it's likely to be a huge problem.  I
think this is why a lot of NoSQL systems have abandoned transactions
as a way of doing business.  The overhead of isolating transactions is
significant even on a single-server system when there are many CPUs
and lots of short-running queries (cf.
0e141c0fbb211bdd23783afa731e3eef95c9ad7a and previous efforts in the
same area) but in a multi-server environment it just gets crushingly
expensive.

I think there are ways to reduce the cost of this.  Some distributed
systems have solved it by retreating from snapshot isolation and going
back to using locks.  This can improve scalability if you've got lots
of transactions but they're very short and rarely touch the same data.
Locking individual data elements (or partitions) doesn't require a
central system that knows about all commits - each individual node can
just release locks for each transaction as it completes.  More
generally, if you could avoid having to make a decision about whether
transaction A precedes or follows transaction B unless transaction A
and B actually touch the same data - an idea we already use for SSI -
then you could get much better scalability.  But I doubt that can be
made to work without a deeper rethink of our transaction system.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Refactor Perl test code

2015-12-03 Thread Tom Lane
Alvaro Herrera  writes:
> Tom Lane wrote:
>> Michael Paquier  writes:
>>> Well, it showed up on my terminal...

>> Not on mine, as per the extract I showed.  Probably a Perl version
>> difference, but I don't think we can exactly write off RHEL6 as an
>> uninteresting obsolete distribution.  (The Perl here is 5.10.1.)

> Hmm, maybe a selinux policy or something like that is preventing some
> system call from working,

Good thought, but overcomplicated.  After some quality time with strace,
I find that the
Undefined subroutine ::run called at /home/postgres/pgsql/src/bin/initdb
/../../../src/test/perl/TestLib.pm line 146.
error message does indeed appear, but it's printed only into
src/bin/initdb/tmp_check/log/regress_log_001_initdb

It appears that the reason this happens is that the child perl process
(of the two that are active in this area of the trace) decides at one
point along the line to redirect its own stdout and stderr into that
file, disconnecting them from the pipes leading to the parent process.
So when that process prints "Undefined subroutine" to its stderr, that
is where it goes.

I suppose that the redirection is a result of this bit in TestLib.pm:

# Hijack STDOUT and STDERR to the log file
open(ORIG_STDOUT, ">");
open(ORIG_STDERR, ">");
open(STDOUT,  ">");
open(STDERR,  ">");

so the question from my end is not so much why it doesn't work for me
as how it could possibly work for you.  Could the newer Perl version
be detecting the undefined-subroutine error before it executes the
file's INIT stanza, rather than after?  Or maybe the "tie" stuff
happening just below this acts differently in more modern Perls?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] atomic reads & writes (with no barriers)

2015-12-03 Thread Kevin Grittner
The "snapshot too old" patch has an XXX comment about being able to
drop a spinlock usage from a frequently called "get" method if the
64-bit read could be trusted to be atomic.  There is no need for a
barrier in this case, because a stale value just means we won't be
quite as aggressive about pruning tuples still within the global
xmin boundary (potentially leading to the "snapshot too old"
error), normally by a small fraction of a second.  Removing
contention on spinlocks is a good thing.  Andres commented here:

http://www.postgresql.org/message-id/20151203154230.ge20...@awork2.anarazel.de

| We currently don't assume it's atomic. And there are platforms,
| e.g 32 bit arm, where that's not the case (c.f.
| https://wiki.postgresql.org/wiki/Atomics ). It'd be rather useful
| to abstract that knowledge into a macro...

I did some web searches and then opened a question on Stack
Overflow.  Not surprisingly I got a mix of useful answers, and
those not quite so much.  Based on the most useful comment, I got
something that isn't too awfully pretty, but it seems to work on my
Ubuntu machine.  Maybe it can be the start of something useful.

Before C11 the C standard specifically disclaimed any atomic
access.  C11 adds what seems to me to be a fairly nice mechanism
for supporting it with the _Atomic modifier on a declaration.
Based on a suggestion from Ian Abbott I added this:

diff --git a/src/include/c.h b/src/include/c.h
index 8163b00..105c542 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -288,6 +288,11 @@ typedef unsigned long long int uint64;
 #error must have a working 64-bit integer datatype
 #endif

+#if defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 201112L) &&
!defined(__STDC_NO_ATOMICS__)
+#define HAVE_ATOMICINT64
+typedef _Atomic int64 atomicint64;
+#endif
+
 /* Decide if we need to decorate 64-bit constants */
 #ifdef HAVE_LL_CONSTANTS
 #define INT64CONST(x)  ((int64) x##LL)

To use it in my patch, I made these changes to the patched code:

diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 13c58d2..1b891b5 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -81,7 +81,11 @@ typedef struct OldSnapshotControlData
slock_t mutex_current;  /* protect current timestamp */
int64   current_timestamp;  /* latest snapshot timestamp */
slock_t mutex_threshold;/* protect threshold fields */
+#ifdef HAVE_ATOMICINT64
+   atomicint64 threshold_timestamp;/* earlier snapshot is old */
+#else
int64   threshold_timestamp;/* earlier snapshot is old */
+#endif
TransactionId threshold_xid;/* earlier xid may be gone */

/*
@@ -1553,6 +1557,9 @@ GetSnapshotCurrentTimestamp(void)
 int64
 GetOldSnapshotThresholdTimestamp(void)
 {
+#ifdef HAVE_ATOMICINT64
+   return oldSnapshotControl->threshold_timestamp;
+#else
int64   threshold_timestamp;

SpinLockAcquire(>mutex_threshold);
@@ -1560,6 +1567,7 @@ GetOldSnapshotThresholdTimestamp(void)
SpinLockRelease(>mutex_threshold);

return threshold_timestamp;
+#endif
 }

 /*

I'm sure this could be improved upon, and we would want to expand
it to uint64; but it seems to work, and I think it should do no
harm on non-C11 compilers.  (I'm less sure about 32-bit builds, but
if there's a problem there, it should be fixable.)  We may be able
to make use of non-standard features of other (non-C11) compilers,
but I'm not aware of what options there are.

Is the c.h change above on anything resembling the right track for
a patch for this?  If not, what would such a patch look like?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-03 Thread Alvaro Herrera
> >> Here I attached the patch with the suggested changes.
> >> Along with line number, I kept the options column also with authentication
> >> options as a jsonb datatype.
> >>
> >> Example output:
> >>
> >> postgres=# select pg_hba_lookup('test','all','::1');
> >> NOTICE:  Skipped 84 Hba line, because of non matching IP.
> >> NOTICE:  Skipped 86 Hba line, because of non matching database.
> >> NOTICE:  Skipped 87 Hba line, because of non matching role.
> >>  pg_hba_lookup
> >> ---
> >>  (89,trust,{})
> >> (1 row)
> >>
> >> comments?

I don't like this interface.  It's nice for psql, but everybody else is
going to lose.  I think these should be reported in the SRF result set
as well; perhaps add a "mode" column that says "skipped" for such rows,
and "matched" for the one that, uh, matches.  (Please try calling your
function with "select * from" which should give nicer output.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] broken tests

2015-12-03 Thread Pavel Raiskup
On Thursday 03 of December 2015 20:49:09 Pavel Stehule wrote:
> 2015-12-03 12:39 GMT+01:00 Pavel Raiskup :
>
> > On Wednesday 02 of December 2015 20:26:56 Pavel Stehule wrote:
> > > 2015-12-02 20:08 GMT+01:00 Alvaro Herrera :
> > >
> > > > Pavel Stehule wrote:
> > > > > Hi
> > > > >
> > > > > Today I have problem with regress tests on my laptop.
> > > >
> > > > Maybe this is because of the libxml version?
> > >
> > > 100%, same issue is with 9.4.5
> > >
> > > After downgrade to 2.9.2 (from 2.9.3) this issue was out
> >
> > Agreed.
> >
> > > So it is looking like Fedora fault
> >
> > Not really.  This has been changed in libxml2 upstream:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1286692
>
> ok
>
> I would not be offensive

Sorry if that looked like this :(.  I just wanted to point out that this
is not "just" about Fedora --> but either libxml2 (upstream) needs to be
fixed or PostgreSQL's testsuite.

Pavel



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-03 Thread Pavel Stehule
2015-12-03 5:53 GMT+01:00 Pavel Stehule :

>
>
> 2015-12-03 5:00 GMT+01:00 Haribabu Kommi :
>
>> On Wed, Nov 25, 2015 at 7:18 PM, Pavel Stehule 
>> wrote:
>> >
>> >
>> > 2015-11-25 8:05 GMT+01:00 Haribabu Kommi :
>> >>
>> >>
>> >> Thanks. Here I attached the poc patch that returns authentication
>> method
>> >> of the
>> >> first matched hba entry in pg_hba.conf with the given input values.
>> >> Currently these
>> >> functions returns text type. Based on the details required to be
>> >> printed, it can
>> >> be changed.
>> >>
>> >> postgres=# select pg_hba_lookup('all', 'all');
>> >>  pg_hba_lookup
>> >> ---
>> >>  trust
>> >> (1 row)
>> >>
>> >> comments for the approach?
>> >
>> >
>> > From my perspective, it shows too less informations.
>> >
>> > What I am expecting:
>> >
>> > 1. line num of choosed rule
>> > 2. some tracing - via NOTICE, what and why some rules was skipped.
>>
>> Here I attached the patch with the suggested changes.
>> Along with line number, I kept the options column also with authentication
>> options as a jsonb datatype.
>>
>> Example output:
>>
>> postgres=# select pg_hba_lookup('test','all','::1');
>> NOTICE:  Skipped 84 Hba line, because of non matching IP.
>> NOTICE:  Skipped 86 Hba line, because of non matching database.
>> NOTICE:  Skipped 87 Hba line, because of non matching role.
>>  pg_hba_lookup
>> ---
>>  (89,trust,{})
>> (1 row)
>>
>> comments?
>>
>
> I liked it
>
> The text of notice can be reduced "Skipped xx line, ..." - it have to be
> pg_hba
>

this tracing can be implemented to main pg_hba processing. When you are
connect from some specific client - and you can see, why you cannot to
connect to Postgres

Pavel


>
> Pavel
>
>
>>
>> Regards,
>> Hari Babu
>> Fujitsu Australia
>>
>
>


Re: [HACKERS] Error with index on unlogged table

2015-12-03 Thread Andres Freund
Hi,

On 2015-11-20 16:11:15 +0900, Michael Paquier wrote:
> diff --git a/src/backend/access/transam/xlog.c 
> b/src/backend/access/transam/xlog.c
> index cc845d2..4883697 100644
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -9503,6 +9503,14 @@ xlog_redo(XLogRecPtr lsn, XLogRecord *record)
>   data += sizeof(BkpBlock);
>  
>   RestoreBackupBlockContents(lsn, bkpb, data, false, false);
> +
> + if (bkpb.fork == INIT_FORKNUM)
> + {
> + SMgrRelation srel;
> + srel = smgropen(bkpb.node, InvalidBackendId);
> + smgrimmedsync(srel, INIT_FORKNUM);
> + smgrclose(srel);
> + }
>   }
>   else if (info == XLOG_BACKUP_END)
>   {

A smgrwrite() instead of a smgrimmedsync() should be sufficient here.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Passing initially_valid values instead of !skip_validation to StoreRelCheck() in AddRelationNewConstraints()

2015-12-03 Thread Amit Langote
On 2015/12/03 20:44, amul sul wrote:
>> On Thursday, 3 December 2015 4:36 PM, Amit Langote 
>>  wrote:
>> Especially from a readability standpoint, I think using skip_validation may 
>> be more apt. 
>> Why - the corresponding parameter of StoreRelCheck() dictates what's stored 
>> in pg_constraint.convalidated.
> 
> Why not? won't initially_valid flag serve same purpose ?

Yes it could, but IMO, it wouldn't be a readability improvement. As I
said, you could think of the variable/argument as delivering whether the
clause is marked NOT VALID or not. Also, see below.

> 
>> So, if skip_validation is 'true' because user specified the constraint NOT 
>> VALID,
>> StoreRelCheck() will store the constraint with convalidated as 'false'
> 
> I guess thats was added before initially_valid flag. As I said, in normal 
> case gram.y take care of skip_validation & initially_valid values, if one is 
> 'true' other will be 'false'. 
> 
>> The user will have to separately validate the constraint by issuing a ALTER 
>> TABLE VALIDATE CONSTRAINT
>> command at a time of their choosing.
> 
> 
> This could be time consuming operation for big table, If I am pretty much 
> sure that my constraint will be valid, simply I could set both 
> flag(initially_valid & skip_validation) to true.

There is currently no support for adding a constraint after-the-fact (that
is, using ALTER TABLE) and marking it valid  without actually verifying it
by scanning the table. As Marko points out that would be actually a new
SQL-level feature that requires much more than changing that line.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-12-03 Thread Pavel Stehule
2015-12-03 16:57 GMT+01:00 Catalin Iacob :

> On Thu, Dec 3, 2015 at 7:58 AM, Pavel Stehule 
> wrote:
> > I am sorry, I don't understand. Now due inheritance plpy.Fatal and
> > plpy.SPIError has availability to use keyword parameters.
>
> Indeed, I didn't realize this, but I don't think it changes the main
> argument.
>
> What I think should happen:
>
> 1. Error should take keyword arguments
> 2. same for Fatal
>

understand


> 3. Fatal should not be changed to inherit from Error - it should stay
> like it is now, just another exception class
> You can argue a Fatal error is an Error but these classes already
> exist and are independent, changing their relationship is backward
> incompatible.
>

Don't understand - if Fatal has same behave as Error, then why it cannot be
inherited from Error?

What can be broken?



> 4. SPIError should not be changed at all
>  It's for errors raised by query execution not user PL/Python code
> so doing raise SPIError in PL/Python doesn't make sense.
>  And again, changing the inheritance relationship of these
> existing classes changes meaning of existing code that catches the
> exceptions.
>

can be


> 5. all the reporting functions: plpy.debug...plpy.fatal functions
> should also be changed to allow more arguments than the message and
> allow them as keyword arguments
>

this is maybe bigger source of broken compatibility

lot of people uses plpy.debug(var1, var2, var3)

rich constructor use $1 for message, $2 for detail, $3 for hint. So it was
a reason, why didn't touch these functions


>  They are Python wrappers for ereport and this would make them
> similar in capabilities to the PL/pgSQL RAISE
>  This will make plpy.error(...) stay equivalent in capabilities
> with raise plpy.Error(...), same for fatal and Fatal
> 6. the docs should move to the "Utility Functions" section
>  There, it's already described how to raise errors either via the
> exceptions or the utility functions.
>
> I think the above doesn't break anything, doesn't confuse user
> exceptions with backend SPIError exceptions, enhances error reporting
> features for the PL/Python user to bring them up to par with ereport
> and PL/pgSQL RAISE and solve your initial use case at the top of the
> thread (but with slightly different spelling to match what already
> exists in PL/Python):
>
> "We cannot to raise PostgreSQL exception with setting all possible
> fields. I propose new function
>
> plpy.ereport(level, [ message [, detail [, hint [, sqlstate, ... )"
>

I am not against to plpy.ereport function - it can live together with rich
exceptions class, and users can use what they prefer.

It is few lines more


>
> Is what I mean more clear now? Do you (and others) agree?
>

not too much  :)

Regards

Pavel


Re: [HACKERS] Rework the way multixact truncations work

2015-12-03 Thread Andres Freund
On 2015-12-03 04:38:45 -0500, Noah Misch wrote:
> On Wed, Dec 02, 2015 at 04:46:26PM +0100, Andres Freund wrote:
> > Especially if reverting and redoing includes conflicts that mainly
> > increases the chance of accidental bugs.
> 
> True.  (That doesn't apply to these patches.)

Uh, it does. You had conflicts in your process, and it's hard to verify
that the re-applied patch is actually functionally identical given the
volume of changes. It's much easier to see what actually changes by
looking at iterative commits forward from the current state.


Sorry, but I really just want to see these changes as iterative patches
ontop of 9.5/HEAD instead of this process. I won't revert the reversion
if you push it anyway, but I think it's a rather bad idea.


> > > git remote add community git://git.postgresql.org/git/postgresql.git
> > > git remote add nmisch_github https://github.com/nmisch/postgresql.git
> > > git fetch --multiple community nmisch_github
> > > git diff community/master...nmisch_github/mxt4-rm-legacy
> > 
> > That's a nearly useless diff, because it includes a lot of other changes
> > (218 files changed, 2828 insertions(+), 8742 deletions(-)) made since
> > you published the changes.
> 
> Perhaps you used "git diff a..b", not "git diff a...b".

Ah yes. Neat, didn't know that one.

> > > @@ -2190,7 +2194,8 @@ MultiXactSetNextMXact(MultiXactId nextMulti,
> > > - /*
> > > -  * Computing the actual limits is only possible once the data directory 
> > > is
> > > -  * in a consistent state. There's no need to compute the limits while
> > > -  * still replaying WAL - no decisions about new multis are made even
> > > -  * though multixact creations might be replayed. So we'll only do 
> > > further
> > > -  * checks after TrimMultiXact() has been called.
> > > -  */
> > > + /* Before the TrimMultiXact() call at end of recovery, skip the rest. */
> > >   if (!MultiXactState->finishedStartup)
> > >   return;
> > > -
> > >   Assert(!InRecovery);
> > > 
> > > - /* Set limits for offset vacuum. */
> > > + /*
> > > +  * Setting MultiXactState->oldestOffset entails a find_multixact_start()
> > > +  * call, which is only possible once the data directory is in a 
> > > consistent
> > > +  * state. There's no need for an offset limit while still replaying WAL;
> > > +  * no decisions about new multis are made even though multixact 
> > > creations
> > > +  * might be replayed.
> > > +  */
> > >   needs_offset_vacuum = SetOffsetVacuumLimit();
> > 
> > I don't really see the benefit of this change. The previous location of
> > the comment is where we return early, so it seems appropriate to
> > document the reason there?
> 
> I made that low-importance change for two reasons.  First, returning at that
> point skips more than just the setting a limit; it also skips autovacuum
> signalling and wraparound warnings.  Second, the function has just computed
> mxid "actual limits", so branch mxt2-cosmetic made the comment specify that we
> defer an offset limit, not any and all limits.

My question was more about the comment being after the "early return"
than about the content change, should have made that clearer. Can we
just move your comment up?

> > >  static bool
> > >  SetOffsetVacuumLimit(void)
> > >  {
> > > - MultiXactId oldestMultiXactId;
> > > + MultiXactId oldestMultiXactId;
> > >   MultiXactId nextMXact;
> > > - MultiXactOffset oldestOffset = 0;   /* placate compiler */
> > > - MultiXactOffset prevOldestOffset;
> > > + MultiXactOffset oldestOffset = 0;   /* placate compiler */
> > >   MultiXactOffset nextOffset;
> > >   boololdestOffsetKnown = false;
> > > + MultiXactOffset prevOldestOffset;
> > >   boolprevOldestOffsetKnown;
> > > - MultiXactOffset offsetStopLimit = 0;
> > 
> > I don't see the benefit of the order changes here.
> 
> I reacted the same way.  Commit 4f627f8 reordered some declarations, which I
> reverted when I refinished that commit as branch mxt3-main.

But the other changes are there, and in the history anyway. As the new
order isn't more meaningful than the current one...

> > > - if (oldestOffsetKnown)
> > > - ereport(DEBUG1,
> > > - (errmsg("oldest MultiXactId member is 
> > > at offset %u",
> > > - oldestOffset)));
> > 
> > That's imo a rather useful debug message.
> 
> The branches emit that message at the same times 4f627f8^ and earlier emit it.

During testing I found it rather helpful if it was emitted regularly.

> > >   LWLockRelease(MultiXactTruncationLock);
> > > 
> > >   /*
> > > -  * If we can, compute limits (and install them MultiXactState) to 
> > > prevent
> > > -  * overrun of old data in the members SLRU area. We can only do so if 
> > > the
> > > -  * oldest offset is known though.
> > > +  * There's no need to update anything if we don't know the oldest offset
> > > +  * or if it hasn't changed.
> > >*/
> > 
> > 

Re: [HACKERS] eXtensible Transaction Manager API

2015-12-03 Thread Craig Ringer
On 4 December 2015 at 06:41, Robert Haas  wrote:

>
> I think there are ways to reduce the cost of this.  Some distributed
> systems have solved it by retreating from snapshot isolation and going
> back to using locks.  This can improve scalability if you've got lots
> of transactions but they're very short and rarely touch the same data.
> Locking individual data elements (or partitions) doesn't require a
> central system that knows about all commits - each individual node can
> just release locks for each transaction as it completes.  More
> generally, if you could avoid having to make a decision about whether
> transaction A precedes or follows transaction B unless transaction A
> and B actually touch the same data - an idea we already use for SSI -
> then you could get much better scalability.  But I doubt that can be
> made to work without a deeper rethink of our transaction system.
>

Something I've seen thrown about is the idea of lazy snapshots. Using
SSI-like facilities you keep track of transaction dependencies and what's
changed since the last snapshot. A transaction can use an old snapshot if
it doesn't touch anything changed since the snapshot was taken. If it does
you acquire a new snapshot (or switch to a newer existing one) and can swap
it in safely, since you know they're the same for all data the xact has
touched so far.

I suspect that just replaces one expensive problem with one or more
different expensive problems and is something that'd help a few workloads
but hurt others. It's come up when people have asked why we take a new
snapshot for every transaction but I haven't seen it explored much.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Remaining 9.5 open items

2015-12-03 Thread Noah Misch
On Tue, Dec 01, 2015 at 11:05:47AM -0500, Robert Haas wrote:
> On Mon, Nov 30, 2015 at 2:43 PM, Tom Lane  wrote:
> > * Foreign join pushdown vs EvalPlanQual
> >
> > Is this fixed by 5fc4c26db?  If not, what remains to do?
> 
> Unfortunately, no.  That commit allows FDWs to do proper EPQ handling
> for plain table scans, but it proves to be inadequate for EPQ handling
> for joins. Solving that problem will require another patch, and,
> modulo a bunch of cosmetic issues, I'm reasonably happy with KaiGai
> Kohei's latest submission.  I'll respond in more detail on that
> thread, but the question I want to raise here is: do we want to
> back-patch those changes to 9.5 at this late date?

Yes.  If 9.5 added a bad interface, better to fix the interface even now than
to live with the bad one.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Remaining 9.5 open items

2015-12-03 Thread Peter Geoghegan
On Wed, Dec 2, 2015 at 5:27 AM, Robert Haas  wrote:
>> These are mainly just documentation improvements which I'm working on,
>> though the docs were recently updated and I need to incorporate Peter's
>> changes which I wasn't exactly anticipating.
>
> So, when do you foresee this documentation stuff getting taken care
> of?  We kinda need to do a release here.  Is this really a blocker?

If it's a blocker, I can write the documentation myself. Let's just
fix it, rather than discussing whether or not it needs to be a blocker
-- it will take less time.

Stephen?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers