Re: [COMMITTERS] pgsql: Add a test for transition table usage in FOR EACH ROW trigger.

2017-05-16 Thread Kevin Grittner
On Tue, May 16, 2017 at 5:18 PM, Kevin Grittner <kgri...@gmail.com> wrote:

> I swear I did a `make check-world` before committing!

Then I spotted a typo in a comment, fixed the comment, and didn't
re-run.  Sorry.

-- 
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add a test for transition table usage in FOR EACH ROW trigger.

2017-05-16 Thread Kevin Grittner
On Tue, May 16, 2017 at 4:32 PM, Andres Freund <and...@anarazel.de> wrote:
> On 2017-05-16 21:12:08 +0000, Kevin Grittner wrote:
>> Add a test for transition table usage in FOR EACH ROW trigger.
>
> The buildfarm does not like this one:
> https://buildfarm.postgresql.org/cgi-bin/show_failures.pl?max_days=3==Submit

Yikes!  Reverted.  Will see what went wrong.

I swear I did a `make check-world` before committing!

-- 
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Revert "Add a test for transition table usage in FOR EACH ROW tr

2017-05-16 Thread Kevin Grittner
Revert "Add a test for transition table usage in FOR EACH ROW trigger."

This reverts commit 4a03f935b3438de27ee00d9e562ffe4e225978a9.

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/a19ea9c6601bfb06dfd9f4c1060550dbc3f7bde1

Modified Files
--
src/test/regress/expected/sanity_check.out |  1 -
src/test/regress/expected/triggers.out | 27 ---
src/test/regress/sql/triggers.sql  | 25 -
3 files changed, 53 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add a test for transition table usage in FOR EACH ROW trigger.

2017-05-16 Thread Kevin Grittner
Add a test for transition table usage in FOR EACH ROW trigger.

Branch
--
master

Details
---
https://git.postgresql.org/pg/commitdiff/4a03f935b3438de27ee00d9e562ffe4e225978a9

Modified Files
--
src/test/regress/expected/sanity_check.out |  1 +
src/test/regress/expected/triggers.out | 27 +++
src/test/regress/sql/triggers.sql  | 25 +
3 files changed, 53 insertions(+)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add infrastructure to support EphemeralNamedRelation references.

2017-04-06 Thread Kevin Grittner
On Thu, Apr 6, 2017 at 5:20 PM, Kevin Grittner <kgri...@gmail.com> wrote:

> I'll commit this fix first so I don't hold up Andres or break any
> picky buildfarm critters

Done.

> and then see whether I can't manage to get
> the tests to cover this code.

The function in question is only called from rewrite, and here's the
relevant comment:

 * About JOINs and dropped columns: although the parser never includes an
 * already-dropped column in a JOIN RTE's alias var list, it is possible for
 * such a list in a stored rule to include references to dropped columns.
 * (If the column is not explicitly referenced anywhere else in the query,
 * the dependency mechanism won't consider it used by the rule and so won't
 * prevent the column drop.)  To support get_rte_attribute_is_dropped(), we
 * replace join alias vars that reference dropped columns with null pointers.

So, to test this I guess I need to create a view that does SELECT *
on a table, write a plpgsql trigger function and use it as an AFTER
EACH STATEMENT trigger on that table -- referencing the view and
explicitly using a specific column from the transition table in a
join qual, modify the table so the trigger gets fired and the
function gets cached, ALTER the table to drop the column so
referenced without doing anything that might cause the function plan
to be discarded from cache, and then modify the table again to fire
the cached trigger.  Does that seem like the right test to add?  Or
would even that fail to reach this code because the transition table
is not on the view?

Oh well, I guess I'll write the code and find out -- seems easier
than reverse-engineering that code path.

--
Kevin Grittner


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add isolation test for SERIALIZABLE READ ONLY DEFERRABLE.

2017-04-05 Thread Kevin Grittner
Add isolation test for SERIALIZABLE READ ONLY DEFERRABLE.

This improves code coverage and lays a foundation for testing
similar issues in a distributed environment.

Author: Thomas Munro 
Reviewed-by: Michael Paquier 

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/4deb413813f619b3e859abf435b61efc09cafe09

Modified Files
--
.../isolation/expected/read-only-anomaly-2.out | 44 ++
.../isolation/expected/read-only-anomaly-3.out | 26 +
src/test/isolation/expected/read-only-anomaly.out  | 25 
src/test/isolation/isolation_schedule  |  3 ++
src/test/isolation/isolationtester.c   |  8 
src/test/isolation/specs/read-only-anomaly-2.spec  | 42 +
src/test/isolation/specs/read-only-anomaly-3.spec  | 39 +++
src/test/isolation/specs/read-only-anomaly.spec| 38 +++
8 files changed, 225 insertions(+)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Follow-on cleanup for the transition table patch.

2017-04-04 Thread Kevin Grittner
Follow-on cleanup for the transition table patch.

Commit 59702716 added transition table support to PL/pgsql so that
SQL queries in trigger functions could access those transient
tables.  In order to provide the same level of support for PL/perl,
PL/python and PL/tcl, refactor the relevant code into a new
function SPI_register_trigger_data.  Call the new function in the
trigger handler of all four PLs, and document it as a public SPI
function so that authors of out-of-tree PLs can do the same.

Also get rid of a second QueryEnvironment object that was
maintained by PL/pgsql.  That was previously used to deal with
cursors, but the same approach wasn't appropriate for PLs that are
less tangled up with core code.  Instead, have SPI_cursor_open
install the connection's current QueryEnvironment, as already
happens for SPI_execute_plan.

While in the docs, remove the note that transition tables were only
supported in C and PL/pgSQL triggers, and correct some ommissions.

Thomas Munro with some work by Kevin Grittner (mostly docs)

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/5ebeb579b9b281dba5f8415b2fbda86fdae7b366

Modified Files
--
doc/src/sgml/ref/create_trigger.sgml  |  10 +--
doc/src/sgml/spi.sgml | 125 ++
doc/src/sgml/trigger.sgml |  54 +--
src/backend/executor/spi.c|  52 +++
src/include/executor/spi.h|   3 +
src/pl/plperl/expected/plperl_trigger.out |  29 ++
src/pl/plperl/plperl.c|   7 ++
src/pl/plperl/sql/plperl_trigger.sql  |  32 +++
src/pl/plpgsql/src/pl_exec.c  |  49 +-
src/pl/plpgsql/src/plpgsql.h  |   4 -
src/pl/plpython/expected/plpython_trigger.out |  21 +
src/pl/plpython/plpy_exec.c   |   5 ++
src/pl/plpython/sql/plpython_trigger.sql  |  24 +
src/pl/tcl/expected/pltcl_queries.out |  21 +
src/pl/tcl/pltcl.c|   5 ++
src/pl/tcl/sql/pltcl_queries.sql  |  20 +
16 files changed, 398 insertions(+), 63 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Fix two undocumented parameters to functions from ENR patch.

2017-04-01 Thread Kevin Grittner
Fix two undocumented parameters to functions from ENR patch.

On ProcessUtility document the parameter, to match others.

On CreateCachedPlan drop the queryEnv parameter.  It was not
referenced within the function, and had been added on the
assumption that with some unknown future usage of QueryEnvironment
it might be useful to do something there.  We have avoided other
"just in case" implementation of unused paramters, so drop it here.

Per gripe from Tom Lane

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/41bd155dd656e7f17c02855be7aff234843347cd

Modified Files
--
src/backend/commands/prepare.c  | 2 +-
src/backend/executor/spi.c  | 3 +--
src/backend/tcop/postgres.c | 6 ++
src/backend/tcop/utility.c  | 2 ++
src/backend/utils/cache/plancache.c | 3 +--
src/include/utils/plancache.h   | 3 +--
6 files changed, 8 insertions(+), 11 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add infrastructure to support EphemeralNamedRelation references.

2017-04-01 Thread Kevin Grittner
On Sat, Apr 1, 2017 at 12:21 AM, Kevin Grittner <kgri...@gmail.com> wrote:
> On Sat, Apr 1, 2017 at 12:01 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>
>> rhinoceros says you missed contrib/sepgsql.
>
> Yeah, I saw that and have pushed an attempt to fix.

That blind fix seemed to work.

>> (And while I'm bitching, you definitely failed to update ProcessUtility's
>> header comment, which like most significant functions takes some pains
>> to describe all the arguments.)
>
> Will fix.

I pushed a fix for that and then went looking to see what else I
missed.  I found CreateCachedPlan, but then saw that the parameter
was never referenced within the function.  That seems odd.  I want
to go over that carefully, but I'm too tired now.  Will investigate
tomorrow.

--
Kevin Grittner


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add infrastructure to support EphemeralNamedRelation references.

2017-03-31 Thread Kevin Grittner
On Sat, Apr 1, 2017 at 12:01 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> rhinoceros says you missed contrib/sepgsql.

Yeah, I saw that and have pushed an attempt to fix.

> More generally, if you hack the API of some globally-referenced function,
> you ought to grep for references to it rather than just assume your
> compiler will find them all for you.  For one thing, that approach is a
> great way to fail to update relevant comments.

That's what I normally do, and thought that I had here, but clearly
I messed up.  The log shows that even in 2014, when this was
written, those calls were there.

> (And while I'm bitching, you definitely failed to update ProcessUtility's
> header comment, which like most significant functions takes some pains
> to describe all the arguments.)

Will fix.

--
Kevin Grittner


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Try to fix breakage of sepgsql hooks by ENR patch.

2017-03-31 Thread Kevin Grittner
Try to fix breakage of sepgsql hooks by ENR patch.

Turned up by buildfarm animal rhinoceros.  Fixing blind.  Will have
to wait for next run by rhinoceros to know whether it worked.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/01fd6f8f2d15a9369768921d6fc95ac481779430

Modified Files
--
contrib/sepgsql/hooks.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add transition table support to plpgsql.

2017-03-31 Thread Kevin Grittner
Add transition table support to plpgsql.

Kevin Grittner and Thomas Munro
Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro
with valuable comments and suggestions from many others

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/59702716324ab9c07b02fb005dcf14c7f48c4632

Modified Files
--
doc/src/sgml/ref/create_trigger.sgml   |   5 +
src/pl/plpgsql/src/pl_comp.c   |  13 +-
src/pl/plpgsql/src/pl_exec.c   |  47 ++
src/pl/plpgsql/src/plpgsql.h   |  14 +-
src/test/regress/expected/plpgsql.out  | 287 +
src/test/regress/expected/triggers.out |  24 +++
src/test/regress/sql/plpgsql.sql   | 283 
src/test/regress/sql/triggers.sql  |  23 +++
8 files changed, 685 insertions(+), 11 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add infrastructure to support EphemeralNamedRelation references.

2017-03-31 Thread Kevin Grittner
Add infrastructure to support EphemeralNamedRelation references.

A QueryEnvironment concept is added, which allows new types of
objects to be passed into queries from parsing on through
execution.  At this point, the only thing implemented is a
collection of EphemeralNamedRelation objects -- relations which
can be referenced by name in queries, but do not exist in the
catalogs.  The only type of ENR implemented is NamedTuplestore, but
provision is made to add more types fairly easily.

An ENR can carry its own TupleDesc or reference a relation in the
catalogs by relid.

Although these features can be used without SPI, convenience
functions are added to SPI so that ENRs can easily be used by code
run through SPI.

The initial use of all this is going to be transition tables in
AFTER triggers, but that will be added to each PL as a separate
commit.

An incidental effect of this patch is to produce a more informative
error message if an attempt is made to modify the contents of a CTE
from a referencing DML statement.  No tests previously covered that
possibility, so one is added.

Kevin Grittner and Thomas Munro
Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro
with valuable comments and suggestions from many others

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/18ce3a4ab22d2984f8540ab480979c851dae5338

Modified Files
--
contrib/pg_stat_statements/pg_stat_statements.c |  15 +-
doc/src/sgml/spi.sgml   | 204 
src/backend/catalog/pg_proc.c   |   3 +-
src/backend/commands/copy.c |   5 +-
src/backend/commands/createas.c |   5 +-
src/backend/commands/explain.c  |  40 +++--
src/backend/commands/extension.c|   6 +-
src/backend/commands/foreigncmds.c  |   2 +-
src/backend/commands/matview.c  |   2 +-
src/backend/commands/prepare.c  |  17 +-
src/backend/commands/schemacmds.c   |   1 +
src/backend/commands/trigger.c  |   9 +-
src/backend/commands/view.c |   2 +-
src/backend/executor/Makefile   |   3 +-
src/backend/executor/execAmi.c  |   6 +
src/backend/executor/execMain.c |   5 +
src/backend/executor/execParallel.c |   2 +-
src/backend/executor/execProcnode.c |  14 ++
src/backend/executor/execUtils.c|   2 +
src/backend/executor/functions.c|   8 +-
src/backend/executor/nodeNamedtuplestorescan.c  | 198 +++
src/backend/executor/spi.c  | 116 --
src/backend/nodes/copyfuncs.c   |  25 +++
src/backend/nodes/nodeFuncs.c   |   2 +
src/backend/nodes/outfuncs.c|  20 +++
src/backend/nodes/print.c   |   4 +
src/backend/nodes/readfuncs.c   |   7 +
src/backend/optimizer/path/allpaths.c   |  45 ++
src/backend/optimizer/path/costsize.c   |  70 
src/backend/optimizer/plan/createplan.c |  71 +
src/backend/optimizer/plan/setrefs.c|  16 ++
src/backend/optimizer/plan/subselect.c  |   4 +
src/backend/optimizer/prep/prepjointree.c   |   2 +
src/backend/optimizer/util/clauses.c|   2 +-
src/backend/optimizer/util/pathnode.c   |  26 +++
src/backend/optimizer/util/plancat.c|   7 +-
src/backend/optimizer/util/relnode.c|   1 +
src/backend/parser/Makefile |   5 +-
src/backend/parser/analyze.c|  14 +-
src/backend/parser/parse_clause.c   |  59 +--
src/backend/parser/parse_enr.c  |  29 
src/backend/parser/parse_node.c |   2 +
src/backend/parser/parse_relation.c | 143 -
src/backend/parser/parse_target.c   |   2 +
src/backend/tcop/postgres.c |  22 ++-
src/backend/tcop/pquery.c   |  10 +-
src/backend/tcop/utility.c  |  34 ++--
src/backend/utils/adt/ruleutils.c   |   1 +
src/backend/utils/cache/plancache.c |  34 ++--
src/backend/utils/misc/Makefile |   4 +-
src/backend/utils/misc/queryenvironment.c   | 144 +
src/backend/utils/sort/tuplestore.c |  17 ++
src/include/catalog/catversion.h|   2 +-
src/include/commands/createas.h |   3 +-
src/include/commands/explain.h  |   9 +-
src/include/commands/prepare.h  |   4 +-
src/include/executor/execdesc.h |   2 +
src/include/executor/nodeNamedtuplestorescan.h  |  24 +++
src/include/executor/spi.h  |   7 +
src/include/executor/spi_priv.h |   2 +
src/include/nodes/execnodes.h   |  21 +++
src/include/nodes

[COMMITTERS] pgsql: Back-patch fcff8a575198478023ada8a48e13b50f70054766 as a bug fix

2016-12-13 Thread Kevin Grittner
Back-patch fcff8a575198478023ada8a48e13b50f70054766 as a bug fix.

When there is both a serialization failure and a unique violation,
throw the former rather than the latter.  When initially pushed,
this was viewed as a feature to assist application framework
developers, so that they could more accurately determine when to
retry a failed transaction, but a test case presented by Ian
Jackson has shown that this patch can prevent serialization
anomalies in some cases where a unique violation is caught within a
subtransaction, the work of that subtransaction is discarded, and
no error is thrown.  That makes this a bug fix, so it is being
back-patched to all supported branches where it is not already
present (i.e., 9.2 to 9.5).

Discussion: 
https://postgr.es/m/1481307991-16971-1-git-send-email-ian.jack...@eu.citrix.com
Discussion: 
https://postgr.es/m/22607.56276.807567.924...@mariner.uk.xensource.com

Branch
--
REL9_5_STABLE

Details
---
http://git.postgresql.org/pg/commitdiff/bed2a0b06ba54266a4e66affbc8f08e5eea6e8bc

Modified Files
--
doc/src/sgml/mvcc.sgml | 35 
src/backend/access/nbtree/nbtinsert.c  |  8 
.../isolation/expected/read-write-unique-2.out | 29 +
.../isolation/expected/read-write-unique-3.out | 12 ++
.../isolation/expected/read-write-unique-4.out | 41 ++
src/test/isolation/expected/read-write-unique.out  | 29 +
src/test/isolation/isolation_schedule  |  4 ++
src/test/isolation/specs/read-write-unique-2.spec  | 36 
src/test/isolation/specs/read-write-unique-3.spec  | 33 +++
src/test/isolation/specs/read-write-unique-4.spec  | 48 ++
src/test/isolation/specs/read-write-unique.spec| 39 ++
11 files changed, 307 insertions(+), 7 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Back-patch fcff8a575198478023ada8a48e13b50f70054766 as a bug fix

2016-12-13 Thread Kevin Grittner
Back-patch fcff8a575198478023ada8a48e13b50f70054766 as a bug fix.

When there is both a serialization failure and a unique violation,
throw the former rather than the latter.  When initially pushed,
this was viewed as a feature to assist application framework
developers, so that they could more accurately determine when to
retry a failed transaction, but a test case presented by Ian
Jackson has shown that this patch can prevent serialization
anomalies in some cases where a unique violation is caught within a
subtransaction, the work of that subtransaction is discarded, and
no error is thrown.  That makes this a bug fix, so it is being
back-patched to all supported branches where it is not already
present (i.e., 9.2 to 9.5).

Discussion: 
https://postgr.es/m/1481307991-16971-1-git-send-email-ian.jack...@eu.citrix.com
Discussion: 
https://postgr.es/m/22607.56276.807567.924...@mariner.uk.xensource.com

Branch
--
REL9_2_STABLE

Details
---
http://git.postgresql.org/pg/commitdiff/60314e28eb02c82eb658aaf3c7fcf253004acbb4

Modified Files
--
doc/src/sgml/mvcc.sgml | 35 
src/backend/access/nbtree/nbtinsert.c  |  8 
.../isolation/expected/read-write-unique-2.out | 29 +
.../isolation/expected/read-write-unique-3.out | 12 ++
.../isolation/expected/read-write-unique-4.out | 41 ++
src/test/isolation/expected/read-write-unique.out  | 29 +
src/test/isolation/isolation_schedule  |  4 ++
src/test/isolation/specs/read-write-unique-2.spec  | 36 
src/test/isolation/specs/read-write-unique-3.spec  | 33 +++
src/test/isolation/specs/read-write-unique-4.spec  | 48 ++
src/test/isolation/specs/read-write-unique.spec| 39 ++
11 files changed, 307 insertions(+), 7 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Back-patch fcff8a575198478023ada8a48e13b50f70054766 as a bug fix

2016-12-13 Thread Kevin Grittner
Back-patch fcff8a575198478023ada8a48e13b50f70054766 as a bug fix.

When there is both a serialization failure and a unique violation,
throw the former rather than the latter.  When initially pushed,
this was viewed as a feature to assist application framework
developers, so that they could more accurately determine when to
retry a failed transaction, but a test case presented by Ian
Jackson has shown that this patch can prevent serialization
anomalies in some cases where a unique violation is caught within a
subtransaction, the work of that subtransaction is discarded, and
no error is thrown.  That makes this a bug fix, so it is being
back-patched to all supported branches where it is not already
present (i.e., 9.2 to 9.5).

Discussion: 
https://postgr.es/m/1481307991-16971-1-git-send-email-ian.jack...@eu.citrix.com
Discussion: 
https://postgr.es/m/22607.56276.807567.924...@mariner.uk.xensource.com

Branch
--
REL9_4_STABLE

Details
---
http://git.postgresql.org/pg/commitdiff/4b9d466c14083003bd80e1ce02e617b2b92df7fe

Modified Files
--
doc/src/sgml/mvcc.sgml | 35 
src/backend/access/nbtree/nbtinsert.c  |  8 
.../isolation/expected/read-write-unique-2.out | 29 +
.../isolation/expected/read-write-unique-3.out | 12 ++
.../isolation/expected/read-write-unique-4.out | 41 ++
src/test/isolation/expected/read-write-unique.out  | 29 +
src/test/isolation/isolation_schedule  |  4 ++
src/test/isolation/specs/read-write-unique-2.spec  | 36 
src/test/isolation/specs/read-write-unique-3.spec  | 33 +++
src/test/isolation/specs/read-write-unique-4.spec  | 48 ++
src/test/isolation/specs/read-write-unique.spec| 39 ++
11 files changed, 307 insertions(+), 7 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Back-patch fcff8a575198478023ada8a48e13b50f70054766 as a bug fix

2016-12-13 Thread Kevin Grittner
Back-patch fcff8a575198478023ada8a48e13b50f70054766 as a bug fix.

When there is both a serialization failure and a unique violation,
throw the former rather than the latter.  When initially pushed,
this was viewed as a feature to assist application framework
developers, so that they could more accurately determine when to
retry a failed transaction, but a test case presented by Ian
Jackson has shown that this patch can prevent serialization
anomalies in some cases where a unique violation is caught within a
subtransaction, the work of that subtransaction is discarded, and
no error is thrown.  That makes this a bug fix, so it is being
back-patched to all supported branches where it is not already
present (i.e., 9.2 to 9.5).

Discussion: 
https://postgr.es/m/1481307991-16971-1-git-send-email-ian.jack...@eu.citrix.com
Discussion: 
https://postgr.es/m/22607.56276.807567.924...@mariner.uk.xensource.com

Branch
--
REL9_3_STABLE

Details
---
http://git.postgresql.org/pg/commitdiff/5d80171adace192c6d4308ae4b22074d6b0809f3

Modified Files
--
doc/src/sgml/mvcc.sgml | 35 
src/backend/access/nbtree/nbtinsert.c  |  8 
.../isolation/expected/read-write-unique-2.out | 29 +
.../isolation/expected/read-write-unique-3.out | 12 ++
.../isolation/expected/read-write-unique-4.out | 41 ++
src/test/isolation/expected/read-write-unique.out  | 29 +
src/test/isolation/isolation_schedule  |  4 ++
src/test/isolation/specs/read-write-unique-2.spec  | 36 
src/test/isolation/specs/read-write-unique-3.spec  | 33 +++
src/test/isolation/specs/read-write-unique-4.spec  | 48 ++
src/test/isolation/specs/read-write-unique.spec| 39 ++
11 files changed, 307 insertions(+), 7 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Improve tab completion for CREATE TRIGGER.

2016-11-04 Thread Kevin Grittner
Improve tab completion for CREATE TRIGGER.

This includes support for the new REFERENCING clause.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/927d7bb6b120a2ca09a164898f887eb850b7a329

Modified Files
--
src/bin/psql/tab-complete.c | 34 ++
1 file changed, 34 insertions(+)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Implement syntax for transition tables in AFTER triggers.

2016-11-04 Thread Kevin Grittner
Implement syntax for transition tables in AFTER triggers.

This is infrastructure for the complete SQL standard feature.  No
support is included at this point for execution nodes or PLs.  The
intent is to add that soon.

As this patch leaves things, standard syntax can create tuplestores
to contain old and/or new versions of rows affected by a statement.
References to these tuplestores are in the TriggerData structure.
C triggers can access the tuplestores directly, so they are usable,
but they cannot yet be referenced within a SQL statement.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/8c48375e5f43ebd832f93c9166d1fe0e639ff806

Modified Files
--
doc/src/sgml/catalogs.sgml   |  16 ++
doc/src/sgml/ref/create_trigger.sgml |  94 --
src/backend/commands/tablecmds.c |   5 +-
src/backend/commands/trigger.c   | 327 +--
src/backend/nodes/copyfuncs.c|  16 ++
src/backend/nodes/equalfuncs.c   |  14 ++
src/backend/nodes/outfuncs.c |  13 ++
src/backend/parser/gram.y|  70 +++-
src/backend/utils/adt/ruleutils.c|  23 +++
src/include/catalog/catversion.h |   2 +-
src/include/catalog/pg_trigger.h |  13 +-
src/include/commands/trigger.h   |   2 +
src/include/nodes/nodes.h|   1 +
src/include/nodes/parsenodes.h   |  17 ++
src/include/parser/kwlist.h  |   3 +
src/include/utils/reltrigger.h   |   7 +
16 files changed, 580 insertions(+), 43 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Fix recent commit for tab-completion of database template.

2016-09-12 Thread Kevin Grittner
Fix recent commit for tab-completion of database template.

The details of commit 52803098ab26051c0c9802f3c19edffc90de8843 were
based on a misunderstanding of the role inheritance allowing use
of a database for a template.  While the CREATEDB privilege is not
inherited, the database ownership is privileges are.

Pointed out by Vitaly Burovoy and Tom Lane.
Fix provided by Tom Lane, reviewed by Vitaly Burovoy.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/63c1a871940c7c4798788e98fdb1a24408a49d05

Modified Files
--
src/bin/psql/tab-complete.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: psql tab completion for CREATE DATABASE ... TEMPLATE ...

2016-09-11 Thread Kevin Grittner
psql tab completion for CREATE DATABASE ... TEMPLATE ...

Sehrope Sarkuni, reviewed by Merlin Moncure & Vitaly Burovoy
with some editing by me

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/52803098ab26051c0c9802f3c19edffc90de8843

Modified Files
--
src/bin/psql/tab-complete.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Improve tab completion for BEGIN & START|SET TRANSACTION.

2016-09-01 Thread Kevin Grittner
Improve tab completion for BEGIN & START|SET TRANSACTION.

Andreas Karlsson with minor change by me for SET TRANSACTION
SNAPSHOT.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/76f9dd4fa82270899f7b56b002b5d34226dc99d8

Modified Files
--
src/bin/psql/tab-complete.c | 35 +++
1 file changed, 27 insertions(+), 8 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Remove unnecessary #include.

2016-08-24 Thread Kevin Grittner
Remove unnecessary #include.

Accidentally added in 8b65cf4c5edabdcae45ceaef7b9ac236879aae50.

Pointed out by Álvaro Herrera

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/5cd3864075622b203d530f1a710818777859304e

Modified Files
--
src/include/storage/bufmgr.h | 1 -
1 file changed, 1 deletion(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Remove unnecessary #include.

2016-08-24 Thread Kevin Grittner
Remove unnecessary #include.

Accidentally added in 8b65cf4c5edabdcae45ceaef7b9ac236879aae50.

Pointed out by Álvaro Herrera

Branch
--
REL9_6_STABLE

Details
---
http://git.postgresql.org/pg/commitdiff/eaae83c123f5e8e103abbbe822fe73b791d9d5c9

Modified Files
--
src/include/storage/bufmgr.h | 1 -
1 file changed, 1 deletion(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add OldSnapshotTimeMapLock to wait_event table in docs.

2016-08-03 Thread Kevin Grittner
Add OldSnapshotTimeMapLock to wait_event table in docs.

Ashutosh Sharma with minor fixes by me.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/c93d8737be47667091b36f5852fd706146514008

Modified Files
--
doc/src/sgml/monitoring.sgml | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add comment & docs about no vacuum truncation with sto.

2016-07-19 Thread Kevin Grittner
Add comment & docs about no vacuum truncation with sto.

Omission noted by Andres Freund.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/1c15aac53f3421fd38ae145118d3204f055ba955

Modified Files
--
doc/src/sgml/config.sgml  | 9 +
src/backend/commands/vacuumlazy.c | 9 +
2 files changed, 18 insertions(+)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-16 Thread Kevin Grittner
On Thu, Jun 16, 2016 at 1:32 PM, Andres Freund <and...@anarazel.de> wrote:

> With old_snapshot_threshold=1 I indeed can reproduce the issue. I
> disabled autovacuum, to make the scheduling more predictable. But it
> should "work" just as well with autovacuum.
>
> S1: CREATE TABLE toasted(largecol text);
> INSERT INTO toasted SELECT string_agg(random()::text, '-') FROM 
> generate_series(1, 1000);
> BEGIN;
> DELETE FROM toasted;
> S2: BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
> S2: SELECT hashtext(largecol), length(largecol) FROM toasted;
>> ...
> S1: COMMIT;
> S2: SELECT hashtext(largecol), length(largecol) FROM toasted;
>> ...
> S1: /* wait for snapshot threshold to be passed */
> S1: VACUUM pg_toast.pg_toast_16437;
>> INFO:  0: "pg_toast_16437": found 61942 removable, 0 nonremovable row 
>> versions in 15486 out of 15486 pages
>> DETAIL:  0 dead row versions cannot be removed yet.
> S2: SELECT hashtext(largecol), length(largecol) FROM toasted;
> ERROR:  XX000: missing chunk number 0 for toast value 16455 in pg_toast_16437
> LOCATION:  toast_fetch_datum, tuptoaster.c:1945

Thanks!  That's something I should be able to work with.
Unfortunately, I am going to be on vacation, so I won't have any
results until sometime after 28 June.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-16 Thread Kevin Grittner
On Thu, Jun 16, 2016 at 1:19 PM, Kevin Grittner <kgri...@gmail.com> wrote:
> On Thu, Jun 16, 2016 at 11:54 AM, Andres Freund <and...@anarazel.de> wrote:
>> On 2016-06-16 12:43:34 -0400, Robert Haas wrote:

>>> Maybe it would help if you lay out the whole sequence of events, like:
>>>
>>> S1: Does this.
>>> S2: Does that.
>>> S1: Now does something else.
>>
>> I presume it'd be something like:
>>
>> Assuming a 'toasted' table, which contains one row, with a 1GB field.
>>
>> S1: BEGIN REPEATABLE READ;
>> S1: SELECT SUM(length(one_gb_record)) FROM toasted;
>> S2: DELETE FROM toasted;
>> AUTOVAC: vacuum toasted's toast table, it's large. skip toasted, it's small
>> S1: SELECT SUM(length(one_gb_record)) FROM toasted;
>> 
>
> I'll put together a test like that and post in a bit.

old_snapshot_threshold = '1min'
autovacuum_vacuum_threshold = 0\
autovacuum_vacuum_scale_factor = 0.01

test=# CREATE TABLE gb (rec bytea not null);
CREATE TABLE
test=# ALTER TABLE gb ALTER COLUMN rec SET STORAGE external;
ALTER TABLE
test=# INSERT INTO gb SELECT t FROM (SELECT repeat('x',
10)::bytea) x(t);
INSERT 0 1
test=# BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN
test=# SELECT SUM(length(rec)) FROM gb;
sum
----
 10
(1 row)

[wait for autovacuum to run]

test=# SELECT SUM(length(rec)) FROM gb;
ERROR:  snapshot too old

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-16 Thread Kevin Grittner
On Thu, Jun 16, 2016 at 11:54 AM, Andres Freund <and...@anarazel.de> wrote:
> On 2016-06-16 12:43:34 -0400, Robert Haas wrote:

>> The root of my confusion is: if we prune a tuple, we'll bump the page
>> LSN, so any session that is still referencing that tuple will error
>> out as soon as it touches the page on which that tuple used to exist.
>
> Right. On the main table. But we don't peform that check on the toast
> table/pages. So if we prune toast tuples, which are still referenced by
> (unvacuumed) main relation, we can get into trouble.

I thought that we should never be using visibility information from
the toast table; that the visibility information in the heap should
control.  If that's the case, how would we prune toast rows without
pruning the heap?  You pointed out that the *reverse* case has an
option bit -- if that is ever set there could be toasted values
which would not have a row.  Do they still have a line pointer in
the heap, like "dead" index entries?  How are they cleaned up in
current production versions?  (Note the question mark -- I'm not
big on using that with assertions and rarely fall back on
rhetorical questions.)

>> It won't even survive long enough to care that the tuple isn't there
>> any more.
>>
>> Maybe it would help if you lay out the whole sequence of events, like:
>>
>> S1: Does this.
>> S2: Does that.
>> S1: Now does something else.
>
> I presume it'd be something like:
>
> Assuming a 'toasted' table, which contains one row, with a 1GB field.
>
> S1: BEGIN REPEATABLE READ;
> S1: SELECT SUM(length(one_gb_record)) FROM toasted;
> S2: DELETE FROM toasted;
> AUTOVAC: vacuum toasted's toast table, it's large. skip toasted, it's small
> S1: SELECT SUM(length(one_gb_record)) FROM toasted;
> 

I'll put together a test like that and post in a bit.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-16 Thread Kevin Grittner
On Thu, Jun 16, 2016 at 1:01 PM, Andres Freund <and...@anarazel.de> wrote:

> The relevant part is the HeapTupleSatisfiesMVCC check, we're using
> SatisfiesToast for toast lookups.
>
> FWIW, I just tried to reproduce this with old_snapshot_threshold = 0 -
> but ran into the problem that I couldn't get it to vacuum anything away
> (neither main nor toast rel).   That appears to be
> if (old_snapshot_threshold == 0)
> {
> if (TransactionIdPrecedes(latest_xmin, MyPgXact->xmin)
> && TransactionIdFollows(latest_xmin, xlimit))
> xlimit = latest_xmin;
> because latest_xmin always is equal to MyPgXact->xmin, which is actually
> kinda unsurprising?

Sure -- the STO feature *never* advances the point for early
pruning past the earliest still-active transaction ID.  If it did
we would have all sorts of weird problems.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-16 Thread Kevin Grittner
On Wed, Jun 15, 2016 at 8:57 PM, Andres Freund <and...@anarazel.de> wrote:

> The simplest version of the scenario I'm concerned about is that a tuple
> in a tuple is *not* vacuumed, even though it's elegible to be removed
> due to STO. If that tuple has toasted columns, it could be the that the
> toast table was independently vacuumed (autovac considers main/toast
> tables separately,

If that were really true, why would we not have the problem in
current production versions that the toast table could be vacuumed
before the heap, leading to exactly the issue you are talking
about?  It seems to me that a simple test shows that it is not the
case that the heap is vacuumed without considering toast:

test=# create table tt (c1 text not null);
CREATE TABLE
test=# insert into tt select repeat(md5(n::text),10) from (select
generate_series(1,100)) x(n);
INSERT 0 100
test=# delete from tt;
DELETE 100
test=# vacuum verbose tt;
INFO:  vacuuming "public.tt"
INFO:  "tt": removed 100 row versions in 1 pages
INFO:  "tt": found 100 removable, 0 nonremovable row versions in 1 out
of 1 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  vacuuming "pg_toast.pg_toast_16552"
INFO:  scanned index "pg_toast_16552_index" to remove 1900 row versions
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_16552": removed 1900 row versions in 467 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "pg_toast_16552_index" now contains 0 row versions in 8 pages
DETAIL:  1900 index row versions were removed.
5 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_16552": found 1900 removable, 0 nonremovable row
versions in 467 out of 467 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

> or the horizon could change between the two heap scans,

Not a problem in current production why?

> or pins could prevent vacuuming of one page, or ...).

Not a problem in current production why?

So far I am not seeing any way for TOAST tuples to be pruned in
advance of the referencing heap tuples, nor any way for the problem
you describe to happen in the absence of that.  If you see such,
could you provide a more detailed description or a reproducible
test case?

> Toast accesses via tuptoaster.c currently don't perform
> TestForOldSnapshot_impl(), because they use SnapshotToastData, not
> SnapshotMVCC.
>
> That seems to means two things:
>
> 1) You might get 'missing chunk number ...' errors on access to toasted
>columns. Misleading error, but ok.
>
> 2) Because the tuple has been pruned from the toast table, it's possible
>that the toast oid/va_valueid is reused, because now there's no
>conflict with GetNewOidWithIndex() anymore. In that case the
>toast_fetch_datum() might return a tuple from another column & type
>(all columns in a table share the same toast table), which could lead
>to almost arbitrary problems.  That's not super likely to happen, but
>could have quite severe consequences once it starts.
>
> It seems the easiest way to fix this would be to make
> TestForOldSnapshot() "accept" SnapshotToast as well.

I don't think that would be appropriate without testing the
characteristics of the underlying table to see whether it should be
excluded.  But is the TOAST data ever updated without an update to
the referencing heap tuple?  If not, I don't see any benefit.  And
we certainly don't want to add some new way to prune TOAST tuples
which might still have referencing heap tuples; that could provide
a route to *create* the problem you describe.

I am on vacation tomorrow (Friday the 17th) through Monday the
27th, so I will be unable to respond to further issues during that
time.  Hopefully I can get this particular issue sorted out before
I go.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-15 Thread Kevin Grittner
On Wed, Jun 15, 2016 at 5:34 PM, Andres Freund <and...@anarazel.de> wrote:
> On 2016-06-15 16:58:25 -0500, Kevin Grittner wrote:
>> On Wed, Jun 15, 2016 at 3:25 PM, Andres Freund <and...@anarazel.de> wrote:
>>> On 2016-06-15 14:24:58 -0500, Kevin Grittner wrote:
>>>> On Wed, Jun 15, 2016 at 2:20 PM, Andres Freund <and...@anarazel.de> wrote:
>>>>
>>>>> We might fetch a toast tuple which
>>>>> since have been re-purposed for a datum of a different type.
>>>>
>>>> How would that happen?
>>>
>>> Autovac vacuums toast and heap tables independently. Once a toast datum
>>> isn't used anymore, the oid used can be reused (because it doesn't
>>> conflict via GetNewOidWithIndex() anymore. If analyze then detoasts a
>>> datum, which hasn't been removed, the contents of that toast id, might
>>> actually be for something different.
>>
>> What prevents that from happening now, without STO?
>
> Afaics we shouldn't ever look (i.e. detoast) at a "dead for everyone"
> tuple in autovacuum (or anywhere else). There's one minor exception to
> that, and that's enum datums in indexes, which is why we currently have
> weird transactional requirements for them.  I'm not entirely sure this
> can be hit, but it's worth checking.

I'm not clear where you see this as being in any way different with
STO.  Above it seemed that you saw this as an issue related to
ANALYZE.  If there is not early pruning for the table being
analyzed, nothing is at all different.  If there is early pruning
the rows are not seen and there could be no detoasting.  If there
is a function that lies about IMMUTABLE and reads from a table, it
either functions as before or throws a STO error on page access
(long before any detoasting).  Am I missing something?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-15 Thread Kevin Grittner
On Wed, Jun 15, 2016 at 2:40 PM, Alvaro Herrera
<alvhe...@2ndquadrant.com> wrote:
> Kevin Grittner wrote:
>> On Wed, Jun 15, 2016 at 1:59 PM, Alvaro Herrera <alvhe...@2ndquadrant.com> 
>> wrote:
>
>> > We actually go quite some lengths to support this case, even when it's
>> > the opinion of many that we shouldn't.  For example VACUUM doesn't try
>> > to find index entries using the values in each deleted tuple; instead we
>> > remember the TIDs and then scan the indexes (possibly many times) to
>> > find entries that match those TIDs -- which is much slower.  Yet we do
>> > it this way to protect the case that somebody is doing the
>> > not-really-IMMUTABLE function.
>> >
>> > In other words, I don't think we consider the position you argued as
>> > acceptable.
>>
>> What are you saying is unacceptable, and what behavior would be
>> acceptable instead?
>
> The answer "we don't support the situation where you have an index using
> an IMMUTABLE function that isn't actually immutable" is not acceptable.
> The acceptable solution would be a design that doesn't have that
> property as a requisite.
>
> I think having various executor(/heapam) checks that raise errors when
> queries are executed from within ANALYZE is acceptable.

Here is an example of a test case showing that:

-- connection 1
drop table if exists t1;
create table t1 (c1 int not null);
drop table if exists t2;
create table t2 (c1 int not null);
insert into t1 select generate_series(1, 1);
drop function mysq(i int);
create function mysq(i int)
  returns int
  language plpgsql
  immutable
as $mysq$
begin
  return (i * (select c1 from t2));
end
$mysq$;
insert into t2 values (1);
create index t1_c1sq on t1 ((mysq(c1)));
begin transaction isolation level repeatable read;
select 1;

-- connection 2
vacuum analyze verbose t1;
delete from t1 where c1 between 1000 and 1999;
delete from t1 where c1 = 8000;
update t2 set c1 = 1;

-- connection 1
analyze verbose t1;  -- when run after threshold, STO error occurs

The tail end of that, running the analyze once immediately and once
after the threshold is:

test=# -- connection 1
test=# analyze verbose t1;  -- when run after threshold, STO error occurs
INFO:  analyzing "public.t1"
INFO:  "t1": scanned 45 of 45 pages, containing 8999 live rows and
1001 dead rows; 8999 rows in sample, 8999 estimated total rows
ANALYZE
test=# -- connection 1
analyze verbose t1;  -- when run after threshold, STO error occurs
INFO:  analyzing "public.t1"
INFO:  "t1": scanned 45 of 45 pages, containing 8999 live rows and
1001 dead rows; 8999 rows in sample, 8999 estimated total rows
ERROR:  snapshot too old
CONTEXT:  SQL statement "SELECT (i * (select c1 from t2))"
PL/pgSQL function mysq(integer) line 3 at RETURN

Is there some other behavior which would be preferred?

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-15 Thread Kevin Grittner
On Wed, Jun 15, 2016 at 3:25 PM, Andres Freund <and...@anarazel.de> wrote:
> On 2016-06-15 14:24:58 -0500, Kevin Grittner wrote:
>> On Wed, Jun 15, 2016 at 2:20 PM, Andres Freund <and...@anarazel.de> wrote:
>>
>> > We might fetch a toast tuple which
>> > since have been re-purposed for a datum of a different type.
>>
>> How would that happen?
>
> Autovac vacuums toast and heap tables independently. Once a toast datum
> isn't used anymore, the oid used can be reused (because it doesn't
> conflict via GetNewOidWithIndex() anymore. If analyze then detoasts a
> datum, which hasn't been removed, the contents of that toast id, might
> actually be for something different.

What prevents that from happening now, without STO?

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-15 Thread Kevin Grittner
On Wed, Jun 15, 2016 at 2:20 PM, Andres Freund <and...@anarazel.de> wrote:

> We might fetch a toast tuple which
> since have been re-purposed for a datum of a different type.

How would that happen?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-15 Thread Kevin Grittner
On Wed, Jun 15, 2016 at 1:59 PM, Alvaro Herrera
<alvhe...@2ndquadrant.com> wrote:
> Kevin Grittner wrote:
>> On Wed, Jun 15, 2016 at 1:50 PM, Robert Haas <robertmh...@gmail.com> wrote:
>>
>>> The expression index case is the one to worry about; if there is a
>>> problem, that's where it is.  What bothers me is that a function used
>>> in an expression index could do anything at all - it can read any
>>> table in the database.
>>
>> It *can*, but then you are lying to the database when you call it
>> IMMUTABLE.  Such an index can easily become corrupted through
>> normal DML.  Without DML the ANALYZE has no problem.  So you seem
>> to be concerned that if someone is lying to the database engine to
>> force it accept a function as IMMUTABLE when it actually isn't, and
>> then updating the referenced rows (which is very likely to render
>> the index corrupted), that statistics might also become stale.
>
> We actually go quite some lengths to support this case, even when it's
> the opinion of many that we shouldn't.  For example VACUUM doesn't try
> to find index entries using the values in each deleted tuple; instead we
> remember the TIDs and then scan the indexes (possibly many times) to
> find entries that match those TIDs -- which is much slower.  Yet we do
> it this way to protect the case that somebody is doing the
> not-really-IMMUTABLE function.
>
> In other words, I don't think we consider the position you argued as
> acceptable.

What are you saying is unacceptable, and what behavior would be
acceptable instead?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-15 Thread Kevin Grittner
On Wed, Jun 15, 2016 at 1:54 PM, Alvaro Herrera
<alvhe...@2ndquadrant.com> wrote:
> Robert Haas wrote:

>> The expression index case is the one to worry about; if there is a
>> problem, that's where it is.  What bothers me is that a function used
>> in an expression index could do anything at all - it can read any
>> table in the database.
>
> Hmm, but if it does that, then the code that actually implements that
> query would run the STO checks, right?  The analyze code doesn't need
> to.

Right.  In the described case, you would get a "snapshot too old"
error inside the expression which is trying to generate the index
value.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-15 Thread Kevin Grittner
On Wed, Jun 15, 2016 at 1:50 PM, Robert Haas <robertmh...@gmail.com> wrote:

> The expression index case is the one to worry about; if there is a
> problem, that's where it is.  What bothers me is that a function used
> in an expression index could do anything at all - it can read any
> table in the database.

It *can*, but then you are lying to the database when you call it
IMMUTABLE.  Such an index can easily become corrupted through
normal DML.  Without DML the ANALYZE has no problem.  So you seem
to be concerned that if someone is lying to the database engine to
force it accept a function as IMMUTABLE when it actually isn't, and
then updating the referenced rows (which is very likely to render
the index corrupted), that statistics might also become stale.

They might.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-15 Thread Kevin Grittner
On Wed, Jun 15, 2016 at 1:45 PM, Alvaro Herrera
<alvhe...@2ndquadrant.com> wrote:

> Maybe it is possible to get into trouble if you're taking a sample for
> an expression index.

Maybe -- if you are using a function for an index expression which
does an explicit SELECT against some database table rather than
only using values from the row itself.  In such a case you would
have had to mark a function as IMMUTABLE which depends on table
contents.  I say you get to keep both pieces.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-15 Thread Kevin Grittner
On Wed, Jun 15, 2016 at 1:29 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Wed, Jun 15, 2016 at 1:44 PM, Kevin Grittner <kgri...@gmail.com>
> wrote:>> So what happens in this scenario:
>>> 1. ANALYZE runs really slowly - maybe the user-defined function it's
>>> running for the expression index is extremely long-running.
>>> 2. Eventually, the snapshot for ANALYZE is older than the configured
>>> value of snapshot_too_old.
>>> 3. Then, ANALYZE selects a page with an LSN new enough that it might
>>> have been pruned.
>>>
>>> Presumably, the ANALYZE ought to error out in this scenario, just as
>>> it would in any other situation where an old snapshot sees a new page.
>>> No?
>>
>> The test I showed creates a situation which (to ANALYZE) is
>> identical to what you describe -- ANALYZE sees a page with an LSN
>> recent enough that it could have been (and actually has been)
>> pruned.  Why would it be better for the ANALYZE to fail than to
>> complete?
>
> As I understand it, the reason we need to sometimes give "ERROR:
> snapshot too old" after early pruning is because we might otherwise
> give the wrong answer.
>
> Maybe I'm confused.

In the scenario that you describe, ANALYZE is coming up with
statistics to use in query planning, and the numbers are not
expected to always be 100% accurate.  I can think of conditions
which might prevail when seeing the recent LSN.

(1)  The recent LSN is from a cause having nothing to do with the
STO feature, like DML.  As things stand, the behavior is the same
as without the patch -- the rows are counted just the same as
always.  If we did as you suggest, we instead would abort ANALYZE
and have stale statistics.

(2)  The recent LSN is related to STO pruning.  The dead rows are
gone forever, and will not be counted.  This seems more correct
than counting them, even if it were possible, and also superior to
aborting the ANALYZE and leaving stale statistics.

Of course, a subset of (1) is the case that the rows can be
early-pruned at the next opportunity.  In such a case ANALYZE is
still counting them according to the rules that we had before the
snapshot too old feature.  If someone wants to argue that those
rules are wrong, that seems like material for a separate patch.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-15 Thread Kevin Grittner
On Wed, Jun 15, 2016 at 10:46 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Sat, Jun 11, 2016 at 11:29 AM, Kevin Grittner <kgri...@gmail.com> wrote:
>> I have reviewed the code and run tests to try to find something
>> here which could be considered a bug, without finding any problem.
>> When reading pages for the random sample for ANALYZE (or
>> auto-analyze) there is not an age check; so ANALYZE completes
>> without error, keeping statistics up-to-date.
>>
>> There really is no difference in behavior except in the case that:
>>
>> (1)  old_snapshot_threshold >= 0 to enable the "snapshot too old"
>>feature, and
>> (2)  there were tuples that were dead as the result of completed
>>transactions, and
>> (3)  those tuples became older than the threshold, and
>> (4)  those tuples were pruned or vacuumed away, and
>> (5)  an ANALYZE process would have read those dead tuples had they
>>not been removed.
>>
>> In such a case the irrevocably dead, permanently removed tuples are
>> not counted in the statistics.  I have trouble seeing a better
>> outcome than that.  Among my tests, I specifically checked for an
>> ANALYZE of a table having an index on an expression, using an old
>> snapshot:
>>
>> -- connection 1
>> drop table if exists t1;
>> create table t1 (c1 int not null);
>> drop table if exists t2;
>> create table t2 (c1 int not null);
>> insert into t1 select generate_series(1, 1);
>> drop function mysq(i int);
>> create function mysq(i int)
>>   returns int
>>   language plpgsql
>>   immutable
>> as $mysq$
>> begin
>>   return (i * i);
>> end
>> $mysq$;
>> create index t1_c1sq on t1 ((mysq(c1)));
>> begin transaction isolation level repeatable read;
>> select 1;
>>
>> -- connection 2
>> vacuum analyze verbose t1;
>> delete from t1 where c1 between 1000 and 1999;
>> delete from t1 where c1 = 8000;
>> insert into t2 values (1);
>> select pg_sleep_for('2min');
>> vacuum verbose t1;  -- repeat if necessary to see the dead rows
>> disappear
>>
>> -- connection 1
>> analyze verbose t1;
>>
>> This runs to completion, as I would want and expect.
>>
>> I am closing this item on the "PostgreSQL 9.6 Open Items" page.  If
>> anyone feels that I've missed something, please provide a test to
>> show the problem, or a clear description of the problem and how you
>> feel behavior should be different.
>
> So what happens in this scenario:
>
> 1. ANALYZE runs really slowly - maybe the user-defined function it's
> running for the expression index is extremely long-running.
> 2. Eventually, the snapshot for ANALYZE is older than the configured
> value of snapshot_too_old.
> 3. Then, ANALYZE selects a page with an LSN new enough that it might
> have been pruned.
>
> Presumably, the ANALYZE ought to error out in this scenario, just as
> it would in any other situation where an old snapshot sees a new page.
> No?

The test I showed creates a situation which (to ANALYZE) is
identical to what you describe -- ANALYZE sees a page with an LSN
recent enough that it could have been (and actually has been)
pruned.  Why would it be better for the ANALYZE to fail than to
complete?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-11 Thread Kevin Grittner
On Tue, May 24, 2016 at 4:10 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Tue, May 24, 2016 at 3:48 PM, Kevin Grittner <kgri...@gmail.com> wrote:
>> On Tue, May 24, 2016 at 12:00 PM, Andres Freund <and...@anarazel.de> wrote:
>>> On 2016-05-24 11:24:44 -0500, Kevin Grittner wrote:
>>>> On Fri, May 6, 2016 at 8:28 PM, Kevin Grittner <kgri...@gmail.com> wrote:
>>>>> On Fri, May 6, 2016 at 7:48 PM, Andres Freund <and...@anarazel.de> wrote:
>>>>
>>>>>> That comment reminds me of a question I had: Did you consider the effect
>>>>>> of this patch on analyze? It uses a snapshot, and by memory you've not
>>>>>> built in a defense against analyze being cancelled.
>>
>> The primary defense is not considering a cancellation except in 30
>> of the 500 places a page is used.  None of those 30 are, as far as
>> I can see (upon review in response to your question), used in the
>> analyze process.
>
> It's not obvious to me how this is supposed to work.  If ANALYZE's
> snapshot is subject to being ignored for xmin purposes because of
> snapshot_too_old, then I would think it would need to consider
> cancelling itself if it reads a page with possibly-removed data, just
> like in any other case.  It seems that we might not actually need a
> snapshot set for ANALYZE in all cases, because the comments say:
>
> /* functions in indexes may want a snapshot set */
> PushActiveSnapshot(GetTransactionSnapshot());
>
> If we can get away with it, it would be a rather large win to only set
> a snapshot when the table has an expression index.  For purposes of
> "snapshot too old", though, it will be important that a function in an
> index which tries to read data from some other table which has been
> pruned cancels itself when necessary.

I have reviewed the code and run tests to try to find something
here which could be considered a bug, without finding any problem.
When reading pages for the random sample for ANALYZE (or
auto-analyze) there is not an age check; so ANALYZE completes
without error, keeping statistics up-to-date.

There really is no difference in behavior except in the case that:

(1)  old_snapshot_threshold >= 0 to enable the "snapshot too old"
   feature, and
(2)  there were tuples that were dead as the result of completed
   transactions, and
(3)  those tuples became older than the threshold, and
(4)  those tuples were pruned or vacuumed away, and
(5)  an ANALYZE process would have read those dead tuples had they
   not been removed.

In such a case the irrevocably dead, permanently removed tuples are
not counted in the statistics.  I have trouble seeing a better
outcome than that.  Among my tests, I specifically checked for an
ANALYZE of a table having an index on an expression, using an old
snapshot:

-- connection 1
drop table if exists t1;
create table t1 (c1 int not null);
drop table if exists t2;
create table t2 (c1 int not null);
insert into t1 select generate_series(1, 1);
drop function mysq(i int);
create function mysq(i int)
  returns int
  language plpgsql
  immutable
as $mysq$
begin
  return (i * i);
end
$mysq$;
create index t1_c1sq on t1 ((mysq(c1)));
begin transaction isolation level repeatable read;
select 1;

-- connection 2
vacuum analyze verbose t1;
delete from t1 where c1 between 1000 and 1999;
delete from t1 where c1 = 8000;
insert into t2 values (1);
select pg_sleep_for('2min');
vacuum verbose t1;  -- repeat if necessary to see the dead rows
disappear

-- connection 1
analyze verbose t1;

This runs to completion, as I would want and expect.

I am closing this item on the "PostgreSQL 9.6 Open Items" page.  If
anyone feels that I've missed something, please provide a test to
show the problem, or a clear description of the problem and how you
feel behavior should be different.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-10 Thread Kevin Grittner
On Fri, Jun 10, 2016 at 10:26 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Fri, Jun 10, 2016 at 10:45 AM, Kevin Grittner <kgri...@gmail.com> wrote:

>> Since vacuum calls the pruning function, and not the other way
>> around, the name you suggest would be technically more correct.
>> Committed using "Pruning" instead of "Vacuum" in both new macros.

> You've still got an early_vacuum_enabled variable name floating around there.

Gah!  Renamed for consistency.

Thanks!

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Rename local variable for consistency.

2016-06-10 Thread Kevin Grittner
Rename local variable for consistency.

Pointed out by Robert Haas.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/13761bccb177022c8c0dabc08f3e9acb491b1c96

Modified Files
--
src/backend/catalog/index.c | 8 
1 file changed, 4 insertions(+), 4 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-10 Thread Kevin Grittner
On Thu, Jun 9, 2016 at 6:16 PM, Robert Haas <robertmh...@gmail.com> wrote:

> So I like the idea of centralizing checks in
> RelationAllowsEarlyVacuum, but shouldn't it really be called
> RelationAllowsEarlyPruning?

Since vacuum calls the pruning function, and not the other way
around, the name you suggest would be technically more correct.
Committed using "Pruning" instead of "Vacuum" in both new macros.

I have closed the CREATE INDEX versus "snapshot too old" issue in
the "PostgreSQL 9.6 Open Items" Wiki page.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Fix interaction between CREATE INDEX and "snapshot too old".

2016-06-10 Thread Kevin Grittner
Fix interaction between CREATE INDEX and "snapshot too old".

Since indexes are created without valid LSNs, an index created
while a snapshot older than old_snapshot_threshold existed could
cause queries to return incorrect results when those old snapshots
were used, if any relevant rows had been subject to early pruning
before the index was built.  Prevent usage of a newly created index
until all such snapshots are released, for relations where this can
happen.

Questions about the interaction of "snapshot too old" with index
creation were initially raised by Andres Freund.

Reviewed by Robert Haas.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/bf9a60ee3349a2f2dc5fe6d571a8d39cfc634371

Modified Files
--
src/backend/catalog/index.c | 28 +---
src/backend/storage/buffer/bufmgr.c |  3 +--
src/backend/utils/time/snapmgr.c|  5 +
src/include/utils/snapmgr.h | 13 +
4 files changed, 36 insertions(+), 13 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-09 Thread Kevin Grittner
[Thanks to Robert to stepping up to keep this moving while I was
down yesterday with a minor injury.  I'm back on it today.]

On Wed, Jun 8, 2016 at 3:11 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Wed, Jun 8, 2016 at 4:04 PM, Kevin Grittner <kgri...@gmail.com> wrote:

>> -- connection 1
>> drop table if exists t1;
>> create table t1 (c1 int not null);
>> insert into t1 select generate_series(1, 100);
>> begin transaction isolation level repeatable read;
>> select 1;
>>
>> -- connection 2
>> insert into t2 values (1);
>> delete from t1 where c1 between 20 and 29;
>> delete from t1 where c1 = 100;
>> vacuum analyze verbose t1;
>> select pg_sleep_for('2min');
>> vacuum analyze verbose t1;  -- repeat if needed until dead rows vacuumed
>>
>> -- connection 1
>> select c1 from t1 where c1 = 100;
>> select c1 from t1 where c1 = 25;
>>
>> The problem occurs when an index is built while an old snapshot
>> exists which can't see the effects of early pruning/vacuuming.  The
>> fix prevents use of such an index until all snapshots early enough
>> to have a problem have been released.
>
> This example doesn't seem to involve any CREATE INDEX or REINDEX
> operations, so I'm a little confused.

Sorry; pasto -- the index build is supposed to be on connection 2
right after the dead rows are confirmed vacuumed away:

create index t1_c1 on t1 (c1);

> Generally, I think I see the hazard you're concerned about: I had
> failed to realize, as you mentioned upthread, that new index pages
> would have an LSN of 0. So if a tuple is pruned early and then the
> index is reindexed, old snapshots won't realize that data is missing.
> What I'm less certain about is whether you can actually get by with
> reusing ii_BrokenHotChain to handle this case.

v2 and later does not do that.  v1 did, but that was a more blunt
instrument.

>  For example, note this comment:
>
>  * However, when reindexing an existing index, we should do nothing here.
>  * Any HOT chains that are broken with respect to the index must predate
>  * the index's original creation, so there is no need to change the
>  * index's usability horizon.  Moreover, we *must not* try to change the
>  * index's pg_index entry while reindexing pg_index itself, and this
>  * optimization nicely prevents that.
>
> This logic doesn't apply to the old snapshot case; there, you'd need
> to distrust the index whether it was an initial build or a REINDEX,
> but it doesn't look like that's what the patch does.  I'm worried
> there could be other places where we rely on ii_BrokenHotChain to
> detect only one specific condition that isn't quite the same as what
> you're trying to use it for here.

Well spotted.  I had used a lot of discreet calls to get that
reindex logic right, but it was verbose and ugly, so I had just
added the new macros in this patch and started to implement them
before I knocked off for the day.  At handover I was too distracted
to remember where I was with it.  :-(  See if it looks right to you
now.

Attached is v3.  I will commit this patch to resolve this issue
tomorrow, barring any objections before then.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 31a1438..449a665 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2040,18 +2040,24 @@ index_build(Relation heapRelation,
 	/*
 	 * If we found any potentially broken HOT chains, mark the index as not
 	 * being usable until the current transaction is below the event horizon.
-	 * See src/backend/access/heap/README.HOT for discussion.
+	 * See src/backend/access/heap/README.HOT for discussion.  Also set this
+	 * if early pruning/vacuuming is enabled for the heap relation.  While it
+	 * might become safe to use the index earlier based on actual cleanup
+	 * activity and other active transactions, the test for that would be much
+	 * more complex and would require some form of blocking, so keep it simple
+	 * and fast by just using the current transaction.
 	 *
 	 * However, when reindexing an existing index, we should do nothing here.
 	 * Any HOT chains that are broken with respect to the index must predate
 	 * the index's original creation, so there is no need to change the
 	 * index's usability horizon.  Moreover, we *must not* try to change the
 	 * index's pg_index entry while reindexing pg_index itself, and this
-	 * optimization nicely prevents that.
+	 * optimization nicely prevents that.  The more complex rules needed for a
+	 * reindex are handled separately after this function returns.
 	 *
 	 * We also need not set indcheckxmin during a concurrent index build,
 	 * because we won't set i

Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-08 Thread Kevin Grittner
On Wed, Jun 8, 2016 at 2:49 PM, Robert Haas <robertmh...@gmail.com> wrote:

> Do you have a test case that demonstrates a problem, or an explanation
> of why you think there is one?

With old_snapshot_threshold = '1min'

-- connection 1
drop table if exists t1;
create table t1 (c1 int not null);
insert into t1 select generate_series(1, 100);
begin transaction isolation level repeatable read;
select 1;

-- connection 2
insert into t2 values (1);
delete from t1 where c1 between 20 and 29;
delete from t1 where c1 = 100;
vacuum analyze verbose t1;
select pg_sleep_for('2min');
vacuum analyze verbose t1;  -- repeat if needed until dead rows vacuumed

-- connection 1
select c1 from t1 where c1 = 100;
select c1 from t1 where c1 = 25;

The problem occurs when an index is built while an old snapshot
exists which can't see the effects of early pruning/vacuuming.  The
fix prevents use of such an index until all snapshots early enough
to have a problem have been released.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-08 Thread Kevin Grittner
On Tue, Jun 7, 2016 at 10:40 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Sat, Jun 4, 2016 at 4:21 PM, Kevin Grittner <kgri...@gmail.com> wrote:

>> the minimal patch to fix behavior in this area would be:
>>
>> diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
>> index 31a1438..6c379da 100644
>> --- a/src/backend/catalog/index.c
>> +++ b/src/backend/catalog/index.c
>> @@ -2268,6 +2268,9 @@ IndexBuildHeapRangeScan(Relation heapRelation,
>> Assert(numblocks == InvalidBlockNumber);
>> }
>>
>> +   if (old_snapshot_threshold >= 0)
>> +   indexInfo->ii_BrokenHotChain = true;
>> +
>> reltuples = 0;
>>
>> /*
>>
>> Of course, ii_BrokenHotChain should be renamed to something like
>> ii_UnsafeForOldSnapshots, and some comments need to be updated; but
>> the above is the substance of it.
>
> I don't know why we'd want to rename it like that...

If we made the above change, the old name would be misleading, but
I've thought better of that and attach a slightly different
approach (tested but not yet with comment adjustments); attached.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 31a1438..945f55c 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2060,7 +2060,8 @@ index_build(Relation heapRelation,
 	 * about any concurrent readers of the tuple; no other transaction can see
 	 * it yet.
 	 */
-	if (indexInfo->ii_BrokenHotChain && !isreindex &&
+	if ((indexInfo->ii_BrokenHotChain || EarlyVacuumEnabled(indexRelation)) &&
+		!isreindex &&
 		!indexInfo->ii_Concurrent)
 	{
 		Oid			indexId = RelationGetRelid(indexRelation);
@@ -3409,9 +3410,10 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 	 !indexForm->indisready ||
 	 !indexForm->indislive);
 		if (index_bad ||
-			(indexForm->indcheckxmin && !indexInfo->ii_BrokenHotChain))
+			(indexForm->indcheckxmin &&
+			 !(indexInfo->ii_BrokenHotChain || EarlyVacuumEnabled(iRel
 		{
-			if (!indexInfo->ii_BrokenHotChain)
+			if (!(indexInfo->ii_BrokenHotChain || EarlyVacuumEnabled(iRel)))
 indexForm->indcheckxmin = false;
 			else if (index_bad)
 indexForm->indcheckxmin = true;
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 8a830d4..4e50b19 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -4290,8 +4290,7 @@ IssuePendingWritebacks(WritebackContext *context)
 void
 TestForOldSnapshot_impl(Snapshot snapshot, Relation relation)
 {
-	if (!IsCatalogRelation(relation)
-	 && !RelationIsAccessibleInLogicalDecoding(relation)
+	if (RelationAllowsEarlyVacuum(relation)
 	 && (snapshot)->whenTaken < GetOldSnapshotThresholdTimestamp())
 		ereport(ERROR,
 (errcode(ERRCODE_SNAPSHOT_TOO_OLD),
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index f8a2a83..2eabd1c 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -1597,10 +1597,7 @@ TransactionIdLimitedForOldSnapshots(TransactionId recentXmin,
 {
 	if (TransactionIdIsNormal(recentXmin)
 		&& old_snapshot_threshold >= 0
-		&& RelationNeedsWAL(relation)
-		&& !IsCatalogRelation(relation)
-		&& !RelationIsAccessibleInLogicalDecoding(relation)
-		&& !RelationHasUnloggedIndex(relation))
+		&& RelationAllowsEarlyVacuum(relation))
 	{
 		int64		ts = GetSnapshotCurrentTimestamp();
 		TransactionId xlimit = recentXmin;
diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h
index 4270696..effdb0c 100644
--- a/src/include/utils/snapmgr.h
+++ b/src/include/utils/snapmgr.h
@@ -31,6 +31,19 @@
 #define OLD_SNAPSHOT_PADDING_ENTRIES 10
 #define OLD_SNAPSHOT_TIME_MAP_ENTRIES (old_snapshot_threshold + OLD_SNAPSHOT_PADDING_ENTRIES)
 
+/*
+ * Common definition of relation properties that allow early pruning/vacuuming
+ * when old_snapshot_threshold >= 0.
+ */
+#define RelationAllowsEarlyVacuum(rel) \
+( \
+	 RelationNeedsWAL(rel) \
+  && !IsCatalogRelation(rel) \
+  && !RelationIsAccessibleInLogicalDecoding(rel) \
+  && !RelationHasUnloggedIndex(rel) \
+)
+
+#define EarlyVacuumEnabled(rel) (old_snapshot_threshold >= 0 && RelationAllowsEarlyVacuum(rel))
 
 /* GUC variables */
 extern PGDLLIMPORT int old_snapshot_threshold;

-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-04 Thread Kevin Grittner
On Fri, Jun 3, 2016 at 4:24 PM, Kevin Grittner <kgri...@gmail.com> wrote:

> Consequently, causing the index to be ignored in planning when
> using the old index

That last line should have read "using an old snapshot"

> is not a nice optimization, but necessary for
> correctness.  We already have logic to do this for other cases
> (like HOT updates), so it is a matter of tying in to that existing
> code correctly.  This won't be all that novel.

Just to demonstrate that, the minimal patch to fix behavior in this
area would be:

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 31a1438..6c379da 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2268,6 +2268,9 @@ IndexBuildHeapRangeScan(Relation heapRelation,
Assert(numblocks == InvalidBlockNumber);
}

+   if (old_snapshot_threshold >= 0)
+   indexInfo->ii_BrokenHotChain = true;
+
reltuples = 0;

/*

Of course, ii_BrokenHotChain should be renamed to something like
ii_UnsafeForOldSnapshots, and some comments need to be updated; but
the above is the substance of it.

Attached is what I have so far.  I'm still looking at what other
comments might need to be adjusted, but thought I should put this
much out for comment now.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 31a1438..b072985 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1672,7 +1672,7 @@ BuildIndexInfo(Relation index)
 
 	/* initialize index-build state to default */
 	ii->ii_Concurrent = false;
-	ii->ii_BrokenHotChain = false;
+	ii->ii_UnsafeForOldSnapshots = false;
 
 	return ii;
 }
@@ -2060,7 +2060,7 @@ index_build(Relation heapRelation,
 	 * about any concurrent readers of the tuple; no other transaction can see
 	 * it yet.
 	 */
-	if (indexInfo->ii_BrokenHotChain && !isreindex &&
+	if (indexInfo->ii_UnsafeForOldSnapshots && !isreindex &&
 		!indexInfo->ii_Concurrent)
 	{
 		Oid			indexId = RelationGetRelid(indexRelation);
@@ -2137,11 +2137,11 @@ index_build(Relation heapRelation,
  * so here because the AM might reject some of the tuples for its own reasons,
  * such as being unable to store NULLs.
  *
- * A side effect is to set indexInfo->ii_BrokenHotChain to true if we detect
- * any potentially broken HOT chains.  Currently, we set this if there are
- * any RECENTLY_DEAD or DELETE_IN_PROGRESS entries in a HOT chain, without
- * trying very hard to detect whether they're really incompatible with the
- * chain tip.
+ * A side effect is to set indexInfo->ii_UnsafeForOldSnapshots to true if we
+ * detect any potentially broken HOT chains or if old_snapshot_threshold is
+ * set to allow early pruning/vacuuuming.  Currently, we set this based on
+ * fairly simple tests; it is not guaranteed that using the index would cause
+ * a problem when set, but if the flag is clear it is guaranteed to be safe.
  */
 double
 IndexBuildHeapScan(Relation heapRelation,
@@ -2268,6 +2268,14 @@ IndexBuildHeapRangeScan(Relation heapRelation,
 		Assert(numblocks == InvalidBlockNumber);
 	}
 
+	/*
+	 * If allowing early pruning/vacuuming, the index cannot be considered
+	 * safe for old snapshots, since entries could be missing and lsn is set
+	 * to InvalidXLogRecPtr in all pages of the new index.
+	 */
+	if (old_snapshot_threshold >= 0)
+		indexInfo->ii_UnsafeForOldSnapshots = true;
+
 	reltuples = 0;
 
 	/*
@@ -2361,7 +2369,7 @@ IndexBuildHeapRangeScan(Relation heapRelation,
 	{
 		indexIt = false;
 		/* mark the index as unsafe for old snapshots */
-		indexInfo->ii_BrokenHotChain = true;
+		indexInfo->ii_UnsafeForOldSnapshots = true;
 	}
 	else
 		indexIt = true;
@@ -2488,7 +2496,7 @@ IndexBuildHeapRangeScan(Relation heapRelation,
 		 */
 		indexIt = false;
 		/* mark the index as unsafe for old snapshots */
-		indexInfo->ii_BrokenHotChain = true;
+		indexInfo->ii_UnsafeForOldSnapshots = true;
 	}
 	else
 	{
@@ -3409,9 +3417,9 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 	 !indexForm->indisready ||
 	 !indexForm->indislive);
 		if (index_bad ||
-			(indexForm->indcheckxmin && !indexInfo->ii_BrokenHotChain))
+			(indexForm->indcheckxmin && !indexInfo->ii_UnsafeForOldSnapshots))
 		{
-			if (!indexInfo->ii_BrokenHotChain)
+			if (!indexInfo->ii_UnsafeForOldSnapshots)
 indexForm->indcheckxmin = false;
 			else if (index_bad)
 indexForm->indcheckxmin = true;
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index 564e10e..8dcac7e 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -314,7 +314,7 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid

Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-03 Thread Kevin Grittner
On Fri, May 27, 2016 at 10:35 AM, Kevin Grittner <kgri...@gmail.com> wrote:
> On Tue, May 24, 2016 at 4:10 PM, Robert Haas <robertmh...@gmail.com> wrote:

>> [ANALYZE of index with expression may fail to update statistics
>> if ANALYZE runs longer than old_snapshot_threshold]

>> If we can get away with it, it would be a rather large win to only set
>> a snapshot when the table has an expression index.  For purposes of
>> "snapshot too old", though, it will be important that a function in an
>> index which tries to read data from some other table which has been
>> pruned cancels itself when necessary.
>
> I will make this my top priority after resolving the question of whether
> there is an issue with CREATE INDEX.  I expect to have a resolution,
> probably involving some patch, by 3 June.

Due to 9.5 bug-fixing and the index issue taking a bit longer than
I expected, this is now looking like a 7 June resolution.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-06-03 Thread Kevin Grittner
On Fri, May 27, 2016 at 10:18 AM, Kevin Grittner <kgri...@gmail.com> wrote:
> On Fri, May 27, 2016 at 9:58 AM, Kevin Grittner <kgri...@gmail.com> wrote:
>> On Tue, May 24, 2016 at 4:56 PM, Andres Freund <and...@anarazel.de> wrote:
>
>>> If an old session with >= repeatable read accesses a clustered
>>> table (after the cluster committed), they'll now not see any
>>> errors, because all the LSNs look new.
>>
>> Again, it is new LSNs that trigger errors; if the page has not been
>> written recently the LSN is old and there is no error.  I think you
>> may be seeing problems based on getting the basics of this
>> backwards.
>
> I am reviewing the suggestion of a possible bug now, and will make
> it my top priority until resolved.  By the end of 1 June I will
> either have committed a fix or posted an explanation of why the
> concern is mistaken, with test results to demonstrate correct
> behavior.

This got set back by needing to fix a bug in the 9.5 release.  I am
back on this and have figured out that everyone who commented on
this specific issue was wrong about a very important fact -- the
LSNs in index pages after CREATE INDEX (with or without
CONCURRENTLY) and for REINDEX are always == InvalidXLogRecPtr (0).

That means that a snapshot from before an index build does not
always generate errors when it should on the use of the new index.
(Any early pruning/vacuuuming from before the index build is
missed; activity subsequent to the index build is recognized.)
Consequently, causing the index to be ignored in planning when
using the old index is not a nice optimization, but necessary for
correctness.  We already have logic to do this for other cases
(like HOT updates), so it is a matter of tying in to that existing
code correctly.  This won't be all that novel.

I now expect to push a fix along those lines by Tuesday, 6 June.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add new snapshot fields to serialize/deserialize functions.

2016-06-03 Thread Kevin Grittner
Add new snapshot fields to serialize/deserialize functions.

The "snapshot too old" condition was not being recognized when
using a copied snapshot, since the original timestamp and lsn were
not being passed along.  Noticed when testing the combination of
"snapshot too old" with parallel query execution.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/370a46fc015115bfeccde4eb208d82049f792f9f

Modified Files
--
src/backend/utils/time/snapmgr.c | 6 ++
1 file changed, 6 insertions(+)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: C comment improvement & typo fix.

2016-06-02 Thread Kevin Grittner
C comment improvement & typo fix.

Thomas Munro

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/4edb7bd2fd6a48f6104c73551423cb208e13e529

Modified Files
--
src/include/access/nbtree.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Fix btree mark/restore bug.

2016-06-02 Thread Kevin Grittner
Fix btree mark/restore bug.

Commit 2ed5b87f96d473962ec5230fd820abfeaccb2069 introduced a bug in
mark/restore, in an attempt to optimize repeated restores to the
same page.  This caused an assertion failure during a merge join
which fed directly from an index scan, although the impact would
not be limited to that case.  Revert the bad chunk of code from
that commit.

While investigating this bug it was discovered that a particular
"paranoia" set of the mark position field would not prevent bad
behavior; it would just make it harder to diagnose.  Change that
into an assertion, which will draw attention to any future problem
in that area more directly.

Backpatch to 9.5, where the bug was introduced.

Bug #14169 reported by Shinta Koyanagi.
Preliminary analysis by Tom Lane identified which commit caused
the bug.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/7392eed7c2a327eb1b496f30a64be33404bcf82e

Modified Files
--
src/backend/access/nbtree/nbtree.c| 19 ---
src/backend/access/nbtree/nbtsearch.c |  2 +-
2 files changed, 1 insertion(+), 20 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Fix btree mark/restore bug.

2016-06-02 Thread Kevin Grittner
Fix btree mark/restore bug.

Commit 2ed5b87f96d473962ec5230fd820abfeaccb2069 introduced a bug in
mark/restore, in an attempt to optimize repeated restores to the
same page.  This caused an assertion failure during a merge join
which fed directly from an index scan, although the impact would
not be limited to that case.  Revert the bad chunk of code from
that commit.

While investigating this bug it was discovered that a particular
"paranoia" set of the mark position field would not prevent bad
behavior; it would just make it harder to diagnose.  Change that
into an assertion, which will draw attention to any future problem
in that area more directly.

Backpatch to 9.5, where the bug was introduced.

Bug #14169 reported by Shinta Koyanagi.
Preliminary analysis by Tom Lane identified which commit caused
the bug.

Branch
--
REL9_5_STABLE

Details
---
http://git.postgresql.org/pg/commitdiff/236d569f92b298c697e0f54891418acfc8310003

Modified Files
--
src/backend/access/nbtree/nbtree.c| 19 ---
src/backend/access/nbtree/nbtsearch.c |  2 +-
2 files changed, 1 insertion(+), 20 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-05-31 Thread Kevin Grittner
On Tue, May 31, 2016 at 10:03 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Fri, May 27, 2016 at 10:58 AM, Kevin Grittner <kgri...@gmail.com> wrote:
>>> As far as I can see normal index builds will allow concurrent hot
>>> prunes and everything; since those only require page-level
>>> exclusive locks.
>>>
>>> So for !concurrent builds we might end up with a corrupt index.
>>
>> ... by which you mean an index that omits certainly-dead heap
>> tuples which have been the subject of early pruning or vacuum, even
>> if there are still registered snapshots that include those tuples?
>> Or do you see something else?
>
> I think that is the danger.

Well, the *perceived* danger -- since every page in the new index
would be new enough to be recognized as too recently modified to be
safe for a snapshot that could still see the omitted rows, the only
question I'm sorting out on this is how safe it might be to cause
the index to be ignored in planning when using such a snapshot.
That and demonstrating the safe behavior to those not following
closely enough to see what will happen without a demonstration.

>> Again, since both the heap pages involved and all the new index
>> pages would have LSNs recent enough to trigger the old snapshot
>> check, I'm not sure where the problem is,

> This is a good point, though, I think.

The whole perception of risk in this area seems to be based on
getting that wrong; although the review of this area may yield some
benefit in terms of minimizing false positives.

>>> I think many of the places relying on heap scans with !mvcc
>>> snapshots are problematic in that way. Outdated reads will not be
>>> detected by TestForOldSnapshot() (given the (snapshot)->satisfies
>>> == HeapTupleSatisfiesMVCC condition therein), and rely on the
>>> snapshot to be actually working.  E.g. CLUSTER/ VACUUM FULL rely
>>> on accurate HEAPTUPLE_RECENTLY_DEAD
>>
>> Don't the "RECENTLY" values imply that the transaction is still
>> running which cause the tuple to be dead?  Since tuples are not
>> subject to early pruning or vacuum when that is the case, where do
>> you see a problem?  The snapshot itself has the usual xmin et al.,
>> so I'm not sure what you might mean by "the snapshot to be actually
>> working" if not the override at the time of pruning/vacuuming.
>
> Anybody who calls HeapTupleSatisfiesVacuum() with an xmin value that
> is newer that the oldest registered snapshot in the system (based on
> some snapshots being ignored) might get a return value of
> HEAPTUPLE_DEAD rather than HEAPTUPLE_RECENTLY_DEAD.

Since the override xmin cannot advance past the earliest
transaction which is still active, HEAPTUPLE_DEAD indicates that
the transaction causing the tuple to be dead has completed and the
tuple is irrevocably dead -- even if there are still snapshots
registered which can see it.  Seeing HEAPTUPLE_DEAD rather than
HEAPTUPLE_RECENTLY_DEAD is strictly limited to tuples which are
certainly and permanently dead and for which the only possible
references are non-MVCC snapshots or existing snapshots subject to
"snapshot too old" monitoring.

> IndexBuildHeapRangeScan(): We might end up with indexIt = false
> instead of indexIt = true.  That should be OK because anyone using the
> old snapshot will see a new page LSN and error out.  We might also
> fail to set indexInfo->ii_BrokenHotChain = true.  I suspect that's a
> problem, but I'm not certain of it.

The latter flag is what I'm currently digging at; but my hope is
that whenever old_snapshot_threshold >= 0 we can set
indexInfo->ii_BrokenHotChain = true to cause the planner to skip
consideration of the index if the snapshot is an "old" one.  That
will avoid some false positives (seeing the error when not strictly
necessary).  If that works out the way I hope, the only down side
is that a scan using a snapshot from an old transaction or cursor
would use some other index or a heap scan; but we already have that
possibility in some cases -- that seems to be the point of the
flag.

> CheckForSerializableConflictOut: Maybe a problem?  If the tuple is
> dead, there's no issue, but if it's recently-dead, there might be.

If the tuple is not visible to the scan, the behavior is unchanged
(a simple return from the function on either HEAPTUPLE_DEAD or
HEAPTUPLE_RECENTLY_DEAD with !visible) and (thus) clearly correct.
If the tuple is visible to us it is currently subject to early
pruning or vacuum (since those operations would get the same
modified xmin) but has not yet had any such treatment since we made
it to this function in the first place.  The processing for SSI
purposes would be unaffected by the possibility that there could
later be earl

Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-05-27 Thread Kevin Grittner
On Tue, May 24, 2016 at 4:10 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Tue, May 24, 2016 at 3:48 PM, Kevin Grittner <kgri...@gmail.com> wrote:
>> On Tue, May 24, 2016 at 12:00 PM, Andres Freund <and...@anarazel.de> wrote:
>>> On 2016-05-24 11:24:44 -0500, Kevin Grittner wrote:
>>>> On Fri, May 6, 2016 at 8:28 PM, Kevin Grittner <kgri...@gmail.com> wrote:
>>>>> On Fri, May 6, 2016 at 7:48 PM, Andres Freund <and...@anarazel.de> wrote:
>>>>
>>>>>> That comment reminds me of a question I had: Did you consider the effect
>>>>>> of this patch on analyze? It uses a snapshot, and by memory you've not
>>>>>> built in a defense against analyze being cancelled.
>>
>> The primary defense is not considering a cancellation except in 30
>> of the 500 places a page is used.  None of those 30 are, as far as
>> I can see (upon review in response to your question), used in the
>> analyze process.
>
> It's not obvious to me how this is supposed to work.  If ANALYZE's
> snapshot is subject to being ignored for xmin purposes because of
> snapshot_too_old, then I would think it would need to consider
> cancelling itself if it reads a page with possibly-removed data, just
> like in any other case.  It seems that we might not actually need a
> snapshot set for ANALYZE in all cases, because the comments say:
>
> /* functions in indexes may want a snapshot set */
> PushActiveSnapshot(GetTransactionSnapshot());
>
> If we can get away with it, it would be a rather large win to only set
> a snapshot when the table has an expression index.  For purposes of
> "snapshot too old", though, it will be important that a function in an
> index which tries to read data from some other table which has been
> pruned cancels itself when necessary.

I will make this my top priority after resolving the question of whether
there is an issue with CREATE INDEX.  I expect to have a resolution,
probably involving some patch, by 3 June.

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-05-27 Thread Kevin Grittner
On Fri, May 27, 2016 at 9:58 AM, Kevin Grittner <kgri...@gmail.com> wrote:
> On Tue, May 24, 2016 at 4:56 PM, Andres Freund <and...@anarazel.de> wrote:

>> If an old session with >= repeatable read accesses a clustered
>> table (after the cluster committed), they'll now not see any
>> errors, because all the LSNs look new.
>
> Again, it is new LSNs that trigger errors; if the page has not been
> written recently the LSN is old and there is no error.  I think you
> may be seeing problems based on getting the basics of this
> backwards.

I am reviewing the suggestion of a possible bug now, and will make
it my top priority until resolved.  By the end of 1 June I will
either have committed a fix or posted an explanation of why the
concern is mistaken, with test results to demonstrate correct
behavior.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-05-27 Thread Kevin Grittner
On Tue, May 24, 2016 at 4:56 PM, Andres Freund <and...@anarazel.de> wrote:

> The LSNs of the created index pages will be new, and we'll thus
> not necessarily error out when requried.

It is *old* LSNs that are *safe* -- *new* LSNs are what trigger the
"snapshot too old" error.  So your observation may be a reason that
there is not, in fact, any bug here.  That said, even if there is
no chance that using the index could generate incorrect results, it
may be worth looking at whether avoiding index use (as mentioned
below) might avoid false positives, and have enough value as an
optimization to add.  Of course, if that is the case, there is the
question of whether that is appropriate for 9.6 or material for the
first version 10 CF.

> At the very least we'd have to set indexInfo->ii_BrokenHotChain
> in that case.

If there is a bug, this is what I will look at first -- having
queries avoid the index if they are using an old snapshot, rather
than throwing an error.

> As far as I can see normal index builds will allow concurrent hot
> prunes and everything; since those only require page-level
> exclusive locks.
>
> So for !concurrent builds we might end up with a corrupt index.

... by which you mean an index that omits certainly-dead heap
tuples which have been the subject of early pruning or vacuum, even
if there are still registered snapshots that include those tuples?
Or do you see something else?

Again, since both the heap pages involved and all the new index
pages would have LSNs recent enough to trigger the old snapshot
check, I'm not sure where the problem is, but will review closely
to see what I might have missed.

> I think many of the places relying on heap scans with !mvcc
> snapshots are problematic in that way. Outdated reads will not be
> detected by TestForOldSnapshot() (given the (snapshot)->satisfies
> == HeapTupleSatisfiesMVCC condition therein), and rely on the
> snapshot to be actually working.  E.g. CLUSTER/ VACUUM FULL rely
> on accurate HEAPTUPLE_RECENTLY_DEAD

Don't the "RECENTLY" values imply that the transaction is still
running which cause the tuple to be dead?  Since tuples are not
subject to early pruning or vacuum when that is the case, where do
you see a problem?  The snapshot itself has the usual xmin et al.,
so I'm not sure what you might mean by "the snapshot to be actually
working" if not the override at the time of pruning/vacuuming.

> If an old session with >= repeatable read accesses a clustered
> table (after the cluster committed), they'll now not see any
> errors, because all the LSNs look new.

Again, it is new LSNs that trigger errors; if the page has not been
written recently the LSN is old and there is no error.  I think you
may be seeing problems based on getting the basics of this
backwards.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Update doc text to reflect new column in MVCC phenomena table.

2016-05-25 Thread Kevin Grittner
Update doc text to reflect new column in MVCC phenomena table.

Scott Wehrenberg

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/627e360358e3beb67cd2f54393835f979c5e30b7

Modified Files
--
doc/src/sgml/mvcc.sgml | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-05-24 Thread Kevin Grittner
On Tue, May 24, 2016 at 4:09 PM, Kevin Grittner <kgri...@gmail.com> wrote:
> On Tue, May 24, 2016 at 3:54 PM, Andres Freund <and...@anarazel.de> wrote:

>> It appears that concurrent index builds are currently broken
>> from a quick skim?
>
> Either you don't understand this feature very well, or I don't
> understand concurrent index build very well.

And it may be the latter.  On closer review I think some adjustment
may be needed in IndexBuildHeapRangeScan().  I'm not sure that
throwing the error is necessary, since there is a flag to say that
the index is unsafe for existing snapshots -- it may be enough to
set that flag.

Sorry for my earlier email.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-05-24 Thread Kevin Grittner
On Tue, May 24, 2016 at 4:10 PM, Robert Haas <robertmh...@gmail.com> wrote:

> For purposes of
> "snapshot too old", though, it will be important that a function in an
> index which tries to read data from some other table which has been
> pruned cancels itself when necessary.

Hm.  I'll try to work up a test case for this.  If you have one,
please send it along to me.

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-05-24 Thread Kevin Grittner
On Tue, May 24, 2016 at 3:54 PM, Andres Freund <and...@anarazel.de> wrote:

> what about e.g. concurrent index builds? E.g. IndexBuildHeapRangeScan() 
> doesn't
> seem to contain any checks against outdated blocks

Why would it?  We're talking about blocks where there were dead
tuples, with the transaction which updated or deleted the tuples
being complete, where those dead tuples were vacuumed away.  These
should not appear in the index, should they?

> and that's certainly not ok?

Why not?

> It appears that concurrent index builds are currently broken
> from a quick skim?

Either you don't understand this feature very well, or I don't
understand concurrent index build very well.  I thought we burned a
lot of time going through the index an extra time just to get rid
of dead tuples -- why would it be a problem not to add them in the
first place?

>>> Is there anything preventing this from becoming a problem?
>>
>> The fundamental approach that the error can only appear on
>> user-facing scans, not internal reads and positioning.
>
>> Unless there is some code path that uses a scan where the snapshot
>> age is checked, this cannot be a problem.  I don't see any such
>> path, but if you do, please let me know, and I'll fix things.
>
> That scares me. Not throwing an error, and not being broken are entirely
> different things.

Absolutely, but let's not start pointing at random chunks of code
and asking why an error isn't thrown there without showing that you
get bad results otherwise, or at least some plausible argument why
you might.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-05-24 Thread Kevin Grittner
On Tue, May 24, 2016 at 12:00 PM, Andres Freund <and...@anarazel.de> wrote:
> On 2016-05-24 11:24:44 -0500, Kevin Grittner wrote:
>> On Fri, May 6, 2016 at 8:28 PM, Kevin Grittner <kgri...@gmail.com> wrote:
>>> On Fri, May 6, 2016 at 7:48 PM, Andres Freund <and...@anarazel.de> wrote:
>>
>>>> That comment reminds me of a question I had: Did you consider the effect
>>>> of this patch on analyze? It uses a snapshot, and by memory you've not
>>>> built in a defense against analyze being cancelled.

The primary defense is not considering a cancellation except in 30
of the 500 places a page is used.  None of those 30 are, as far as
I can see (upon review in response to your question), used in the
analyze process.

>> With a 1min threshold, after loading a table "v" with a million
>> rows, beginning a repeatable read transaction on a different
>> connection and opening a cursor against that table, deleting almost
>> all rows on the original connection, and waiting a few minutes I
>> see this in the open transaction:
>>
>> test=# analyze verbose v;
>> INFO:  analyzing "public.v"
>> INFO:  "v": scanned 4425 of 4425 pages, containing 1999 live rows and
>> 0 dead rows; 1999 rows in sample, 1999 estimated total rows
>> ANALYZE
>> test=# select count(*) from v;
>> ERROR:  snapshot too old
>>
>> Meanwhile, no errors appeared in the log from autovacuum.
>
> I'd guess that that problem could only be reproduced if autoanalyze
> takes longer than the timeout, and there's rows pruned after it has
> started?  Analyze IIRC acquires a new snapshot when getting sample rows,
> so it'll not necessarily trigger in the above scenario, right?

Per Tom's recollection and my review of the code, the transaction
snapshot would be used in the test I show above, and the
combination of the verbose output the subsequent query show clearly
that if one of the page references capable of throwing the error
were encountered with this snapshot, the error would be thrown.  So
at least in this ANALYZE run, there is empirical support for what I
found in reviewing the code -- none of the places where we check
for an old snapshot are exercised during an ANALYZE.

> Is there anything preventing this from becoming a problem?

The fundamental approach that the error can only appear on
user-facing scans, not internal reads and positioning.

Unless there is some code path that uses a scan where the snapshot
age is checked, this cannot be a problem.  I don't see any such
path, but if you do, please let me know, and I'll fix things.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-05-24 Thread Kevin Grittner
On Tue, May 24, 2016 at 12:00 PM, Andres Freund <and...@anarazel.de> wrote:

> Analyze IIRC acquires a new snapshot when getting sample rows,

I could not find anything like that, and a case-insensitive search
of analyze.c finds no occurrences of "snap".  Can you remember
where you think you saw something that would cause the ANALYZE
command in my test to use a snapshot other than the one from the
REPEATABLE READ transaction in which it was run?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-05-24 Thread Kevin Grittner
On Fri, May 6, 2016 at 8:28 PM, Kevin Grittner <kgri...@gmail.com> wrote:
> On Fri, May 6, 2016 at 7:48 PM, Andres Freund <and...@anarazel.de> wrote:

>> That comment reminds me of a question I had: Did you consider the effect
>> of this patch on analyze? It uses a snapshot, and by memory you've not
>> built in a defense against analyze being cancelled.
>
> Will need to check on that.

With a 1min threshold, after loading a table "v" with a million
rows, beginning a repeatable read transaction on a different
connection and opening a cursor against that table, deleting almost
all rows on the original connection, and waiting a few minutes I
see this in the open transaction:

test=# analyze verbose v;
INFO:  analyzing "public.v"
INFO:  "v": scanned 4425 of 4425 pages, containing 1999 live rows and
0 dead rows; 1999 rows in sample, 1999 estimated total rows
ANALYZE
test=# select count(*) from v;
ERROR:  snapshot too old

Meanwhile, no errors appeared in the log from autovacuum.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-05-06 Thread Kevin Grittner
On Fri, May 6, 2016 at 7:48 PM, Andres Freund <and...@anarazel.de> wrote:
> On 2016-05-06 19:43:24 -0500, Kevin Grittner wrote:
>> It's disappointing that I am not getting more consistent numbers,
>> but NUMA can be hard to manage that way.
>
> FWIW, in my experience, unless you disable autovacuum (or rather
> auto-analyze), the effects from non-predicable analyze runs with
> long-running snapshots are worse.  I mean the numa effects suck, but in
> r/w workload effects of analyze are often much worse.

Hm.  But the benefits of the patch are not there if autovacuum is
off.  I'm gonna need to ponder the best way to test given all that.

> That comment reminds me of a question I had: Did you consider the effect
> of this patch on analyze? It uses a snapshot, and by memory you've not
> built in a defense against analyze being cancelled.

Will need to check on that.

>> Will push shortly with the nit-pick fixes you requested.
>
> Cool.

Done.

I will be checking in on the buildfarm, but if it starts causing
problems while I'm, say, sleeping -- I won't be offended if someone
else reverts 7e3da1c4737fd6582e12c80983987e4d2cbc1d17.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


[COMMITTERS] pgsql: Mitigate "snapshot too old" performance regression on NUMA

2016-05-06 Thread Kevin Grittner
Mitigate "snapshot too old" performance regression on NUMA

Limit maintenance of time to xid mapping to once per minute.  At
least in the tested case this brings performance within 5% of when
the feature is off, compared to several times slower without this
patch.

While there, fix comments and whitespace.

Ants Aasma, with cosmetic adjustments suggested by Andres Freund
Reviewed by Kevin Grittner and Andres Freund

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/7e3da1c4737fd6582e12c80983987e4d2cbc1d17

Modified Files
--
src/backend/utils/time/snapmgr.c | 85 +---
1 file changed, 62 insertions(+), 23 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-05-06 Thread Kevin Grittner
On Fri, May 6, 2016 at 5:07 PM, Andres Freund <and...@anarazel.de> wrote:

> On 2016-05-06 14:18:22 -0500, Kevin Grittner wrote:
>> I rebased the patch Ants posted (attached), and am running
>> benchmarks on a cthulhu (a big NUMA machine with 8 memory nodes).
>> Normally I wouldn't post results without a lot more data points
>> with multiple samples at each, but the initial results have me
>> wondering whether people would like to see this pushed later today
>> so that it has some time in the buildfarm and then into beta1.
>
> I think that generally would make sense. We quite possibly need some
> further changes, but it seems more likely that we can find them if the
> patch runs close to the disabled performance.

ok

>> Running the r/w TPC-B (sort of) load with scale, jobs, and threads
>> at 1000, and the database configured as I would for a production
>> server of that size, preliminary TPS results are:
>>
>> master, -1:  8158
>> master, 10min: 2019
>> Ants' patch, 10min: 7804
>
> That's rather nice.  Did you test read-only as well?

Not yet.  I don't trust short runs, so I've been going with -T2400;
with setup times and so on, that limits me to one run per hour of
time I book the machine, and I'm competing with others for that.  I
do plan to run read-only, too.

>From the 40 minute tests so far with Ants' patch (alternating settings):

old_snapshot_threshold = 10
7804
9524
9512

old_snapshot_threshold = -1
10421
8691
8977

It's disappointing that I am not getting more consistent numbers,
but NUMA can be hard to manage that way.

> If you'd feel more comfortable committing after I've run some
> performance tests, I could kick off some soon.

I think I should get it onto the buildfarm if we're going for
beta2, so there's time to recognize any problem (unlikely as that
*seems*) and back this out before beta if needed.  That said, all
additional data points welcome!

>> I can see arguments for tuning this far in time for the beta, as
>> well as the argument to wait until after the beta, so I'm just
>> throwing it out there to see what other people think.  I wouldn't
>> do it unless I have three runs at -1 and 10min with the patch, all
>> showing similar numbers.  If the BF chokes on it I would revert
>> this optimization attempt.
>
> +1 for going forward.  I'm still doubtful that it's a good idea to the
> map maintenance from GetSnapshotData(), but the issue becomes much less
> severe when addressed like this.
>
> The primary reasons why I'd like to move it is because of the
> significant amount of added gettimeofday() calls which are a real hog in
> some virtualized environments, and because I'm doubtful of tying the
> current time to the xmin horizon.

When I initially presented the proof of concept patch during the
9.5 development cycle it was based on transaction counts, and that
was the biggest criticism, and it came from many quarters.  Using
time was the big demand from just about everyone, and I'm not sure
how you do that without a mapping of time to xmin horizon.  If you
have some other idea, I'm all ears.

> Looks roughly sensible.

Will push shortly with the nit-pick fixes you requested.

Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-05-06 Thread Kevin Grittner
On Wed, Apr 20, 2016 at 8:08 PM, Ants Aasma <ants.aa...@eesti.ee> wrote:

> I had an idea I wanted to test out. The gist of it is to effectively
> have the last slot of timestamp to xid map stored in the latest_xmin
> field and only update the mapping when slot boundaries are crossed.
> See attached WIP patch for details. This way the exclusive lock only
> needs to be acquired once per minute. The common case is a spinlock
> that could be replaced with atomics later.

I rebased the patch Ants posted (attached), and am running
benchmarks on a cthulhu (a big NUMA machine with 8 memory nodes).
Normally I wouldn't post results without a lot more data points
with multiple samples at each, but the initial results have me
wondering whether people would like to see this pushed later today
so that it has some time in the buildfarm and then into beta1.

Running the r/w TPC-B (sort of) load with scale, jobs, and threads
at 1000, and the database configured as I would for a production
server of that size, preliminary TPS results are:

master, -1:  8158
master, 10min: 2019
Ants' patch, 10min: 7804

Basically it just skips the maintenance of the time/xid mapping
unless current time has advanced to a new minute.

I can see arguments for tuning this far in time for the beta, as
well as the argument to wait until after the beta, so I'm just
throwing it out there to see what other people think.  I wouldn't
do it unless I have three runs at -1 and 10min with the patch, all
showing similar numbers.  If the BF chokes on it I would revert
this optimization attempt.

Thoughts?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


update-old-snapshot-map-once-per-tick-v2.patch
Description: binary/octet-stream

-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Fix hash index vs "snapshot too old" problemms

2016-05-06 Thread Kevin Grittner
Fix hash index vs "snapshot too old" problemms

Hash indexes are not WAL-logged, and so do not maintain the LSN of
index pages.  Since the "snapshot too old" feature counts on
detecting error conditions using the LSN of a table and all indexes
on it, this makes it impossible to safely do early vacuuming on any
table with a hash index, so add this to the tests for whether the
xid used to vacuum a table can be adjusted based on
old_snapshot_threshold.

While at it, add a paragraph to the docs for old_snapshot_threshold
which specifically mentions this and other aspects of the feature
which may otherwise surprise users.

Problem reported and patch reviewed by Amit Kapila

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/2cc41acd8fa3ebb8f0501c6102a253fb7053cf46

Modified Files
--
doc/src/sgml/config.sgml | 13 ++
src/backend/access/hash/hash.c   |  1 -
src/backend/access/hash/hashsearch.c |  4 
src/backend/utils/cache/relcache.c   | 46 
src/backend/utils/time/snapmgr.c |  3 ++-
src/include/utils/rel.h  |  1 +
6 files changed, 62 insertions(+), 6 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add a few entries to the tail of time mapping, to see old values

2016-04-29 Thread Kevin Grittner
Add a few entries to the tail of time mapping, to see old values.

Without a few entries beyond old_snapshot_threshold, the lookup
would often fail, resulting in the more aggressive pruning or
vacuum being skipped often enough to matter.  This was very clearly
shown by a python test script posted by Ants Aasma, and was likely
a factor in an earlier but somewhat less clear-cut test case posted
by Jeff Janes.

This patch makes no change to the logic, per se -- it just makes
the array of mapping entries big enough to make lookup misses based
on timing much less likely.  An occasional miss is still possible
if a thread stalls for more than 10 minutes, but that does not
create any problem with correctness of behavior.  Besides, if
things are so busy that a thread is stalling for more than 10
minutes, it is probably OK to skip the more aggressive cleanup at
that particular point in time.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/7c3e8039f450eb99b3a73272d0a1661195747d1b

Modified Files
--
src/backend/utils/time/snapmgr.c | 26 +-
src/include/utils/snapmgr.h  | 13 +
2 files changed, 26 insertions(+), 13 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Fix C comment typo and redundant test

2016-04-25 Thread Kevin Grittner
Fix C comment typo and redundant test

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/e65953be4f540dce31f17db2934ee58365077272

Modified Files
--
src/backend/utils/time/snapmgr.c | 7 +++
1 file changed, 3 insertions(+), 4 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-21 Thread Kevin Grittner
On Thu, Apr 21, 2016 at 2:10 PM, Ants Aasma <ants.aa...@eesti.ee> wrote:
> On Thu, Apr 21, 2016 at 5:16 PM, Kevin Grittner <kgri...@gmail.com> wrote:

>> Could you provide enough to make that a self-contained
>> reproducible test case [?]

> [provided]

Thanks!  I have your test case running, and it is not immediately
clear why old rows are not being vacuumed away.  Will investigate.

> I'm too tired right now to chase this down myself. The mental
> toll that two small kids can take is pretty staggering.

Been there, done that; so I know just what you mean.  :-)  It is
rewarding though, eh?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Include snapmgr.h in blscan.c

2016-04-21 Thread Kevin Grittner
Include snapmgr.h in blscan.c

Windows builds on buildfarm are failing because
old_snapshot_threshold is not found in the bloom filter contrib
module.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/7cb1db1d9599f0a09d6920d2149d956ef6d88b0e

Modified Files
--
contrib/bloom/blscan.c | 1 +
1 file changed, 1 insertion(+)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-21 Thread Kevin Grittner
On Wed, Apr 20, 2016 at 8:08 PM, Ants Aasma <ants.aa...@eesti.ee> wrote:

> However, while checking out if my proof of concept patch actually
> works I hit another issue. I couldn't get my test for the feature to
> actually work. The test script I used is attached.

Could you provide enough to make that a self-contained reproducible
test case (i.e., that I don't need to infer or re-write any steps
or guess how to call it)?  In previous cases people have given me
where they felt that the feature wasn't working there have have
been valid reasons for it to behave as it was (e.g., a transaction
with a transaction ID and an xmin which prevented cleanup from
advancing).  I'll be happy to look at your case and see whether
it's another such case or some bug, but it seems a waste to reverse
engineer or rewrite parts of the test case to do so.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Inline initial comparisons in TestForOldSnapshot()

2016-04-21 Thread Kevin Grittner
Inline initial comparisons in TestForOldSnapshot()

Even with old_snapshot_threshold = -1 (which disables the "snapshot
too old" feature), performance regressions were seen at moderate to
high concurrency.  For example, a one-socket, four-core system
running 200 connections at saturation could see up to a 2.3%
regression, with larger regressions possible on NUMA machines.
By inlining the early (smaller, faster) tests in the
TestForOldSnapshot() function, the i7 case dropped to a 0.2%
regression, which could easily just be noise, and is clearly an
improvement.  Further testing will show whether more is needed.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/11e178d0dc4bc2328ae4759090b3c48b07023fab

Modified Files
--
src/backend/storage/buffer/bufmgr.c | 28 +---
src/include/storage/bufmgr.h| 32 +++-
2 files changed, 36 insertions(+), 24 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Revert no-op changes to BufferGetPage()

2016-04-20 Thread Kevin Grittner
Revert no-op changes to BufferGetPage()

The reverted changes were intended to force a choice of whether any
newly-added BufferGetPage() calls needed to be accompanied by a
test of the snapshot age, to support the "snapshot too old"
feature.  Such an accompanying test is needed in about 7% of the
cases, where the page is being used as part of a scan rather than
positioning for other purposes (such as DML or vacuuming).  The
additional effort required for back-patching, and the doubt whether
the intended benefit would really be there, have indicated it is
best just to rely on developers to do the right thing based on
comments and existing usage, as we do with many other conventions.

This change should have little or no effect on generated executable
code.

Motivated by the back-patching pain of Tom Lane and Robert Haas

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/a343e223a5c33a7283a6d8b255c9dbc48dbc5061

Modified Files
--
contrib/bloom/blinsert.c  |   3 +-
contrib/bloom/blscan.c|   4 +-
contrib/bloom/blutils.c   |   8 +-
contrib/bloom/blvacuum.c  |   2 +-
contrib/pageinspect/btreefuncs.c  |   8 +-
contrib/pageinspect/rawpage.c |   4 +-
contrib/pg_visibility/pg_visibility.c |   4 +-
contrib/pgstattuple/pgstatapprox.c|   2 +-
contrib/pgstattuple/pgstatindex.c |   6 +-
contrib/pgstattuple/pgstattuple.c |  12 ++-
src/backend/access/brin/brin.c|  15 ++--
src/backend/access/brin/brin_pageops.c|  52 +---
src/backend/access/brin/brin_revmap.c |  19 +++--
src/backend/access/brin/brin_xlog.c   |  16 ++--
src/backend/access/gin/ginbtree.c |  71 ++---
src/backend/access/gin/gindatapage.c  |  18 ++---
src/backend/access/gin/ginentrypage.c |  18 ++---
src/backend/access/gin/ginfast.c  |  22 ++---
src/backend/access/gin/ginget.c   |  41 +-
src/backend/access/gin/gininsert.c|   6 +-
src/backend/access/gin/ginutil.c  |  12 ++-
src/backend/access/gin/ginvacuum.c|  29 +++
src/backend/access/gin/ginxlog.c  |  34 
src/backend/access/gist/gist.c|  48 ---
src/backend/access/gist/gistbuild.c   |  14 ++--
src/backend/access/gist/gistget.c |   5 +-
src/backend/access/gist/gistutil.c|   6 +-
src/backend/access/gist/gistvacuum.c  |   6 +-
src/backend/access/gist/gistxlog.c|   8 +-
src/backend/access/hash/hash.c|  18 ++---
src/backend/access/hash/hashinsert.c  |  11 ++-
src/backend/access/hash/hashovfl.c|  30 ---
src/backend/access/hash/hashpage.c|  21 +++--
src/backend/access/hash/hashsearch.c  |  18 ++---
src/backend/access/hash/hashutil.c|   2 +-
src/backend/access/heap/heapam.c  | 128 +-
src/backend/access/heap/hio.c |  27 +++
src/backend/access/heap/pruneheap.c   |  12 ++-
src/backend/access/heap/visibilitymap.c   |  21 ++---
src/backend/access/nbtree/nbtinsert.c |  47 +--
src/backend/access/nbtree/nbtpage.c   |  78 +-
src/backend/access/nbtree/nbtree.c|   2 +-
src/backend/access/nbtree/nbtsearch.c |  46 ++-
src/backend/access/nbtree/nbtutils.c  |   4 +-
src/backend/access/nbtree/nbtxlog.c   |  37 -
src/backend/access/spgist/spgdoinsert.c   |  39 -
src/backend/access/spgist/spginsert.c |  12 +--
src/backend/access/spgist/spgscan.c   |   3 +-
src/backend/access/spgist/spgutils.c  |  19 ++---
src/backend/access/spgist/spgvacuum.c |  13 ++-
src/backend/access/spgist/spgxlog.c   |  50 ++--
src/backend/access/transam/generic_xlog.c |  16 ++--
src/backend/access/transam/xloginsert.c   |   8 +-
src/backend/access/transam/xlogutils.c|   8 +-
src/backend/catalog/index.c   |   4 +-
src/backend/commands/analyze.c|   2 +-
src/backend/commands/sequence.c   |  12 +--
src/backend/commands/trigger.c|   2 +-
src/backend/commands/vacuumlazy.c |  12 +--
src/backend/executor/nodeBitmapHeapscan.c |   4 +-
src/backend/executor/nodeSamplescan.c |   4 +-
src/backend/storage/buffer/bufmgr.c   |  10 +--
src/backend/storage/freespace/freespace.c |  29 +++
src/backend/storage/freespace/fsmpage.c   |   2 +-
src/include/storage/bufmgr.h  |  42 +++---
65 files changed, 550 insertions(+), 736 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-19 Thread Kevin Grittner
On Tue, Apr 19, 2016 at 10:14 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Tue, Apr 19, 2016 at 11:11 AM, Kevin Grittner <kgri...@gmail.com> wrote:
>> On Tue, Apr 19, 2016 at 9:57 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
>>> On Sun, Apr 17, 2016 at 2:26 AM, Andres Freund <and...@anarazel.de> wrote:
>>>>
>>>> On 2016-04-16 16:44:52 -0400, Noah Misch wrote:
>>>> > That is more controversial than the potential ~2% regression for
>>>> > old_snapshot_threshold=-1.  Alvaro[2] and Robert[3] are okay releasing
>>>> > that way, and Andres[4] is not.
>>>>
>>>> FWIW, I could be kinda convinced that it's temporarily ok, if there'd be
>>>> a clear proposal on the table how to solve the scalability issue around
>>>> MaintainOldSnapshotTimeMapping().
>>>
>>> It seems that for read-only workloads, MaintainOldSnapshotTimeMapping()
>>> takes EXCLUSIVE LWLock which seems to be a probable reason for a performance
>>> regression.  Now, here the question is do we need to acquire that lock if
>>> xmin is not changed since the last time value of
>>> oldSnapshotControl->latest_xmin is updated or xmin is lesser than equal to
>>> oldSnapshotControl->latest_xmin?
>>> If we don't need it for above cases, I think it can address the performance
>>> regression to a good degree for read-only workloads when the feature is
>>> enabled.
>>
>> Thanks, Amit -- I think something along those lines is the right
>> solution to the scaling issues when the feature is enabled.  For
>> now I'm focusing on the back-patching issues and the performance
>> regression when the feature is disabled, but I'll shift focus to
>> this once the "killer" issues are in hand.
>
> Maybe Amit could try his idea in parallel.

That would be great!

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-19 Thread Kevin Grittner
On Tue, Apr 19, 2016 at 9:57 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Sun, Apr 17, 2016 at 2:26 AM, Andres Freund <and...@anarazel.de> wrote:
>>
>> On 2016-04-16 16:44:52 -0400, Noah Misch wrote:
>> > That is more controversial than the potential ~2% regression for
>> > old_snapshot_threshold=-1.  Alvaro[2] and Robert[3] are okay releasing
>> > that way, and Andres[4] is not.
>>
>> FWIW, I could be kinda convinced that it's temporarily ok, if there'd be
>> a clear proposal on the table how to solve the scalability issue around
>> MaintainOldSnapshotTimeMapping().
>
> It seems that for read-only workloads, MaintainOldSnapshotTimeMapping()
> takes EXCLUSIVE LWLock which seems to be a probable reason for a performance
> regression.  Now, here the question is do we need to acquire that lock if
> xmin is not changed since the last time value of
> oldSnapshotControl->latest_xmin is updated or xmin is lesser than equal to
> oldSnapshotControl->latest_xmin?
> If we don't need it for above cases, I think it can address the performance
> regression to a good degree for read-only workloads when the feature is
> enabled.

Thanks, Amit -- I think something along those lines is the right
solution to the scaling issues when the feature is enabled.  For
now I'm focusing on the back-patching issues and the performance
regression when the feature is disabled, but I'll shift focus to
this once the "killer" issues are in hand.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-13 Thread Kevin Grittner
On Wed, Apr 13, 2016 at 3:47 PM, Andres Freund <and...@anarazel.de> wrote:
> On 2016-04-13 15:21:31 -0500, Kevin Grittner wrote:

>> What is the kernel on which these tests were run?
>
> 3.16. I can upgrade to 4.4 if necessary.

No, I'm not aware of any problems from 3.8 on.

> But I still believe very strongly that this is side-tracking the issue.

As long as I know it isn't a broken NUMA scheduler, or that there
were fewer than four NUMA memory nodes, I consider it a non-issue.
I just need to know whether it fits that problem profile to feel
comfortable that I can interpret the results correctly.

>> Which pg commit were these tests run against?
>
> 85e00470. + some reverts (the whitespace commits make this harder...) in
> the reverted case.
>
>
>> If 2201d801 was not included in your -1 tests, have you identified
>> where the 2% extra run time is going on -1 versus reverted?
>
> No. It's hard to do good profiles on most virtualized hardware, since
> hardware performance counters are disabled. So you only can do OS
> sampling; which has a pretty big performance influence.
>
> I'm not entirely sure what you mean with "2201d801 was not included in
> your -1 tests". The optimization was present.

Sorry, the "not" was accidental -- I hate reverse logic errors like that.

Based on the commit you used, I have my answer.  Thanks.

>> Since several other threads lately have reported bigger variation than
>> that based on random memory alignment issues, can we confirm that this
>> is a real difference in what is at master's HEAD?
>
> It's unfortunately hard to measure this conclusively here (and in
> general). I guess we'll have to look, on native hardware, where the
> difference comes from.  The difference is smaller on my laptop, and my
> workstation is somewhere on a container ship, other physical hardware I
> do not have.

OK, thanks.  I can't think of anything else to ask for at this
point.  If you feel that you have enough to press for some
particular course of action, go for it.  Personally, I want to do
some more investigation on those big machines.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-13 Thread Kevin Grittner
On Wed, Apr 13, 2016 at 3:01 PM, Andres Freund <and...@anarazel.de> wrote:

> If you want me to rn some other tests I can, but ISTM we have the
> data we need?

Thanks for the additional detail on how this was run.  I think I
still need a little more context, though:

What is the kernel on which these tests were run?

Which pg commit were these tests run against?

If 2201d801 was not included in your -1 tests, have you identified
where the 2% extra run time is going on -1 versus reverted?  Since
several other threads lately have reported bigger variation than
that based on random memory alignment issues, can we confirm that
this is a real difference in what is at master's HEAD?  Of course,
I'm still scheduled to test on bare metal machines in a couple
days, on two different architectures, so we'll have a few more data
points after that.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-13 Thread Kevin Grittner
On Wed, Apr 13, 2016 at 1:56 PM, Andres Freund <and...@anarazel.de> wrote:

> I'll run with -1 once the current (longer) run has finished.

Just for the record, were any of the other results purporting to be
with the feature "off" also actually running with the feature set
for its fastest possible timeout?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-13 Thread Kevin Grittner
On Wed, Apr 13, 2016 at 12:25 PM, Robert Haas <robertmh...@gmail.com> wrote:

> [test results with old_snapshot_threshold = 0 and 10]

>From the docs:

| A value of -1 disables this feature, and is the default.

> Yuck.  Aside from the fact that performance tanks when the feature is
> turned on, it seems that there is a significant effect even with it
> turned off.

No evidence of that has been provided.  -1 is off; 0 is for testing
very fast expiration.

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-12 Thread Kevin Grittner
On Tue, Apr 12, 2016 at 2:53 PM, Kevin Grittner <kgri...@gmail.com> wrote:
> Readonly with client and job counts matching scale.

Single-socket i7, BTW.

>> A lot of this will be different between
>> single-socket and multi-socket servers; as soon as you have the latter
>> the likelihood of contention being bad goes up dramatically.
>
> Yeah, I know, and 4 socket has been at least an order of magnitude
> more problematic in my experience than 2 socket.  And the problems
> are far, far, far worse on kernels prior to 3.8, especially on 3.x
> before 3.8, so it's hard to know how to take any report of problems
> on a 4 node NUMA machine without knowing the kernel version.

Also, with 4 node NUMA I have seen far better scaling with
hyper-threading turned off.  I know there are environments where it
helps, but high-concurrency on multi-node NUMA is not one of them.

So, anyway, mentioning the HT setting is important, too.

Kevin Grittner


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-12 Thread Kevin Grittner
On Tue, Apr 12, 2016 at 2:28 PM, Andres Freund <and...@anarazel.de> wrote:
> On 2016-04-12 14:17:12 -0500, Kevin Grittner wrote:
>> Well, something is different between your environment and mine,
>> since I saw no difference at scale 100 and 2.2% at scale 200.
>
> In a readonly test or r/w?

Readonly with client and job counts matching scale.

> A lot of this will be different between
> single-socket and multi-socket servers; as soon as you have the latter
> the likelihood of contention being bad goes up dramatically.

Yeah, I know, and 4 socket has been at least an order of magnitude
more problematic in my experience than 2 socket.  And the problems
are far, far, far worse on kernels prior to 3.8, especially on 3.x
before 3.8, so it's hard to know how to take any report of problems
on a 4 node NUMA machine without knowing the kernel version.

>> knowing more about your hardware, OS, configuration, etc., might
>> allow me to duplicate a problem so I can fix
>
>> For example, I used a "real" pg config, like I would for a production
>> machine (because that seems to me to be the environment that is most
>> important): the kernel is 3.13 (not one with pessimal scheduling) and
>> has tuning for THP, the deadline scheduler, the vm.*dirty* settings,
>> etc.  Without knowing even the kernel and what tuning the OS and pg
>> have had on your box, I could take a lot of shots in the dark without
>> hitting anything.
>
> That shouldn't really matter much for a read-only, shared_buffer
> resident, test? There's no IO and THP pretty much plays no role because
> there's very few memory allocations (removing the pressure causing the
> well known degradations).

I hate to assume which differences matter without trying, but some
of them seem less probable than others.

>> Oh, and the output of `numactl --hardware` would be good to have.
>> Thanks for all information you can provide.
>
> That was on Alexander's/PgPro's machine. Numactl wasn't installed, and I
> didn't have root. But it has four numa domains (gathered via /sys/).

On the machines I've used, it will give you the hardware report
without being root.  But of course, it can't do that if it's not
installed.  I hadn't yet seen a machine with multiple NUMA memory
segments that didn't have the numactl executable installed; I'll
keep in mind that can happen.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-12 Thread Kevin Grittner
On Tue, Apr 12, 2016 at 1:56 PM, Andres Freund <and...@anarazel.de> wrote:
> On 2016-04-12 13:44:00 -0500, Kevin Grittner wrote:
>> On Tue, Apr 12, 2016 at 12:38 PM, Andres Freund <and...@anarazel.de> wrote:
>>> On 2016-04-12 16:49:25 +, Kevin Grittner wrote:
>>>> On a big NUMA machine with 1000 connections in saturation load
>>>> there was a performance regression due to spinlock contention, for
>>>> acquiring values which were never used.  Just fill with dummy
>>>> values if we're not going to use them.
>>>
>>> FWIW, I could see massive regressions with just 64 connections.
>>
>> With what settings?
>
> You mean pgbench or postgres? The former -M prepared -c 64 -j 64 -S. The
> latter just a large enough shared buffers to contains the scale 300
> database, and adapted maintenance_work_mem. Nothing special.

Well, something is different between your environment and mine,
since I saw no difference at scale 100 and 2.2% at scale 200.  So,
knowing more about your hardware, OS, configuration, etc., might
allow me to duplicate a problem so I can fix it.  For example, I
used a "real" pg config, like I would for a production machine
(because that seems to me to be the environment that is most
important): the kernel is 3.13 (not one with pessimal scheduling)
and has tuning for THP, the deadline scheduler, the vm.*dirty*
settings, etc.  Without knowing even the kernel and what tuning the
OS and pg have had on your box, I could take a lot of shots in the
dark without hitting anything.  Oh, and the output of `numactl
--hardware` would be good to have.  Thanks for all information you
can provide.

>>  With or without the patch to avoid the locks when off?
>
> Without. Your commit message made it sound like you need unrealistic or
> at least unusual numbers of connections, and that's afaics not the case.

It was the only reported case to that point, so the additional data
point is valuable, if I can tell where that point is.  And you
don't have any evidence that even with your configuration that any
performance regression remains for those who have the default value
for old_snapshot_threshold?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add the "snapshot too old" feature

2016-04-12 Thread Kevin Grittner
On Mon, Apr 11, 2016 at 12:31 PM, Kevin Grittner <kgri...@gmail.com> wrote:
> On Mon, Apr 11, 2016 at 8:20 AM, Alexander Korotkov

>> So, for read-only benchmark this patch introduces more than 5 times
>> regression on big machine.
>
> I did not schedule a saturation test on a big machine.  I guess I
> should have done.

I have booked two large NUMA machines for Friday to look for any
remaining performance regressions on high-concurrency loads on such
machines.  Anyone with ideas on particular tests that they feel
should be included, please let me know before then.

Thanks!

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-12 Thread Kevin Grittner
On Tue, Apr 12, 2016 at 12:38 PM, Andres Freund <and...@anarazel.de> wrote:
> On 2016-04-12 16:49:25 +0000, Kevin Grittner wrote:
>> On a big NUMA machine with 1000 connections in saturation load
>> there was a performance regression due to spinlock contention, for
>> acquiring values which were never used.  Just fill with dummy
>> values if we're not going to use them.
>
> FWIW, I could see massive regressions with just 64 connections.

With what settings?  With or without the patch to avoid the locks when off?

> I'm a bit scared of having an innoccuous sounding option regress things
> by a factor of 10. I think, in addition to this fix, we need to actually
> solve the scalability issue here to a good degree.  One way to do so is
> to apply the parts of 0001 in
> http://archives.postgresql.org/message-id/20160330230914.GH13305%40awork2.anarazel.de
> defining PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY and rely on that. Another
> to apply the whole patch and simply put the lsn in an 8 byte atomic.

I think that we are well due for atomic access to aligned 8-byte
values.  That would eliminate one potential hot spot in the
"snapshot too old" code, for sure.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold

2016-04-12 Thread Kevin Grittner
Avoid extra locks in GetSnapshotData if old_snapshot_threshold < 0

On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used.  Just fill with dummy
values if we're not going to use them.

This patch has not been benchmarked yet on a big NUMA machine, but
it seems like a good idea on general principle, and it seemed to
prevent an apparent 2.2% regression on a single-socket i7 box
running 200 connections at saturation load.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/2201d801b03c2d1b0bce4d6580b718dc34d38b3e

Modified Files
--
src/backend/storage/ipc/procarray.c | 28 
1 file changed, 20 insertions(+), 8 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Use static inline function for BufferGetPage()

2016-04-11 Thread Kevin Grittner
Use static inline function for BufferGetPage()

I was initially concerned that the some of the hundreds of
references to BufferGetPage() where the literal
BGP_NO_SNAPSHOT_TEST were passed might not optimize as well as a
macro, leading to some hard-to-find performance regressions in
corner cases.  Inspection of disassembled code has shown identical
code at all inspected locations, and the size difference doesn't
amount to even one byte per such call.  So make it readable.

Per gripes from Álvaro Herrera and Tom Lane

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/a6f6b78196a701702ec4ff6df56c346bdcf9abd2

Modified Files
--
src/backend/storage/buffer/bufmgr.c |  4 +--
src/include/storage/bufmgr.h| 50 ++---
2 files changed, 25 insertions(+), 29 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Make oldSnapshotControl a pointer to a volatile structure

2016-04-11 Thread Kevin Grittner
Make oldSnapshotControl a pointer to a volatile structure

It was incorrectly declared as a volatile pointer to a non-volatile
structure.  Eliminate the OldSnapshotControl struct definition; it
is really not needed.  Pointed out by Tom Lane.

While at it, add OldSnapshotControlData to pgindent's list of
structures.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/80647bf65a03e232c995c0826ef394dad8d685fe

Modified Files
--
src/backend/utils/time/snapmgr.c | 8 +++-
src/tools/pgindent/typedefs.list | 1 +
2 files changed, 4 insertions(+), 5 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add the "snapshot too old" feature

2016-04-11 Thread Kevin Grittner
On Mon, Apr 11, 2016 at 8:20 AM, Alexander Korotkov
<a.korot...@postgrespro.ru> wrote:

> 1) We uglify buffer manager interface a lot.  This patch adds 3 more
> arguments to BufferGetPage().  It looks weird.  Should we try to find less
> invasive way for doing this?

As already pointed out, I originally touched about 450 fewer places
in the code, and did not change the signature of BufferGetPage().
I was torn on the argument that we needed a "forced choice" on
checking the snapshot age built into BufferGetPage() -- it is a bit
annoying, but it might prevent a later bug which could silently
return incorrect data.  In the end, I caved to the argument that
the annoyance was worth the improved chances of avoiding such a
bug.

> 2) Did you ever try to examine performance regression?

Yes.  Our customer used big machines for extensive performance
testing -- although they used "paced" input to simulate real users,
and saw nothing but gains.  On my own i7 box, your test scaled to
100 (so it would fit in memory) yielded this:

unpatched:

number of transactions actually processed: 40534737
latency average = 0.739 ms
latency stddev = 0.359 ms
tps = 134973.430374 (including connections establishing)
tps = 135039.395041 (excluding connections establishing)

with the "snapshot too old" patch:

number of transactions actually processed: 40679654
latency average = 0.735 ms
latency stddev = 0.486 ms
tps = 135463.614244 (including connections establishing)
tps = 135866.160933 (excluding connections establishing)

> I tried simple read
> only test on 4x18 Intel server.
> pgbench -s 1000 -c 140 -j 100 -M prepared -S -T 300 -P 1 postgres (data fits
> to shared_buffers)
>
> master-   193 740.3 TPS
> snapshot too old reverted - 1 065 732.6 TPS
>
> So, for read-only benchmark this patch introduces more than 5 times
> regression on big machine.

I did not schedule a saturation test on a big machine.  I guess I
should have done.

I'm confident this can be fixed; your suggestion for a wrapping
"if" is probably sufficient.  I am looking at this and the misuse
of "volatile" now.

Are you able to easily test that or should I book time on one (or
more) of our big machines on my end?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Fix typo in C comment.

2016-04-09 Thread Kevin Grittner
Fix typo in C comment.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/381200be4b565292eba6f62200248cb775f06940

Modified Files
--
src/include/storage/bufpage.h | 8 
1 file changed, 4 insertions(+), 4 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Turn special page pointer validation to static inline function

2016-04-09 Thread Kevin Grittner
Turn special page pointer validation to static inline function

Inclusion of multiple macros inside another macro was pushing MSVC
past its size liimit.  Reported by buildfarm.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/56dffb5a73ab157fc8d35a76c1170d656a051f14

Modified Files
--
src/include/storage/bufpage.h | 23 ---
1 file changed, 20 insertions(+), 3 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add the "snapshot too old" feature

2016-04-08 Thread Kevin Grittner
On Fri, Apr 8, 2016 at 5:04 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Kevin Grittner <kgri...@gmail.com> writes:
>> I see that there are failures on buildfarm Windows machines.
>> Makefiles are not my strong suit, and I don't have access to a
>> Windows machine for builds.  I'll start digging, but if someone can
>> help me with that, it would be much appreciated.
>
> I think you need to add snapshot_too_old to @contrib_excludes in
> Mkvcbuild.pm to stop that file from thinking that this module
> contains anything to build.  Not sure if anything else needs to be
> done to get the MSVC script to run the test the module embodies,
> but that would at least let it get further than it's getting now.

Thanks!  Pushed.

We'll see what happens now

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add snapshot_too_old to NSVC @contrib_excludes

2016-04-08 Thread Kevin Grittner
Add snapshot_too_old to NSVC @contrib_excludes

The buildfarm showed failure for Windows MSVC builds due to this
omission.  This might not be the only problem with the Makefile for
this feature, but hopefully this will get it past the immediate
problem.

Fix suggested by Tom Lane

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/279d86afdbed550425bc9d1327ade2dc0028ad33

Modified Files
--
src/tools/msvc/Mkvcbuild.pm | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


Re: [COMMITTERS] pgsql: Add the "snapshot too old" feature

2016-04-08 Thread Kevin Grittner
On Fri, Apr 8, 2016 at 2:45 PM, Kevin Grittner <kgri...@postgresql.org> wrote:
> Add the "snapshot too old" feature

> src/test/modules/Makefile  |   1 +
> src/test/modules/snapshot_too_old/Makefile |  47 +++
> .../snapshot_too_old/expected/sto_using_cursor.out |  73 
> .../snapshot_too_old/expected/sto_using_select.out |  55 +++
> .../snapshot_too_old/specs/sto_using_cursor.spec   |  37 ++
> .../snapshot_too_old/specs/sto_using_select.spec   |  36 ++
> src/test/modules/snapshot_too_old/sto.conf |   3 +

I see that there are failures on buildfarm Windows machines.
Makefiles are not my strong suit, and I don't have access to a
Windows machine for builds.  I'll start digging, but if someone can
help me with that, it would be much appreciated.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


[COMMITTERS] pgsql: Add the "snapshot too old" feature

2016-04-08 Thread Kevin Grittner
Add the "snapshot too old" feature

This feature is controlled by a new old_snapshot_threshold GUC.  A
value of -1 disables the feature, and that is the default.  The
value of 0 is just intended for testing.  Above that it is the
number of minutes a snapshot can reach before pruning and vacuum
are allowed to remove dead tuples which the snapshot would
otherwise protect.  The xmin associated with a transaction ID does
still protect dead tuples.  A connection which is using an "old"
snapshot does not get an error unless it accesses a page modified
recently enough that it might not be able to produce accurate
results.

This is similar to the Oracle feature, and we use the same SQLSTATE
and error message for compatibility.

Branch
--
master

Details
---
http://git.postgresql.org/pg/commitdiff/848ef42bb8c7909c9d7baa38178d4a209906e7c1

Modified Files
--
contrib/bloom/blscan.c |   3 +-
doc/src/sgml/config.sgml   |  50 +++
src/backend/access/brin/brin.c |  19 +-
src/backend/access/brin/brin_revmap.c  |  11 +-
src/backend/access/gin/ginbtree.c  |   9 +-
src/backend/access/gin/gindatapage.c   |   7 +-
src/backend/access/gin/ginget.c|  22 +-
src/backend/access/gin/gininsert.c |   2 +-
src/backend/access/gist/gistget.c  |   2 +-
src/backend/access/hash/hash.c |   3 +-
src/backend/access/hash/hashsearch.c   |  10 +-
src/backend/access/heap/heapam.c   |  31 +-
src/backend/access/heap/pruneheap.c|  11 +-
src/backend/access/nbtree/nbtinsert.c  |   7 +-
src/backend/access/nbtree/nbtpage.c|   2 +-
src/backend/access/nbtree/nbtsearch.c  |  51 ++-
src/backend/access/spgist/spgscan.c|   2 +-
src/backend/commands/vacuum.c  |   3 +-
src/backend/commands/vacuumlazy.c  |   3 +-
src/backend/storage/buffer/bufmgr.c|  40 ++
src/backend/storage/ipc/ipci.c |   3 +
src/backend/storage/ipc/procarray.c|   9 +
src/backend/storage/lmgr/lwlocknames.txt   |   1 +
src/backend/utils/errcodes.txt |   4 +
src/backend/utils/misc/guc.c   |  11 +
src/backend/utils/misc/postgresql.conf.sample  |   2 +
src/backend/utils/time/snapmgr.c   | 404 +
src/include/access/brin_revmap.h   |   5 +-
src/include/access/gin_private.h   |   4 +-
src/include/access/nbtree.h|   7 +-
src/include/storage/bufmgr.h   |  19 +-
src/include/utils/rel.h|   1 +
src/include/utils/snapmgr.h|  13 +
src/include/utils/snapshot.h   |   4 +
src/test/modules/Makefile  |   1 +
src/test/modules/snapshot_too_old/Makefile |  47 +++
.../snapshot_too_old/expected/sto_using_cursor.out |  73 
.../snapshot_too_old/expected/sto_using_select.out |  55 +++
.../snapshot_too_old/specs/sto_using_cursor.spec   |  37 ++
.../snapshot_too_old/specs/sto_using_select.spec   |  36 ++
src/test/modules/snapshot_too_old/sto.conf |   3 +
41 files changed, 942 insertions(+), 85 deletions(-)


-- 
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers


  1   2   3   >