Re: [HACKERS] Is there some possibilities to take info about login mapping inside session?

2015-05-31 Thread Stephen Frost
Pavel,

* Pavel Stehule (pavel.steh...@gmail.com) wrote:
 Have we some possibility to take info about external user when any login
 via mapping is used?

Certainly sounds like a very useful things to me.

I'll note that, for client-side certificates, we actually do include
that info, but it's done in a very-specific-to-SSL way (see sslinfo).

I've not looked, but it would seem that keeping info about what the
'system' user is and making it available via a function would be pretty
simple to do.  Too late for 9.5 though, of course.

 The customer want to use map to do switch between external user to database
 user, but needs a info for audit about external user.

This is a more interesting question- where would this information be
going for audit purposes?  Are you thinking we'd need to add another
escape to log_line_prefix for it?  We still havn't gotten info about the
currently active role added, an effort I spent a great deal of time on
about 2 years ago, as I recall.  I might be able to revisit that for
9.6.

If not through log_line_prefix, then through a trigger?  That would work
with just the function.  If not that, then I'd be quite curious what
this customer is doing (and if it's in line with what our customers are
interested in when it comes to real auditing...).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [CORE] postpone next week's release

2015-05-31 Thread David Steele
On 5/31/15 11:49 AM, Noah Misch wrote:
 On Sat, May 30, 2015 at 09:51:04PM -0400, David Steele wrote:
 On 5/30/15 8:38 PM, Joshua D. Drake wrote:
 On 05/30/2015 03:48 PM, David Steele wrote:
 I would argue Heikki's WAL stuff is a perfect case for releasing a
 public alpha/beta soon.  I'd love to test PgBackRest with an official
 9.5dev build.  The PgBackRest test suite has lots of tests that run on
 versions 8.3+ and might well shake out any bugs that are lying around.

 You are right. Clone git, run it nightly automated and please, please
 report anything you find. There is no reason for a tagged release for
 that. Consider it a custom, purpose built, build-test farm.

 Sure - I can write code to do that.  But then why release a beta at all?
 
 It's largely for the benefit of folks planning manual, or otherwise high-cost,
 testing.  If you budget for just one big test per year, make it a test of
 beta1.  For inexpensive testing, you may as well ignore beta and test git
 master daily or weekly.

I've gotten to the point of (relatively) high-cost coding/testing.  The
removal of checkpoint_segments and pause_on_recovery are leading to
refactoring of not only the regressions tests but the actual backup
code.  9.5 and 8.3 are the only versions that require exceptions in the
code base.

I've already done basic testing against 9.5 by disabling certain tests.
 Now I'm at the point where I need to start modifying code to take new
9.5 features/changes into account and make sure the regression tests
work for 8.3-9.5 with the fewest number of exceptions possible.

From the perspective of backup/restore testing, 9.5 has the most changes
since 9.0.  I'd like to know that the API at least is stable before
investing the time in new development.

Perhaps I'm just misunderstanding the nature of the discussion.

-- 
- David Steele
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Join Filter vs. Index Cond (performance regression 9.1-9.2+/HEAD)

2015-05-31 Thread Tom Lane
Andrew Gierth and...@tao11.riddles.org.uk writes:
 Obviously it makes little sense to use an (a,b,c) index to look up just
 (a,b) and then filter on c; the question is, what is the planner doing
 that leads it to get this so wrong?

It's not so astonishing as all that; compare

regression=# explain select * from t1 where a=3 and b=4;
   QUERY PLAN   

 Index Only Scan using t1_pkey on t1  (cost=0.28..8.29 rows=1 width=12)
   Index Cond: ((a = 3) AND (b = 4))
(2 rows)

regression=# explain select * from t1 where a=3 and b=4 and c=5;
   QUERY PLAN   

 Index Only Scan using t1_pkey on t1  (cost=0.28..8.30 rows=1 width=12)
   Index Cond: ((a = 3) AND (b = 4) AND (c = 5))
(2 rows)

Once you're down to an estimate of one row retrieved, adding additional
index conditions simply increases the cost (not by much, but it increases)
without delivering any visible benefit.

I believe what probably happened in this case is that the planner
considered both forms of the indexscan path and concluded that they were
fuzzily the same cost and rowcount, yet the path using only t2.a and t3.b
clearly dominated by requiring strictly fewer outer relations for
parameters.  So it threw away the path that also had the c = t4.c
comparison before it ever got to the join stage.  Even had it kept that
path, the join cost estimate wouldn't have looked any better than the one
for the join it did pick, so there would have been no certainty of picking
the correct plan.

The real problem in your example is thus the incorrect rowcount estimate;
with better rowcount estimates the two cases wouldn't have appeared to
have the same output rowcount.

For the toy data in your example, this can probably be blamed on the fact
that eqjoinsel_inner doesn't have any smarts for the case of having an MCV
list for only one side (though as noted in the comments, it's not obvious
what it should do instead).  However, it's not very clear what was
happening in the real-world case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-31 Thread Andres Freund
On 2015-05-31 11:55:44 -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  FYI, I realize that one additional thing that has discouraged code
  reorganization is the additional backpatch overhead.  I think we now
  need to accept that our reorganization-adverse approach might have cost
  us some reliability, and that reorganization is going to add work to
  backpatching.
 
  Actually, code reorganization in HEAD might cause backpatching to be
  more buggy, reducing reliability --- obviously we need to have a
  discussion about that.
 
 Commit 6b700301c36e380eb4972ab72c0e914cae60f9fd is a recent real example.
 Not that that should dissuade us from ever doing any reorganizations,
 but it's foolish to discount back-patching costs.

On the other hand, that code is a complete maintenance nightmare. If
there weren't literally dozens of places that needed to be touched to
add a single parameter, it'd be far less likely for such a mistake to be
made. Right now significant portions of the file differ between the
branches, despite primarily minor feature additions...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [CORE] postpone next week's release

2015-05-31 Thread Noah Misch
On Sat, May 30, 2015 at 09:51:04PM -0400, David Steele wrote:
 On 5/30/15 8:38 PM, Joshua D. Drake wrote:
  On 05/30/2015 03:48 PM, David Steele wrote:
  I would argue Heikki's WAL stuff is a perfect case for releasing a
  public alpha/beta soon.  I'd love to test PgBackRest with an official
  9.5dev build.  The PgBackRest test suite has lots of tests that run on
  versions 8.3+ and might well shake out any bugs that are lying around.
  
  You are right. Clone git, run it nightly automated and please, please
  report anything you find. There is no reason for a tagged release for
  that. Consider it a custom, purpose built, build-test farm.
 
 Sure - I can write code to do that.  But then why release a beta at all?

It's largely for the benefit of folks planning manual, or otherwise high-cost,
testing.  If you budget for just one big test per year, make it a test of
beta1.  For inexpensive testing, you may as well ignore beta and test git
master daily or weekly.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlog - pg_xjournal?

2015-05-31 Thread Tom Lane
Joel Jacobson j...@trustly.com writes:
 If we could turn back time, would we have picked pg_xlog as the most
 optimal name for this important directory, or would we have come up with a
 more user-friendly name?

Yeah...

 My suggestion is to use pg_xjournal instead of pg_xlog when new users
 create a new data directory using initdb, and allow for both directories to
 exist (exclusive or, i.e. either one or the other, but not both). That way
 we don't complicate the life for any existing users, all their tools will
 continue to work who rely on pg_xlog to be named pg_xlog, but only force
 new users to do a bit of googling when they can't use whatever tool that
 can't find pg_xlog. When they find out it's an important directory, they
 can simply create a symlink and their old not yet updated tool will work
 again.

Hm.  I think the impact on third-party backup tools would be rather bad,
but there's a simple modification of the idea that might fix that:
just always create pg_xlog as a symlink to pg_xjournal during initdb.
Anybody who blindly removes pg_xlog won't have done anything
irreversible.  We could deprecate pg_xlog and stop creating the symlink
after a few releases, once third-party tools have had a reasonable
amount of time to adjust.

Note that we'd really also have to rename pg_clog etc if you want to
protect against people who rm -rf *log without reading documentation.
But I don't see why the same trick wouldn't work for all of them.

A more difficult question is whether we'd also rename pg_resetxlog,
pg_receivexlog, etc.  It would be hard to make those changes similarly
transparent.

In the end though, this is a lot of thrashing for a problem that
only comes up rarely ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-31 Thread Bruce Momjian
On Sun, May 31, 2015 at 11:55:44AM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  FYI, I realize that one additional thing that has discouraged code
  reorganization is the additional backpatch overhead.  I think we now
  need to accept that our reorganization-adverse approach might have cost
  us some reliability, and that reorganization is going to add work to
  backpatching.
 
  Actually, code reorganization in HEAD might cause backpatching to be
  more buggy, reducing reliability --- obviously we need to have a
  discussion about that.
 
 Commit 6b700301c36e380eb4972ab72c0e914cae60f9fd is a recent real example.
 Not that that should dissuade us from ever doing any reorganizations,
 but it's foolish to discount back-patching costs.

Yep.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlog - pg_xjournal?

2015-05-31 Thread Joel Jacobson
On Sun, May 31, 2015 at 7:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Hm.  I think the impact on third-party backup tools would be rather bad,
 but there's a simple modification of the idea that might fix that:
 just always create pg_xlog as a symlink to pg_xjournal during initdb.
 Anybody who blindly removes pg_xlog won't have done anything
 irreversible.  We could deprecate pg_xlog and stop creating the symlink
 after a few releases, once third-party tools have had a reasonable
 amount of time to adjust.

I like the solution. Simple and effective.
+1

 In the end though, this is a lot of thrashing for a problem that
 only comes up rarely ...

It happens often enough for the problem to be the first mentioned
use-case of pg_resetxlog at Stack Overflow:

pg_resetxlog is a tool of last resort for getting your database
running again after:
1. You deleted files you shouldn't have from pg_xlog;

(http://stackoverflow.com/questions/12897429/what-does-pg-resetxlog-do-and-how-does-it-work)

Preventing failure in the case of faults is of course one of the
primary objectives of any database.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix autoconf deprecation warnings

2015-05-31 Thread Andreas Karlsson

I have attached new versions which apply on the current master.

Andreas
From a4a4970d37b710449ccd57a8b4d896a3c004b62a Mon Sep 17 00:00:00 2001
From: Andreas Karlsson andr...@proxel.se
Date: Thu, 12 Feb 2015 23:55:01 +0100
Subject: [PATCH 1/4] Replace obsolete macros AC_TRY_* with AC_*_IFELSE.

The AC_TRY_* macros have been obsolete for a long time and cause warnings
to be written to the log. The new macros have existed since at least
autoconf 2.49a, which was released in late 2000.
---
 config/ac_func_accept_argtypes.m4 |  6 ++--
 config/acx_pthread.m4 |  4 +--
 config/c-compiler.m4  | 69 +++--
 config/c-library.m4   | 36 ++--
 config/programs.m4|  4 +--
 configure.in  | 72 ---
 6 files changed, 97 insertions(+), 94 deletions(-)

diff --git a/config/ac_func_accept_argtypes.m4 b/config/ac_func_accept_argtypes.m4
index a82788d..0e50c8d 100644
--- a/config/ac_func_accept_argtypes.m4
+++ b/config/ac_func_accept_argtypes.m4
@@ -50,15 +50,15 @@ AC_DEFUN([AC_FUNC_ACCEPT_ARGTYPES],
   for ac_cv_func_accept_arg1 in 'int' 'unsigned int' 'SOCKET'; do
for ac_cv_func_accept_arg2 in 'struct sockaddr *' 'const struct sockaddr *' 'void *'; do
 for ac_cv_func_accept_arg3 in 'int' 'size_t' 'socklen_t' 'unsigned int' 'void'; do
- AC_TRY_COMPILE(
+ AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
 [#ifdef HAVE_SYS_TYPES_H
 #include sys/types.h
 #endif
 #ifdef HAVE_SYS_SOCKET_H
 #include sys/socket.h
 #endif
-extern $ac_cv_func_accept_return accept ($ac_cv_func_accept_arg1, $ac_cv_func_accept_arg2, $ac_cv_func_accept_arg3 *);],
- [], [ac_not_found=no; break 4], [ac_not_found=yes])
+extern $ac_cv_func_accept_return accept ($ac_cv_func_accept_arg1, $ac_cv_func_accept_arg2, $ac_cv_func_accept_arg3 *);], [])],
+ [ac_not_found=no; break 4], [ac_not_found=yes])
done
   done
  done
diff --git a/config/acx_pthread.m4 b/config/acx_pthread.m4
index 581164b..ebb4f06 100644
--- a/config/acx_pthread.m4
+++ b/config/acx_pthread.m4
@@ -122,10 +122,10 @@ for flag in $acx_pthread_flags; do
 # pthread_cleanup_push because it is one of the few pthread
 # functions on Solaris that doesn't have a non-functional libc stub.
 # We try pthread_create on general principles.
-AC_TRY_LINK([#include pthread.h],
+AC_LINK_IFELSE([AC_LANG_PROGRAM([#include pthread.h],
 [pthread_t th; pthread_join(th, 0);
  pthread_attr_init(0); pthread_cleanup_push(0, 0);
- pthread_create(0,0,0,0); pthread_cleanup_pop(0); ],
+ pthread_create(0,0,0,0); pthread_cleanup_pop(0); ])],
 [acx_pthread_ok=yes], [acx_pthread_ok=no])
 
 if test x$acx_pthread_ok = xyes; then
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 4ef0de6..bf9b758 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -7,8 +7,8 @@
 # Check if the C compiler understands signed types.
 AC_DEFUN([PGAC_C_SIGNED],
 [AC_CACHE_CHECK(for signed types, pgac_cv_c_signed,
-[AC_TRY_COMPILE([],
-[signed char c; signed short s; signed int i;],
+[AC_COMPILE_IFELSE([AC_LANG_PROGRAM([],
+[signed char c; signed short s; signed int i;])],
 [pgac_cv_c_signed=yes],
 [pgac_cv_c_signed=no])])
 if test x$pgac_cv_c_signed = xno ; then
@@ -81,7 +81,7 @@ AC_DEFUN([PGAC_TYPE_64BIT_INT],
 [define([Ac_define], [translit([have_$1_64], [a-z *], [A-Z_P])])dnl
 define([Ac_cachevar], [translit([pgac_cv_type_$1_64], [ *], [_p])])dnl
 AC_CACHE_CHECK([whether $1 is 64 bits], [Ac_cachevar],
-[AC_TRY_RUN(
+[AC_RUN_IFELSE([AC_LANG_SOURCE(
 [typedef $1 ac_int64;
 
 /*
@@ -107,7 +107,7 @@ int does_int64_work()
 }
 main() {
   exit(! does_int64_work());
-}],
+}])],
 [Ac_cachevar=yes],
 [Ac_cachevar=no],
 [# If cross-compiling, check the size reported by the compiler and
@@ -169,8 +169,8 @@ fi])# PGAC_TYPE_128BIT_INT
 # Define HAVE_FUNCNAME__FUNC or HAVE_FUNCNAME__FUNCTION accordingly.
 AC_DEFUN([PGAC_C_FUNCNAME_SUPPORT],
 [AC_CACHE_CHECK(for __func__, pgac_cv_funcname_func_support,
-[AC_TRY_COMPILE([#include stdio.h],
-[printf(%s\n, __func__);],
+[AC_COMPILE_IFELSE([AC_LANG_PROGRAM([#include stdio.h],
+[printf(%s\n, __func__);])],
 [pgac_cv_funcname_func_support=yes],
 [pgac_cv_funcname_func_support=no])])
 if test x$pgac_cv_funcname_func_support = xyes ; then
@@ -178,8 +178,8 @@ AC_DEFINE(HAVE_FUNCNAME__FUNC, 1,
   [Define to 1 if your compiler understands __func__.])
 else
 AC_CACHE_CHECK(for __FUNCTION__, pgac_cv_funcname_function_support,
-[AC_TRY_COMPILE([#include stdio.h],
-[printf(%s\n, __FUNCTION__);],
+[AC_COMPILE_IFELSE([AC_LANG_PROGRAM([#include stdio.h],
+[printf(%s\n, __FUNCTION__);])],
 [pgac_cv_funcname_function_support=yes],
 [pgac_cv_funcname_function_support=no])])
 if test x$pgac_cv_funcname_function_support = xyes ; then
@@ -199,8 +199,8 @@ fi])# 

Re: [HACKERS] nested loop semijoin estimates

2015-05-31 Thread Tom Lane
Tomas Vondra tomas.von...@2ndquadrant.com writes:
 On 05/30/15 23:16, Tom Lane wrote:
 Attached is a draft patch for that.  It fixes the problem for me:

 Seems to be working OK, but I still do get a Bitmap Heap Scan there (but 
 more about that later).

Attached is an incremental patch (on top of the previous one) to allow
startup cost of parameterized paths to be considered when the relation
is the RHS of a semi or anti join.  It seems reasonably clean except
for one thing: logically, we perhaps should remove the checks on
path-param_info from the last half of compare_path_costs_fuzzily(),
so as to increase the symmetry between parameterized paths and
unparameterized ones.  However, when I did that, I got changes in some
of the regression test plans, and they didn't seem to be for the better.
So I left that alone.  As-is, this patch doesn't seem to affect the
results for any existing regression tests.

 Do you plan to push that into 9.5, or 9.6? I assume it's a behavior 
 change so that no back-patching, right?

Mumble.  It's definitely a planner bug fix, and I believe that the effects
are narrowly constrained to cases where significantly better plans are
possible.  So personally I'd be willing to back-patch it.  But others
might see that differently, especially since it's been like this for a
long time and we only have one field complaint.

regards, tom lane

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 4775acf..87304ba 100644
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*** _outRelOptInfo(StringInfo str, const Rel
*** 1844,1849 
--- 1844,1850 
  	WRITE_FLOAT_FIELD(rows, %.0f);
  	WRITE_INT_FIELD(width);
  	WRITE_BOOL_FIELD(consider_startup);
+ 	WRITE_BOOL_FIELD(consider_param_startup);
  	WRITE_NODE_FIELD(reltargetlist);
  	WRITE_NODE_FIELD(pathlist);
  	WRITE_NODE_FIELD(ppilist);
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index d51fd57..d1349aa 100644
*** a/src/backend/optimizer/README
--- b/src/backend/optimizer/README
*** a nestloop that provides parameters to t
*** 795,801 
  do not ignore merge joins entirely, joinpath.c does not fully explore the
  space of potential merge joins with parameterized inputs.  Also, add_path
  treats parameterized paths as having no pathkeys, so that they compete
! only on total cost and rowcount; they don't get preference for producing a
  special sort order.  This creates additional bias against merge joins,
  since we might discard a path that could have been useful for performing
  a merge without an explicit sort step.  Since a parameterized path must
--- 795,801 
  do not ignore merge joins entirely, joinpath.c does not fully explore the
  space of potential merge joins with parameterized inputs.  Also, add_path
  treats parameterized paths as having no pathkeys, so that they compete
! only on cost and rowcount; they don't get preference for producing a
  special sort order.  This creates additional bias against merge joins,
  since we might discard a path that could have been useful for performing
  a merge without an explicit sort step.  Since a parameterized path must
*** uninteresting, these choices do not affe
*** 804,809 
--- 804,817 
  output order of a query --- they only make it harder to use a merge join
  at a lower level.  The savings in planning work justifies that.
  
+ Similarly, parameterized paths do not normally get preference in add_path
+ for having cheap startup cost; that's seldom of much value when on the
+ inside of a nestloop, so it seems not worth keeping extra paths solely for
+ that.  An exception occurs for parameterized paths for the RHS relation of
+ a SEMI or ANTI join: in those cases, we can stop the inner scan after the
+ first match, so that it's primarily startup not total cost that we care
+ about.
+ 
  
  LATERAL subqueries
  --
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 4e6d90d..0b83189 100644
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
*** set_rel_pathlist_hook_type set_rel_pathl
*** 61,66 
--- 61,67 
  join_search_hook_type join_search_hook = NULL;
  
  
+ static void set_base_rel_consider_startup(PlannerInfo *root);
  static void set_base_rel_sizes(PlannerInfo *root);
  static void set_base_rel_pathlists(PlannerInfo *root);
  static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
*** make_one_rel(PlannerInfo *root, List *jo
*** 152,157 
--- 153,161 
  		root-all_baserels = bms_add_member(root-all_baserels, brel-relid);
  	}
  
+ 	/* Mark base rels as to whether we care about fast-start plans */
+ 	set_base_rel_consider_startup(root);
+ 
  	/*
  	 * Generate access paths for the base rels.
  	 */
*** make_one_rel(PlannerInfo *root, List *jo
*** 172,177 
--- 176,224 
  }
  
 

Re: [CORE] [HACKERS] postpone next week's release

2015-05-31 Thread Michael Paquier
On Sun, May 31, 2015 at 11:03 PM, Bruce Momjian br...@momjian.us wrote:
 On Sun, May 31, 2015 at 09:50:25AM -0400, Bruce Momjian wrote:
  +1. Complexity has increased, and we are actually never at 100% sure
  that a given bug fix does not have side effects on other things, hence
  I think that a portion of this technical debt is the lack of
  regression test coverage, for both existing features and platforms
  (like Windows). The thing is that complexity has increased, but for
  example for many features we lack test coverage, thinking mainly
  replication-related stuff here. Of course we will never get to a level
  of 100% of confidence with just the test coverage and the buildfarm,
  but we should at least try to get closer to such a goal.

 FYI, I realize that one additional thing that has discouraged code
 reorganization is the additional backpatch overhead.  I think we now
 need to accept that our reorganization-adverse approach might have cost
 us some reliability, and that reorganization is going to add work to
 backpatching.

 Actually, code reorganization in HEAD might cause backpatching to be
 more buggy, reducing reliability --- obviously we need to have a
 discussion about that.

As a result, IMO all the folks gathering to PGCon (won't be there
sorry, but I read the MLs) should have a talk about that and define a
clear list of items to tackle in terms of reorganization for 9.5, and
then update this page:
https://wiki.postgresql.org/wiki/PostgreSQL_9.5_Open_Items
This does not prevent to move on with all the current items and
continue reviewing existing features that have been pushed of course.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nested loop semijoin estimates

2015-05-31 Thread Tomas Vondra

On 06/01/15 00:08, Tom Lane wrote:

Tomas Vondra tomas.von...@2ndquadrant.com writes:

On 05/30/15 23:16, Tom Lane wrote:

Attached is a draft patch for that.  It fixes the problem for me:



Seems to be working OK, but I still do get a Bitmap Heap Scan there (but
more about that later).


Attached is an incremental patch (on top of the previous one) to
allow startup cost of parameterized paths to be considered when the
relation is the RHS of a semi or anti join. It seems reasonably clean
except for one thing: logically, we perhaps should remove the checks
on path-param_info from the last half of
compare_path_costs_fuzzily(), so as to increase the symmetry between
parameterized paths and unparameterized ones. However, when I did
that, I got changes in some of the regression test plans, and they
didn't seem to be for the better. So I left that alone. As-is, this
patch doesn't seem to affect the results for any existing regression
tests.


Seems to be working fine. I've tried a bunch of queries modifying the 
test case in various ways, and all seem to be planned fine. I've been 
unable to come up with a query that'd get planned badly.


Regarding the remaining checks in compare_path_costs_fuzzily(), isn't 
that simply caused by very small data sets? For example the first 
failing plan in join.sql looks like this:


Nested Loop Left Join  (cost=0.29..22.80 rows=2 width=12)

 Nested Loop Left Join  (cost=0.29..22.80 rows=2 width=12)
   Output: *VALUES*.column1, i1.f1, (666)
   Join Filter: (*VALUES*.column1 = i1.f1)
   -  Values Scan on *VALUES*  (cost=0.00..0.03 rows=2 width=4)
 Output: *VALUES*.column1
   -  Materialize  (cost=0.29..22.64 rows=5 width=8)
 Output: i1.f1, (666)
 -  Nested Loop Left Join  (cost=0.29..22.61 rows=5 width=8)
   Output: i1.f1, 666
   -  Seq Scan on public.int4_tbl i1  (cost=0.00..1.05 ...
 Output: i1.f1
   -  Index Only Scan using tenk1_unique2 on public
 Output: i2.unique2
 Index Cond: (i2.unique2 = i1.f1)

while with the changes it'd look like this:

 Hash Right Join  (cost=0.34..22.70 rows=2 width=12)
   Output: *VALUES*.column1, i1.f1, (666)
   Hash Cond: (i1.f1 = *VALUES*.column1)
   -  Nested Loop Left Join  (cost=0.29..22.61 rows=5 width=8)
 Output: i1.f1, 666
 -  Seq Scan on public.int4_tbl i1  (cost=0.00..1.05 ...
   Output: i1.f1
 -  Index Only Scan using tenk1_unique2 on public.tenk1 ...
   Output: i2.unique2
   Index Cond: (i2.unique2 = i1.f1)
   -  Hash  (cost=0.03..0.03 rows=2 width=4)
 Output: *VALUES*.column1
 -  Values Scan on *VALUES*  (cost=0.00..0.03 rows=2 ...
   Output: *VALUES*.column1
(14 rows)

So the planner actually believes the plan to be cheaper, although only 
by a tiny margin. And I see pretty much no difference in planning/exec 
time (but I'm on a machine with power-management and VMs, so a lot of 
random noise).


But once the int4_tbl gets bigger (say, 160k rows instead of the 5), 
even the current the hash join clearly wins. Actually, it switches from 
the current plan way sooner (at 500 rows it's already using the hash 
join, and I see about the same timings).


I don't really see why you think those plan changes to be bad? And even 
if they are,  isn't that simply a matter of tuning the cost parameters?




Do you plan to push that into 9.5, or 9.6? I assume it's a
behavior change so that no back-patching, right?


Mumble. It's definitely a planner bug fix, and I believe that the
effects are narrowly constrained to cases where significantly better
plans are possible. So personally I'd be willing to back-patch it.
But others might see that differently, especially since it's been
like this for a long time and we only have one field complaint.


+1 to back-patching from me. It's true we only have one field complaint, 
but I believe there are more users impacted by this. They just didn't 
notice - we only really got this complaint because of the plan 
discrepancy between queries that are almost exactly the same.


--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlog - pg_xjournal?

2015-05-31 Thread Abhijit Menon-Sen
At 2015-05-31 13:46:33 -0400, t...@sss.pgh.pa.us wrote:

 just always create pg_xlog as a symlink to pg_xjournal during initdb.

At first glance, the Subject: of this thread made me think that *was*
Joel's proposal. :-) But I think it's a great idea, and worth doing.

I think pg_journal (no x) is sufficient. The journal is an idea that
people are familiar with from filesystems anyway.

 Note that we'd really also have to rename pg_clog etc

pg_clog could become pg_commits or pg_xactstatus or pg_commit_status or
something. What else is there? I'd hope pg_logical can be left alone.

 A more difficult question is whether we'd also rename pg_resetxlog,
 pg_receivexlog, etc.

I don't think it's necessary. (Of course, people have wanted to rename
pg_resetxlog to make it sound more scary anyway, but that's a different
matter.)

 In the end though, this is a lot of thrashing for a problem that
 only comes up rarely ...

I'll agree with Joel that it comes up far too often for comfort anyway.
I've known a number of people who were on the verge of deleting stuff
from pg_xlog, but just happened to check with me first.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-31 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 FYI, I realize that one additional thing that has discouraged code
 reorganization is the additional backpatch overhead.  I think we now
 need to accept that our reorganization-adverse approach might have cost
 us some reliability, and that reorganization is going to add work to
 backpatching.

 Actually, code reorganization in HEAD might cause backpatching to be
 more buggy, reducing reliability --- obviously we need to have a
 discussion about that.

Commit 6b700301c36e380eb4972ab72c0e914cae60f9fd is a recent real example.
Not that that should dissuade us from ever doing any reorganizations,
but it's foolish to discount back-patching costs.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_xlog - pg_xjournal?

2015-05-31 Thread Joel Jacobson
While anyone who is familiar with postgres would never do something as
stupid as to delete pg_xlog,
according to Google, there appears to be a fair amount of end-users out
there having made the irrevocable mistake of deleting the precious
directory,
a decision made on the assumption that since it has *log* in the name so
it must be unimportant (http://stackoverflow
.com/questions/12897429/what-does-pg-resetxlog-do-and-how-does-it-work).

If we could turn back time, would we have picked pg_xlog as the most
optimal name for this important directory, or would we have come up with a
more user-friendly name?

Personally, I have never had any problems with pg_xlog, but I realize there
are quite a few unlucky new users who end up in trouble.

My suggestion is to use pg_xjournal instead of pg_xlog when new users
create a new data directory using initdb, and allow for both directories to
exist (exclusive or, i.e. either one or the other, but not both). That way
we don't complicate the life for any existing users, all their tools will
continue to work who rely on pg_xlog to be named pg_xlog, but only force
new users to do a bit of googling when they can't use whatever tool that
can't find pg_xlog. When they find out it's an important directory, they
can simply create a symlink and their old not yet updated tool will work
again.

Thoughts?


[HACKERS] Re: [GENERAL] 9.4.1 - 9.4.2 problem: could not access status of transaction 1

2015-05-31 Thread Noah Misch
On Fri, May 29, 2015 at 10:37:57AM +1200, Thomas Munro wrote:
 On Fri, May 29, 2015 at 7:56 AM, Robert Haas robertmh...@gmail.com wrote:
  - There's a third possible problem related to boundary cases in
  SlruScanDirCbRemoveMembers, but I don't understand that one well
  enough to explain it.  Maybe Thomas can jump in here and explain the
  concern.
 
 I noticed something in passing which is probably not harmful, and not
 relevant to this bug report, it was just a bit confusing while
 testing:  SlruScanDirCbRemoveMembers never deletes any files if
 rangeStart == rangeEnd.  In practice, if you have an idle cluster with
 a lot of multixact data and you VACUUM FREEZE all databases and then
 CHECKPOINT, you might be surprised to see no member files going away
 quite yet, but they'll eventually be truncated by a future checkpoint,
 once rangeEnd has had a chance to advance to the next page due to more
 multixacts being created.
 
 If we want to fix this one day, maybe the right thing to do is to
 treat the rangeStart == rangeEnd case the same way we treat rangeStart
  rangeEnd, that is, to assume that the range of pages isn't
 wrapped/inverted in this case.

I agree.  Because we round rangeStart down to a segment boundary, oldest and
next member offsets falling on the same page typically implies
rangeStartrangeEnd.  Only when the page they share happens to be the first
page of a segment does one observe rangeStart==RangeEnd.

While testing this (with inconsistent-multixact-fix-master.patch applied,
FWIW), I noticed a nearby bug with a similar symptom.  TruncateMultiXact()
omits the nextMXact==oldestMXact special case found in each other
find_multixact_start() caller, so it reads the offset of a not-yet-created
MultiXactId.  The usual outcome is to get rangeStart==0, so we truncate less
than we could.  This can't make us truncate excessively, because
nextMXact==oldestMXact implies no table contains any mxid.  If nextMXact
happens to be the first of a segment, an error is possible.  Procedure:

1. Make a fresh cluster.
2. UPDATE pg_database SET datallowconn = true
3. Consume precisely 131071 mxids.  Number of offsets per mxid is unimportant.
4. vacuumdb --freeze --all

Expected state after those steps:
$ pg_controldata | grep NextMultiXactId
Latest checkpoint's NextMultiXactId:  131072

Checkpoint will fail like this:
26699 2015-05-31 17:22:33.134 GMT LOG:  statement: checkpoint
26661 2015-05-31 17:22:33.134 GMT DEBUG:  performing replication slot checkpoint
26661 2015-05-31 17:22:33.136 GMT ERROR:  could not access status of 
transaction 131072
26661 2015-05-31 17:22:33.136 GMT DETAIL:  Could not open file 
pg_multixact/offsets/0002: No such file or directory.
26699 2015-05-31 17:22:33.234 GMT ERROR:  checkpoint request failed
26699 2015-05-31 17:22:33.234 GMT HINT:  Consult recent messages in the server 
log for details.
26699 2015-05-31 17:22:33.234 GMT STATEMENT:  checkpoint

This does not block startup, and creating one mxid hides the problem again.
Thus, it is not a top-priority bug like some other parts of this thread.  I
mention it today mostly so it doesn't surprise hackers testing other fixes.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [GENERAL] 9.4.1 - 9.4.2 problem: could not access status of transaction 1

2015-05-31 Thread Noah Misch
Incomplete review, done in a relative rush:

On Fri, May 29, 2015 at 03:08:11PM -0400, Robert Haas wrote:
 OK, here's a patch.  Actually two patches, differing only in
 whitespace, for 9.3 and for master (ha!).  I now think that the root
 of the problem here is that DetermineSafeOldestOffset() and
 SetMultiXactIdLimit() were largely ignorant of the possibility that
 they might be called at points in time when the cluster was
 inconsistent.

A cause perhaps closer to the root is commit f741300 moving truncation from
VACUUM to checkpoints.  CLOG has given us deep experience with VACUUM-time
truncation.  Commit f6a6c46d and this patch are about bringing CHECKPOINT-time
truncation up to the same level.

Speaking of commit f6a6c46d, it seems logical that updating allocation stop
limits should happen proximate to truncation.  That's currently the case for
CLOG (vac_truncate_clog() does both) and pg_multixact/members (checkpoint's
TruncateMultiXact() call does both).  However, pg_multixact/offsets is
truncated from TruncateMultiXact(), but vac_truncate_clog() updates its limit.
I did not distill an errant test case, but this is fishy.

 SetMultiXactIdLimit() bracketed certain parts of its
 logic with if (!InRecovery), but those guards were ineffective because
 it gets called before InRecovery is set in the first place.

SetTransactionIdLimit() checks InRecovery for the same things, and it is
called at nearly the same moments as SetMultiXactIdLimit().  Do you have a
sense of whether it is subject to similar problems as a result?

 1. Moves the call to DetermineSafeOldestOffset() that appears in
 StartupMultiXact() to TrimMultiXact(), so that we don't try to do this
 until we're consistent.  Also, instead of passing
 MultiXactState-oldestMultiXactId, pass the newer of that value and
 the earliest offset that exists on disk.  That way, it won't try to
 read data that's not there.

Perhaps emit a LOG message when we do that, since it's our last opportunity to
point to the potential data loss?

 +  * PostgreSQL 9.3.0 through 9.3.6 and PostgreSQL 9.4.0 through 9.4.1
 +  * had bugs that could allow users who reached those release through

s/release/releases/

 @@ -2859,6 +2947,14 @@ TruncateMultiXact(void)
   SimpleLruTruncate(MultiXactOffsetCtl,
 MultiXactIdToOffsetPage(oldestMXact));
  
 + /* Update oldest-on-disk value in shared memory. */
 + earliest = range.rangeStart * MULTIXACT_OFFSETS_PER_PAGE;
 + if (earliest  FirstMultiXactId)
 + earliest = FirstMultiXactId;
 + LWLockAcquire(MultiXactGenLock, LW_EXCLUSIVE);
 + Assert(MultiXactState-oldestMultiXactOnDiskValid);
 + MultiXactState-oldestMultiXactOnDiskValid = earliest;

That last line needs s/Valid//, I presume.  Is it okay that
oldestMultiXactOnDisk becomes too-old during TruncateMultiXact(), despite its
Valid indicator remaining true?

 +static MultiXactOffset
 +GetOldestMultiXactOnDisk(void)
 +{

 + SlruScanDirectory(MultiXactOffsetCtl, SlruScanDirCbFindEarliest, 
 trunc);
 + earliest = trunc.earliestExistingPage * MULTIXACT_OFFSETS_PER_PAGE;
 + if (earliest  FirstMultiXactId)
 + earliest = FirstMultiXactId;

SlruScanDirCbFindEarliest() is only meaningful if the files on disk do not
represent a wrapped state.  When the files do represent a wrapped state,
MultiXactIdPrecedes() is not transitive, and the SlruScanDirCbFindEarliest()
result is sensitive to readdir() order.  This code exists specifically to help
us make do in the face of wrong catalog and pg_control entries.  We may have
wrapped as a result of those of same catalog/pg_control entries, so I think
this function ought to account for wrap.  I haven't given enough thought to
what exactly it should do.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-31 Thread Robert Haas
On Sat, May 30, 2015 at 3:46 PM, Peter Geoghegan p...@heroku.com wrote:
 What, in this release, could break things badly?  RLS? Grouping sets?
 Heikki's WAL format changes?  That last one sounds really scary to me;
 it's painful if not impossible to fix the WAL format in a minor
 release.

 I think we actually have learned some lessons here. MultiXacts were a
 somewhat unusual case for a couple of reasons that I need not rehash.

 In contrast, Heikki's WAL format changes (just for example) are
 fundamentally just a restructuring to the existing format. Sure, there
 could be bugs, but I think that it's fundamentally different to the
 9.3 MultiXact stuff, in that the MultiXact stuff appears to be
 stubbornly difficult to stabilize over months and years. That feels
 like something that is unlikely to be true for anything that made it
 into 9.5.

I hope you're right.  But I don't think any of us foresaw just how bad
the MultiXact thing was likely to be either.

In fact, I think to some extent we may STILL be in denial about how bad it is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] session_replication_role origin vs local

2015-05-31 Thread Robert Haas
On Fri, May 29, 2015 at 10:49 PM, Peter Eisentraut pete...@gmx.net wrote:
 Does anyone know what the difference between the
 session_replication_role settings of 'origin' vs 'local' is supposed to
 be?  AFAICT, the code treats them the same and has done since this
 feature was initially introduced.

I assume that 'local' is meant to be used on a replica node when
applying non-replicated changes that are only being done locally.  But
that's just a guess.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problems on Solaris

2015-05-31 Thread Robert Haas
On Sat, May 30, 2015 at 7:09 PM, Andres Freund and...@anarazel.de wrote:
 On 2015-05-27 21:23:34 -0400, Robert Haas wrote:
  Oh wow, that's bad, and could explain a couple of the problems we're
  seing. One possible way to fix is to replace the sequence with if
  (!TAS(spin)) S_UNLOCK();. But that'd mean TAS() has to be a barrier,
  even if the lock isn't free - which e.g. isn't the case for PowerPC's
  implementation :(

 Another possibility is to make the fallback barrier implementation a
 system call, like maybe kill(PostmasterPid, 0).

 It's not necessarily true that all system calls are effective
 barriers. I'm e.g. doubtful that kill(..., 0) is one as it only performs
 local error checking. It might be that the process existance check
 includes a lock that's sufficient, but I would not like to rely on
 it. Sending an actual signal probably would be, but has the potential of
 disrupting postmaster progress.

So pick a better system call?

 I think we should just bite the bullet and require a barrier
 implementation for all architectures that have spinlock support. That
 should be fairly straightforward, even though distinctly unpleasurable,
 exercise. And then use semaphores (PGSemaphoreUnlock();PGSemaphoreLock()
 doesn't have the issue that spinlocks have) for --disable-spinlock
 platforms.

Like maybe this.

 If people agree with that way forward, I'll go through the
 platforms. The biggest one missing is probably solaris with sun's
 compiler.

Certainly, having real barriers everywhere would be great.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Join Filter vs. Index Cond (performance regression 9.1-9.2+/HEAD)

2015-05-31 Thread Robert Haas
On Fri, May 29, 2015 at 6:45 PM, Andrew Gierth
and...@tao11.riddles.org.uk wrote:
 Obviously it makes little sense to use an (a,b,c) index to look up just
 (a,b) and then filter on c; the question is, what is the planner doing
 that leads it to get this so wrong? Finding a workaround for it was not
 easy, either - the only thing that I found that worked was replacing the
 t1 join with a lateral join with an OFFSET 0 clause to nobble the
 planner entirely.

I agree.  That looks like a bug.

The fact that it chooses index-only scans is also a little strange,
considering that they seem to be entirely ineffective.  The tables
have never been vacuumed, so presumably, the all-visible bits are all
0, and the planner knows it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-31 Thread Bruce Momjian
On Sun, May 31, 2015 at 08:15:38PM +0900, Michael Paquier wrote:
 On Sun, May 31, 2015 at 11:48 AM, Bruce Momjian wrote:
  On Sat, May 30, 2015 at 10:47:27PM +0200, Andres Freund wrote:
  So, I think we have built up a lot of technical debt. And very little
  effort has been made to fix that; and in the cases where people have the
  reception has often been cool, because refactoring things obviously will
  destabilize in the short term, even if it fixes problems in the long
  term.  I don't think that's sustainable.
 
  Agreed.
 
 +1. Complexity has increased, and we are actually never at 100% sure
 that a given bug fix does not have side effects on other things, hence
 I think that a portion of this technical debt is the lack of
 regression test coverage, for both existing features and platforms
 (like Windows). The thing is that complexity has increased, but for
 example for many features we lack test coverage, thinking mainly
 replication-related stuff here. Of course we will never get to a level
 of 100% of confidence with just the test coverage and the buildfarm,
 but we should at least try to get closer to such a goal.

FYI, I realize that one additional thing that has discouraged code
reorganization is the additional backpatch overhead.  I think we now
need to accept that our reorganization-adverse approach might have cost
us some reliability, and that reorganization is going to add work to
backpatching.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] 9.4.1 - 9.4.2 problem: could not access status of transaction 1

2015-05-31 Thread Robert Haas
On Sat, May 30, 2015 at 8:55 PM, Andres Freund and...@anarazel.de wrote:
 Is oldestMulti, nextMulti - 1 really suitable for this? Are both
 actually guaranteed to exist in the offsets slru and be valid?  Hm. I
 guess you intend to simply truncate everything else, but just in
 offsets?

oldestMulti in theory is the right thing, I think, but in actuality we
know that some people have 1 here instead of the correct value.

 One argument against this idea is that we may not want to keep a full
 set of member files on standbys (due to disk space usage), but that's
 what will happen unless we truncate during replay.

 I think that argument is pretty much the death-knell.=

Yes.  Truncating on the standby is really not optional.

  I think at least for 9.5+ we should a) invent proper truncation records
  for pg_multixact b) start storing oldestValidMultiOffset in pg_control.
  The current hack of scanning the directories to get knowledge we should
  have is a pretty bad hack, and we should not continue using it forever.
  I think we might end up needing to do a) even in the backbranches.

 Definitely agree with WAL-logging truncations; also +1 on backpatching
 that to 9.3.  We already have experience with adding extra WAL records
 on minor releases, and it didn't seem to have bitten too hard.

 I'm inclined to agree. My only problem is that I'm not sure whether we
 can find a way of doing all this without adding a pg_control field. Let
 me try to sketch this out:

 1) We continue determining the oldest 
 SlruScanDirectory(SlruScanDirCbFindEarliest)
on the master to find the oldest offsets segment to
truncate. Alternatively, if we determine it to be safe, we could use
oldestMulti to find that.
 2) SlruScanDirCbRemoveMembers is changed to return the range of members
to remove, instead of doing itself
 3) We wal log [oldest offset segment guaranteed to not be alive,
nextmulti) for offsets, and [oldest members segment guaranteed to not be 
 alive,
nextmultioff), and redo truncations for the entire range during
recovery.

 I'm pretty tired right now, but this sounds doable.

I'm probably biased here, but I think we should finish reviewing,
testing, and committing my patch before we embark on designing this.
So far we have no reports of trouble attributable to the lack of the
WAL-logging support discussed here, as opposed to several reports of
trouble from the status quo within days of release.

I'm having trouble reconstructing the series of events where what
you're worried about here really becomes a problem, and I think we
ought to have a very clear idea about that before back-patching
changes of this type.  It's true that if the state of the SLRU
directory is in the future, because recovery was restarted from an
earlier checkpoint, we might replay a checkpoint and remove some of
those files from the future.  But so what?  By the time we've reached
the minimum recovery point, they will have been recreated by the same
WAL records that created them in the first place.  If, in the previous
replay, we had wrapped all the way around, some of the stuff we keep
may actually already have been overwritten by future WAL records, but
they'll be overwritten again.  Now, that could mess up our
determination of which members to remove, I guess, but I'm not clear
that actually matters either: if the offsets space has wrapped around,
the members space will certainly have wrapped around as well, so we
can remove anything we like at this stage and we're still OK.  I agree
this is ugly the way it is, but where is the actual bug?

As far as your actual outline goes, I think if we do this, we need to
be very careful about step #2.  Right now, we decide what we need to
keep and then remove everything else, but that's kind of wonky because
new stuff may be getting created at the same time, so we keep
adjusting our idea of exactly what needs to be removed.  It would be
far better to invert that logic: decide what needs to be removed -
presumably, everything from the oldest member that now exists up until
some later point - and then remove precisely that stuff and nothing
else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [CORE] [HACKERS] postpone next week's release

2015-05-31 Thread Bruce Momjian
On Sun, May 31, 2015 at 09:50:25AM -0400, Bruce Momjian wrote:
  +1. Complexity has increased, and we are actually never at 100% sure
  that a given bug fix does not have side effects on other things, hence
  I think that a portion of this technical debt is the lack of
  regression test coverage, for both existing features and platforms
  (like Windows). The thing is that complexity has increased, but for
  example for many features we lack test coverage, thinking mainly
  replication-related stuff here. Of course we will never get to a level
  of 100% of confidence with just the test coverage and the buildfarm,
  but we should at least try to get closer to such a goal.
 
 FYI, I realize that one additional thing that has discouraged code
 reorganization is the additional backpatch overhead.  I think we now
 need to accept that our reorganization-adverse approach might have cost
 us some reliability, and that reorganization is going to add work to
 backpatching.

Actually, code reorganization in HEAD might cause backpatching to be
more buggy, reducing reliability --- obviously we need to have a
discussion about that.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers