Re: [HACKERS] inherit support for foreign tables

2015-04-16 Thread Etsuro Fujita
On 2015/03/23 2:57, Tom Lane wrote:
 Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes:
 [ fdw-inh-8.patch ]
 
 I've committed this with some substantial rearrangements, notably:

 * As I mentioned earlier, I got rid of a few unnecessary restrictions on
 foreign tables so as to avoid introducing warts into inheritance behavior.
 In particular, we now allow NOT VALID CHECK constraints (and hence ALTER
 ... VALIDATE CONSTRAINT), ALTER SET STORAGE, and ALTER SET WITH/WITHOUT
 OIDS.  These are probably no-ops anyway for foreign tables, though
 conceivably an FDW might choose to implement some behavior for STORAGE
 or OIDs.

I agree with you on this point.  However, ISTM there is a bug in
handling OIDs on foreign tables; while we now allow for ALTER SET
WITH/WITHOUT OIDS, we still don't allow the default_with_oids parameter
for foreign tables.  I think that since CREATE FOREIGN TABLE should be
consistent with ALTER FOREIGN TABLE, we should also allow the parameter
for foreign tables.  Attached is a patch for that.

Best regards,
Etsuro Fujita
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***
*** 580,586  DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
  	descriptor = BuildDescForRelation(schema);
  
  	localHasOids = interpretOidsOption(stmt-options,
! 	   (relkind == RELKIND_RELATION));
  	descriptor-tdhasoid = (localHasOids || parentOidCount  0);
  
  	/*
--- 580,587 
  	descriptor = BuildDescForRelation(schema);
  
  	localHasOids = interpretOidsOption(stmt-options,
! 	   (relkind == RELKIND_RELATION ||
! 		relkind == RELKIND_FOREIGN_TABLE));
  	descriptor-tdhasoid = (localHasOids || parentOidCount  0);
  
  	/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Supporting src/test/modules in MSVC builds

2015-04-16 Thread Michael Paquier
Hi all,

As mentioned previously
(cab7npqscphafxs2rzeb-fbccjqiknqxjhloztkggim1mf5x...@mail.gmail.com),
attached are patches to add support for src/test/modules in MSVC
builds, modules whose tests are not supported since they have been
moved from contrib/:
- 0001 adds support for the build portion.
- 0002 adds support for the installation. I arrived at the conclusion
that those modules should be installed by default, because
contribcheck relies on the fact that tests should be run on a server
already running, and modulescheck should do the same IMO.
- 0003 adds a new target modulescheck in vcregress.

Patches 0001 and 0003 are based on some previous work from Alvaro,
that I modified slightly after testing them. Note that this split is
done to test each step on the build farm for safety.
Regards,
-- 
Michael
From 1b58eba30d0347a7718a0db18ac6ae2c02888aaf Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Wed, 15 Apr 2015 22:14:33 -0700
Subject: [PATCH 1/3] Include modules of src/test/modules in build

commit_ts, being only a module used for test purposes, is simply ignored
in the process.
---
 src/tools/msvc/Mkvcbuild.pm | 33 +++--
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index e4dbebf..986f3b3 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -28,7 +28,7 @@ my $libpgcommon;
 my $postgres;
 my $libpq;
 
-# Set of variables for contrib modules
+# Set of variables for modules in contrib/ and src/test/modules/
 my $contrib_defines = { 'refint' = 'REFINT_VERBOSE' };
 my @contrib_uselibpq =
   ('dblink', 'oid2name', 'postgres_fdw', 'vacuumlo');
@@ -50,7 +50,7 @@ my $contrib_extraincludes =
 my $contrib_extrasource = {
 	'cube' = [ 'contrib\cube\cubescan.l', 'contrib\cube\cubeparse.y' ],
 	'seg' = [ 'contrib\seg\segscan.l', 'contrib\seg\segparse.y' ], };
-my @contrib_excludes = ('pgcrypto', 'intagg', 'sepgsql');
+my @contrib_excludes = ('pgcrypto', 'commit_ts', 'intagg', 'sepgsql');
 
 # Set of variables for frontend modules
 my $frontend_defines = { 'initdb' = 'FRONTEND' };
@@ -564,15 +564,18 @@ sub mkvcbuild
 	my $mf = Project::read_file('contrib/pgcrypto/Makefile');
 	GenerateContribSqlFiles('pgcrypto', $mf);
 
-	opendir($D, 'contrib') || croak Could not opendir on contrib!\n;
-	while (my $d = readdir($D))
+	foreach my $subdir ('contrib', 'src/test/modules')
 	{
-		next if ($d =~ /^\./);
-		next unless (-f contrib/$d/Makefile);
-		next if (grep { /^$d$/ } @contrib_excludes);
-		AddContrib($d);
+		opendir($D, $subdir) || croak Could not opendir on $subdir!\n;
+		while (my $d = readdir($D))
+		{
+			next if ($d =~ /^\./);
+			next unless (-f $subdir/$d/Makefile);
+			next if (grep { /^$d$/ } @contrib_excludes);
+			AddContrib($subdir, $d);
+		}
+		closedir($D);
 	}
-	closedir($D);
 
 	$mf =
 	  Project::read_file('src\backend\utils\mb\conversion_procs\Makefile');
@@ -689,14 +692,15 @@ sub AddSimpleFrontend
 # Add a simple contrib project
 sub AddContrib
 {
+	my $subdir = shift;
 	my $n  = shift;
-	my $mf = Project::read_file('contrib\\' . $n . '\Makefile');
+	my $mf = Project::read_file($subdir/$n/Makefile);
 
 	if ($mf =~ /^MODULE_big\s*=\s*(.*)$/mg)
 	{
 		my $dn = $1;
 		my $proj =
-		  $solution-AddProject($dn, 'dll', 'contrib', 'contrib\\' . $n);
+		  $solution-AddProject($dn, 'dll', 'contrib', $subdir/$n);
 		$proj-AddReference($postgres);
 		AdjustContribProj($proj);
 	}
@@ -705,8 +709,9 @@ sub AddContrib
 		foreach my $mod (split /\s+/, $1)
 		{
 			my $proj =
-			  $solution-AddProject($mod, 'dll', 'contrib', 'contrib\\' . $n);
-			$proj-AddFile('contrib\\' . $n . '\\' . $mod . '.c');
+			  $solution-AddProject($mod, 'dll', 'contrib', $subdir/$n);
+			my $filename = $mod . '.c';
+			$proj-AddFile($subdir . '\\' . $n . '\\' .  $mod . '.c');
 			$proj-AddReference($postgres);
 			AdjustContribProj($proj);
 		}
@@ -714,7 +719,7 @@ sub AddContrib
 	elsif ($mf =~ /^PROGRAM\s*=\s*(.*)$/mg)
 	{
 		my $proj =
-		  $solution-AddProject($1, 'exe', 'contrib', 'contrib\\' . $n);
+		  $solution-AddProject($1, 'exe', 'contrib', $subdir/$n);
 		AdjustContribProj($proj);
 	}
 	else
-- 
2.3.5

From e901543c2fa728646dca13a66979a6f0619dd87f Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Wed, 15 Apr 2015 23:13:14 -0700
Subject: [PATCH 2/3] Support installation of test modules in MSVC

Note that those modules are needed in the installation because like
contribcheck, modulescheck does not create a cluster from scratch
on where to run the tests. It seems also good to include them to be
able to perform those tests using a central installation path.
---
 src/tools/msvc/Install.pm | 151 +-
 1 file changed, 81 insertions(+), 70 deletions(-)

diff --git a/src/tools/msvc/Install.pm b/src/tools/msvc/Install.pm
index 93a6724..c82743d 100644
--- a/src/tools/msvc/Install.pm
+++ b/src/tools/msvc/Install.pm
@@ 

Re: [HACKERS] inherit support for foreign tables

2015-04-16 Thread Kyotaro HORIGUCHI
Hello,

At Thu, 16 Apr 2015 12:20:47 +0900, Etsuro Fujita fujita.ets...@lab.ntt.co.jp 
wrote in 552f2a8f.2090...@lab.ntt.co.jp
 On 2015/04/15 3:52, Alvaro Herrera wrote:
  On 4/14/15 5:49 AM, Etsuro Fujita wrote:
  postgres=# create foreign table ft1 (c1 int) server myserver options
  (table_name 't1');
  CREATE FOREIGN TABLE
  postgres=# create foreign table ft2 (c1 int) server myserver options
  (table_name 't2');
  CREATE FOREIGN TABLE
  postgres=# alter foreign table ft2 inherit ft1;
  ALTER FOREIGN TABLE
  postgres=# select * from ft1 for update;
  ERROR:  could not find junk tableoid1 column
 
  I think this is a bug.  Attached is a patch fixing this issue.
 
  I think the real question is whether we're now (I mean after this
  patch)
  emitting useless tableoid columns that we didn't previously have.  I
  think the answer is yes, and if so I think we need a smaller comb to
  fix
  the problem.  This one seems rather large.
 
 My answer for that would be *no* because I think tableoid would be
 needed for EvalPlanQual checking in more complex SELECT FOR UPDATE on
 the inheritance or UPDATE/DELETE involving the inheritance as a source
 table.  Also, if we allow the FDW to change the behavior of SELECT FOR
 UPDATE so as to match the local semantics exactly, which I'm working
 on in another thread, I think tableoid would also be needed for the
 actual row locking.

Given the parent foreign talbes, surely they need tableoids for
such usage. The patch preserves the condition rc-isParent so it
newly affects exactly only parent foreign tables for now.

Before the parent foreign tables introduced, ROW_MARK_COPY and
RTE_RELATION are mutually exclusive so didn't need, or cannot
have tableoid.  But now it intorduces an rte with ROW_MARK_COPY 
RTE_RELATION and there seems no reason for parent tables in any
kind not to have tableoid. After such consideration, I came to
think that the patch is a reasonable fix, not mere a workaround.

Thoughts?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] show xl_prev in xlog.c errcontext

2015-04-16 Thread Heikki Linnakangas

On 04/15/2015 11:35 PM, Alvaro Herrera wrote:

I found this patch in my local repo that I wrote some weeks or months
ago while debugging some XLog corruption problem: it was difficult to
pinpoint what XLog record in a long sequence of WAL files was causing a
problem, and the displaying the prev pointer in errcontext made finding
it much easier -- I could correlate it with pg_xlogdump output, I think.


Seems like a good idea, but why print the prev pointer, and not the 
location of the record itself?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns

2015-04-16 Thread Etsuro Fujita

On 2015/04/15 2:27, Jim Nasby wrote:

On 4/14/15 1:05 AM, Kyotaro HORIGUCHI wrote:

As an example, the following operations cause an unexpected
result.



Those results are indeed surprising, but since we allow it in a direct
connection I don't see why we wouldn't allow it in the Postgres FDW...

As for the FDW not knowing about keys, why would it need to? If you try
to do something illegal it's the remote side that should throw the
error, not the FDW.

Of course, if you try to do a locking operation on an FDW that doesn't
support it, the FDW should throw an error... but that's different.


Ah, you are right.  FOR NO KEY UPDATE and FOR KEY SHARE would be useful 
in the Postgres FDW if we assume the user performs those properly based 
on information about keys for a remote table.


Sorry, my explanation was not correct, but I want to make it clear that 
the proposed patch also allows the FDW to change the behavior of FOR NO 
KEY UPDATE and/or FOR KEY SHARE row locking so as to match the local 
semantics exactly.


BTW, I revised docs a bit.  Attached is an updated version of the patch.

Best regards,
Etsuro Fujita
*** a/doc/src/sgml/fdwhandler.sgml
--- b/doc/src/sgml/fdwhandler.sgml
***
*** 211,216  BeginForeignScan (ForeignScanState *node,
--- 211,232 
  /para
  
  para
+  If the FDW changes handling commandSELECT FOR UPDATE/SHARE/ row
+  locking, this routine should perform any initialization needed prior to
+  the actual row locking if the foreign table is referenced in a
+  commandSELECT FOR UPDATE/SHARE/.  To decide whether the foreign
+  table is referenced in the command or not, it's recommended to use
+  functionExecFindRowMark/ with the missing_ok argument set to true,
+  which returns an structnameExecRowMark/ struct if the foreign table
+  is referenced in the command and otherwise returns a NULL.  Information
+  about the row locking is accessible through the
+  structnameExecRowMark/ struct.
+  (The structfieldfdw_state/ field of structnameExecRowMark/ is
+  available for the FDW to store any private state it needs for the row
+  locking.)
+ /para
+ 
+ para
   Note that when literal(eflags amp; EXEC_FLAG_EXPLAIN_ONLY)/ is
   true, this function should not perform any externally-visible actions;
   it should only do the minimum required to make the node state valid
***
*** 598,603  IsForeignRelUpdatable (Relation rel);
--- 614,701 
  
 /sect2
  
+sect2 id=fdw-callbacks-row-locking
+ titleFDW Routines For commandSELECT FOR UPDATE/SHARE/ row locking
+  /title
+ 
+ para
+  If an FDW supports commandSELECT FOR UPDATE/SHARE/ row locking,
+  it should provide the following callback functions:
+ /para
+ 
+ para
+ programlisting
+ RowMarkType
+ GetForeignRowMarkType (LockClauseStrength strength);
+ /programlisting
+ 
+  Report which row-marking option to support for a lock strength
+  associated with a commandSELECT FOR UPDATE/SHARE/ request.
+  This is called at the beginning of query planning.
+  literalstrength/ is a member of the literalLockClauseStrength/
+  enum type.
+  The return value should be a member of the literalRowMarkType/
+  enum type.  See
+  filenamesrc/include/nodes/lockoptions.h/ and
+  filenamesrc/include/nodes/plannodes.h/ for information about
+  these enum types.
+ /para
+ 
+ para
+  If the functionGetForeignRowMarkType/ pointer is set to
+  literalNULL/, the default option is selected for any lock strength,
+  and both functionLockForeignRow/ and functionFetchForeignRow/
+  described below will not be called at query execution time.
+ /para
+ 
+ para
+ programlisting
+ bool
+ LockForeignRow (EState *estate,
+ ExecRowMark *erm,
+ ItemPointer tupleid);
+ /programlisting
+ 
+  Lock one tuple in the foreign table.
+  literalestate/ is global execution state for the query.
+  literalerm/ is the structnameExecRowMark/ struct describing
+  the target foreign table.
+  literaltupleid/ identifies the tuple to be locked.
+  This function should return literaltrue/, if the FDW lock the tuple
+  successfully.  Otherwise, return literalfalse/.
+ /para
+ 
+ para
+  If the functionLockForeignRow/ pointer is set to
+  literalNULL/, attempts to lock rows will fail
+  with an error message.
+ /para
+ 
+ para
+ programlisting
+ HeapTuple
+ FetchForeignRow (EState *estate,
+  ExecRowMark *erm,
+  ItemPointer tupleid);
+ /programlisting
+ 
+  Fetch one tuple from the foreign table.
+  literalestate/ is global execution state for the query.
+  literalerm/ is the structnameExecRowMark/ struct describing
+  the target foreign table.
+  literaltupleid/ identifies the tuple to be fetched.
+  This function should return the fetched 

Re: [HACKERS] show xl_prev in xlog.c errcontext

2015-04-16 Thread Michael Paquier
On Thu, Apr 16, 2015 at 3:25 PM, Heikki Linnakangas hlinn...@iki.fi wrote:
 On 04/15/2015 11:35 PM, Alvaro Herrera wrote:

 I found this patch in my local repo that I wrote some weeks or months
 ago while debugging some XLog corruption problem: it was difficult to
 pinpoint what XLog record in a long sequence of WAL files was causing a
 problem, and the displaying the prev pointer in errcontext made finding
 it much easier -- I could correlate it with pg_xlogdump output, I think.


 Seems like a good idea, but why print the prev pointer, and not the location
 of the record itself?

And both?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix broken Install.bat when target directory contains a space

2015-04-16 Thread Asif Naeem
Hi Michael,

I spend spend time look into the patch. Good catch, I am also surprised to
see that current Windows install script don’t support spaces in the path.
Please see my findings as following i.e.

*Without the patch*

1.

 C:\PG\postgresql\src\tools\msvcinstall C:\PG\postgresql\inst with space
 without patch
 with was unexpected at this time.


2.

 C:\PG\postgresql\src\tools\msvcinstall
 Invalid command line options.
 Usage: install.bat path


3.

 C:\PG\postgresql\src\tools\msvcinstall /?
 Installing version 9.5 for release in /?
 Copying build output files...Could not copy release\postgres\postgres.exe
 to /?\bin\postgres.exe
 at Install.pm line 40
 Install::lcopy('release\postgres\postgres.exe', '/?\bin\postgres.exe')
 called at Install.pm line 324
 Install::CopySolutionOutput('release', '/?') called at Install.pm line 93
 Install::Install('/?', undef) called at install.pl line 13


*With the patch*

1.

 C:\PG\postgresql\src\tools\msvcinstall C:\PG\postgresql\inst with space
 without patch
 Works fine.


2.

 C:\PG\postgresql\src\tools\msvcinstall
 Usage: install.pl targetdir [installtype]
 installtype: client


3.

 C:\PG\postgresql\src\tools\msvcinstall /?
 Invalid command line options.
 Usage: install.bat path


Following change looks confusing to me i.e.

 -if NOT %1== GOTO RUN_INSTALL
 +if NOT [%1]==[/?] GOTO RUN_INSTALL


Along with fixing the space in installation path, it is also changing the
behavior of install script, why not just if NOT [%1]==[] GOTO
RUN_INSTALL, checking for /? seems good for help message but it seem not
well handled in the script, it is still saying “Invalid command line
options.”, along with this, option /? seems not handled by any other .bat
build script. Other than this, with the patch applied, is it an acceptable
behavior that (2) shows usage message as 'Usage:* install.pl
http://install.pl* targetdir [installtype]' but (3) shows usage message
as 'Usage: *install.bat* path'. Thanks.

Regards,
Muhammad Asif Naeem

On Mon, Mar 2, 2015 at 9:27 AM, Michael Paquier michael.paqu...@gmail.com
wrote:

 Hi all,

 When using install.bat with a path containing spaces, I got surprised
 by a couple of errors.
 1) First with this path:
 $ install c:\Program Files\pgsql
 I am getting the following error:
 Files\pgsql== was unexpected at this time.
 This is caused by an incorrect evaluation of the first parameter in
 install.bat.
 2) After correcting the first error, the path is truncated to
 c:\Program because first argument value does not seem to be correctly
 parsed when used with install.pl.

 Attached is a patch fixing both problems. I imagine that it would be
 good to get that backpatched.
 Regards,
 --
 Michael


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




[HACKERS] Minor improvement to config.sgml

2015-04-16 Thread Etsuro Fujita
Hi,

Attached is a small patch to mark up on with literal in
doc/src/sgml/config.sgml.

Best regards,
Etsuro Fujita
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b30c68d..0d8624a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6752,7 +6752,7 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
 determines whether OIDs will be included in tables created by
 commandSELECT INTO/command. The parameter is literaloff/
 by default; in productnamePostgreSQL/ 8.0 and earlier, it
-was on by default.
+was literalon/ by default.
/para
 
para

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-16 Thread Andres Freund
On 2015-04-15 17:58:54 +0300, Heikki Linnakangas wrote:
 When the speculative insertion is finished, write a new kind of a WAL record
 for that. The record only needs to contain the ctid of the tuple. Replaying
 that record will clear the flag on the heap tuple that said that it was a
 speculative insertion.

 In logical decoding, decode speculative insertions like any other insertion.
 To decode a super-deletion record, scan the reorder buffer for the
 transaction to find the corresponding speculative insertion record for the
 tuple, and remove it.

 BTW, that'd work just as well without the new WAL record to finish a
 speculative insertion. Am I missing something?

I'm, completely independent of logical decoding, of the *VERY* strong
opinion that 'speculative insertions' should never be visible when
looking with normal snapshots. For one it allows to simplify
considerations around wraparound (which has proven to be a very good
idea, c.f. multixacts + vacuum causing data corruption years after it
was thought to be harmless). For another it allows to reclaim/redefine
the bit after a database restart/upgrade. Given how complex this is and
how scarce flags are that seems like a really good property.

And avoiding those flags to be visible to the outside requires a WAL
record, otherwise it won't be correct on the standby.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: default_index_tablespace

2015-04-16 Thread Greg Stark
On 15 Apr 2015 19:12, Tom Lane t...@sss.pgh.pa.us wrote:

 I'm afraid this idea is a nonstarter, because it will break existing
 applications, and in particular existing pg_dump output files, which
 expect to be able to determine an index's tablespace by setting
 default_tablespace.  (It is *not* adequate that the code falls back
 to default_tablespace if the new GUC is unset; if it is set, you've
 still broken pg_dump.)  The incremental value, if indeed there is any,
 of being able to control index positioning this way seems unlikely to
 justify a backwards-compatibility break of such magnitude.

Just brainstorming here but that just means default_tablespace needs to
take precedence. We could have a default_table_tablespace and
default_index_tablespace which default_tablespace overrides. Or we could
allow a mini config language in default_tablespace like
table=space1,index=space2.


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-16 Thread Greg Stark
On 15 Apr 2015 15:43, Simon Riggs si...@2ndquadrant.com wrote:

 It all depends upon who is being selfish. Why is a user selfish for
 not wanting to clean every single block they scan, when the people
 that made the mess do nothing and go faster 10 minutes from now?
 Randomly and massively penalising large SELECTs makes no sense. Some
 cleanup is OK, with reasonable limits, which is why that is proposed.

I don't think it's productive to think of a query as a different actor with
only an interest in its own performance and no interest in overall system
performance.

From a holistic point of view the question is how many times is a given hit
chain going to need to be followed before it's pruned. Or to put it another
way, how expensive is creating a hot chain. Does it cause a single prune? a
fixed number of chain readers followed by a prune? Does the amount of work
depend on the workload or is it consistent?

My intuition is that a fixed cutoff like five pages is dangerous because
if you update many pages there's no limit to the number of times they'll be
read before they're all pruned. The steady state could easily be that every
query is having to read hot chains forever.

My intuition, again, is that what we need is a percentage such as do 10
prunes then ignore the next 1000 clean pages with hot chains. That
guarantees that after 100 selects the hot chains will all be pruned but
each select will only prune 1% of the clean pages it sees.


Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-16 Thread Andres Freund
On 2015-04-15 18:53:15 +0300, Heikki Linnakangas wrote:
 Hmm, ok, I've read the INSERT ... ON CONFLICT UPDATE and logical decoding
 thread now, and I have to say that IMHO it's a lot more sane to handle this
 in ReorderBufferCommit() like Peter first did, than to make the main
 insertion path more complex like this.

I don't like Peter's way much. For one it's just broken. For another
it's quite annoying to trigger disk reads to figure out what actual type
of record something is.

If we go that way that's the way I think it should be done: Whenever we
encounter a speculative record we 'unlink' it from the changes that will
be reused for spooling from disk and do nothing further. Then we just
continue reading through the records. If the next thing we encounter is
a super-deletion we throw away that record. If it's another type of
change (or even bettter a 'speculative insertion succeeded' record)
insert it. That'll still require some uglyness, but it should not be too
bad.

I earlier thought that'd not be ok because there could be a new catalog
snapshot inbetween, but I was mistaken: The locking on the source
transaction prevents problems.

 Another idea is to never spill the latest record to disk, at least if it was
 a speculative insertion. Then you would be sure that when you see the
 super-deletion record, the speculative insertion it refers to is still in
 memory. That seems simple.

It's not guaranteed to be the last record, there can be records
inbetween (snapshots from other backends at the very least).


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-16 Thread Pavan Deolasee
On Thu, Apr 16, 2015 at 2:47 PM, Greg Stark st...@mit.edu wrote:


 On 15 Apr 2015 15:43, Simon Riggs si...@2ndquadrant.com wrote:
 
  It all depends upon who is being selfish. Why is a user selfish for
  not wanting to clean every single block they scan, when the people
  that made the mess do nothing and go faster 10 minutes from now?
  Randomly and massively penalising large SELECTs makes no sense. Some
  cleanup is OK, with reasonable limits, which is why that is proposed.

 I don't think it's productive to think of a query as a different actor
 with only an interest in its own performance and no interest in overall
 system performance.

 From a holistic point of view the question is how many times is a given
 hit chain going to need to be followed before it's pruned. Or to put it
 another way, how expensive is creating a hot chain. Does it cause a single
 prune? a fixed number of chain readers followed by a prune? Does the amount
 of work depend on the workload or is it consistent?


IMO the size or traversal of the HOT chain is not that expensive compared
to the cost of either pruning too frequently, which generates WAL as well
as makes buffers dirty. OTOH cost of less frequent pruning could also be
very high. It can cause severe table bloat which may just stay for a very
long time. Even if dead space is recovered within a page, truncating a
bloated heap is not always possible. In such cases, even SELECTs would be
slowed down just because they need to read/scan far more pages than they
otherwise would have. IOW its probably wrong to assume that not-pruning
quickly enough will have impact only on the non-SELECT queries.

I also concur with arguments upthread that this change needs to be
carefully calibrated because it can lead to significant degradation for
certain workloads.

 My intuition, again, is that what we need is a percentage such as do 10
 prunes then ignore the next 1000 clean pages with hot chains. That
 guarantees that after 100 selects the hot chains will all be pruned but
 each select will only prune 1% of the clean pages it sees.

I think some such proposal was made in the last. There could be knob to
control how much a read-only query (or may be a read-only transaction)
should do HOT cleanup, say as a percentage of pages it looks at. The
default can be left at 100% in the first release so that the current
behaviour is not suddenly disrupted. But it will allow others to play with
the percentages and then based on field reports, we can change defaults in
the next releases.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


Re: [HACKERS] FPW compression leaks information

2015-04-16 Thread Michael Paquier
On Wed, Apr 15, 2015 at 9:42 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Wed, Apr 15, 2015 at 9:20 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 On Wed, Apr 15, 2015 at 2:22 PM, Fujii Masao wrote:
 On Wed, Apr 15, 2015 at 11:55 AM, Michael Paquier wrote:
 1) Doc patch to mention that it is possible that compression can give
 hints to attackers when working on sensible fields that have a
 non-fixed size.

 I think that this patch is enough as the first step.

 I'll get something done for that at least, a big warning below the
 description of wal_compression would do it.

So here is a patch for this purpose, with the following text being used:
+   warning
+para
+ When enabling varnamewal_compression/varname, there is a risk
+ to leak data similarly to the BREACH and CRIME attacks on SSL where
+ the compression ratio of a full page image gives a hint of what is
+ the existing data of this page.  Tables that contain sensitive
+ information like structnamepg_authid/structname with password
+ data could be potential targets to such attacks. Note that as a
+ prerequisite a user needs to be able to insert data on the same page
+ as the data targeted and need to be able to detect checkpoint
+ presence to find out if a compressed full page write is included in
+ WAL to calculate the compression ratio of a page using WAL positions
+ before and after inserting data on the page with data targeted.
+/para
+   /warning

Comments and reformulations are welcome.
Regards,
-- 
Michael
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b30c68d..2f61e29 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2303,6 +2303,22 @@ include_dir 'conf.d'
 but at the cost of some extra CPU spent on the compression during
 WAL logging and on the decompression during WAL replay.
/para
+
+   warning
+para
+ When enabling varnamewal_compression/varname, there is a risk
+ to leak data similarly to the BREACH and CRIME attacks on SSL where
+ the compression ratio of a full page image gives a hint of what is
+ the existing data of this page.  Tables that contain sensitive
+ information like structnamepg_authid/structname with password
+ data could be potential targets to such attacks. Note that as a
+ prerequisite a user needs to be able to insert data on the same page
+ as the data targeted and need to be able to detect checkpoint
+ presence to find out if a compressed full page write is included in
+ WAL to calculate the compression ratio of a page using WAL positions
+ before and after inserting data on the page with data targeted.
+/para
+   /warning
   /listitem
  /varlistentry
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2015-04-16 Thread Etsuro Fujita

On 2015/03/05 21:08, Etsuro Fujita wrote:

Here is an updated version.



The EXPLAIN output has also been improved as discussed in [1].


I noticed that the EXPLAIN for a pushed-down update (delete) on 
inheritance childs doubly displays Foreign Update (Foreign Delete), 
one for ForeignScan and the other for ModifyTable.  Here is an example:


postgres=# explain verbose update parent set c1 = c1;
  QUERY PLAN
--
 Update on public.parent  (cost=0.00..364.54 rows=4819 width=10)
   Update on public.parent
   Foreign Update on public.ft1
   Foreign Update on public.ft2
   -  Seq Scan on public.parent  (cost=0.00..0.00 rows=1 width=10)
 Output: parent.c1, parent.ctid
   -  Foreign Update on public.ft1  (cost=100.00..182.27 rows=2409 
width=10)

 Remote SQL: UPDATE public.t1 SET c1 = c1
   -  Foreign Update on public.ft2  (cost=100.00..182.27 rows=2409 
width=10)

 Remote SQL: UPDATE public.t2 SET c1 = c1
(10 rows)

Should we do something?  Suggestions are welcome.

Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/31942.1410534...@sss.pgh.pa.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2015-04-16 Thread Etsuro Fujita
On 2015/04/16 19:57, Amit Langote wrote:
 On 16-04-2015 PM 07:50, Etsuro Fujita wrote:
 The EXPLAIN output has also been improved as discussed in [1].

 I noticed that the EXPLAIN for a pushed-down update (delete) on inheritance
 childs doubly displays Foreign Update (Foreign Delete), one for
 ForeignScan and the other for ModifyTable.  Here is an example:

 postgres=# explain verbose update parent set c1 = c1;
QUERY PLAN
 --
   Update on public.parent  (cost=0.00..364.54 rows=4819 width=10)
 Update on public.parent
 Foreign Update on public.ft1
 Foreign Update on public.ft2
 -  Seq Scan on public.parent  (cost=0.00..0.00 rows=1 width=10)
   Output: parent.c1, parent.ctid
 -  Foreign Update on public.ft1  (cost=100.00..182.27 rows=2409 
 width=10)
   Remote SQL: UPDATE public.t1 SET c1 = c1
 -  Foreign Update on public.ft2  (cost=100.00..182.27 rows=2409 
 width=10)
   Remote SQL: UPDATE public.t2 SET c1 = c1
 (10 rows)

 Should we do something?  Suggestions are welcome.

From what I see in Tom's commit message[0] for FTI patch, this shouldn't be,
 right?
 
 To be specific, there should be Foreign Scan there as per the commit. Am I
 missing something?

As shown in the below example, this patch doesn't change the EXPLAIN
output for non-pushed-down update (delete) cases, but since we changed
the EXPLAIN output as discussed in [1], the patch doubly displays
Foreign Update (Foreign Delete) for pushed-down update (delet) cases
like the above example.

postgres=# explain verbose update parent set c1 = trunc(random() * 9 +
1)::int;
 QUERY PLAN
-
 Update on public.parent  (cost=0.00..452.06 rows=5461 width=6)
   Update on public.parent
   Foreign Update on public.ft1
 Remote SQL: UPDATE public.t1 SET c1 = $2 WHERE ctid = $1
   Foreign Update on public.ft2
 Remote SQL: UPDATE public.t2 SET c1 = $2 WHERE ctid = $1
   -  Seq Scan on public.parent  (cost=0.00..0.01 rows=1 width=6)
 Output: (trunc(((random() * '9'::double precision) +
'1'::double precision)))::integer, parent.ctid
   -  Foreign Scan on public.ft1  (cost=100.00..226.03 rows=2730 width=6)
 Output: (trunc(((random() * '9'::double precision) +
'1'::double precision)))::integer, ft1.ctid
 Remote SQL: SELECT ctid FROM public.t1 FOR UPDATE
   -  Foreign Scan on public.ft2  (cost=100.00..226.03 rows=2730 width=6)
 Output: (trunc(((random() * '9'::double precision) +
'1'::double precision)))::integer, ft2.ctid
 Remote SQL: SELECT ctid FROM public.t2 FOR UPDATE
(14 rows)

Best regards,
Etsuro Fujita


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2015-04-16 Thread Amit Langote
On 16-04-2015 PM 07:50, Etsuro Fujita wrote:
 The EXPLAIN output has also been improved as discussed in [1].
 
 I noticed that the EXPLAIN for a pushed-down update (delete) on inheritance
 childs doubly displays Foreign Update (Foreign Delete), one for
 ForeignScan and the other for ModifyTable.  Here is an example:
 
 postgres=# explain verbose update parent set c1 = c1;
   QUERY PLAN
 --
  Update on public.parent  (cost=0.00..364.54 rows=4819 width=10)
Update on public.parent
Foreign Update on public.ft1
Foreign Update on public.ft2
-  Seq Scan on public.parent  (cost=0.00..0.00 rows=1 width=10)
  Output: parent.c1, parent.ctid
-  Foreign Update on public.ft1  (cost=100.00..182.27 rows=2409 width=10)
  Remote SQL: UPDATE public.t1 SET c1 = c1
-  Foreign Update on public.ft2  (cost=100.00..182.27 rows=2409 width=10)
  Remote SQL: UPDATE public.t2 SET c1 = c1
 (10 rows)
 
 Should we do something?  Suggestions are welcome.
 

From what I see in Tom's commit message[0] for FTI patch, this shouldn't be,
right?

To be specific, there should be Foreign Scan there as per the commit. Am I
missing something?

Thanks,
Amit

[1]
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=cb1ca4d800621dcae67ca6c799006de99fa4f0a5



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and WAL archive interactions

2015-04-16 Thread Heikki Linnakangas

On 03/01/2015 12:36 AM, Venkata Balaji N wrote:

Patch did get applied successfully to the latest master. Can you please
rebase.


Here you go.

On 01/31/2015 03:07 PM, Andres Freund wrote:

On 2014-12-19 22:56:40 +0200, Heikki Linnakangas wrote:

This add two new archive_modes, 'shared' and 'always', to indicate whether
the WAL archive is shared between the primary and standby, or not. In
shared mode, the standby tracks which files have been archived by the
primary. The standby refrains from recycling files that the primary has
not yet archived, and at failover, the standby archives all those files too
from the old timeline. In 'always' mode, the standby's WAL archive is
taken to be separate from the primary's, and the standby independently
archives all files it receives from the primary.


I don't really like this approach. Sharing a archive is rather dangerous
in my experience - if your old master comes up again (and writes in the
last wal file) or similar, you can get into really bad situations.


It doesn't have to actually be shared. The master and standby could 
archive to different locations, but the responsibility of archiving is 
shared, so that on promotion, the standby ensures that every WAL file 
gets archived. If the master didn't do it, then the standby will.


Yes, if the master comes up again, it might try to archive a file that 
the standby already archived. But that's not so bad. Both copies of the 
file will be identical. You could put logic in archive_command to check, 
if the file already exists in the archive, whether the contents are 
identical, and return success without doing anything if they are.


Oh, hang on, that's not necessarily true. On promotion, the standby 
archives the last, partial WAL segment from the old timeline. That's 
just wrong 
(http://www.postgresql.org/message-id/52fcd37c.3070...@vmware.com), and 
in fact I somehow thought I changed that already, but apparently not. So 
let's stop doing that.



What I was thinking about was instead trying to detect the point up to
which files were safely archived by running restore command to check for
the presence of archived files. Then archive anything that has valid
content and isn't yet archived. That doesn't sound particularly
complicated to me.


Hmm. That assumes that the standby has a valid restore_command, and can 
access the WAL archive. Not a too unreasonable requirement I guess, but 
with the scheme I proposed, it's not necessary. Seems a bit silly to 
copy a whole segment from the archive just to check if it exists, though.


- Heikki

From db5c4311baf4e3a2ae3308c4d0d9975ee3692a18 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas heikki.linnakangas@iki.fi
Date: Thu, 16 Apr 2015 14:40:24 +0300
Subject: [PATCH v2 1/1] Make WAL archival behave more sensibly in standby
 mode.

This adds two new archive_modes, 'shared' and 'always', to indicate whether
the WAL archive is shared between the primary and standby, or not. In
shared mode, the standby tracks which files have been archived by the
primary. The standby refrains from recycling files that the primary has
not yet archived, and at failover, the standby archives all those files too
from the old timeline. In 'always' mode, the standby's WAL archive is
taken to be separate from the primary's, and the standby independently
archives all files it receives from the primary.

Fujii Masao and me.
---
 doc/src/sgml/config.sgml  |  12 +-
 doc/src/sgml/high-availability.sgml   |  48 +++
 doc/src/sgml/protocol.sgml|  31 +
 src/backend/access/transam/xlog.c |  29 -
 src/backend/postmaster/postmaster.c   |  37 --
 src/backend/replication/walreceiver.c | 172 --
 src/backend/replication/walsender.c   |  47 +++
 src/backend/utils/misc/guc.c  |  21 ++--
 src/backend/utils/misc/postgresql.conf.sample |   2 +-
 src/include/access/xlog.h |  14 ++-
 10 files changed, 351 insertions(+), 62 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b30c68d..e352b8e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2521,7 +2521,7 @@ include_dir 'conf.d'
 
 variablelist
  varlistentry id=guc-archive-mode xreflabel=archive_mode
-  termvarnamearchive_mode/varname (typeboolean/type)
+  termvarnamearchive_mode/varname (typeenum/type)
   indexterm
primaryvarnamearchive_mode/ configuration parameter/primary
   /indexterm
@@ -2530,7 +2530,15 @@ include_dir 'conf.d'
para
 When varnamearchive_mode/ is enabled, completed WAL segments
 are sent to archive storage by setting
-xref linkend=guc-archive-command.
+xref linkend=guc-archive-command. In addition to literaloff/,
+to disable, there are three modes: literalon/, literalshared/,
+and literalalways/. During normal operation, there is no
+

Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-16 Thread Alvaro Herrera
Pavan Deolasee wrote:
 On Thu, Apr 16, 2015 at 2:47 PM, Greg Stark st...@mit.edu wrote:

  From a holistic point of view the question is how many times is a given
  hit chain going to need to be followed before it's pruned. Or to put it
  another way, how expensive is creating a hot chain. Does it cause a single
  prune? a fixed number of chain readers followed by a prune? Does the amount
  of work depend on the workload or is it consistent?
 
 IMO the size or traversal of the HOT chain is not that expensive compared
 to the cost of either pruning too frequently, which generates WAL as well
 as makes buffers dirty. OTOH cost of less frequent pruning could also be
 very high. It can cause severe table bloat which may just stay for a very
 long time. Even if dead space is recovered within a page, truncating a
 bloated heap is not always possible.

I think you're failing to consider that in the patch there is a
distinction between read-only page accesses and page updates.  During a
page update, HOT cleanup is always done even with the patch, so there
won't be any additional bloat that would not be there without the patch.
It's only the read-only accesses to the patch that skip the HOT pruning.

Of course, as Greg says there will be some additional scans of the HOT
chain by read-only processes.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: default_index_tablespace

2015-04-16 Thread David Steele
On 4/15/15 11:33 PM, Amit Kapila wrote:
 On Thu, Apr 16, 2015 at 8:01 AM, Bruce Momjian br...@momjian.us
 mailto:br...@momjian.us wrote:

 On Wed, Apr 15, 2015 at 07:12:11PM -0400, Tom Lane wrote:
  jltal...@adv-solutions.net mailto:jltal...@adv-solutions.net writes:
   This small patch implements a new GUC (default_index_tablespace) plus
   supporting code.
   Originated from a customer request, the feature intends to make
   creation of indexes on SSD-backed tablespaces easy and convenient
   (almost transparent) for users: the DBA can just set it and
 indexes will
   be placed in the specified tablespace --as opposed to the same
   tablespace where the referenced table is-- without having to
 specify it
   every time.
 
 
 Another way to provide different default tablespace for index could be
 to provide it at Database level.  Have a new option INDEX_TABLESPACE
 in Create Database command which can be used to create indexes
 when not specified during Create Index command.  This would also need
 changes in pg_dump (like while dumping info about database) but on
 initial look, it seems it can be done without much changes. 

That's same idea that Stephen and I have discussed in the past.
Something like:

CREATE DATABASE name
SET TABLESPACE table_volume
SET INDEX TABLESPACE index_volume;

This has some real usability advantages.  In the past I've written code
to move tables to where they need to be once the db update is complete.
 The tables tend to be small or empty so this is not usually a big deal
- but sometimes it is.  Trying to get a tablespace clause on every index
in the build scripts is a real PITA.

-- 
- David Steele
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-16 Thread Andres Freund
On 2015-04-16 10:20:20 -0300, Alvaro Herrera wrote:
 I think you're failing to consider that in the patch there is a
 distinction between read-only page accesses and page updates.  During a
 page update, HOT cleanup is always done even with the patch, so there
 won't be any additional bloat that would not be there without the
 patch.

That's not really true (and my benchmark upthread proves it). The fact
that hot pruning only happens when we can get a cleanup lock means that
we can end up with more pages that are full, if we prune on select less
often. Especially if SELECTs are more frequent than write accesses -
pretty darn common - the likelihood of SELECTs getting the lock is
correspondingly higher.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Move pg_upgrade from contrib/ to src/bin/

2015-04-16 Thread Bernd Helmle


--On 15. April 2015 15:02:05 -0400 Andrew Dunstan and...@dunslane.net
wrote:

 We've handled the buildfarm being red for a few days before. People are
 usually good about applying fixes fairly quickly.

Took me some time to get that due to my mail backlog, but i've done the
hotfix for dotterel and forced a run for all branches on this box.

-- 
Thanks

Bernd


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting src/test/modules in MSVC builds

2015-04-16 Thread Andrew Dunstan


On 04/16/2015 02:46 AM, Michael Paquier wrote:

Hi all,

As mentioned previously
(cab7npqscphafxs2rzeb-fbccjqiknqxjhloztkggim1mf5x...@mail.gmail.com),
attached are patches to add support for src/test/modules in MSVC
builds, modules whose tests are not supported since they have been
moved from contrib/:
- 0001 adds support for the build portion.
- 0002 adds support for the installation. I arrived at the conclusion
that those modules should be installed by default, because
contribcheck relies on the fact that tests should be run on a server
already running, and modulescheck should do the same IMO.
- 0003 adds a new target modulescheck in vcregress.

Patches 0001 and 0003 are based on some previous work from Alvaro,
that I modified slightly after testing them. Note that this split is
done to test each step on the build farm for safety.
Regards,




Thanks for doing this. It looks good, and if you've tested it I'm 
satisfied. I suggest that we apply patches 1 and 2 immediately. AUIU 
they don't require any changes to the buildfarm, as the MSVC build 
process will automatically build and install it with these changes. Then 
if all goes well we can apply the third patch and I'll fix the buildfarm 
client for the forthcoming release to run the tests on MSVC builds. 
Nothing will break in the meantime - the tests just won't get run until 
the new client version is deployed.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-16 Thread Peter Geoghegan
On Thu, Apr 16, 2015 at 2:23 AM, Andres Freund and...@anarazel.de wrote:
 I'm, completely independent of logical decoding, of the *VERY* strong
 opinion that 'speculative insertions' should never be visible when
 looking with normal snapshots. For one it allows to simplify
 considerations around wraparound (which has proven to be a very good
 idea, c.f. multixacts + vacuum causing data corruption years after it
 was thought to be harmless). For another it allows to reclaim/redefine
 the bit after a database restart/upgrade. Given how complex this is and
 how scarce flags are that seems like a really good property.

 And avoiding those flags to be visible to the outside requires a WAL
 record, otherwise it won't be correct on the standby.

I'm a bit distracted here, and not sure exactly what you mean. What's
a normal snapshot?

Do you just mean that you think that speculative insertions should be
explicitly affirmed by a second record (making it not a speculative
tuple, but rather, a fully fledged tuple)? IOW, an MVCC snapshot has
no chance of seeing a tuple until it was affirmed by a second in-place
modification, regardless of tuple xmin xact commit status?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: default_index_tablespace

2015-04-16 Thread jltallon


 I'm afraid this idea is a nonstarter, because it will break 
existing
 applications, and in particular existing pg_dump output files, 
which

 expect to be able to determine an index's tablespace by setting
 default_tablespace.  (It is *not* adequate that the code falls 
back
 to default_tablespace if the new GUC is unset; if it is set, 
you've

 still broken pg_dump.)


Got it. Thank you for the feedback.


The incremental value, if indeed there is any,
 of being able to control index positioning this way seems unlikely 
to

 justify a backwards-compatibility break of such magnitude.

I can see why someone would want this because random I/O, which is
frequent for indexes, is much faster on SSD than magnetic disks.
(Sequential I/O is more similar for the two.)


The general idea is something I've brought up previously also (I 
believe

it was even discussed at the Dev meeting in, uh, 2013?) so I'm not
anxious to simply dismiss it, but it'd certainly have to be done
correctly..



Any suggestions on how to do it properly?
Does Greg Stark's suggestion (at 
CAM-w4HPOASwsQMdGZqjyFHNubbUnWrUAo8ibci-97UKU=po...@mail.gmail.com) 
make sense to you?

This approach might suffer from the same problem as mine, though.

It seems to me, IMVHO, a limitation in pg_dump ---since 8.0 when 
tablespace support for CREATE INDEX was implemented--- that we should 
fix.
Keeping backwards compatibility is indeed required; I just did not 
think about pg_dump at all :(



I don't mind reworking the patch or redoing it completely once there is 
a viable solution.



Thanks,

  / J.L.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] initdb -S and tablespaces

2015-04-16 Thread Abhijit Menon-Sen
Hi.

Here's a variation of the earlier patch that follows all links in
PGDATA. Does this look more like what you had in mind?

-- Abhijit
From d86888b0d2f5a3a57027d26ce050a3bbb58670d3 Mon Sep 17 00:00:00 2001
From: Abhijit Menon-Sen a...@2ndquadrant.com
Date: Thu, 6 Nov 2014 00:45:56 +0530
Subject: Recursively fsync PGDATA at startup if needed

It's needed if we need to perform crash recovery or if fsync was
disabled at some point while the server was running earlier (and
we must store that information in the control file).

This is so that we don't lose older unflushed writes in the event of
a power failure after crash recovery, where more recent writes are
preserved.

See 20140918083148.ga17...@alap3.anarazel.de for details.
---
 src/backend/access/transam/xlog.c   | 168 
 src/bin/pg_controldata/pg_controldata.c |   2 +
 src/include/catalog/pg_control.h|   8 +-
 3 files changed, 177 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 2580996..263d2d0 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -835,6 +835,8 @@ static void WALInsertLockAcquireExclusive(void);
 static void WALInsertLockRelease(void);
 static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt);
 
+static void fsync_pgdata(char *datadir);
+
 /*
  * Insert an XLOG record represented by an already-constructed chain of data
  * chunks.  This is a low-level routine; to construct the WAL record header
@@ -4811,6 +4813,7 @@ BootStrapXLOG(void)
 	ControlFile-checkPoint = checkPoint.redo;
 	ControlFile-checkPointCopy = checkPoint;
 	ControlFile-unloggedLSN = 1;
+	ControlFile-fsync_disabled = false;
 
 	/* Set important parameter values for use when replaying WAL */
 	ControlFile-MaxConnections = MaxConnections;
@@ -5868,6 +5871,27 @@ StartupXLOG(void)
 		ereport(FATAL,
 (errmsg(control file contains invalid data)));
 
+	/*
+	 * If we need to perform crash recovery, we issue an fsync on the
+	 * data directory and its contents to try to ensure that any data
+	 * written before the crash are flushed to disk. Otherwise a power
+	 * failure in the near future might cause earlier unflushed writes
+	 * to be lost, even though more recent data written to disk from
+	 * here on would be persisted.
+	 *
+	 * We also do this if the control file indicates that fsync was
+	 * disabled at some point while the server was running earlier.
+	 */
+
+	if (enableFsync 
+		(ControlFile-fsync_disabled ||
+		 (ControlFile-state != DB_SHUTDOWNED 
+		  ControlFile-state != DB_SHUTDOWNED_IN_RECOVERY)))
+	{
+		fsync_pgdata(data_directory);
+		ControlFile-fsync_disabled = false;
+	}
+
 	if (ControlFile-state == DB_SHUTDOWNED)
 	{
 		/* This is the expected case, so don't be chatty in standalone mode */
@@ -8236,6 +8260,8 @@ CreateCheckPoint(int flags)
 	/* crash recovery should always recover to the end of WAL */
 	ControlFile-minRecoveryPoint = InvalidXLogRecPtr;
 	ControlFile-minRecoveryPointTLI = 0;
+	if (!enableFsync)
+		ControlFile-fsync_disabled = true;
 
 	/*
 	 * Persist unloggedLSN value. It's reset on crash recovery, so this goes
@@ -11107,3 +11133,145 @@ SetWalWriterSleeping(bool sleeping)
 	XLogCtl-WalWriterSleeping = sleeping;
 	SpinLockRelease(XLogCtl-info_lck);
 }
+
+/*
+ * Hint to the OS that it should get ready to fsync() this file.
+ *
+ * Adapted from pre_sync_fname in initdb.c
+ */
+static void
+pre_sync_fname(char *fname, bool isdir)
+{
+	int			fd;
+
+	fd = open(fname, O_RDONLY | PG_BINARY);
+
+	/*
+	 * Some OSs don't allow us to open directories at all (Windows returns
+	 * EACCES)
+	 */
+	if (fd  0  isdir  (errno == EISDIR || errno == EACCES))
+		return;
+
+	if (fd  0)
+		ereport(FATAL,
+(errmsg(could not open file \%s\ before fsync,
+		fname)));
+
+	pg_flush_data(fd, 0, 0);
+
+	close(fd);
+}
+
+/*
+ * walkdir: recursively walk a directory, applying the action to each
+ * regular file and directory (including the named directory itself)
+ * and following symbolic links.
+ */
+static void
+walkdir(char *path, void (*action) (char *fname, bool isdir))
+{
+	DIR		   *dir;
+	struct dirent *de;
+
+	dir = AllocateDir(path);
+	while ((de = ReadDir(dir, path)) != NULL)
+	{
+		char		subpath[MAXPGPATH];
+		struct stat fst;
+
+		CHECK_FOR_INTERRUPTS();
+
+		if (strcmp(de-d_name, .) == 0 ||
+			strcmp(de-d_name, ..) == 0)
+			continue;
+
+		snprintf(subpath, MAXPGPATH, %s/%s, path, de-d_name);
+
+		if (lstat(subpath, fst)  0)
+			ereport(ERROR,
+	(errcode_for_file_access(),
+	 errmsg(could not stat file \%s\: %m, subpath)));
+
+		if (S_ISREG(fst.st_mode))
+			(*action) (subpath, false);
+		else if (S_ISDIR(fst.st_mode))
+			walkdir(subpath, action);
+#ifndef WIN32
+		else if (S_ISLNK(fst.st_mode))
+#else
+		else if (pg_win32_is_junction(subpath))
+#endif
+		{
+#if defined(HAVE_READLINK) || defined(WIN32)
+			char		linkpath[MAXPGPATH];
+			int			len;
+			struct stat lst;
+
+			len = 

[HACKERS] Disabling trust/ident authentication configure option

2015-04-16 Thread Bernd Helmle
We have a customer using a patch to harden their PostgreSQL installation
(see attached) they would like to contribute. This patch adds the ability
to disable trust and ident authentication at compile time via configure
options, thus making it impossible to use these authentication methods for
sloppy configuration purposes. This is critical to their software
deployment, as stated in their use case description:

snip
PostgreSQL is deployed as part of a larger technical solution (e.g. a
Telecommunication system) and a field engineer has to install/upgrade this
solution. The engineer is a specialist in the Telco domain and has only
little knowledge of databases and especially PostgreSQL setup.

We now want to provide these kinds of users with pre-hardened packages that
make it very hard to accidentally expose their database. For this purpose
the patch allows to optionally disable the trust and ident
authentication methods. Especially the trust mechanism is very critical
as it might actually provide useful functionality for our user. Think of an
engineer who has to do a night shift upgrade with a customer breathing down
his neck to get the system back online. Circumventing all these
authentication configuration issues by just enabling trust is very easy
and looks well supported and documented. (the documentation states the
dangers but there are no *big red flags* or something). After finishing the
upgrade the engineer forgets to restore the secure configuration, and a
malicious attacker later uses this access method to gain access to the
database.
/snip

Currently the patch adds new configure options --without-trust-auth and
--without-ident-auth and makes peer authentication default when these
options are set. My testing shows that regression tests (which are normally
using trust) are still working. This works as far as it goes to Linux and
friends, Windows currently is not adressed yet. Maybe its worth to consider
making sspi the default on this platform instead.

There was a discussion some time ago ([1]), but i think the reasoning
behind this patch is a little bit different than discussed there. 

[1]
http://www.postgresql.org/message-id/CAN2Y=umt7cpkxzhaufw7szeckdwcwsuulmh4xphuxkqbtdu...@mail.gmail.com

-- 
Thanks

Bernd

disable_trust_ident_configure.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-16 Thread Pavan Deolasee
On Thu, Apr 16, 2015 at 6:50 PM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

 Pavan Deolasee wrote:
  On Thu, Apr 16, 2015 at 2:47 PM, Greg Stark st...@mit.edu wrote:

   From a holistic point of view the question is how many times is a given
   hit chain going to need to be followed before it's pruned. Or to put it
   another way, how expensive is creating a hot chain. Does it cause a
 single
   prune? a fixed number of chain readers followed by a prune? Does the
 amount
   of work depend on the workload or is it consistent?
 
  IMO the size or traversal of the HOT chain is not that expensive compared
  to the cost of either pruning too frequently, which generates WAL as well
  as makes buffers dirty. OTOH cost of less frequent pruning could also be
  very high. It can cause severe table bloat which may just stay for a very
  long time. Even if dead space is recovered within a page, truncating a
  bloated heap is not always possible.

 I think you're failing to consider that in the patch there is a
 distinction between read-only page accesses and page updates.  During a
 page update, HOT cleanup is always done even with the patch, so there
 won't be any additional bloat that would not be there without the patch.
 It's only the read-only accesses to the patch that skip the HOT pruning.


Ah, Ok. I'd not read the patch. But now that I do, I feel much more
comfortable with the change. In fact, I wonder if its just enough to either
do full HOT prune for target relations and not at all for all other
relations involved in the query. My apologies if this is done based on
discussions upthread. I haven't read the entire thread yet.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-16 Thread Heikki Linnakangas

On 04/16/2015 12:18 PM, Andres Freund wrote:

On 2015-04-15 18:53:15 +0300, Heikki Linnakangas wrote:

Hmm, ok, I've read the INSERT ... ON CONFLICT UPDATE and logical decoding
thread now, and I have to say that IMHO it's a lot more sane to handle this
in ReorderBufferCommit() like Peter first did, than to make the main
insertion path more complex like this.


I don't like Peter's way much. For one it's just broken. For another
it's quite annoying to trigger disk reads to figure out what actual type
of record something is.

If we go that way that's the way I think it should be done: Whenever we
encounter a speculative record we 'unlink' it from the changes that will
be reused for spooling from disk and do nothing further. Then we just
continue reading through the records. If the next thing we encounter is
a super-deletion we throw away that record. If it's another type of
change (or even bettter a 'speculative insertion succeeded' record)
insert it. That'll still require some uglyness, but it should not be too
bad.


Sounds good to me.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-16 Thread Peter Geoghegan
On Thu, Apr 16, 2015 at 2:18 AM, Andres Freund and...@anarazel.de wrote:
 On 2015-04-15 18:53:15 +0300, Heikki Linnakangas wrote:
 Hmm, ok, I've read the INSERT ... ON CONFLICT UPDATE and logical decoding
 thread now, and I have to say that IMHO it's a lot more sane to handle this
 in ReorderBufferCommit() like Peter first did, than to make the main
 insertion path more complex like this.

 I don't like Peter's way much. For one it's just broken. For another
 it's quite annoying to trigger disk reads to figure out what actual type
 of record something is.

 If we go that way that's the way I think it should be done: Whenever we
 encounter a speculative record we 'unlink' it from the changes that will
 be reused for spooling from disk and do nothing further. Then we just
 continue reading through the records.

You mean we continue iterating through *changes* from ReorderBufferCommit()?

 If the next thing we encounter is
 a super-deletion we throw away that record. If it's another type of
 change (or even bettter a 'speculative insertion succeeded' record)
 insert it.

By insert it, I gather you mean report to the the logical decoding
plugin as an insert change.

 That'll still require some uglyness, but it should not be too
 bad.

So, to be clear, what you have in mind is sort of a hybrid between my
first and second approaches (mostly my first approach).

We'd have coordination between records originally decoded into
changes, maybe intercepting them during xact reassembly, like my
first approach. We'd mostly do the same thing as the first approach,
actually. The major difference would be that there'd be explicit
speculative affirmation WAL records. But we wouldn't rely on those
affirmation records within ReorderBufferCommit() - we'd rely on the
*absence* of a super deletion WAL record (to report an insert change
to the decoding plugin). To emphasize, like my first approach, it
would be based on an *absence* of a super deletion WAL record.

However, like my second approach, there would be a speculative
affirmation WAL record. The new speculative affirmation WAL record
would however be quite different to what my second approach to logical
decoding (the in-place update record thing that was in V3.3) actually
did. In particular, it would be far more simple, and the tuple would
be built from the original insertion record within logical decoding.

Right now, I'm tired, so bear with me. What I think I'm not quite
getting here is how the new type of affirmation record affects
visibility (or anything else, actually). Clearly dirty snapshots need
to see the record to set a speculative insertion token for their
caller to wait on (and HeapTupleSatisfiesVacuum() cannot see the
speculatively inserted tuple as reclaimable, of course). They need
this *before* the speculative insertion is affirmed, of course.

Maybe you mean: If the speculative insertion xact is in progress, then
the tuple is visible to several types of snapshots (dirty, vacuum,
self, any). If it is not, then tuples are not visible because they are
only speculative (and not *confirmed*). If they were confirmed, and
the xact was committed, then those tuples are logically and physically
indistinguishable from tuples that were inserted in the ordinary
manner.

Is that it? Basically, the affirmation records have nothing much to do
with logical decoding in particular. But you still need to super
delete, so that several types of snapshots ((dirty, vacuum, self, any)
*stop* seeing the tuple as visible independent of the inserting xact
being in progress.

 I earlier thought that'd not be ok because there could be a new catalog
 snapshot inbetween, but I was mistaken: The locking on the source
 transaction prevents problems.

I thought this was the case.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

2015-04-16 Thread Josh Berkus
On 04/16/2015 01:01 PM, David G. Johnston wrote:
 If this is not covered adequately enough in the documentation then that
 should be remedied.  Did you evaluate the documentation in that light
 while preparing your blog post?

Your response seems very defensive.  I'm unclear on why discussing
improving an error message should get such a hostile response; it
certainly doesn't invite a serious reply.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

2015-04-16 Thread Josh Berkus
Folks:

SELECT
device_id,
count(*)::INT as present,
count(*)::INT FILTER (WHERE valid) as valid_count,
mode()::INT WITHIN GROUP (order by val) as mode,
percentile_disc(0.5)::INT WITHIN GROUP (order by val)
  as median
FROM dataflow_0913
GROUP BY device_id
ORDER BY device_id;

ERROR:  syntax error at or near FILTER
LINE 4: count(*)::INT FILTER (WHERE valid)
as valid_count,


The error is right, that's invalid syntax.  I can't insert a ::INT
between the aggregate() and FILTER.  However, the error message is also
rather confusing to the user; they're likely to look for their mistake
in the wrong place.  The same goes for WITHIN GROUP (and OVER, too, I
think).

Is there some kind of possible HINT we could add to make this easier to
debug?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-16 Thread Alvaro Herrera
Andres Freund wrote:
 On 2015-04-16 10:20:20 -0300, Alvaro Herrera wrote:
  I think you're failing to consider that in the patch there is a
  distinction between read-only page accesses and page updates.  During a
  page update, HOT cleanup is always done even with the patch, so there
  won't be any additional bloat that would not be there without the
  patch.
 
 That's not really true (and my benchmark upthread proves it). The fact
 that hot pruning only happens when we can get a cleanup lock means that
 we can end up with more pages that are full, if we prune on select less
 often. Especially if SELECTs are more frequent than write accesses -
 pretty darn common - the likelihood of SELECTs getting the lock is
 correspondingly higher.

Interesting point.  Of course, this code should count HOT cleanups
against the total limit when they are effectively carried out, and
ignore those that are skipped because of inability to acquire the
cleanup lock.  Not sure whether the submitted code does that.

Can we keep stats on how many pages we don't clean in the updating
process due to failure to acquire cleanup lock?  My intuition says that
it should be similar to the number of backends running concurrently,
but that might be wrong.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

2015-04-16 Thread David G. Johnston
On Thu, Apr 16, 2015 at 1:06 PM, Josh Berkus j...@agliodbs.com wrote:

 On 04/16/2015 01:01 PM, David G. Johnston wrote:
  If this is not covered adequately enough in the documentation then that
  should be remedied.  Did you evaluate the documentation in that light
  while preparing your blog post?

 Your response seems very defensive.  I'm unclear on why discussing
 improving an error message should get such a hostile response; it
 certainly doesn't invite a serious reply.


Mostly because ​I have no clue how difficult it would be​...

​I'm all for improvement but obviously if it was that simple you would have
just proposed the desired wording and asked someone to commit it.  I was
trying to understand (and communicate my understanding) why it wasn't done
better in the first place.

Beyond that there is still the need to teach the user the correct syntax of
these functions - and not only point out when they get it wrong so they
blindly fix their typo - in order to better avoid the situation where the
syntax is valid but the outcome is unexpected.  Two separate problems of
which the later seems more important than the former - and probably
somewhat easier to implement.

David J.


Re: [HACKERS] inherit support for foreign tables

2015-04-16 Thread David Fetter
On Wed, Apr 15, 2015 at 09:35:05AM +0900, Kyotaro HORIGUCHI wrote:
 Hi,
 
 Before suppressing the symptom, I doubt the necessity and/or
 validity of giving foreign tables an ability to be a parent. Is
 there any reasonable usage for the ability?
 
 I think we should choose to inhibit foreign tables from becoming
 a parent rather than leaving it allowed then taking measures for
 the consequent symptom.

I have a use case for having foreign tables as non-leaf nodes in a
partitioning hierarchy, namely geographic.  One might have a table at
HQ called foo_world, then partitions under it called foo_jp, foo_us,
etc., in one level, foo_us_ca, foo_us_pa, etc. in the next level, and
on down, each in general in a separate data center.

Is there something essential about having non-leaf nodes as foreign
tables that's a problem here?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting src/test/modules in MSVC builds

2015-04-16 Thread Alvaro Herrera
Andrew Dunstan wrote:

 Thanks for doing this.

Yes, much appreciated.

 It looks good, and if you've tested it I'm satisfied.  I suggest that
 we apply patches 1 and 2 immediately. AUIU they don't require any
 changes to the buildfarm, as the MSVC build process will automatically
 build and install it with these changes.

Okay, I just pushed patches 1 and 2.

 Then if all goes well we can apply the third patch and I'll fix the
 buildfarm client for the forthcoming release to run the tests on MSVC
 builds. Nothing will break in the meantime - the tests just won't get
 run until the new client version is deployed.

Will wait for a day or so to see what the Win buildfarm members say.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertion failure when streaming logical changes

2015-04-16 Thread Heikki Linnakangas

On 04/07/2015 03:54 PM, Andres Freund wrote:

On 2015-04-07 17:22:12 +0800, Craig Ringer wrote:

It might be a good idea to apply this if nothing better is forthcoming.
Logical decoding in WALsenders is broken at the moment.


Yes.


Committed.
- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

2015-04-16 Thread Josh Berkus
On 04/16/2015 01:18 PM, David G. Johnston wrote:
 On Thu, Apr 16, 2015 at 1:06 PM, Josh Berkus j...@agliodbs.com
 mailto:j...@agliodbs.comwrote:
 
 On 04/16/2015 01:01 PM, David G. Johnston wrote:
  If this is not covered adequately enough in the documentation then that
  should be remedied.  Did you evaluate the documentation in that light
  while preparing your blog post?
 
 Your response seems very defensive.  I'm unclear on why discussing
 improving an error message should get such a hostile response; it
 certainly doesn't invite a serious reply.
 
 
 Mostly because ​I have no clue how difficult it would be​...
 
 ​I'm all for improvement but obviously if it was that simple you would
 have just proposed the desired wording and asked someone to commit it. 
 I was trying to understand (and communicate my understanding) why it
 wasn't done better in the first place.

Right.  First, I don't know that it's possible for the parser to figure
out that error as opposed to actual mistakes in the filter clause, or
just straight gibberish.  Second, I'm not sure how to write a succinct
HINT which gets the point across.

However, the error we're giving for windowing suggests that we can
improve things.  In the example you quote:

SQL Error: ERROR:  OVER specified, but ceil is not a window function nor
an aggregate function
LINE 1: SELECT ceil(count(*)) OVER ()

... that's actually a lot clearer error.   However, we still get the
same error with casting:

josh=# select count(*)::INT OVER (order by device_id, val) from
dataflow_0913 ;
ERROR:  syntax error at or near OVER
LINE 1: select count(*)::INT OVER (order by device_id, val) from dat...

... so maybe we can't.

 Beyond that there is still the need to teach the user the correct syntax
 of these functions - and not only point out when they get it wrong so
 they blindly fix their typo - in order to better avoid the situation
 where the syntax is valid but the outcome is unexpected.  Two separate
 problems of which the later seems more important than the former - and
 probably somewhat easier to implement.

Absolutely.  Hence my blog post; I'm hoping that people will at least
get a google hit on the error text.

The problem is, in the main docs, how do we reasonably document
*combining* features?  In Syntax?  My head hurts just thinking about it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

2015-04-16 Thread David G. Johnston
On Thu, Apr 16, 2015 at 12:22 PM, Josh Berkus j...@agliodbs.com wrote:

 Folks:

 SELECT
 device_id,
 count(*)::INT as present,
 count(*)::INT FILTER (WHERE valid) as valid_count,
 mode()::INT WITHIN GROUP (order by val) as mode,
 percentile_disc(0.5)::INT WITHIN GROUP (order by val)
   as median
 FROM dataflow_0913
 GROUP BY device_id
 ORDER BY device_id;

 ERROR:  syntax error at or near FILTER
 LINE 4: count(*)::INT FILTER (WHERE valid)
 as valid_count,


 The error is right, that's invalid syntax.  I can't insert a ::INT
 between the aggregate() and FILTER.  However, the error message is also
 rather confusing to the user; they're likely to look for their mistake
 in the wrong place.  The same goes for WITHIN GROUP (and OVER, too, I
 think).


​​SELECT count(*)::int OVER ()
FROM ( VALUES (1),(2),(3) ) src;
​

 Is there some kind of possible HINT we could add to make this easier to
 debug?


​Do you have a suggested hint so that the effort to make it work can be
compared to its usefulness?


​For kicks I ran the following - since val::type is simply another syntax
for type(val)...

SELECT ceil(count(*)) OVER ()
FROM ( VALUES (1),(2),(3) ) src

SQL Error: ERROR:  OVER specified, but ceil is not a window function nor an
aggregate function
LINE 1: SELECT ceil(count(*)) OVER ()

The non-hint has been around as long as window functions and hasn't really
come up as an issue - not enough so to motivate a change at least.  Is the
idiomatic usage of FILTER and WITHIN GROUP making this error more likely?

The foot-gun in your blog post is more problematic but also seemingly
impossible to avoid except through education of the user.  It would not be
unreasonable to accept that the current error is acting like a canary and
forcing the user to go read the documentation on OVER/FILTER/WITHIN GROUP
and learn to write the expression as a single unit.

If this is not covered adequately enough in the documentation then that
should be remedied.  Did you evaluate the documentation in that light while
preparing your blog post?

David J.


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-16 Thread Simon Riggs
On 16 April 2015 at 15:21, Andres Freund and...@anarazel.de wrote:

 On 2015-04-16 10:20:20 -0300, Alvaro Herrera wrote:
  I think you're failing to consider that in the patch there is a
  distinction between read-only page accesses and page updates.  During a
  page update, HOT cleanup is always done even with the patch, so there
  won't be any additional bloat that would not be there without the
  patch.

 That's not really true (and my benchmark upthread proves it). The fact
 that hot pruning only happens when we can get a cleanup lock means that
 we can end up with more pages that are full, if we prune on select less
 often. Especially if SELECTs are more frequent than write accesses -
 pretty darn common - the likelihood of SELECTs getting the lock is
 correspondingly higher.


Your point that we *must* do *some* HOT cleanup on SELECTs is proven beyond
question. Alvaro has not disputed that, ISTM you misread that. Pavan has
questioned that point but the results upthread are there, he explains he
hasn't read that yet.

The only question is how much cleanup on SELECT? Having one SELECT hit
10,000 cleanups while another hits 0 creates an unfairness and
unpredictability in the way we work. Maybe some people running a backup
actually like the fact it cleans the database; others think that is a bad
thing. Few people issuing large queries think it is good behaviour. Anybody
running replication also knows that this causes a huge slam of WAL which
can increase replication delay, which is a concern for HA.

That is how we arrive at the idea of a cleanup limit, further enhanced by a
limit that applies only to dirtying clean blocks, which we have 4? recent
votes in favour of.

I would personally be in favour of a parameter to control the limit, since
whatever we chose is right/wrong depending upon circumstances. I am however
comfortable with not having a parameter if people think it is hard to tune
that, which I agree it would be, hence no parameter in the patch.


[HACKERS] Performance tuning assisted by a GUI application

2015-04-16 Thread Jacek Wielemborek
Hello,

(Please pardon me if this is offtopic and I should send it to another
mailing list instead)

I had a brief discussion on #postgresql and thought that perhaps there
might be a need for a tool that would enable a fine-tuning of PostgreSQL
performance settings by conveniently testing them with a sample SQL
query with the aid of a simple GUI application. To illustrate this, I
created this little proof of concept:

https://gist.github.com/d33tah/d01f3599e55e53d00f68

Screenshot can be found here: https://imgur.com/TguH6Xq

This is by far not even alpha quality here and would need some basic
Python knowledge to set up the connection string. This only supports
modifying cpu_tuple_cost right now, but I guess that you can extrapolate
where this would go.

What do you think? Would anybody be interested in an application like this?

Cheers,
Jacek Wielemborek



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Supporting src/test/modules in MSVC builds

2015-04-16 Thread Alvaro Herrera
Andrew Dunstan wrote:
 
 On 04/16/2015 07:42 PM, Michael Paquier wrote:
 
 Then if all goes well we can apply the third patch and I'll fix the
 buildfarm client for the forthcoming release to run the tests on MSVC
 builds. Nothing will break in the meantime - the tests just won't get
 run until the new client version is deployed.
 Will wait for a day or so to see what the Win buildfarm members say.
 currawong and mastodon seem satisfied.
 
 Yeah. I think we can push the third one - I am just about ready to release
 the new buildfarm client.

Pushed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_upgrade in 9.5 broken for adminpack

2015-04-16 Thread Jeff Janes
pg_upgrade was recently broken for use upgrading from a system with
adminpack installed.

Breaking commit is:

commit 30982be4e5019684e1772dd9170aaa53f5a8e894
Author: Peter Eisentraut pete...@gmx.net

Integrate pg_upgrade_support module into backend


from pg_upgrade_dump_12870.log

pg_restore: creating EXTENSION adminpack
pg_restore: creating COMMENT EXTENSION adminpack
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2806; 0 0 COMMENT
EXTENSION adminpack
pg_restore: [archiver (db)] could not execute query: ERROR:  extension
adminpack does not exist
Command was: COMMENT ON EXTENSION adminpack IS 'administrative
functions for PostgreSQL';


I get the same error whether the source database is 9.2.10 or 9.5.HEAD.

Cheers,

Jeff


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-04-16 Thread Kouhei Kaigai
Hanada-san,

 I merged explain patch into foreign_join patch.
 
 Now v12 is the latest patch.

It contains many garbage lines... Please ensure the
patch is correctly based on the latest master +
custom_join patch.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


 -Original Message-
 From: Shigeru HANADA [mailto:shigeru.han...@gmail.com]
 Sent: Thursday, April 16, 2015 5:06 PM
 To: Kaigai Kouhei(海外 浩平)
 Cc: Ashutosh Bapat; Robert Haas; Tom Lane; Thom Brown;
 pgsql-hackers@postgreSQL.org
 Subject: ##freemail## Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] 
 Custom
 Plan API)
 
 Kaigai-san,
 
 2015/04/15 22:33、Kouhei Kaigai kai...@ak.jp.nec.com のメール:
  Oops, that’s right.  Attached is the revised version.  I chose fully 
  qualified
  name, schema.relname [alias] for the output.  It would waste some cycles 
  during
  planning if that is not for EXPLAIN, but it seems difficult to get a list 
  of
 name
  of relations in ExplainForeignScan() phase, because planning information 
  has
 gone
  away at that time.
 
  I understand. Private data structure of the postgres_fdw is not designed
  to keep tree structure data (like relations join tree), so it seems to me
  a straightforward way to implement the feature.
 
  I have a small suggestion. This patch makes deparseSelectSql initialize
  the StringInfoData if supplied, however, it usually shall be a task of
  function caller, not callee.
  In this case, I like to initStringInfo(relations) next to the line of
  initStingInfo(sql) on the postgresGetForeignPlan. In my sense, it is
  a bit strange to pass uninitialized StringInfoData, to get a text form.
 
  @@ -803,7 +806,7 @@ postgresGetForeignPlan(PlannerInfo *root,
  */
 initStringInfo(sql);
 deparseSelectSql(sql, root, baserel, fpinfo-attrs_used, remote_conds,
  -params_list, fdw_ps_tlist, retrieved_attrs);
  +params_list, fdw_ps_tlist, retrieved_attrs,
 relations);
 
 /*
  * Build the fdw_private list that will be available in the executor.
 
 
 Agreed.  If caller passes a buffer, it should be initialized by caller.  In
 addition to your idea, I added a check that the RelOptInfo is a JOINREL, coz 
 BASEREL
 doesn’t need relations for its EXPLAIN output.
 
  Also, could you merge the EXPLAIN output feature on the main patch?
  I think here is no reason why to split this feature.
 
 I merged explain patch into foreign_join patch.
 
 Now v12 is the latest patch.
 
 --
 Shigeru HANADA
 shigeru.han...@gmail.com
 
 
 
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting src/test/modules in MSVC builds

2015-04-16 Thread Michael Paquier
On Fri, Apr 17, 2015 at 4:47 AM, Alvaro Herrera wrote:
 Andrew Dunstan wrote:
 It looks good, and if you've tested it I'm satisfied.  I suggest that
 we apply patches 1 and 2 immediately. AUIU they don't require any
 changes to the buildfarm, as the MSVC build process will automatically
 build and install it with these changes.

 Okay, I just pushed patches 1 and 2.

Cool, thanks.

 Then if all goes well we can apply the third patch and I'll fix the
 buildfarm client for the forthcoming release to run the tests on MSVC
 builds. Nothing will break in the meantime - the tests just won't get
 run until the new client version is deployed.

 Will wait for a day or so to see what the Win buildfarm members say.

currawong and mastodon seem satisfied.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting src/test/modules in MSVC builds

2015-04-16 Thread Andrew Dunstan


On 04/16/2015 07:42 PM, Michael Paquier wrote:



Then if all goes well we can apply the third patch and I'll fix the
buildfarm client for the forthcoming release to run the tests on MSVC
builds. Nothing will break in the meantime - the tests just won't get
run until the new client version is deployed.

Will wait for a day or so to see what the Win buildfarm members say.

currawong and mastodon seem satisfied.


Yeah. I think we can push the third one - I am just about ready to 
release the new buildfarm client.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2015-04-16 Thread Amit Langote

Fujita-san,

On 16-04-2015 PM 08:40, Etsuro Fujita wrote:
 From what I see in Tom's commit message[0] for FTI patch, this shouldn't be,
 right?

 To be specific, there should be Foreign Scan there as per the commit. Am I
 missing something?
 
 As shown in the below example, this patch doesn't change the EXPLAIN
 output for non-pushed-down update (delete) cases, but since we changed
 the EXPLAIN output as discussed in [1], the patch doubly displays
 Foreign Update (Foreign Delete) for pushed-down update (delet) cases
 like the above example.
 
 postgres=# explain verbose update parent set c1 = trunc(random() * 9 +
 1)::int;
  QUERY PLAN
 -
  Update on public.parent  (cost=0.00..452.06 rows=5461 width=6)
Update on public.parent
Foreign Update on public.ft1
  Remote SQL: UPDATE public.t1 SET c1 = $2 WHERE ctid = $1
Foreign Update on public.ft2
  Remote SQL: UPDATE public.t2 SET c1 = $2 WHERE ctid = $1
-  Seq Scan on public.parent  (cost=0.00..0.01 rows=1 width=6)
  Output: (trunc(((random() * '9'::double precision) +
 '1'::double precision)))::integer, parent.ctid
-  Foreign Scan on public.ft1  (cost=100.00..226.03 rows=2730 width=6)
  Output: (trunc(((random() * '9'::double precision) +
 '1'::double precision)))::integer, ft1.ctid
  Remote SQL: SELECT ctid FROM public.t1 FOR UPDATE
-  Foreign Scan on public.ft2  (cost=100.00..226.03 rows=2730 width=6)
  Output: (trunc(((random() * '9'::double precision) +
 '1'::double precision)))::integer, ft2.ctid
  Remote SQL: SELECT ctid FROM public.t2 FOR UPDATE
 (14 rows)
 

I think I missed the point that you are talking about the result after the
patch for foreign udpate pushdown (which is the topic of this thread) has been
applied. Sorry about the noise.

By the way, one suggestion may be to attach a (pushed down) to the
ModifyTable's Foreign Update. And in that case, there would be no mention of
corresponding scan node in the list below exactly because there would be none.

postgres=# explain verbose update parent set c1 = c1;
  QUERY PLAN
--
 Update on public.parent  (cost=0.00..364.54 rows=4819 width=10)
   Update on public.parent
   Foreign Update (pushed down) on public.ft1
   Foreign Update (pushed down) on public.ft2
   -  Seq Scan on public.parent  (cost=0.00..0.00 rows=1 width=10)
 Output: parent.c1, parent.ctid

Thoughts?

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade in 9.5 broken for adminpack

2015-04-16 Thread Bruce Momjian
On Thu, Apr 16, 2015 at 07:33:50PM -0700, Jeff Janes wrote:
 pg_upgrade was recently broken for use upgrading from a system with adminpack
 installed.
 
 Breaking commit is:
 
 commit 30982be4e5019684e1772dd9170aaa53f5a8e894
 Author: Peter Eisentraut pete...@gmx.net
 
     Integrate pg_upgrade_support module into backend
 
 
 from pg_upgrade_dump_12870.log
 
 pg_restore: creating EXTENSION adminpack
 pg_restore: creating COMMENT EXTENSION adminpack
 pg_restore: [archiver (db)] Error while PROCESSING TOC:
 pg_restore: [archiver (db)] Error from TOC entry 2806; 0 0 COMMENT EXTENSION
 adminpack
 pg_restore: [archiver (db)] could not execute query: ERROR:  extension
 adminpack does not exist
     Command was: COMMENT ON EXTENSION adminpack IS 'administrative functions
 for PostgreSQL';
 
 
 I get the same error whether the source database is 9.2.10 or 9.5.HEAD.

Uh, I am confused how moving pg_upgrade or pg_upgrade_support would
break the loading of the adminpack extension.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reparsing query

2015-04-16 Thread Tatsuo Ishii
 It is not difficult to output parsed query in some tool readable
 format but it comes with a maintain overhead: once tools rely on it,
 we have to conform to some schema continuously, like the xml/xmlns. Do
 we want to take this? Depends on how far the tools can go with this
 exposed information.

I think part of the problems could be resolved by using adequate API
to the parse tree. For example, pgpool-II imports PostgreSQL's raw
parser to do a query rewriting. The parse tree format could be changed
release by release but essential API for it (for example tree_walker)
is very stable. As a result, the workforce for rewriting rarely needs
to be modified.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

2015-04-16 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 ERROR:  syntax error at or near FILTER
 LINE 4: count(*)::INT FILTER (WHERE valid) as valid_count,

 The error is right, that's invalid syntax.  I can't insert a ::INT
 between the aggregate() and FILTER.  However, the error message is also
 rather confusing to the user; they're likely to look for their mistake
 in the wrong place.  The same goes for WITHIN GROUP (and OVER, too, I
 think).

 Is there some kind of possible HINT we could add to make this easier to
 debug?

Probably not; or at least, IMO it would be a fundamental misjudgment to
go in with the idea of improving this one single case.  It'd be cool if
we could improve reporting of syntax errors in general, though.

Unfortunately Bison doesn't provide a whole heckuva lot of support for
that.  The only simple thing it offers is %error-verbose, which I've
experimented with in the past and come away with the impression that
it'd be completely useless for most people :-(.  It seems to basically
add information about which tokens would be valid next at the point of
the error report.  That isn't any help for the sort of user conceptual
error you're talking about here.

You could imagine teaching yyerror() to have some SQL-specific knowledge
that it could apply by comparing the current lookahead token to the
current parse state stack ... but neither of those things are exposed
to it by Bison.  We could probably get the lookahead token indirectly
by inspecting the lexer's state, but it's not clear that's enough to
do much.  In this particular example, that would mean showing the exact
same hint whenever a syntax error is reported at an OVER token; and
I'm afraid that it'd be mighty hard to write a hint for that that
wouldn't frequently do more harm than good.  (Note in particular that
since OVER, FILTER, and WITHIN are unreserved keywords, they might be
used as simple column/table/function names.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

2015-04-16 Thread Josh Berkus
On 04/16/2015 04:36 PM, Tom Lane wrote:
 You could imagine teaching yyerror() to have some SQL-specific knowledge
 that it could apply by comparing the current lookahead token to the
 current parse state stack ... but neither of those things are exposed
 to it by Bison.  We could probably get the lookahead token indirectly
 by inspecting the lexer's state, but it's not clear that's enough to
 do much.  In this particular example, that would mean showing the exact
 same hint whenever a syntax error is reported at an OVER token; and

Yeah, and that's what we *don't* want to do.  That's why I was wondering
if it was even possible to figure out the extra syntax case.

 I'm afraid that it'd be mighty hard to write a hint for that that
 wouldn't frequently do more harm than good.  (Note in particular that
 since OVER, FILTER, and WITHIN are unreserved keywords, they might be
 used as simple column/table/function names.)

No kidding.  Heck, I just spent 1/2 hour figuring out a bug which was
actually caused by the fact that a user created a view called mode.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] pushing order by + limit to union subqueries

2015-04-16 Thread Qingqing Zhou
On Sat, Feb 28, 2015 at 8:24 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 There would be cases where that would be a win, and there would be cases
 where it wouldn't be, so I'd not be in favor of making the transformation
 blindly.  Unfortunately, given the current state of the planner that's
 all we could do really, because the subqueries are planned at arm's
 length and then we just mechanically combine them.  Doing it right would
 entail fully planning each subquery twice, which would be very expensive.

Yes, after pulling up, subqueries are planned independently and we
glue them together finally.

 I have a longstanding desire to rewrite the upper levels of the planner to
 use path generation and comparison, which should make it more practical
 for the planner to compare alternative implementations of UNION and other
 top-level constructs.  But I've been saying I would do that for several
 years now, so don't hold your breath :-(

GreenPlum utilizes Cascades optimizer framework (also used in SQL
Server and some others) to make the optimizer more modular and
extensible. In our context here, it allows thorough optimization
without pre-defined boundaries -  no subquery planning then glue
them. Is that something in your mind?

Regards,
Qingqing


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers