Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-01 Thread Amit Kapila
On Thu, Feb 27, 2014 at 4:14 PM, Christian Kruse
christ...@2ndquadrant.com wrote:
 On 25/02/14 16:11, Robert Haas wrote:
 Reading this over, I'm not sure I understand why this is a CONTEXT at
 all and not just a DETAIL for the particular error message that it's
 supposed to be decorating.  Generally CONTEXT should be used for
 information that will be relevant to all errors in a given code path,
 and DETAIL for extra information specific to a particular error.

 Because there is more than one scenario where this context is useful,
 not just log_lock_wait messages.

 If we're going to stick with CONTEXT, we could rephrase it like this:

 CONTEXT: while attempting to lock tuple (1,2) in relation with OID 3456

 or when the relation name is known:

 CONTEXT: while attempting to lock tuple (1,2) in relation public.foo

 Accepted. Patch attached.

With new patch, the message while updating locked rows will be displayed
as below:

LOG:  process 364 still waiting for ShareLock on transaction 678 after
1014.000ms
CONTEXT:  while attempting to lock tuple (0,2) with values (2) in relation publ
ic.t1 of database postgres

LOG:  process 364 acquired ShareLock on transaction 678 after 60036.000 ms
CONTEXT:  while attempting to lock tuple (0,2) with values (2) in relation publ
ic.t1 of database postgres

Now I am not sure, if the second message is an improvement, as what it sounds
is while attempting to lock tuple, it got shared lock on
transaction'. If you, Robert
or other feels it is okay, then we can retain it as it is in patch
else I think either
we need to rephrase it or may be try with some other way (global variable) such
that it appears only for required case. I feel the way Robert has
suggested i.e to
make it as Detail of particular message (we might need to use global variable to
pass certain info) is better way and will have minimal impact on the cases where
this additional information needs to be displayed.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: new long psql parameter --on-error-stop

2014-03-01 Thread Pavel Stehule
Hello

here is a prototype:

bash-4.1$ /usr/local/pgsql/bin/psql --help-variables
List of some variables (options) for use from command line.
Complete list you find in psql section in the PostgreSQL documentation.

psql variables:
Usage:
  psql --set=NAME=VALUE
  or \set NAME VALUE in interactive mode

  AUTOCOMMIT when is on, successful SQL command is automatically
commited
  COMP_KEYWORD_CASE  determines which letter case to use when completing an
SQL key word
  ECHO   all lines from input can be written to standard output
  ECHO_HIDDENdisplay queries for internal commands (same as -E
option)
  FETCH_COUNThow many rows should be for one page (default 0
unlimited)
  HISTFILE   file name that be used for store history list
  HISTSIZE   the number of commands to store in the command history
  ON_ERROR_ROLLBACK  when is on, raise ROLLBACK on error automatically
  ON_ERROR_STOP  when is set, then batch execution stop immediately
after error
  VERBOSITY  control verbosity of error reports [default, verbose,
terse]

Printing options:
Usage:
  psql --pset=NAME[=VALUE]
  or \pset NAME [VALUE] in interactive mode

  border number of border style
  fieldsep   specify field separator for unaligned output
  fieldsep_zero  field separator in unaligned mode will be zero
  format set output format [unaligned, aligned, wrapped, html,
latex, ..]
  linestyle  sets the border line drawing style [ascii, old-ascii,
unicode]
  null   sets the string to be printed in place of a null value
  pager  when the pager option is off, the pager program is not
used
  recordsep  specifies the record (line) separator to use in
unaligned output format
  recordsep_zero record separator be in unaligned output format a zero
byte
  title  sets the table title for any subsequently printed
tables
  tuples_onlyin tuples-only mode, only actual table data is shown

Environment options:
Usage:
  NAME=VALUE, [NAME=VALUE] psql ...
  or \setenv NAME [VALUE] in interactive mode

  COLUMNSnumber of columns for wrapped format
  PAGER  used pager
  PGHOST same as the host connection parameter
  PGDATABASE same as the dbname connection parameter
  PGUSER same as the user connection parameter
  PGPASSWORD possibility to set password
  PSQL_EDITOR, EDITOR, VISUAL  editor used by \e \ef commands
  PSQL_EDITOR_LINE_NUMBER_ARG  style how to line number is used in editor
  PSQL_HISTORY   alternative location for the command history file
  PSQL_RCalternative location of the user's .psqlrc file
  SHELL  command executed by the \! command
  TMPDIR directory for storing temporary files

For more information consult the psql section in the PostgreSQL
documentation.

Regards

Pavel



2014-02-28 23:01 GMT+01:00 Andrew Dunstan and...@dunslane.net:


 On 02/28/2014 04:38 PM, Tom Lane wrote:

 Andrew Dunstan and...@dunslane.net writes:

 Well, then we just have to add more info to --help

 +1 for at least doing that. I found it annoying just the other day not
 to find it in plsql's --help output, in a moment of brain fade when I
 forgot how to spell it. So it's not just beginners who can benefit, it's
 people like me whose memory occasionally goes awry.

 No objection in principle, but what are we talking about exactly?
 Adding some new backslash command that lists all the variables that have
 special meanings?



 That's a pretty good idea, especially if we give that command a command
 line option too, so something like

psql --special-variables

 would run that command and exit.

 Maybe I'm over-egging the pudding a bit ;-)

 cheers

 andrew

commit 82a6a61a11bdb76c00481abeccff42b4b532762e
Author: Pavel Stehule pavel.steh...@gmail.com
Date:   Sat Mar 1 09:34:47 2014 +0100

initial

diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index baa9417..fb77132 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -78,12 +78,13 @@ usage(void)
 	printf(_(  -f, --file=FILENAME  execute commands from file, then exit\n));
 	printf(_(  -l, --list   list available databases, then exit\n));
 	printf(_(  -v, --set=, --variable=NAME=VALUE\n
-			set psql variable NAME to VALUE\n));
+			set psql variable NAME to VALUE e.g.: -v ON_ERROR_STOP=1\n));
 	printf(_(  -V, --versionoutput version information, then exit\n));
 	printf(_(  -X, --no-psqlrc  do not read startup file (~/.psqlrc)\n));
 	printf(_(  -1 (\one\), --single-transaction\n
 			execute as a single transaction (if non-interactive)\n));
 	printf(_(  -?, --help   show this help, then exit\n));
+	printf(_(  --help-variables list of available configuration variables (options), then exit\n));
 
 	printf(_(\nInput 

[HACKERS] psql: show only failed queries

2014-03-01 Thread Pavel Stehule
Hello

I was asked, how can be showed only failed queries in psql.

I am thinking, so it is not possible now. But implementation is very simple

What do you think about it?

bash-4.1$ psql postgres -v ECHO=error -f data.sql
INSERT 0 1
Time: 27.735 ms
INSERT 0 1
Time: 8.303 ms
psql:data.sql:3: ERROR:  value too long for type character varying(2)
insert into foo values('bbb');
Time: 0.178 ms
INSERT 0 1
Time: 8.285 ms
psql:data.sql:5: ERROR:  value too long for type character varying(2)
insert into foo values('ss');
Time: 0.422 ms

Regards

Pavel
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 3a820fa..8354f60 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -958,6 +958,12 @@ SendQuery(const char *query)
 		results = NULL;			/* PQclear(NULL) does nothing */
 	}
 
+	if (!OK  pset.echo == PSQL_ECHO_ERROR_QUERIES)
+	{
+		puts(query);
+		fflush(stdout);
+	}
+
 	/* If we made a temporary savepoint, possibly release/rollback */
 	if (on_error_rollback_savepoint)
 	{
diff --git a/src/bin/psql/settings.h b/src/bin/psql/settings.h
index 3e8328d..ed80179 100644
--- a/src/bin/psql/settings.h
+++ b/src/bin/psql/settings.h
@@ -30,6 +30,7 @@
 typedef enum
 {
 	PSQL_ECHO_NONE,
+	PSQL_ECHO_ERROR_QUERIES,
 	PSQL_ECHO_QUERIES,
 	PSQL_ECHO_ALL
 } PSQL_ECHO;
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 1061992..666261f 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -712,6 +712,8 @@ echo_hook(const char *newval)
 {
 	if (newval == NULL)
 		pset.echo = PSQL_ECHO_NONE;
+	else if (strcmp(newval, error) == 0)
+		pset.echo = PSQL_ECHO_ERROR_QUERIES;
 	else if (strcmp(newval, queries) == 0)
 		pset.echo = PSQL_ECHO_QUERIES;
 	else if (strcmp(newval, all) == 0)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 1d69b95..641cff3 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3397,6 +3397,23 @@ psql_completion(char *text, int start, int end)
 	{
 		matches = complete_from_variables(text, , );
 	}
+	else if (strcmp(prev2_wd, \\set) == 0)
+	{
+		if (strcmp(prev_wd, ECHO) == 0)
+		{
+			static const char *const my_list[] =
+			{none, error, queries, all, NULL};
+
+			COMPLETE_WITH_LIST_CS(my_list);
+		}
+		else if (strcmp(prev_wd, ECHO_HIDDEN) == 0)
+		{
+			static const char *const my_list[] =
+			{noexec, off, on, NULL};
+
+			COMPLETE_WITH_LIST_CS(my_list);
+		}
+	}
 	else if (strcmp(prev_wd, \\sf) == 0 || strcmp(prev_wd, \\sf+) == 0)
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_functions, NULL);
 	else if (strcmp(prev_wd, \\cd) == 0 ||

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Stephen Frost
KaiGai,

* Kohei KaiGai (kai...@kaigai.gr.jp) wrote:
 BTW, this kind of discussion looks like a talk with a ghost because
 we cannot see the new interface according to the parallel execution
 right now, so we cannot have tangible investigation whether it becomes
 really serious backward incompatibility, or not.

Yeah, it would certainly be nice if we had all of the answers up-front.
What I keep hoping for is that someone who has been working on this area
(eg: Robert) would speak up...

 My bet is minor one. I cannot imagine plan-node interface that does
 not support existing non-parallel SeqScan or NestLoop and so on.

Sure you can- because once we change the interface, we're probably going
to go through and make everything use the new one rather than have to
special-case things.  That's more-or-less exactly my point here because
having an external hook like CustomScan would make that kind of
wholesale change more difficult.

That does *not* mean I'm against using GPUs and GPU optimizations.  What
it means is that I'd rather see that done in core, which would allow us
to simply change that interface along with the rest when doing wholesale
changes and not have to worry about backwards compatibility and breaking
other people's code.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] gaussian distribution pgbench

2014-03-01 Thread Alvaro Herrera
Seems that in the review so far, Fabien has focused mainly in the
mathematical properties of the new random number generation.  That seems
perfectly fine, but no comment has been made about the chosen UI for the
feature.  Per the few initial messages in the thread, in the patch as
submitted you ask for a gaussian random number by using \setgaussian,
and exponential via \setexp.  Is this the right UI?  Currently you get
an evenly distributed number with \setrandom.  There is nothing that
makes it obvious on \setgaussian by itself that it produces random
numbers.  Perhaps we should simply add a new argument to \setrandom,
instead of creating new commands for each distribution?  I would guess
that, in the future, we're going to want other distributions as well.

Not sure what it would look like; perhaps
\setrandom foo 1 10 gaussian
or 
\setrandom foo 1 10 dist=gaussian
or
\setrandom(gaussian) foo 1 10
or
\setrandom(dist=gaussian) foo 1 10

I think we could easily support

\set distrib gaussian
\setrandom(dist=:distrib) foo 1 10

so that it can be changed for a bunch of commands easily.

Or maybe I'm going overboard, everybody else is happy with \setgaussian,
and should just use that?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-03-01 Thread Alvaro Herrera
I didn't check the patch in detail, but it seems to me that both the
encode stuff as well as pgrand belong in src/common rather than
src/port.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-03-01 Thread Noah Misch
On Sat, Mar 01, 2014 at 12:48:08PM -0300, Alvaro Herrera wrote:
 I didn't check the patch in detail, but it seems to me that both the
 encode stuff as well as pgrand belong in src/common rather than
 src/port.

Since src/common exists only in 9.3 and up, that would mean putting them in
different libraries depending on the branch.  I did not think that worthwhile.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] trgm regex index peculiarity

2014-03-01 Thread Alexander Korotkov
On Mon, Feb 10, 2014 at 1:01 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Alexander Korotkov aekorot...@gmail.com writes:
  On Thu, Jan 16, 2014 at 3:34 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  I looked at this patch a bit.  It seems like this:
  +  *BLANK_COLOR_SIZE - How much blank character is more frequent
 than
  +  *   other character in average
  + #define BLANK_COLOR_SIZE  32
  is a drastic overestimate.

  OK, I would like to make more reasoning for penalty.
  Let's consider word of length n.
  It contains n+1 trigrams including:
  1 __x trigram
  1 _xx trigram
  1 xx_ trigram
  n - 2 xxx trigrams

  Assume alphabet of size m those trigrams have following average
 frequencies:
  __x 1/((n+1)*m)
  _xx 1/((n+1)*m^2)
  xx_ 1/((n+1)*m^2)
  xxx (n-2)/((n+1)*m^3)

 Hmm, but you're assuming that the m letters are all equally frequent,
 which is surely not true in normal text.

 I didn't have a machine-readable copy of Hemingway or Tolstoy at hand,
 but I do have a text file with the collected works of H.P. Lovecraft,
 so I tried analyzing the trigrams in that.  I find that
 * The average word length is 4.78 characters.
 * There are 5652 distinct trigrams (discounting some containing digits),
 the most common one ('  t') occurring 81222 times; the average
 occurrence count is 500.0.
 * Considering only trigrams not containing any blanks, there are
 4937 of them, the most common one ('the') occurring 45832 times,
 with an average count of 266.9.
 * There are (unsurprisingly) exactly 26 trigrams of the form '  x',
 with an average count of 19598.5.
 * There are 689 trigrams of the form ' xx' or 'xx ', the most common one
 (' th') occurring 58200 times, with an average count of 1450.1.

 So, we've rediscovered the fact that 'the' is the most common word in
 English text ;-).  But I think the significant conclusion for our purposes
 here is that single-space trigrams are on average about 1450.1/266.9 = 5.4
 times more common than the average space-free trigram, and two-space
 trigrams about 19598.5/266.9 = 73.4 times more common.

 So this leads me to the conclusion that we should be using a
 BLANK_COLOR_SIZE value around 5 or 6 (with maybe something other than
 a square-law rule for two-space trigrams).  Even using your numbers,
 it shouldn't be 32.


Next revision of patch is attached. Changes are so:
1) Notion penalty is used instead of size.
2) We try to reduce total penalty to WISH_TRGM_PENALTY, but restriction is
MAX_TRGM_COUNT total trigrams count.
3) Penalties are assigned to particular color trigram classes. I.e.
separate penalties for __a, _aa, _a_, aa_. It's based on analysis of
trigram frequencies in Oscar Wilde writings. We can end up with different
numbers, but I don't think they will be dramatically different.

--
With best regards,
Alexander Korotkov.


trgm-regex-optimize.3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gaussian distribution pgbench

2014-03-01 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Seems that in the review so far, Fabien has focused mainly in the
 mathematical properties of the new random number generation.  That seems
 perfectly fine, but no comment has been made about the chosen UI for the
 feature.  Per the few initial messages in the thread, in the patch as
 submitted you ask for a gaussian random number by using \setgaussian,
 and exponential via \setexp.  Is this the right UI?  Currently you get
 an evenly distributed number with \setrandom.  There is nothing that
 makes it obvious on \setgaussian by itself that it produces random
 numbers.  Perhaps we should simply add a new argument to \setrandom,
 instead of creating new commands for each distribution?  I would guess
 that, in the future, we're going to want other distributions as well.

+1 for an argument to \setrandom instead of separate commands.

 Not sure what it would look like; perhaps
 \setrandom foo 1 10 gaussian

FWIW, I think this style is sufficient; the others seem overcomplicated
for not much gain.  I'm not strongly attached to that position though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] commit fest status and release timeline

2014-03-01 Thread Peter Eisentraut
Status Summary. Needs Review: 36, Waiting on Author: 7, Ready for
Committer: 16, Committed: 43, Returned with Feedback: 8, Rejected: 4.
Total: 114.

We're still on track to achieve about 50% committed patches, which would
be similar to the previous few commit fests.  So decent job so far.

Which brings us to important news.  The core team has agreed on a
release timeline:

- Mar 15 end commit fest
- Apr 15 feature freeze
- May 15 beta

This is similar to the last few years, so it shouldn't come as a shock
to anyone.

Let's use the remaining two weeks to give all patches in the commit fest
fair consideration and a decent review.  The time to reject or postpone
patches will inevitably come in the time between the end of the commit
fest and feature freeze.  Note that it is everyone's individual
responsibility to move their favorite patch forward.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add support of IF NOT EXISTS to others CREATE statements

2014-03-01 Thread Tom Lane
=?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
 On Sat, Jan 18, 2014 at 11:12 PM, Stephen Frost sfr...@snowman.net wrote:
 Fabrízio, can you clarify the use-case for things like CREATE AGGREGATE
 to have IF NOT EXISTS rather than OR REPLACE, or if there is a reason
 why both should exist?  Complicating our CREATE options is not something
 we really wish to do without good reason and we certainly don't want to
 add something now that we'll wish to remove in another version or two.

 Well I have a scenario with many servers to deploy DDL scripts, and most of
 them we must run without transaction control because some tasks like CREATE
 INDEX CONCURRENTLY, DROP/CREATE DATABASE, CLUSTER, etc.

 When an error occurs the script stops, but the previous commands was
 commited, then we must review the script to comment parts that was already
 executed and then run it again. Until now is not a really trouble, but in
 some cases we must deploy another DDL script that contains a new version of
 some object before we finish to fix the previous version that was in
 production, and if we have CINE for all CREATE objects this task will more
 easy because we just run it again without care if will replace the content
 and do not produce an error.

Why wouldn't COR semantics answer that requirement just as well, if not
better?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-03-01 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 As announced with last week's releases, use of trust authentication in the
 make check temporary database cluster makes it straightforward to hijack the
 OS user account involved.  The prerequisite is another user account on the
 same system.  The solution we discussed on secur...@postgresql.org was to
 switch to md5 authentication with a generated, strong password.

Noah is leaving the impression that there was consensus on that approach;
there was not, which is one reason that this patch didn't simply get
committed last week.

There are two big problems with the lets-generate-a-random-password
approach.  Noah acknowledged the portability issue of possibly not having
a strong entropy source available.  The other issue though is whether
doing this doesn't introduce enough crypto dependency into the core code
to create an export-restrictions hazard.  We've kept contrib/pgcrypto
out of core all these years in order to give people a relatively
straightforward solution if they are worried about export laws: just omit
contrib/pgcrypto.  But just omit regression testing isn't palatable.

In the case of Unix systems, there is a *far* simpler and more portable
solution technique, which is to tell the test postmaster to put its socket
in some non-world-accessible directory created by the test scaffolding.

Of course that doesn't work for Windows, which is why we looked at the
random-password solution.  But I wonder whether we shouldn't use the
nonstandard-socket-location approach everywhere else, and only use random
passwords on Windows.  That would greatly reduce the number of cases to
worry about for portability of the password-generation code; and perhaps
we could also push the crypto issue into reliance on some Windows-supplied
functionality (though I'm just speculating about that part).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-03-01 Thread Andrew Dunstan


On 03/01/2014 12:29 PM, Tom Lane wrote:



In the case of Unix systems, there is a *far* simpler and more portable
solution technique, which is to tell the test postmaster to put its socket
in some non-world-accessible directory created by the test scaffolding.



+1 - I'm all for KISS.



Of course that doesn't work for Windows, which is why we looked at the
random-password solution.  But I wonder whether we shouldn't use the
nonstandard-socket-location approach everywhere else, and only use random
passwords on Windows.  That would greatly reduce the number of cases to
worry about for portability of the password-generation code; and perhaps
we could also push the crypto issue into reliance on some Windows-supplied
functionality (though I'm just speculating about that part).



See for example 
http://msdn.microsoft.com/en-us/library/windows/desktop/aa379942%28v=vs.85%29.aspx


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-03-01 Thread Magnus Hagander
On Sat, Mar 1, 2014 at 7:09 PM, Andrew Dunstan and...@dunslane.net wrote:


 On 03/01/2014 12:29 PM, Tom Lane wrote:


 In the case of Unix systems, there is a *far* simpler and more portable
 solution technique, which is to tell the test postmaster to put its socket
 in some non-world-accessible directory created by the test scaffolding.



 +1 - I'm all for KISS.



 Of course that doesn't work for Windows, which is why we looked at the
 random-password solution.  But I wonder whether we shouldn't use the
 nonstandard-socket-location approach everywhere else, and only use random
 passwords on Windows.  That would greatly reduce the number of cases to
 worry about for portability of the password-generation code; and perhaps
 we could also push the crypto issue into reliance on some Windows-supplied
 functionality (though I'm just speculating about that part).



 See for example http://msdn.microsoft.com/en-us/library/windows/desktop/
 aa379942%28v=vs.85%29.aspx


For a one-off password used locally only, we could also consider just using
a guid, and generate it using
http://msdn.microsoft.com/en-us/library/windows/desktop/aa379205(v=vs.85).aspx.
Obviously windows only though - do we have *any* Unix platforms that can't
do unix sockets?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-03-01 Thread Noah Misch
On Sat, Mar 01, 2014 at 12:29:38PM -0500, Tom Lane wrote:
 There are two big problems with the lets-generate-a-random-password
 approach.  Noah acknowledged the portability issue of possibly not having
 a strong entropy source available.  The other issue though is whether
 doing this doesn't introduce enough crypto dependency into the core code
 to create an export-restrictions hazard.  We've kept contrib/pgcrypto
 out of core all these years in order to give people a relatively
 straightforward solution if they are worried about export laws: just omit
 contrib/pgcrypto.  But just omit regression testing isn't palatable.

If pgrand.c poses an export control problem, then be-secure.c poses a greater
one.  pgrand.c is just glue to an OS-provided CSPRNG.

 In the case of Unix systems, there is a *far* simpler and more portable
 solution technique, which is to tell the test postmaster to put its socket
 in some non-world-accessible directory created by the test scaffolding.
 
 Of course that doesn't work for Windows, which is why we looked at the
 random-password solution.  But I wonder whether we shouldn't use the
 nonstandard-socket-location approach everywhere else, and only use random
 passwords on Windows.  That would greatly reduce the number of cases to
 worry about for portability of the password-generation code;

Further worrying about systems lacking /dev/urandom is a waste of time.  They
will only get less common, and they are already rare enough that we have zero
of them on the buildfarm.  I provided them with a straightforward workaround:
point the PGTESTPWFILE environment variable to a file containing a password.

Having that said, I can appreciate the value of tightening the socket mode for
a bit of *extra* safety:

--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -2299,4 +2299,5 @@ regression_main(int argc, char *argv[], init_function 
ifunc, test_function tfunc
fputs(\n# Configuration added by pg_regress\n\n, pg_conf);
fputs(max_prepared_transactions = 2\n, pg_conf);
+   fputs(unix_socket_permissions = 0700\n, pg_conf);

Taking the further step of retaining trust authentication on Unix would make
it easier to commit tests guaranteed to fail on Windows.  I see no benefit
cancelling that out.

 and perhaps
 we could also push the crypto issue into reliance on some Windows-supplied
 functionality (though I'm just speculating about that part).

Every version of the patch has done it that way.  It has used OS-supplied
cryptography on every platform.

nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] commit fest status and release timeline

2014-03-01 Thread Josh Berkus
On 03/01/2014 09:01 AM, Peter Eisentraut wrote:
 Status Summary. Needs Review: 36, Waiting on Author: 7, Ready for
 Committer: 16, Committed: 43, Returned with Feedback: 8, Rejected: 4.
 Total: 114.
 
 We're still on track to achieve about 50% committed patches, which would
 be similar to the previous few commit fests.  So decent job so far.

So, other than Hstore2/JSONB and Logical Changesets, what are the
big/difficult patches left?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] `pg_dump -Fd` doesn't check write return status...

2014-03-01 Thread Sean Chittenden
The attached patch fixes the case when `pg_dump -Fd …` is called on a partition 
where write(2) fails for some reason or another. In this case, backup jobs were 
returning with a successful exit code even though most of the files in the dump 
directory were all zero length.

I haven’t tested this patch’s failure conditions but the fix seems simple 
enough: cfwrite() needs to have its return status checked everywhere and 
exit_horribly() upon any failure. In this case, callers of _WriteData() were 
not checking the return status and were discarding the negative return status 
(e.g. ENOSPC).

I made a cursory pass over the code and found one other instance where write 
status wasn’t being checked and also included that.

-sc




pg_dump_check_write.patch
Description: Binary data


--
Sean Chittenden
s...@chittenden.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe

2014-03-01 Thread Vik Fearing
On 03/01/2014 12:06 PM, Simon Riggs wrote:
 On 27 February 2014 08:48, Simon Riggs si...@2ndquadrant.com wrote:
 On 26 February 2014 15:25, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-26 15:15:00 +, Simon Riggs wrote:
 On 26 February 2014 13:38, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 On 2014-02-26 07:32:45 +, Simon Riggs wrote:
 * This definitely should include isolationtester tests actually
   performing concurrent ALTER TABLEs. All that's currently there is
   tests that the locklevel isn't too high, but not that it actually 
 works.
 There is no concurrent behaviour here, hence no code that would be
 exercised by concurrent tests.
 Huh? There's most definitely new concurrent behaviour. Previously no
 other backends could have a relation open (and locked) while it got
 altered (which then sends out relcache invalidations). That's something
 that should be tested.
 It has been. High volume concurrent testing has been performed, per
 Tom's original discussion upthread, but that's not part of the test
 suite.
 Yea, that's not what I am looking for.

 For other tests I have no guide as to how to write a set of automated
 regression tests. Anything could cause a failure, so I'd need to write
 an infinite set of tests to prove there is no bug *somewhere*. How
 many tests are required? 0, 1, 3, 30?
 I think some isolationtester tests for the most important changes in
 lock levels are appropriate. Say, create a PRIMARY KEY, DROP INHERIT,
 ...  while a query is in progress in a nother session.
 OK, I'll work on some tests.

 v18 attached, with v19 coming soon
 v19 complete apart from requested comment additions

I've started to look at this patch and re-read the thread.  The first
thing I noticed is what seems like an automated replace error.  The docs
say This form requires only an SHARE x EXCLUSIVE lock. where the an
was not changed to a.

Attached is a patch-on-patch to fix this.  A more complete review will
come later.

-- 
Vik

*** a/doc/src/sgml/ref/alter_table.sgml
--- b/doc/src/sgml/ref/alter_table.sgml
***
*** 157,163  ALTER TABLE [ IF EXISTS ] replaceable class=PARAMETERname/replaceable
table to change.
   /para
   para
!   This form requires only an literalSHARE ROW EXCLUSIVE/literal lock.
   /para
  /listitem
 /varlistentry
--- 157,163 
table to change.
   /para
   para
!   This form requires only a literalSHARE ROW EXCLUSIVE/literal lock.
   /para
  /listitem
 /varlistentry
***
*** 188,194  ALTER TABLE [ IF EXISTS ] replaceable class=PARAMETERname/replaceable
xref linkend=planner-stats.
   /para
   para
!   This form requires only an literalSHARE UPDATE EXCLUSIVE/literal lock.
   /para
  /listitem
 /varlistentry
--- 188,194 
xref linkend=planner-stats.
   /para
   para
!   This form requires only a literalSHARE UPDATE EXCLUSIVE/literal lock.
   /para
  /listitem
 /varlistentry
***
*** 223,229  ALTER TABLE [ IF EXISTS ] replaceable class=PARAMETERname/replaceable
planner, refer to xref linkend=planner-stats.
   /para
   para
!   This form requires only an literalSHARE UPDATE EXCLUSIVE/literal lock.
   /para
  /listitem
 /varlistentry
--- 223,229 
planner, refer to xref linkend=planner-stats.
   /para
   para
!   This form requires only a literalSHARE UPDATE EXCLUSIVE/literal lock.
   /para
  /listitem
 /varlistentry
***
*** 344,350  ALTER TABLE [ IF EXISTS ] replaceable class=PARAMETERname/replaceable
created. Currently only foreign key constraints may be altered.
   /para
   para
!   This form requires only an literalSHARE ROW EXCLUSIVE/literal lock.
   /para
  /listitem
 /varlistentry
--- 344,350 
created. Currently only foreign key constraints may be altered.
   /para
   para
!   This form requires only a literalSHARE ROW EXCLUSIVE/literal lock.
   /para
  /listitem
 /varlistentry
***
*** 366,372  ALTER TABLE [ IF EXISTS ] replaceable class=PARAMETERname/replaceable
does not prevent normal write commands against the table while it runs.
   /para
   para
!   This form requires only an literalSHARE UPDATE EXCLUSIVE/literal lock
on the table being altered. If the constraint is a foreign key then
a literalROW SHARE/literal lock is also required on
the table referenced by the constraint.
--- 366,372 
does not prevent normal write commands against the table while it runs.
   /para
   para
!   This form requires only a literalSHARE UPDATE EXCLUSIVE/literal lock
on the table being altered. If the constraint is a foreign key then
a literalROW SHARE/literal lock is also required on
the table referenced by the constraint.
***
*** 

Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-03-01 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 In the case of Unix systems, there is a *far* simpler and more portable
 solution technique, which is to tell the test postmaster to put its socket
 in some non-world-accessible directory created by the test scaffolding.

Yes, yes, yes.

 Of course that doesn't work for Windows, which is why we looked at the
 random-password solution.  But I wonder whether we shouldn't use the
 nonstandard-socket-location approach everywhere else, and only use random
 passwords on Windows.  That would greatly reduce the number of cases to
 worry about for portability of the password-generation code; and perhaps
 we could also push the crypto issue into reliance on some Windows-supplied
 functionality (though I'm just speculating about that part).

Multi-user Windows build systems are *far* more rare than unix
equivilants (though even those are semi-rare in these days w/ all the
VMs running around, but still, you may have University common unix
systems with students building PG- the same just doesn't exist in my
experience on the Windows side).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-03-01 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 For a one-off password used locally only, we could also consider just using
 a guid, and generate it using
 http://msdn.microsoft.com/en-us/library/windows/desktop/aa379205(v=vs.85).aspx.

Not sure if that API is intended to create an unpredictable UUID, rather
than just a unique one.

 Obviously windows only though - do we have *any* Unix platforms that can't
 do unix sockets?

I'm not aware of any.  A look into the git history of pg_config_manual.h
shows that QNX and BEOS used to be marked as not having Unix sockets,
but of course we dropped support for those in 2006.  I'd rather bet on
all non-Windows platforms have Unix sockets than all non-Windows
platforms have /dev/urandom; the former standard is far older than
the latter.

One other thought here: is it actually reasonable to expend a lot of effort
on the Windows case?  I'm not aware that people normally expect a Windows
box to have multiple users at all, let alone non-mutually-trusting users.

BTW, a different problem with the proposed patch is that it changes
some test cases in ecpg and contrib/dblink, apparently to avoid session
reconnections.  That seems likely to me to be losing test coverage.
Perhaps there is no alternative, but I'd like to have some discussion
around that point as well.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add support of IF NOT EXISTS to others CREATE statements

2014-03-01 Thread Tom Lane
=?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
 On Sat, Mar 1, 2014 at 2:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 [ re schema upgrade scenarios ]
 Why wouldn't COR semantics answer that requirement just as well, if not
 better?

 Just because it will replace the object content... and in some cases this
 cannot happen because it will regress the schema to an old version.

That argument seems awfully darn flimsy.  On what grounds would you argue
that the script you're sourcing contains versions you want of objects that
aren't there, but not versions you want of objects that are there?  If
the script is out of date, it seems more likely that you'd end up with
back-rev versions of the newly created objects, which very possibly won't
interact well with the newer objects that were already in the database.

In any case, given the existence of DO it's simple to code up
create-if-not-exists behavior with a couple lines of plpgsql; that seems
to me to be a sufficient answer for corner cases.  create-or-replace is
not equivalently fakable if the system doesn't supply the functionality.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add support of IF NOT EXISTS to others CREATE statements

2014-03-01 Thread Fabrízio de Royes Mello
On Sat, Mar 1, 2014 at 2:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com
writes:
  On Sat, Jan 18, 2014 at 11:12 PM, Stephen Frost sfr...@snowman.net
wrote:
  Fabrízio, can you clarify the use-case for things like CREATE AGGREGATE
  to have IF NOT EXISTS rather than OR REPLACE, or if there is a reason
  why both should exist?  Complicating our CREATE options is not
something
  we really wish to do without good reason and we certainly don't want to
  add something now that we'll wish to remove in another version or two.

  Well I have a scenario with many servers to deploy DDL scripts, and
most of
  them we must run without transaction control because some tasks like
CREATE
  INDEX CONCURRENTLY, DROP/CREATE DATABASE, CLUSTER, etc.

  When an error occurs the script stops, but the previous commands was
  commited, then we must review the script to comment parts that was
already
  executed and then run it again. Until now is not a really trouble, but
in
  some cases we must deploy another DDL script that contains a new
version of
  some object before we finish to fix the previous version that was in
  production, and if we have CINE for all CREATE objects this task will
more
  easy because we just run it again without care if will replace the
content
  and do not produce an error.

 Why wouldn't COR semantics answer that requirement just as well, if not
 better?


Just because it will replace the object content... and in some cases this
cannot happen because it will regress the schema to an old version.

I know it's a very specific use case, but in a scenario with many servers
and many automated tasks in different pipelines, CINE will be very useful.
I have this kind of troubles mostly with functions (we use COR), and
sometimes we will discover that the production version of function is wrong
after we receive a user notify, and in this situation many times we spend a
lot of effort do fix the whole damage.

Grettings,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQ
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] commit fest status and release timeline

2014-03-01 Thread Vik Fearing
On 03/01/2014 07:50 PM, Josh Berkus wrote:
 On 03/01/2014 09:01 AM, Peter Eisentraut wrote:
 Status Summary. Needs Review: 36, Waiting on Author: 7, Ready for
 Committer: 16, Committed: 43, Returned with Feedback: 8, Rejected: 4.
 Total: 114.

 We're still on track to achieve about 50% committed patches, which would
 be similar to the previous few commit fests.  So decent job so far.
 So, other than Hstore2/JSONB and Logical Changesets, what are the
 big/difficult patches left?

For me, I'd really like to see the reduced locks on ALTER TABLE. 

-- 
Vik



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-03-01 Thread Andrew Dunstan


On 03/01/2014 05:10 PM, Tom Lane wrote:


One other thought here: is it actually reasonable to expend a lot of effort
on the Windows case?  I'm not aware that people normally expect a Windows
box to have multiple users at all, let alone non-mutually-trusting users.



As Stephen said, it's fairly unusual. There are usually quite a few 
roles, but it's rare to have more than one human type role connected 
to the machine at a given time.


I'd be happy doing nothing in this case, or not very much. e.g. provide 
a password but not with great cryptographic strength.




BTW, a different problem with the proposed patch is that it changes
some test cases in ecpg and contrib/dblink, apparently to avoid session
reconnections.  That seems likely to me to be losing test coverage.
Perhaps there is no alternative, but I'd like to have some discussion
around that point as well.





Yeah. Assuming we make the changes you're suggesting that should no 
longer be necessary, right?


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: new long psql parameter --on-error-stop

2014-03-01 Thread Fabrízio de Royes Mello
On Sat, Mar 1, 2014 at 5:37 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:

 Hello

 here is a prototype:

 bash-4.1$ /usr/local/pgsql/bin/psql --help-variables
 List of some variables (options) for use from command line.
 Complete list you find in psql section in the PostgreSQL documentation.

 psql variables:
 Usage:
   psql --set=NAME=VALUE
   or \set NAME VALUE in interactive mode

   AUTOCOMMIT when is on, successful SQL command is automatically
commited
   COMP_KEYWORD_CASE  determines which letter case to use when completing
an SQL key word
   ECHO   all lines from input can be written to standard
output
   ECHO_HIDDENdisplay queries for internal commands (same as -E
option)
   FETCH_COUNThow many rows should be for one page (default 0
unlimited)
   HISTFILE   file name that be used for store history list
   HISTSIZE   the number of commands to store in the command
history
   ON_ERROR_ROLLBACK  when is on, raise ROLLBACK on error automatically
   ON_ERROR_STOP  when is set, then batch execution stop immediately
after error
   VERBOSITY  control verbosity of error reports [default,
verbose, terse]

 Printing options:
 Usage:
   psql --pset=NAME[=VALUE]
   or \pset NAME [VALUE] in interactive mode

   border number of border style
   fieldsep   specify field separator for unaligned output
   fieldsep_zero  field separator in unaligned mode will be zero
   format set output format [unaligned, aligned, wrapped,
html, latex, ..]
   linestyle  sets the border line drawing style [ascii,
old-ascii, unicode]
   null   sets the string to be printed in place of a null
value
   pager  when the pager option is off, the pager program is
not used
   recordsep  specifies the record (line) separator to use in
unaligned output format
   recordsep_zero record separator be in unaligned output format a
zero byte
   title  sets the table title for any subsequently printed
tables
   tuples_onlyin tuples-only mode, only actual table data is shown

 Environment options:
 Usage:
   NAME=VALUE, [NAME=VALUE] psql ...
   or \setenv NAME [VALUE] in interactive mode

   COLUMNSnumber of columns for wrapped format
   PAGER  used pager
   PGHOST same as the host connection parameter
   PGDATABASE same as the dbname connection parameter
   PGUSER same as the user connection parameter
   PGPASSWORD possibility to set password
   PSQL_EDITOR, EDITOR, VISUAL  editor used by \e \ef commands
   PSQL_EDITOR_LINE_NUMBER_ARG  style how to line number is used in editor
   PSQL_HISTORY   alternative location for the command history file
   PSQL_RCalternative location of the user's .psqlrc file
   SHELL  command executed by the \! command
   TMPDIR directory for storing temporary files

 For more information consult the psql section in the PostgreSQL
 documentation.


The patch is ok (apply to master and apply to master without errors).

Maybe we must show the possible values for each variable/option too.

Thinking more about it, would be nice if we have the possibility to show
help for commands too. Some like that:

$ psql -H vacuum
Command: VACUUM
Description: garbage-collect and optionally analyze a database
Syntax:
VACUUM [ ( { FULL | FREEZE | VERBOSE | ANALYZE } [, ...] ) ] [ table_name [
(column_name [, ...] ) ] ]
VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ table_name ]
VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ table_name [ (column_name
[, ...] ) ] ]

$ psql --help-command=vacuum
Command: VACUUM
Description: garbage-collect and optionally analyze a database
Syntax:
VACUUM [ ( { FULL | FREEZE | VERBOSE | ANALYZE } [, ...] ) ] [ table_name [
(column_name [, ...] ) ] ]
VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ table_name ]
VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ table_name [ (column_name
[, ...] ) ] ]

It's only an idea that occurred to me reading this thread!

Grettings,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Review: Patch FORCE_NULL option for copy COPY in CSV mode

2014-03-01 Thread Andrew Dunstan


On 01/29/2014 10:59 AM, Ian Lawrence Barwick wrote:

2014/1/29 Ian Lawrence Barwick barw...@gmail.com:

2014-01-29 Andrew Dunstan and...@dunslane.net:

On 01/28/2014 05:55 AM, Ian Lawrence Barwick wrote:


Hi Payal

Many thanks for the review, and my apologies for not getting back to
you earlier.

Updated version of the patch attached with suggested corrections.

On a very quick glance, I see that you have still not made adjustments to
contrib/file_fdw to accommodate this new option. I don't see why this COPY
option should be different in that respect.

Hmm, that idea seems to have escaped me completely. I'll get onto it forthwith.

Striking while the keyboard is hot... version with contrib/file_fdw
modifications
attached.






I have reviewed this. Generally it's good, but the author has made a 
significant error - the idea is not to force a quoted empty string to 
null, but to force a quoted null string to null, whatever the null 
string might be. The default case has these the same, but if you specify 
a non-empty null string they aren't.


That difference actually made the file_fdw regression results plain 
wrong, in my view, in that they expected a quoted empty string to be 
turned to null even when the null string was something else.


I've adjusted this and the docs and propose to apply the attached patch 
in the next day or two unless there are any objections.


cheers

andrew



diff --git a/contrib/file_fdw/data/text.csv b/contrib/file_fdw/data/text.csv
index ed348a9..f55d9cf 100644
--- a/contrib/file_fdw/data/text.csv
+++ b/contrib/file_fdw/data/text.csv
@@ -1,4 +1,5 @@
-AAA,aaa
-XYZ,xyz
-NULL,NULL
-ABC,abc
+AAA,aaa,123,
+XYZ,xyz,,321
+NULL,NULL,NULL,NULL
+NULL,NULL,NULL,NULL
+ABC,abc,,
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
index 5639f4d..7fb1dbc 100644
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@@ -48,9 +48,9 @@ struct FileFdwOption
 
 /*
  * Valid options for file_fdw.
- * These options are based on the options for COPY FROM command.
- * But note that force_not_null is handled as a boolean option attached to
- * each column, not as a table option.
+ * These options are based on the options for the COPY FROM command.
+ * But note that force_not_null and force_null are handled as boolean options
+ * attached to a column, not as table options.
  *
  * Note: If you are adding new option for user mapping, you need to modify
  * fileGetOptions(), which currently doesn't bother to look at user mappings.
@@ -69,7 +69,7 @@ static const struct FileFdwOption valid_options[] = {
 	{null, ForeignTableRelationId},
 	{encoding, ForeignTableRelationId},
 	{force_not_null, AttributeRelationId},
-
+	{force_null, AttributeRelationId},
 	/*
 	 * force_quote is not supported by file_fdw because it's for COPY TO.
 	 */
@@ -187,6 +187,7 @@ file_fdw_validator(PG_FUNCTION_ARGS)
 	Oid			catalog = PG_GETARG_OID(1);
 	char	   *filename = NULL;
 	DefElem*force_not_null = NULL;
+	DefElem*force_null = NULL;
 	List	   *other_options = NIL;
 	ListCell   *cell;
 
@@ -243,10 +244,10 @@ file_fdw_validator(PG_FUNCTION_ARGS)
 		}
 
 		/*
-		 * Separate out filename and force_not_null, since ProcessCopyOptions
-		 * won't accept them.  (force_not_null only comes in a boolean
-		 * per-column flavor here.)
+		 * Separate out filename and column-specific options, since
+		 * ProcessCopyOptions won't accept them.
 		 */
+
 		if (strcmp(def-defname, filename) == 0)
 		{
 			if (filename)
@@ -255,16 +256,42 @@ file_fdw_validator(PG_FUNCTION_ARGS)
 		 errmsg(conflicting or redundant options)));
 			filename = defGetString(def);
 		}
+		/*
+		 * force_not_null is a boolean option; after validation we can discard
+		 * it - it will be retrieved later in get_file_fdw_attribute_options()
+		 */
 		else if (strcmp(def-defname, force_not_null) == 0)
 		{
 			if (force_not_null)
 ereport(ERROR,
 		(errcode(ERRCODE_SYNTAX_ERROR),
-		 errmsg(conflicting or redundant options)));
+		 errmsg(conflicting or redundant options),
+		 errhint(option \force_not_null\ supplied more than once for a column)));
+			if(force_null)
+ereport(ERROR,
+		(errcode(ERRCODE_SYNTAX_ERROR),
+		 errmsg(conflicting or redundant options),
+		 errhint(option \force_not_null\ cannot be used together with \force_null\)));
 			force_not_null = def;
 			/* Don't care what the value is, as long as it's a legal boolean */
 			(void) defGetBoolean(def);
 		}
+		/* See comments for force_not_null above */
+		else if (strcmp(def-defname, force_null) == 0)
+		{
+			if (force_null)
+ereport(ERROR,
+		(errcode(ERRCODE_SYNTAX_ERROR),
+		 errmsg(conflicting or redundant options),
+		 errhint(option \force_null\ supplied more than once for a column)));
+			if(force_not_null)
+ereport(ERROR,
+		(errcode(ERRCODE_SYNTAX_ERROR),
+		 errmsg(conflicting or redundant options),
+		 errhint(option \force_null\ cannot be used together with 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Kohei KaiGai
2014-03-01 22:38 GMT+09:00 Stephen Frost sfr...@snowman.net:
 KaiGai,

 * Kohei KaiGai (kai...@kaigai.gr.jp) wrote:
 BTW, this kind of discussion looks like a talk with a ghost because
 we cannot see the new interface according to the parallel execution
 right now, so we cannot have tangible investigation whether it becomes
 really serious backward incompatibility, or not.

 Yeah, it would certainly be nice if we had all of the answers up-front.
 What I keep hoping for is that someone who has been working on this area
 (eg: Robert) would speak up...

I'd also like to see his opinion.

 My bet is minor one. I cannot imagine plan-node interface that does
 not support existing non-parallel SeqScan or NestLoop and so on.

 Sure you can- because once we change the interface, we're probably going
 to go through and make everything use the new one rather than have to
 special-case things.  That's more-or-less exactly my point here because
 having an external hook like CustomScan would make that kind of
 wholesale change more difficult.

I think, we should follow the general rule in case of custom-scan also.
In other words, it's responsibility of extension's author to follow up the
latest specification of interfaces.
For example, I have an extension module that is unable to work on the
latest PG- code because of interface changes at ProcessUtility_hook.
Is it a matter of backward incompatibility? Definitely, no. It should be
my job.

 That does *not* mean I'm against using GPUs and GPU optimizations.  What
 it means is that I'd rather see that done in core, which would allow us
 to simply change that interface along with the rest when doing wholesale
 changes and not have to worry about backwards compatibility and breaking
 other people's code.

I also have to introduce a previous background discussion.
Now we have two options for GPU programming: CUDA or OpenCL.
Both of libraries and drivers are provided under the proprietary license,
so it does not fit for the core implementation of PostgreSQL, but
extensions that shall be installed on user's responsibility.
Because of the story, I brought up a discussion about pluggable
planner/executor node (as a basis of GPU acceleration) in the last
developer meeting, then has worked for custom-scan node interface.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Stephen Frost
KaiGai,

* Kohei KaiGai (kai...@kaigai.gr.jp) wrote:
 Now we have two options for GPU programming: CUDA or OpenCL.
 Both of libraries and drivers are provided under the proprietary license,
 so it does not fit for the core implementation of PostgreSQL, but
 extensions that shall be installed on user's responsibility.

Being able to work with libraries which are not BSD licensed doesn't
change the license under which PostgreSQL code is released.  Nor does it
require PostgreSQL to be licensed in any different way from how it is
today.  Where it would get a bit ugly, I agree, is for the packagers who
have to decide if they want to build against those libraries or not.  We
might be able to make things a bit easier by having a startup-time
determination of if these nodes are able to be used or not.  This isn't
unlike OpenSSL which certainly isn't BSD nor is it even GPL-compatible,
a situation which causes a great deal of difficulty already due to the
whole readline nonsense- but it's difficulty for the packagers, not for
the PostgreSQL project, per se.

 Because of the story, I brought up a discussion about pluggable
 planner/executor node (as a basis of GPU acceleration) in the last
 developer meeting, then has worked for custom-scan node interface.

And I'm still unconvinced of this approach and worry that it's going to
break more often than it works.  That's my 2c on it, but I won't get in
the way if someone else wants to step up and support it.  As I mentioned
up-thread, I'd really like to see FDW join push-down, FDW aggregate
push-down, parallel query execution, and parallel remote-FDW execution
and I don't see this CustomScan approach as the right answer to any of
those.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Kohei KaiGai
2014-03-02 9:51 GMT+09:00 Stephen Frost sfr...@snowman.net:
 KaiGai,

 * Kohei KaiGai (kai...@kaigai.gr.jp) wrote:
 Now we have two options for GPU programming: CUDA or OpenCL.
 Both of libraries and drivers are provided under the proprietary license,
 so it does not fit for the core implementation of PostgreSQL, but
 extensions that shall be installed on user's responsibility.

 Being able to work with libraries which are not BSD licensed doesn't
 change the license under which PostgreSQL code is released.  Nor does it
 require PostgreSQL to be licensed in any different way from how it is
 today.  Where it would get a bit ugly, I agree, is for the packagers who
 have to decide if they want to build against those libraries or not.  We
 might be able to make things a bit easier by having a startup-time
 determination of if these nodes are able to be used or not.  This isn't
 unlike OpenSSL which certainly isn't BSD nor is it even GPL-compatible,
 a situation which causes a great deal of difficulty already due to the
 whole readline nonsense- but it's difficulty for the packagers, not for
 the PostgreSQL project, per se.

As you mentioned, it is a headache for packagers, and does not make
sense for us if packager disabled the feature that requires proprietary
drivers. In fact, Fedora / RHEL does not admit to distribute software
under the none open source software license. Obviously, nvidia's cuda
is a library being distributed under the proprietary license, thus out of
the scope for the Linux distributors. It also leads them to turn off the
feature that shall be linked with proprietary drivers.
All we can do is to implement these features as extension, then offer
an option for users to use or not to use.

 Because of the story, I brought up a discussion about pluggable
 planner/executor node (as a basis of GPU acceleration) in the last
 developer meeting, then has worked for custom-scan node interface.

 And I'm still unconvinced of this approach and worry that it's going to
 break more often than it works.  That's my 2c on it, but I won't get in
 the way if someone else wants to step up and support it.  As I mentioned
 up-thread, I'd really like to see FDW join push-down, FDW aggregate
 push-down, parallel query execution, and parallel remote-FDW execution
 and I don't see this CustomScan approach as the right answer to any of
 those.

It's right approach for FDW functionality enhancement, I never opposed to.

What I'd like to implement is GPU acceleration that can perform on
regular tables, not only foreign tables. Also, regarding to the backlog
in the developer meeting, pluggable executor node is also required
feature by PG-XC folks to work their project with upstream.
I think custom-scan feature is capable to host these requirement,
and does not prevent enhancement FDW features.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Stephen Frost
* Kohei KaiGai (kai...@kaigai.gr.jp) wrote:
 As you mentioned, it is a headache for packagers, and does not make
 sense for us if packager disabled the feature that requires proprietary
 drivers.

No, I disagree with that.  I don't expect this use-case to be very
common to begin with and telling individuals that they have to compile
it themselves is certainly not out of the question.

 In fact, Fedora / RHEL does not admit to distribute software
 under the none open source software license.

I'm pretty confident you can get RPMs for those distributions.

 Obviously, nvidia's cuda
 is a library being distributed under the proprietary license, thus out of
 the scope for the Linux distributors. 

This also doesn't make any sense to me- certainly the CUDA libraries are
available under Debian derivatives, along with open-source wrappers for
them like pycuda.

 It also leads them to turn off the
 feature that shall be linked with proprietary drivers.
 All we can do is to implement these features as extension, then offer
 an option for users to use or not to use.

No, we can tell individuals who want it that they're going to need to
build with support for it.  We don't offer OpenSSL as an extension (I
certainly wish we did- and had a way to replace it w/ GNUTLS or one of
the better licensed options).

 What I'd like to implement is GPU acceleration that can perform on
 regular tables, not only foreign tables. Also, regarding to the backlog
 in the developer meeting, pluggable executor node is also required
 feature by PG-XC folks to work their project with upstream.
 I think custom-scan feature is capable to host these requirement,
 and does not prevent enhancement FDW features.

I think you're conflating things again- while it might be possible to
use CustomScan to implement FDW join-pushdown or FDW aggregate-pushdown,
*I* don't think it's the right approach.  Regarding the PG-XC
requirement, I expect they're looking for FDW join/aggregate-pushdown
and also see that it *could* be done w/ CustomScan.

We could punt on the whole thing and drop in hooks which could be used
to replace everything done from the planner through to the executor and
then anyone *could* implement any of the above, and parallel query too.
That doesn't make it the right approach.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Robert Haas
On Wed, Feb 26, 2014 at 3:02 AM, Stephen Frost sfr...@snowman.net wrote:
 The custom-scan node is intended to perform on regular relations, not
 only foreign tables. It means a special feature (like GPU acceleration)
 can perform transparently for most of existing applications. Usually,
 it defines regular tables for their work on installation, not foreign
 tables. It is the biggest concern for me.

 The line between a foreign table and a local one is becoming blurred
 already, but still, if this is the goal then I really think the
 background worker is where you should be focused, not on this Custom
 Scan API.  Consider that, once we've got proper background workers,
 we're going to need new nodes which operate in parallel (or some other
 rejiggering of the nodes- I don't pretend to know exactly what Robert is
 thinking here, and I've apparently forgotten it if he's posted it
 somewhere) and those interfaces may drive changes which would impact the
 Custom Scan API- or worse, make us deprecate or regret having added it
 because now we'll need to break backwards compatibility to add in the
 parallel node capability to satisfy the more general non-GPU case.

This critique seems pretty odd to me.  I haven't had the time to look
at this patch set, but I don't see why anyone would want to use the
background worker facility for GPU acceleration, which is what
KaiGai's trying to accomplish here.  Surely you want, if possible, to
copy the data directly from the user backend into the GPU's working
memory.  What would the use of a background worker be?  We definitely
need background workers to make use of additional *CPUs*, but I don't
see what good they are in leveraging *GPUs*.

I seriously doubt there's any real conflict with parallelism here.
Parallelism may indeed add more ways to scan a relation
(ParallelSeqScan, ParallelIndexScan?) but that doesn't mean that we
shouldn't have a Custom Scan node too.  Indeed, my principle concern
about this patch set isn't that it's too specialized, as you seem to
be worrying about, but that it's aiming to satisfy *too many* use
cases.  I think FDW join pushdown is a fairly different problem from
adding a custom scan type, and I doubt one patch should try to solve
both problems.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Robert Haas
On Wed, Feb 26, 2014 at 10:23 AM, Stephen Frost sfr...@snowman.net wrote:
 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 IIUC, his approach was integration of join-pushdown within FDW APIs,
 however, it does not mean the idea of remote-join is rejected.

 For my part, trying to consider doing remote joins *without* going
 through FDWs is just nonsensical.

That is, of course, true by definition, but I think it's putting the
focus in the wrong place.  It's possible that there are other cases
when a scan might a plausible path for a joinrel even if there are no
foreign tables in play.  For example, you could cache the joinrel
output and then inject a cache scan as a path for the joinrel.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Robert Haas
On Fri, Feb 28, 2014 at 10:36 AM, Stephen Frost sfr...@snowman.net wrote:
 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 * Stephen Frost (sfr...@snowman.net) wrote:
  I don't see how you can be when there hasn't been any discussion that I've
  seen about how parallel query execution is going to change things for us.
 
 If parallel query execution changes whole of the structure of plan nodes,
 it will also affect to the interface of custom-scan because it is a thin-
 abstraction of plan-node. However, if parallel execution feature is
 implemented as one of new plan node in addition to existing one, I cannot
 imagine a scenario that affects to the structure of another plan node.

 Let's just say that I have doubts that we'll be able to implement
 parallel execution *without* changing the plan node interface in some
 way which will require, hopefully minor, changes to all of the nodes.
 The issue is that even a minor change would break the custom-scan API
 and we'd immediately be in the boat of dealing with complaints regarding
 backwards compatibility.  Perhaps we can hand-wave that, and we've had
 some success changing hook APIs between major releases, but such changes
 may also be in ways which wouldn't break in obvious ways or even
 possibly be changes which have to be introduced into back-branches.
 Parallel query is going to be brand-new real soon and it's reasonable to
 think we'll need to make bug-fix changes to it after it's out which
 might even involve changes to the API which is developed for it.

For what it's worth, and I can't claim to have all the answers here,
this doesn't match my expectation.  I think we'll do two kinds of
parallelism.  One will be parallelism within nodes, like parallel sort
or parallel seqscan.  Any node we parallelize this way is likely to be
heavily rewritten, or else to get a sister that looks quite different
from the original.  The other kind of parallelism will involve pushing
a whole subtree of the plan into a different node.  In this case we'll
need to pass data between nodes in some different way (this was one of
the major reasons I designed the shm_mq stuff) but the nodes
themselves should change little if at all.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Wed, Feb 26, 2014 at 3:02 AM, Stephen Frost sfr...@snowman.net wrote:
  The line between a foreign table and a local one is becoming blurred
  already, but still, if this is the goal then I really think the
  background worker is where you should be focused, not on this Custom
  Scan API.  Consider that, once we've got proper background workers,
  we're going to need new nodes which operate in parallel (or some other
  rejiggering of the nodes- I don't pretend to know exactly what Robert is
  thinking here, and I've apparently forgotten it if he's posted it
  somewhere) and those interfaces may drive changes which would impact the
  Custom Scan API- or worse, make us deprecate or regret having added it
  because now we'll need to break backwards compatibility to add in the
  parallel node capability to satisfy the more general non-GPU case.
 
 This critique seems pretty odd to me.  I haven't had the time to look
 at this patch set, but I don't see why anyone would want to use the
 background worker facility for GPU acceleration, which is what
 KaiGai's trying to accomplish here.

Eh, that didn't come out quite right.  I had intended it to be more
along the lines of look at what Robert's doing.

I was trying to point out that parallel query execution is coming soon
thanks to the work on background worker and that parallel query
execution might drive changes to the way nodes interact in the executor
driving a need to change the API.  In other words, CustomScan could
easily end up being broken by that change and I'd rather we not have to
worry about such breakage.

 I seriously doubt there's any real conflict with parallelism here.
 Parallelism may indeed add more ways to scan a relation
 (ParallelSeqScan, ParallelIndexScan?) but that doesn't mean that we
 shouldn't have a Custom Scan node too.

What about parallel execution through the tree itself, rather than just
at specific end nodes like SeqScan and IndexScan?  Or parallel
aggregates?  I agree that simple parallel SeqScan/IndexScan isn't going
to change any of the interfaces, but surely we're going for more than
that.  Indeed, I'm wishing that I had found more time to spend on just
a simple select-based Append node which could parallelize I/O across
tablespaces and FDWs underneath the Append.

 Indeed, my principle concern
 about this patch set isn't that it's too specialized, as you seem to
 be worrying about, but that it's aiming to satisfy *too many* use
 cases.  I think FDW join pushdown is a fairly different problem from
 adding a custom scan type, and I doubt one patch should try to solve
 both problems.

Yeah, I've voiced those same concerns later in the thread also,
specifically that this punts on nearly everything and expects the
implementor to figure it all out.  We should be able to do better wrt
FDW join-pushdown, etc.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Robert Haas
On Sat, Mar 1, 2014 at 8:49 PM, Stephen Frost sfr...@snowman.net wrote:
 This critique seems pretty odd to me.  I haven't had the time to look
 at this patch set, but I don't see why anyone would want to use the
 background worker facility for GPU acceleration, which is what
 KaiGai's trying to accomplish here.

 Eh, that didn't come out quite right.  I had intended it to be more
 along the lines of look at what Robert's doing.

 I was trying to point out that parallel query execution is coming soon
 thanks to the work on background worker and that parallel query
 execution might drive changes to the way nodes interact in the executor
 driving a need to change the API.  In other words, CustomScan could
 easily end up being broken by that change and I'd rather we not have to
 worry about such breakage.

I think the relation is pretty tangential.  I'm worried about the
possibility that the Custom Scan API is broken *ab initio*, but I'm
not worried about a conflict with parallel query.

 I seriously doubt there's any real conflict with parallelism here.
 Parallelism may indeed add more ways to scan a relation
 (ParallelSeqScan, ParallelIndexScan?) but that doesn't mean that we
 shouldn't have a Custom Scan node too.

 What about parallel execution through the tree itself, rather than just
 at specific end nodes like SeqScan and IndexScan?  Or parallel
 aggregates?  I agree that simple parallel SeqScan/IndexScan isn't going
 to change any of the interfaces, but surely we're going for more than
 that.  Indeed, I'm wishing that I had found more time to spend on just
 a simple select-based Append node which could parallelize I/O across
 tablespaces and FDWs underneath the Append.

Well, as I said in another recent reply that you probably got after
sending this, when you split between nodes, that mostly just has to do
with how you funnel the tuples from one node to another.  The nodes
themselves probably don't even need to know.  Or at least that's what
I'd hope.

I don't see that parallelizing Append is any easier than any other
problem in this space.  There's no parallel I/O facility, so you need
a background worker per append branch to wait on I/O.  And you have
all the problems of making sure that the workers have the same
snapshot, making sure they don't self-deadlock, etc. that you have for
any other case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 I don't see that parallelizing Append is any easier than any other
 problem in this space.  There's no parallel I/O facility, so you need
 a background worker per append branch to wait on I/O.  And you have
 all the problems of making sure that the workers have the same
 snapshot, making sure they don't self-deadlock, etc. that you have for
 any other case.

Erm, my thought was to use a select() loop which sends out I/O requests
and then loops around waiting to see who finishes it.  It doesn't
parallelize the CPU cost of getting the rows back to the caller, but
it'd at least parallelize the I/O, and if what's underneath is actually
a remote FDW running a complex query (because the other side is actually
a view), it would be a massive win to have all the remote FDWs executing
concurrently instead of serially as we have today.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Kohei KaiGai
2014-03-02 10:29 GMT+09:00 Stephen Frost sfr...@snowman.net:
 * Kohei KaiGai (kai...@kaigai.gr.jp) wrote:
 As you mentioned, it is a headache for packagers, and does not make
 sense for us if packager disabled the feature that requires proprietary
 drivers.

 No, I disagree with that.  I don't expect this use-case to be very
 common to begin with and telling individuals that they have to compile
 it themselves is certainly not out of the question.

 In fact, Fedora / RHEL does not admit to distribute software
 under the none open source software license.

 I'm pretty confident you can get RPMs for those distributions.

 Obviously, nvidia's cuda
 is a library being distributed under the proprietary license, thus out of
 the scope for the Linux distributors.

 This also doesn't make any sense to me- certainly the CUDA libraries are
 available under Debian derivatives, along with open-source wrappers for
 them like pycuda.

 It also leads them to turn off the
 feature that shall be linked with proprietary drivers.
 All we can do is to implement these features as extension, then offer
 an option for users to use or not to use.

 No, we can tell individuals who want it that they're going to need to
 build with support for it.  We don't offer OpenSSL as an extension (I
 certainly wish we did- and had a way to replace it w/ GNUTLS or one of
 the better licensed options).

I know there is some alternative ways. However, it requires users to take
additional knowledge and setting up efforts, also loses the benefit to use
distributor's Linux if alternative RPMs are required.
I don't want to recommend users such a complicated setting up procedure.

 What I'd like to implement is GPU acceleration that can perform on
 regular tables, not only foreign tables. Also, regarding to the backlog
 in the developer meeting, pluggable executor node is also required
 feature by PG-XC folks to work their project with upstream.
 I think custom-scan feature is capable to host these requirement,
 and does not prevent enhancement FDW features.

 I think you're conflating things again- while it might be possible to
 use CustomScan to implement FDW join-pushdown or FDW aggregate-pushdown,
 *I* don't think it's the right approach.  Regarding the PG-XC
 requirement, I expect they're looking for FDW join/aggregate-pushdown
 and also see that it *could* be done w/ CustomScan.

The reason why I submitted the part-3 patch (that enhances postgres_fdw
for remote-join using custom-scan) is easy to demonstrate the usage of
join-replacement by a special scan, with minimum scale of the patch to be
reviewed. If we have another idea to demonstrate it, I don't stick on the remot-
join feature on foreign tables.
Regarding to the PG-XC, I didn't know their exact needs because I didn't
attend the cluster meeting, but the someone mentioned about pluggable
plan/exec node in this context.

 We could punt on the whole thing and drop in hooks which could be used
 to replace everything done from the planner through to the executor and
 then anyone *could* implement any of the above, and parallel query too.
 That doesn't make it the right approach.

That is a problem I pointed out in the last developer meeting. Because we
have no way to enhance a part of plan / exec logic by extension, extension
has to replace whole of the planner / executor using hooks. It is painful for
authors of extensions.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 For what it's worth, and I can't claim to have all the answers here,
 this doesn't match my expectation.  I think we'll do two kinds of
 parallelism.  One will be parallelism within nodes, like parallel sort
 or parallel seqscan.  Any node we parallelize this way is likely to be
 heavily rewritten, or else to get a sister that looks quite different
 from the original.  

Sure.

 The other kind of parallelism will involve pushing
 a whole subtree of the plan into a different node.  In this case we'll
 need to pass data between nodes in some different way (this was one of
 the major reasons I designed the shm_mq stuff) but the nodes
 themselves should change little if at all.

It's that some different way of passing data between the nodes that
makes me worry, but I hope you're right and we won't actually need to
change the interfaces or the nodes very much.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-03-01 Thread Kohei KaiGai
2014-03-02 10:38 GMT+09:00 Robert Haas robertmh...@gmail.com:
 On Wed, Feb 26, 2014 at 10:23 AM, Stephen Frost sfr...@snowman.net wrote:
 * Kouhei Kaigai (kai...@ak.jp.nec.com) wrote:
 IIUC, his approach was integration of join-pushdown within FDW APIs,
 however, it does not mean the idea of remote-join is rejected.

 For my part, trying to consider doing remote joins *without* going
 through FDWs is just nonsensical.

 That is, of course, true by definition, but I think it's putting the
 focus in the wrong place.  It's possible that there are other cases
 when a scan might a plausible path for a joinrel even if there are no
 foreign tables in play.  For example, you could cache the joinrel
 output and then inject a cache scan as a path for the joinrel.

That might be an idea to demonstrate usage of custom-scan node,
rather than the (ad-hoc) enhancement of postgres_fdw.
As I have discussed in another thread, it is available to switch heap
reference by cache reference on the fly, it shall be a possible use-
case for custom-scan node.

So, I'm inclined to drop the portion for postgres_fdw in my submission
to focus on custom-scan capability.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-03-01 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 03/01/2014 05:10 PM, Tom Lane wrote:
 BTW, a different problem with the proposed patch is that it changes
 some test cases in ecpg and contrib/dblink, apparently to avoid session
 reconnections.  That seems likely to me to be losing test coverage.
 Perhaps there is no alternative, but I'd like to have some discussion
 around that point as well.

 Yeah. Assuming we make the changes you're suggesting that should no 
 longer be necessary, right?

Not sure.  I believe the point of those changes is that the scaffolding
only sets up a password for the original superuser, so that connecting
as anybody else will fail if the test postmaster is configured for
password auth.  If so, the only way to avoid doing any work would be
if we don't implement any fix at all for Windows.

Whether or not you're worried about the security of make check on
Windows, there are definitely some things to be unhappy about here.
One being that the core regression tests are evidently not testing
connecting as anybody but superuser; and the proposed changes would remove
such testing from contrib and ecpg as well, which is surely not better
from a coverage standpoint.  (It's not terribly hard to imagine
permissions bugs that would cause postinit.c to fail for non-superusers.)

Another issue is that (I presume, haven't checked) make installcheck
on contrib or ecpg will currently fail against a server that's not
configured for trust auth.  Ideally you should be able to do make
installcheck against a reasonably-configured server.

I'm not real sure what to do about either of those points, but I am sure
that the proposed patch isn't moving the ball downfield.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] commit fest status and release timeline

2014-03-01 Thread Michael Paquier
On Sun, Mar 2, 2014 at 7:43 AM, Vik Fearing vik.fear...@dalibo.com wrote:
 On 03/01/2014 07:50 PM, Josh Berkus wrote:
 On 03/01/2014 09:01 AM, Peter Eisentraut wrote:
 Status Summary. Needs Review: 36, Waiting on Author: 7, Ready for
 Committer: 16, Committed: 43, Returned with Feedback: 8, Rejected: 4.
 Total: 114.

 We're still on track to achieve about 50% committed patches, which would
 be similar to the previous few commit fests.  So decent job so far.
 So, other than Hstore2/JSONB and Logical Changesets, what are the
 big/difficult patches left?

 For me, I'd really like to see the reduced locks on ALTER TABLE.
The patch by Peter to improve test coverage for client programs. This
is helpful for QE/QA teams evaluating Postgres, and it could be
extended for other things like replication test suite as well as far
as I understood.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add support of IF NOT EXISTS to others CREATE statements

2014-03-01 Thread Fabrízio de Royes Mello
On Sat, Mar 1, 2014 at 7:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com
writes:
  On Sat, Mar 1, 2014 at 2:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  [ re schema upgrade scenarios ]
  Why wouldn't COR semantics answer that requirement just as well, if not
  better?

  Just because it will replace the object content... and in some cases
this
  cannot happen because it will regress the schema to an old version.

 That argument seems awfully darn flimsy.

Sorry, I know my use case is very specific...

We don't have this feature is a strong argument just because we can
implement COR instead? Or maybe just we don't want to add more complexity
to source code?

The complexity to source code added by this feature is minimal, but the
result is very useful, and can be used for many tools (i.e. rails
migrations, python alembic, doctrine, and others)


 In any case, given the existence of DO it's simple to code up
 create-if-not-exists behavior with a couple lines of plpgsql; that seems
 to me to be a sufficient answer for corner cases.  create-or-replace is
 not equivalently fakable if the system doesn't supply the functionality.


You are completely right.

But we already have DROP ... IF EXISTS, then I think if we would have
CREATE ... IF NOT EXISTS (the inverse behavior) will be very natural...
and I agree in implement CREATE OR REPLACE too.

Grettings,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] commit fest status and release timeline

2014-03-01 Thread Fabrízio de Royes Mello
On Sun, Mar 2, 2014 at 12:56 AM, Michael Paquier michael.paqu...@gmail.com
wrote:

 On Sun, Mar 2, 2014 at 7:43 AM, Vik Fearing vik.fear...@dalibo.com
wrote:
  On 03/01/2014 07:50 PM, Josh Berkus wrote:
  On 03/01/2014 09:01 AM, Peter Eisentraut wrote:
  Status Summary. Needs Review: 36, Waiting on Author: 7, Ready for
  Committer: 16, Committed: 43, Returned with Feedback: 8, Rejected: 4.
  Total: 114.
 
  We're still on track to achieve about 50% committed patches, which
would
  be similar to the previous few commit fests.  So decent job so far.
  So, other than Hstore2/JSONB and Logical Changesets, what are the
  big/difficult patches left?
 
  For me, I'd really like to see the reduced locks on ALTER TABLE.
 The patch by Peter to improve test coverage for client programs. This
 is helpful for QE/QA teams evaluating Postgres, and it could be
 extended for other things like replication test suite as well as far
 as I understood.


+1

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-03-01 Thread Noah Misch
On Sat, Mar 01, 2014 at 05:51:46PM -0500, Andrew Dunstan wrote:
 On 03/01/2014 05:10 PM, Tom Lane wrote:
 One other thought here: is it actually reasonable to expend a lot of effort
 on the Windows case?  I'm not aware that people normally expect a Windows
 box to have multiple users at all, let alone non-mutually-trusting users.
 
 As Stephen said, it's fairly unusual. There are usually quite a few
 roles, but it's rare to have more than one human type role
 connected to the machine at a given time.

I, too, agree it's rare.  Rare enough to justify leaving the vulnerability
open on Windows, indefinitely?  I'd say not.  Windows itself has been pushing
steadily toward better multi-user support over the past 15 years or so.
Releasing software for Windows as though it were a single-user platform is
backwards-looking.  We should be a model in this area, not a straggler.

 I'd be happy doing nothing in this case, or not very much. e.g.
 provide a password but not with great cryptographic strength.

One option that would simplify things is to fix only non-Windows in the back
branches, via socket protection, and fix Windows in HEAD only.  We could even
do so by extending HAVE_UNIX_SOCKETS support to Windows through named pipes.

Using weak passwords on Windows alone would not simplify the effort.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-03-01 Thread Noah Misch
On Sat, Mar 01, 2014 at 09:43:19PM -0500, Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
  On 03/01/2014 05:10 PM, Tom Lane wrote:
  BTW, a different problem with the proposed patch is that it changes
  some test cases in ecpg and contrib/dblink, apparently to avoid session
  reconnections.  That seems likely to me to be losing test coverage.
  Perhaps there is no alternative, but I'd like to have some discussion
  around that point as well.

connect/test5.pgc has the same number of session reconnections before and
after the patch.  The change is to assign passwords to the connection test
accounts and use those passwords during the tests.  Test coverage actually
increased there; before, it did not matter whether ecpg conveyed each password
in good order.  The dblink change does replace a non-superuser reconnection
with a SET SESSION AUTHORIZATION.

 I believe the point of those changes is that the scaffolding
 only sets up a password for the original superuser, so that connecting
 as anybody else will fail if the test postmaster is configured for
 password auth.

Essentially, yes.  You can connect as another user if you assign a password;
the ecpg suite does so.  (That user had better be unprivileged.)  The dblink
change has a second point.  Since the dblink_regression_test role has use of
the dblink_connect_u() function, it needs to be treated as a privileged
account.  (I'll add a comment about that.)

 Another issue is that (I presume, haven't checked) make installcheck
 on contrib or ecpg will currently fail against a server that's not
 configured for trust auth.  Ideally you should be able to do make
 installcheck against a reasonably-configured server.

No, I had verified make installcheck-world under md5 auth.  The setup I
recommend, which I mentioned in the initial message of this thread, is to put
the same PGPASSFILE in the postmaster environment as you put in the make
installcheck environment.  (It's contrib/dblink and contrib/postgres_fdw that
would otherwise fail.)

nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Windows exit code 128 ... it's baaack

2014-03-01 Thread Amit Kapila
On Fri, Feb 28, 2014 at 5:44 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I looked at the postmaster log for the ongoing issue on narwhal
 (to wit, that the contrib/dblink test dies the moment it tries
 to do anything dblink-y), and looky here what the postmaster
 has logged:

 530fc965.bac:2] LOG:  server process (PID 2144) exited with exit code 128
 [530fc965.bac:3] DETAIL:  Failed process was running: SELECT *
 FROM dblink('dbname=contrib_regression','SELECT * FROM foo') AS t(a 
 int, b text, c text[])
 WHERE t.a  7;
 [530fc965.bac:4] LOG:  server process (PID 2144) exited with exit code 0
 [530fc965.bac:5] LOG:  terminating any other active server processes


 Now, back in the 2010 thread where we agreed to put in the ignore-128
 kluge, it was asserted that all known cases of this exit code were
 irreproducible Windows flakiness occurring at process start or exit.
 This is evidently neither start nor exit, but likely is being provoked
 by trying to load libpq into the backend.

Most of the information on net regarding this error code indicates
that it is related to some particular windows version and even there
are few Hot-Fixes for it, for example:
http://support.microsoft.com/kb/974509

Not sure, how relevant such hot-fixes are to current case, as most
information indicates that it happens during CreateProcess(), but the
current failure doesn't seem to have relation with CreateProcess().

I have tried to reproduce it on my local Windows m/c (Win-7), but
couldn't able to reproduce it. I think looking into Event Viewer logs
of that time might give some clue.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Securing make check (CVE-2014-0067)

2014-03-01 Thread Dave Page


 On 2 Mar 2014, at 05:20, Noah Misch n...@leadboat.com wrote:
 
 On Sat, Mar 01, 2014 at 05:51:46PM -0500, Andrew Dunstan wrote:
 On 03/01/2014 05:10 PM, Tom Lane wrote:
 One other thought here: is it actually reasonable to expend a lot of effort
 on the Windows case?  I'm not aware that people normally expect a Windows
 box to have multiple users at all, let alone non-mutually-trusting users.
 
 As Stephen said, it's fairly unusual. There are usually quite a few
 roles, but it's rare to have more than one human type role
 connected to the machine at a given time.
 
 I, too, agree it's rare.  Rare enough to justify leaving the vulnerability
 open on Windows, indefinitely?

It's not that rare in my experience - certainly there are far more single user 
installations, but Terminal Server configurations are common for deploying apps 
Citrix-style or VDI. The one and only Windows server maintained by the EDB 
infrastructure team is a terminal server for example.

  I'd say not.  Windows itself has been pushing
 steadily toward better multi-user support over the past 15 years or so.
 Releasing software for Windows as though it were a single-user platform is
 backwards-looking.  We should be a model in this area, not a straggler.

Definitely.

 
 I'd be happy doing nothing in this case, or not very much. e.g.
 provide a password but not with great cryptographic strength.
 
 One option that would simplify things is to fix only non-Windows in the back
 branches, via socket protection, and fix Windows in HEAD only.  We could even
 do so by extending HAVE_UNIX_SOCKETS support to Windows through named pipes.
 
 Using weak passwords on Windows alone would not simplify the effort.
 
 -- 
 Noah Misch
 EnterpriseDB http://www.enterprisedb.com
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers