Re: [HACKERS] better atomics - v0.5

2014-06-29 Thread Andres Freund
On 2014-06-29 11:53:28 +0530, Amit Kapila wrote:
> On Sat, Jun 28, 2014 at 1:48 PM, Andres Freund 
> > Two things:
> > a) compare_exchange is a read/write operator and so far I've defined it
> >to have full barrier semantics (documented in atomics.h). I'd be
> >happy to discuss which semantics we want for which operation.
> 
> I think it is better to have some discussion about it. Apart from this,
> today I noticed atomic read/write doesn't use any barrier semantics
> and just rely on volatile.

Yes, intentionally so. It's often important to get/set the current value
without the cost of emitting a memory barrier. It just has to be a
recent value  and it actually has to come from memory. And that's actually
enforced by the current definition - I think?

> > b) It's only supported from vista onwards. Afaik we still support XP.
> #ifndef pg_memory_barrier_impl
> #define pg_memory_barrier_impl() MemoryBarrier()
> #endif
> 
> The MemoryBarrier() macro used also supports only on vista onwards,
> so I think for reasons similar to using MemoryBarrier() can apply for
> this as well.

I think that's odd because barrier.h has been using MemoryBarrier()
without a version test for a long time now. I guess everyone uses a new
enough visual studio. Those intrinsics aren't actually OS but just
compiler dependent.

Otherwise we could just define it as:

FORCEINLINE
VOID
MemoryBarrier (
VOID
)
{
LONG Barrier;
__asm {
xchg Barrier, eax
}
}


> > c) It doesn't make any difference on x86 ;)
> 
> What about processors like Itanium which care about acquire/release
> memory barrier semantics?

Well, it still will be correct? I don't think it makes much sense to
focus overly much on itanium here with the price of making anything more
complicated for others.

> > > I think this value is required for lwlock patch, but I am wondering why
> > > can't the same be achieved if we just return the *current* value and
> > > then let lwlock patch do the handling based on it.  This will have
> another
> > > advantage that our pg_* API will also have similar signature as native
> > > API's.
> >
> > Many usages of compare/exchange require that you'll get the old value
> > back in an atomic fashion. Unfortunately the Interlocked* API doesn't
> > provide a nice way to do that.
> 
> Yes, but I think the same cannot be accomplished even by using
> expected.

More complicatedly so, yes? I don't think we want those comparisons on
practically every callsite.

> >Note that this definition of
> > compare/exchange both maps nicely to C11's atomics and the actual x86
> > cmpxchg instruction.
> >
> > I've generally tried to mimick C11's API as far as it's
> > possible. There's some limitations because we can't do generic types and
> > such, but other than that.
> 
> If I am reading C11's spec for this API correctly, then it says as below:
> "Atomically compares the value pointed to by obj with the value pointed
> to by expected, and if those are equal, replaces the former with desired
> (performs read-modify-write operation). Otherwise, loads the actual value
> pointed to by obj into *expected (performs load operation)."
> 
> So it essentialy means that it loads actual value in expected only if
> operation is not successful.

Yes. But in the case it's successfull it will already contain the right
value.

> > > 4.
> ..
> > > There is a Caution notice in microsoft site indicating
> > > _ReadWriteBarrier/MemoryBarrier are deprected.
> >
> > It seemed to be the most widely available API, and it's what barrier.h
> > already used.
> > Do you have a different suggestion?
> 
> I am trying to think based on suggestion given by Microsoft, but
> not completely clear how to accomplish the same at this moment.

Well, they refer to C11 stuff. But I think it'll be a while before it
makes sense to use a fallback based on that.

> > > 6.
> > > pg_atomic_compare_exchange_u32()
> > >
> > > It is better to have comments above this and all other related
> functions.
> >
> > Check atomics.h, there's comments above it:
> > /*
> >  * pg_atomic_compare_exchange_u32 - CAS operation
> >  *
> >  * Atomically compare the current value of ptr with *expected and store
> newval
> >  * iff ptr and *expected have the same value. The current value of *ptr
> will
> >  * always be stored in *expected.
> >  *
> >  * Return whether the values have been exchanged.
> >  *
> >  * Full barrier semantics.
> >  */
> 
> Okay, generally code has such comments on top of function
> definition rather than with declaration.

I don't want to have it on the _impl functions because they're
duplicated for the individual platforms and will just become out of
date...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] better atomics - v0.5

2014-06-29 Thread Andres Freund
On 2014-06-28 19:36:39 +0300, Heikki Linnakangas wrote:
> On 06/27/2014 08:15 PM, Robert Haas wrote:
> >On Thu, Jun 26, 2014 at 3:04 PM, Andres Freund  
> >wrote:
> >>I don't really see usecases where it's not related data that's being
> >>touched, but: The point is that the fastpath (not using a spinlock) might
> >>touch the atomic variable, even while the slowpath has acquired the
> >>spinlock. So the slowpath can't just use non-atomic operations on the
> >>atomic variable.
> >>Imagine something like releasing a buffer pin while somebody else is
> >>doing something that requires holding the buffer header spinlock.
> >
> >If the atomic variable can be manipulated without the spinlock under
> >*any* circumstances, then how is it a good idea to ever manipulate it
> >with the spinlock?
> 
> With the WALInsertLock scaling stuff in 9.4, there are now two variables
> protected by a spinlock: the current WAL insert location, and the prev
> pointer (CurrBytePos and PrevBytePos). To insert a new WAL record, you need
> to grab the spinlock to update both of them atomically. But to just read the
> WAL insert pointer, you could do an atomic read of CurrBytePos if the
> architecture supports it - now you have to grab the spinlock.
> Admittedly that's just an atomic read, not an atomic compare and exchange or
> fetch-and-add.

There's several architectures where you can do atomic 8byte reads, but
only busing cmpxchg or similar, *not* using a plain read... So I think
it's actually a fair example.

> Or if the architecture has an atomic 128-bit compare &
> exchange op you could replace the spinlock with that. But it's not hard to
> imagine similar situations where you sometimes need to lock a larger data
> structure to modify it atomically, but sometimes you just need to modify
> part of it and an atomic op would suffice.

Yes.

> I thought Andres' LWLock patch also did something like that. If the lock is
> not contended, you can acquire it with an atomic compare & exchange to
> increment the exclusive/shared counter. But to manipulate the wait queue,
> you need the spinlock.

Right. It just happens that the algorithm requires none of the atomic
ops have to be under the spinlock... But I generally think that we'll
see more slow/fastpath like situations cropping up where we can't always
avoid the atomic op while holding the lock.

How about adding a paragraph that explains that it's better to avoid
atomics usage of spinlocks because of the prolonged runtime, especially
due to the emulation and cache misses? But also saying it's ok if the
algorithm needs it and is a sufficient benefit?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cluster name in ps output

2014-06-29 Thread Thomas Munro
On 29 June 2014 10:55, Andres Freund  wrote:
> So, I'd looked at it with an eye towards committing it and found some
> more things. I've now
> * added the restriction that the cluster name can only be ASCII. It's
>   shown from several clusters with differing encodings, and it's shown
>   in process titles, so that seems wise.
> * moved everything to the LOGGING_WHAT category
> * added minimal documention to monitoring.sgml
> * expanded the config.sgml entry to mention the restriction on the name.
> * Changed the GUCs short description

Thank you.

> I also think, but haven't done so, we should add a double colon after
> the cluster name, so it's not:
>
> postgres: server1 stats collector process
> but
> postgres: server1: stats collector process

+1

Best regards,
Thomas Munro


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing NOT IN to use ANTI joins

2014-06-29 Thread David Rowley
On Fri, Jun 27, 2014 at 6:14 AM, Tom Lane  wrote:

> David Rowley  writes:
> > If there's no way to tell that the target entry comes from a left join,
> > then would it be a bit too quirky to just do the NOT NULL checking when
> > list_length(query->rtable) == 1 ? or maybe even loop over query->rtable
> and
> > abort if we find an outer join type? it seems a bit hackish, but perhaps
> it
> > would please more people than it would surprise.
>
> Why do you think you can't tell if the column is coming from the wrong
> side of a left join?
>
> It was more of that I couldn't figure out how to do it, rather than
thinking it was not possible.


> I don't recall right now if there is already machinery in the planner for
> this, but we could certainly deconstruct the from-clause enough to tell
> that.
>
> It's not hard to imagine a little function that recursively descends the
> join tree, not bothering to descend into the nullable sides of outer
> joins, and returns "true" if it finds a RangeTableRef for the desired RTE.
> If it doesn't find the RTE in the not-nullable parts of the FROM clause,
> then presumably it's nullable.
>
>
Ok, I've implemented that in the attached. Thanks for the tip.


> The only thing wrong with that approach is if you have to call the
> function many times, in which case discovering the information from
> scratch each time could get expensive.
>
>
I ended up just having the function gather up all RelIds that are on the on
the inner side. So this will just need to be called once per NOT IN clause.


> I believe deconstruct_jointree already keeps track of whether it's
> underneath an outer join, so if we were doing this later than that,
> I'd advocate making sure it saves the needed information.  But I suppose
> you're trying to do this before that.  It might be that you could
> easily save aside similar information during the earlier steps in
> prepjointree.c.  (Sorry for not having examined the patch yet, else
> I'd probably have a more concrete suggestion.)
>
>
This is being done from within pull_up_sublinks, so it's before
deconstruct_jointree.

I've attached an updated version of the patch which seems to now work
properly with outer joins.

I think I'm finally ready for a review again, so I'll update the commitfest
app.

Regards

David Rowley


not_in_anti_join_v0.6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]

2014-06-29 Thread MauMau
Thanks, I marked it as ready for committer.  I hope Fujii san or another 
committer will commit this, refining English expression if necessary.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Allow empty targets in unaccent dictionary

2014-06-29 Thread Abhijit Menon-Sen
Hi.

I've attached a patch to contrib/unaccent as outlined in my review the
other day. I'm familiar with multiple languages in which modifiers are
separate characters (but not Arabic), so I decided to try a quick test
because I was curious.

I added a line containing only U+0940 (DEVANAGARI VOWEL SIGN II) to
unaccent.rules, and tried the following (the argument to unaccent is
U+0915 U+0940, and the result is U+0915):

ams=# select unaccent('unaccent','की ');
 unaccent 
--
 क 
(1 row)

So the patch works fine: it correctly removes the modifier.

To add a test, however, it would be necessary to add this modifier to
unaccent.rules. But if we're adding one modifier to unaccent.rules, we
really should add them all. I have nowhere near the motivation needed to
add all the Devanagari modifiers, let alone any of the other languages I
know, and even if I did, it still wouldn't address Mohammad's use case.

(As a separate matter, it's not clear to me if stripping these modifiers
using unaccent is something everyone will want to do.)

So, though I'm not fond of saying it, perhaps the right thing to do is
to forget my earlier objection (that the patch didn't have tests), and
just commit as-is. It's a pretty straightforward patch, and it works.

I'm setting this as ready for committer.

-- अभजत "unaccented in three languages" മനന-সন
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index a337df6..c485a41 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -105,15 +105,16 @@ initTrie(char *filename)
 			while ((line = tsearch_readline(&trst)) != NULL)
 			{
 /*
- * The format of each line must be "src trg" where src and trg
- * are sequences of one or more non-whitespace characters,
- * separated by whitespace.  Whitespace at start or end of
- * line is ignored.
+ * The format of each line must be "src" or "src trg",
+ * where src and trg are sequences of one or more
+ * non-whitespace characters, separated by whitespace.
+ * Whitespace at start or end of line is ignored. If trg
+ * is omitted, an empty string is used as a replacement.
  */
 int			state;
 char	   *ptr;
 char	   *src = NULL;
-char	   *trg = NULL;
+char	   *trg = "";
 int			ptrlen;
 int			srclen = 0;
 int			trglen = 0;
@@ -160,6 +161,10 @@ initTrie(char *filename)
 	}
 }
 
+/* It's OK to have a valid src and empty trg. */
+if (state > 0 && trglen == 0)
+	state = 5;
+
 if (state >= 3)
 	rootTrie = placeChar(rootTrie,
 		 (unsigned char *) src, srclen,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Allow empty targets in unaccent dictionary

2014-06-29 Thread Mohammad Alhashash

Hi,

Thanks a lot for the review and comments. Here is an updated patch.

On 6/25/2014 8:20 AM, Abhijit Menon-Sen wrote:
Your patch should definitely add a test case or two to 
sql/unaccent.sql and expected/unaccent.out showing the behaviour that 
didn't work before the change. 
That would require adding new entries to the "unaccent.rules" template. 
I'm afraid that the templates I'm using for Arabic now are not complete 
enough to be including in the default dictionary.


I can create a new template just for the test cases but I've to update 
the make file to include that file in installation. Should I do this?


Thanks,

Mohammad Alhashash

diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
old mode 100644
new mode 100755
index a337df6..eb7e2a3
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -105,15 +105,15 @@ initTrie(char *filename)
while ((line = tsearch_readline(&trst)) != NULL)
{
/*
-* The format of each line must be "src trg" 
where src and trg
+* The format of each line must be "src [trg]" 
where src and trg
 * are sequences of one or more non-whitespace 
characters,
 * separated by whitespace.  Whitespace at 
start or end of
-* line is ignored.
+* line is ignored. If trg is empty, a 
zero-length string is used.
 */
int state;
char   *ptr;
char   *src = NULL;
-   char   *trg = NULL;
+   char   *trg = "";
int ptrlen;
int srclen = 0;
int trglen = 0;
@@ -160,6 +160,10 @@ initTrie(char *filename)
}
}
 
+   /* It's OK to have a valid src and empty trg. */
+   if (state > 0 && trglen == 0)
+   state = 5;
+   
if (state >= 3)
rootTrie = placeChar(rootTrie,

 (unsigned char *) src, srclen,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cluster name in ps output

2014-06-29 Thread Andres Freund
On 2014-06-29 11:11:14 +0100, Thomas Munro wrote:
> On 29 June 2014 10:55, Andres Freund  wrote:
> > So, I'd looked at it with an eye towards committing it and found some
> > more things. I've now
> > * added the restriction that the cluster name can only be ASCII. It's
> >   shown from several clusters with differing encodings, and it's shown
> >   in process titles, so that seems wise.
> > * moved everything to the LOGGING_WHAT category
> > * added minimal documention to monitoring.sgml
> > * expanded the config.sgml entry to mention the restriction on the name.
> > * Changed the GUCs short description
> 
> Thank you.
> 
> > I also think, but haven't done so, we should add a double colon after
> > the cluster name, so it's not:
> >
> > postgres: server1 stats collector process
> > but
> > postgres: server1: stats collector process
> 
> +1

Committed with the above changes. Thanks for the contribution!

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Array of composite types returned from python

2014-06-29 Thread Abhijit Menon-Sen
Hi.

When this patch was first added to a CF, I had a quick look at it, but
left it for a proper review by someone more familiar with PL/Python
internals for precisely this reason:

> 2) You removed the comment:
> - /*
> -  * We don't support arrays of row types yet, so the 
> first argument
> -  * can be NULL.
> -  */
> 
> But didn't change the code there. 
> I haven't delved deep enough into the code yet to understand the full
> meaning, but the comment would indicate that if arrays of row types
> are supported, the first argument cannot be null.

I had another look now, and I think removing the comment is fine. It
actually made no sense to me in context, so I went digging a little.

After following a plpython.c → plpy_*.c refactoring (#147c2482) and a
pgindent run (#65e806cb), I found that the comment was added along with
the code by this commit:

commit db7386187f78dfc45b86b6f4f382f6b12cdbc693
Author: Peter Eisentraut 
Date:   Thu Dec 10 20:43:40 2009 +

PL/Python array support

Support arrays as parameters and return values of PL/Python functions.

At the time, the code looked like this:

+   else
+   {
+   nulls[i] = false;
+   /* We don't support arrays of row types yet, so the 
first
+* argument can be NULL. */
+   elems[i] = arg->elm->func(NULL, arg->elm, obj);
+   }

Note that the first argument was actually NULL, so the comment made
sense when it was written. But the code was subsequently changed to
pass in arg->elm by the following commit:

commit 09130e5867d49c72ef0f11bef30c5385d83bf194
Author: Tom Lane 
Date:   Mon Oct 11 22:16:40 2010 -0400

Fix plpython so that it again honors typmod while assigning to tuple 
fields.

This was broken in 9.0 while improving plpython's conversion behavior 
for
bytea and boolean.  Per bug report from maizi.

The comment should have been removed at the same time. So I don't think
there's a problem here.

> 3) This is such a simple change with no new infrastructure code
> (PLyObject_ToComposite already exists). Can you think of a reason
> why this wasn't done until now? Was it a simple miss or purposefully
> excluded?

This is not an authoritative answer: I think the infrastructure was
originally missing, but was later added in #bc411f25 for OUT parameters.
Perhaps it was overlooked at the time that the same code would suffice
for this earlier-missing case. (I've Cc:ed Peter E. in case he has any
comments.)

I think the patch is ready for committer.

-- Abhijit

P.S. I'm a wee bit confused by this mail I'm replying to, because it's
signed "Ed" and looks like a response, but it's "From: Sim Zacks". I've
added the original author's address to the Cc: in case I misunderstood
something.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Array of composite types returned from python

2014-06-29 Thread Abhijit Menon-Sen
At 2014-06-29 18:08:53 +0530, a...@2ndquadrant.com wrote:
>
> I think the patch is ready for committer.

That's based on my earlier quick look and the current archaeology. But
I'm not a PL/Python user, and Ronan signed up to review the patch, so I
haven't changed the status.

Ronan, did you get a chance to look at it?

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]

2014-06-29 Thread Pavel Stehule
2014-06-29 13:35 GMT+02:00 MauMau :

> Thanks, I marked it as ready for committer.  I hope Fujii san or another
> committer will commit this, refining English expression if necessary.
>

sure

Thank you very much

Regards

Pavel


>
> Regards
> MauMau
>
>


Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]

2014-06-29 Thread Abhijit Menon-Sen
At 2014-06-29 20:35:04 +0900, maumau...@gmail.com wrote:
>
> Thanks, I marked it as ready for committer.  I hope Fujii san or
> another committer will commit this, refining English expression if
> necessary.

Since it was just a matter of editing, I went through the patch and
corrected various minor errors (typos, awkwardness, etc.). I agree
that this is now ready for committer.

I've attached the updated diff.

-- Abhijit
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index ee6ec3a..6a172dc 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -556,6 +556,15 @@ EOF
   
 
 
+
+  --help-variables
+  
+  
+  Show help about psql variables,
+  and exit.
+  
+  
+
   
  
 
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index 3aa3c16..9ff2dd9 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -78,12 +78,13 @@ usage(void)
 	printf(_("  -f, --file=FILENAME  execute commands from file, then exit\n"));
 	printf(_("  -l, --list   list available databases, then exit\n"));
 	printf(_("  -v, --set=, --variable=NAME=VALUE\n"
-			 "   set psql variable NAME to VALUE\n"));
+			 "   set psql variable NAME to VALUE e.g.: -v ON_ERROR_STOP=1\n"));
 	printf(_("  -V, --versionoutput version information, then exit\n"));
 	printf(_("  -X, --no-psqlrc  do not read startup file (~/.psqlrc)\n"));
 	printf(_("  -1 (\"one\"), --single-transaction\n"
 			 "   execute as a single transaction (if non-interactive)\n"));
 	printf(_("  -?, --help   show this help, then exit\n"));
+	printf(_("  --help-variables show a list of all specially treated variables, then exit\n"));
 
 	printf(_("\nInput and output options:\n"));
 	printf(_("  -a, --echo-all   echo all input from script\n"));
@@ -279,6 +280,99 @@ slashUsage(unsigned short int pager)
 }
 
 
+/*
+ * show list of available variables (options) from command line
+ */
+void
+help_variables(void)
+{
+	printf(_("List of specially treated variables.\n"));
+
+	printf(_("psql variables:\n"));
+	printf(_("Usage:\n"));
+	printf(_("  psql --set=NAME=VALUE\n  or \\set NAME VALUE in interactive mode\n\n"));
+
+	printf(_("  AUTOCOMMIT if set, successful SQL commands are automatically committed\n"));
+	printf(_("  COMP_KEYWORD_CASE  determine the case used to complete SQL keywords\n"
+	 " [lower, upper, preserve-lower, preserve-upper]\n"));
+	printf(_("  DBNAME the currently connected database name\n"));
+	printf(_("  ECHO   control what input is written to standard output [all, queries]\n"));
+	printf(_("  ECHO_HIDDENdisplay internal queries executed by backslash commands\n"));
+	printf(_("  ENCODING   current client character set encoding\n"));
+	printf(_("  FETCH_COUNTthe number of result rows to fetch and display at a time\n"
+	 " (default: 0=unlimited)\n"));
+	printf(_("  HISTCONTROLcontrol history list [ignorespace, ignoredups, ignoreboth]\n"));
+	printf(_("  HISTFILE   file name used to store the history list\n"));
+	printf(_("  HISTSIZE   the number of commands to store in the command history\n"));
+	printf(_("  HOST   the currently connected database server\n"));
+	printf(_("  IGNOREEOF  if unset, sending an EOF to interactive session terminates application\n"));
+	printf(_("  LASTOIDthe value of last affected OID\n"));
+	printf(_("  ON_ERROR_ROLLBACK  if set, an error doesn't stop a transaction (uses implicit SAVEPOINTs)\n"));
+	printf(_("  ON_ERROR_STOP  stop batch execution after error\n"));
+	printf(_("  PORT   server port of the current connection\n"));
+	printf(_("  PROMPT1, PROMPT2, PROMPT3\n"
+	 " specify the psql prompt\n"));
+	printf(_("  QUIET  run quietly (same as -q option)\n"));
+	printf(_("  SINGLELINE end of line terminates SQL command mode (same as -S option)\n"));
+	printf(_("  SINGLESTEP single-step mode (same as -s option)\n"));
+	printf(_("  USER   the currently connected database user\n"));
+	printf(_("  VERBOSITY  control verbosity of error reports [default, verbose, terse]\n"));
+
+	printf(_("\nPrinting options:\n"));
+	printf(_("Usage:\n"));
+	printf(_("  psql --pset=NAME[=VALUE]\n  or \\pset NAME [VALUE] in interactive mode\n\n"));
+
+	printf(_("  border border style (number)\n"));
+	printf(_("  columnsset the target width for the wrapped format\n"));
+	printf(_("  expanded (or x)toggle expanded output\n"));
+	printf(_("  fieldsep   field separator for unaligned output (default '|')\n"));
+	printf(_("  fieldsep_zero  set field separator in unaligned mode to zero\n"));
+	printf(_("  format set output format [unaligned, 

Re: [HACKERS] Array of composite types returned from python

2014-06-29 Thread Tom Lane
Abhijit Menon-Sen  writes:
> I had another look now, and I think removing the comment is fine. It
> actually made no sense to me in context, so I went digging a little.
> ...
> Note that the first argument was actually NULL, so the comment made
> sense when it was written. But the code was subsequently changed to
> pass in arg->elm by the following commit:

> commit 09130e5867d49c72ef0f11bef30c5385d83bf194
> Author: Tom Lane 
> Date:   Mon Oct 11 22:16:40 2010 -0400

> Fix plpython so that it again honors typmod while assigning to tuple 
> fields.

> This was broken in 9.0 while improving plpython's conversion behavior 
> for
> bytea and boolean.  Per bug report from maizi.

> The comment should have been removed at the same time. So I don't think
> there's a problem here.

Yeah, you're right: the comment is referring to the struct PLyTypeInfo *
argument, which isn't there at all anymore.  Mea culpa --- that's the same
sort of failure-to-update-nearby-comments thinko that I regularly mutter
about other people making :-(

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set new system identifier using pg_resetxlog

2014-06-29 Thread Abhijit Menon-Sen
At 2014-06-27 00:51:02 +0200, p...@2ndquadrant.com wrote:
>
> Also based on Alvaro's comment, I replaced the scanf parsing code with
> strtoul(l) function.

As far as I can tell (from the thread and a quick look at the patch),
your latest version of the patch addresses all the review comments.

Should I mark this ready for committer now?

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Set new system identifier using pg_resetxlog

2014-06-29 Thread Andres Freund
On 2014-06-29 19:44:21 +0530, Abhijit Menon-Sen wrote:
> At 2014-06-27 00:51:02 +0200, p...@2ndquadrant.com wrote:
> >
> > Also based on Alvaro's comment, I replaced the scanf parsing code with
> > strtoul(l) function.
> 
> As far as I can tell (from the thread and a quick look at the patch),
> your latest version of the patch addresses all the review comments.
> 
> Should I mark this ready for committer now?

Well, we haven't really agreed on a way forward yet. Robert disagrees
that the feature is independently useful and thinks it might be
dangerous for some users. I don't agree with either of those points, but
that doesn't absolve the patch from the objection. I think tentatively
have been more people in favor, but it's not clear cut imo.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-29 Thread Abhijit Menon-Sen
So, what's the status of this patch?

There's been quite a lot of discussion (though only about the approach;
no formal code/usage review has yet been posted), but as far as I can
tell, it just tapered off without any particular consensus.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD

2014-06-29 Thread Tom Lane
Dave McGuire  writes:
> On 06/25/2014 01:57 PM, Tom Lane wrote:
>> Is there anyone in the NetBSD/VAX community who would be willing to
>> host a PG buildfarm member?

>   I could put together a simh-based machine (i.e., fast) on a vm, if
> nobody else has stepped up for this.

No other volunteers have emerged, so if you'd do that it'd be great.

Probably first we ought to fix whatever needs to be fixed to get a
standard build to go through.  The one existing NetBSD machine in the
buildfarm, coypu, doesn't seem to be using any special configuration
hacks:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=coypu&dt=2014-06-29%2012%3A33%3A12
so I'm a bit confused as to what we need to change for VAX.

> Dave McGuire, AK4HZ/3
> New Kensington, PA

Hey, right up the river from here!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Decoding of (nearly) empty transactions and regression tests

2014-06-29 Thread Andres Freund
Hi,

As shown by the CLOBBER_CACHE_ALWAYS animal test_decoding's tests fail
if they take very long. The failures aren't bugs, but diffs like:
   BEGIN
+  COMMIT
+  BEGIN
   table public.tr_sub: INSERT: id[integer]:1 path[text]:'1-top-#1'
The added transaction is an analyze started by autovacuum.

So, I've been wondering for a while to get rid of this, but I haven't
come up with something I like.

To recap, currently the rules for visibly decoding a transaction
(i.e. calling the output plugin callbacks) are:
1) it has been WAL logged (duh)
2) it happened in the database we're decoding
3) it contains something. 'Something' currently means that a logged
   table has changed, or the transaction contains invalidations.

Note that just because a transaction contains 'something', these changes
aren't necessarily shown as we don't decode system table changes. I
think that behaviour makes sense because otherwise something like CREATE
TABLE without further changes wouldn't show up in the change stream.

Solution I)
Change ReorderBufferCommit() to not call the begin()/commit() callbacks
if no change() callback has been called. Technically that's trivial, but
I don't think that'd end up being a better behaviour.

Solution II)
Disable autovacuum/analyze for the tables in the regression tests. We
test vacuum explicitly, so we wouldn't loose too much test coverage.

Solution III)
Mark transactions that happen purely internally as such, using a
xl_xact_commit->xinfo flag. Not particularly pretty and the most
invasive of the solutions.

I'm favoring II) so far... Other opinions?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-29 Thread Tom Lane
Abhijit Menon-Sen  writes:
> So, what's the status of this patch?
> There's been quite a lot of discussion (though only about the approach;
> no formal code/usage review has yet been posted), but as far as I can
> tell, it just tapered off without any particular consensus.

AFAICT, people generally agree that this would probably be useful,
but there's not consensus on how far the code should be willing to
"reach" for a match, nor on what to do when there are multiple
roughly-equally-plausible candidates.

Although printing all candidates seems to be what's preferred by
existing systems with similar facilities, I can see the point that
constructing the message in a translatable fashion might be difficult.
So personally I'd be willing to abandon insistence on that.  I still
think though that printing candidates with very large distances
would be unhelpful.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decoding of (nearly) empty transactions and regression tests

2014-06-29 Thread Tom Lane
Andres Freund  writes:
> Solution I)
> Change ReorderBufferCommit() to not call the begin()/commit() callbacks
> if no change() callback has been called. Technically that's trivial, but
> I don't think that'd end up being a better behaviour.

> Solution II)
> Disable autovacuum/analyze for the tables in the regression tests. We
> test vacuum explicitly, so we wouldn't loose too much test coverage.

> Solution III)
> Mark transactions that happen purely internally as such, using a
> xl_xact_commit->xinfo flag. Not particularly pretty and the most
> invasive of the solutions.

> I'm favoring II) so far... Other opinions?

You mean disable autovac only in the contrib/test_decoding check run,
right?  That's probably OK since it's tested in the main regression
runs.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decoding of (nearly) empty transactions and regression tests

2014-06-29 Thread Andres Freund
On 2014-06-29 10:36:22 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Solution I)
> > Change ReorderBufferCommit() to not call the begin()/commit() callbacks
> > if no change() callback has been called. Technically that's trivial, but
> > I don't think that'd end up being a better behaviour.
> 
> > Solution II)
> > Disable autovacuum/analyze for the tables in the regression tests. We
> > test vacuum explicitly, so we wouldn't loose too much test coverage.
> 
> > Solution III)
> > Mark transactions that happen purely internally as such, using a
> > xl_xact_commit->xinfo flag. Not particularly pretty and the most
> > invasive of the solutions.
> 
> > I'm favoring II) so far... Other opinions?
> 
> You mean disable autovac only in the contrib/test_decoding check run,
> right?  That's probably OK since it's tested in the main regression
> runs.

Actually I'd not even though of that, but just of disabling it on
relations with relevant amounts of changes in the test_decoding
tests. That way it'd work even when run against an existing server (via
installcheck-force) which I found useful during development.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD

2014-06-29 Thread Andres Freund
On 2014-06-29 10:24:22 -0400, Tom Lane wrote:
> Dave McGuire  writes:
> > On 06/25/2014 01:57 PM, Tom Lane wrote:
> >> Is there anyone in the NetBSD/VAX community who would be willing to
> >> host a PG buildfarm member?
> 
> >   I could put together a simh-based machine (i.e., fast) on a vm, if
> > nobody else has stepped up for this.
> 
> No other volunteers have emerged, so if you'd do that it'd be great.

Maybe I'm just not playful enough, but keeping a platform alive so we
can run postgres in simulator seems a bit, well, pointless. On the other
hand VAX on *BSD isn't causing many problems that I'm aware of though,
so, whatever.

I've had a quick look and it seems netbsd emulates atomics on vax for
its own purposes (_do_cas in
https://www.almos.fr/trac/netbsdtsar/browser/vendor/netbsd/5/src/sys/arch/vax/vax/lock_stubs.S)
using a hashed lock.

Interestingly ither my nonexistant VAX knowledge is betraying me
(wouldn't be a surprise) or the algorithm doesn't test whether the lock
(bbssi) actually suceeded though...

So I don't think we'd be much worse off with the userland spinlock
protecting atomic ops.

> Probably first we ought to fix whatever needs to be fixed to get a
> standard build to go through.  The one existing NetBSD machine in the
> buildfarm, coypu, doesn't seem to be using any special configuration
> hacks:
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=coypu&dt=2014-06-29%2012%3A33%3A12
> so I'm a bit confused as to what we need to change for VAX.

That's probably something we should fix independently though. One of the
failures was:

> gmake[2]: Leaving directory
> '/usr/pkgsrc/databases/postgresql93-server/work/postgresql-9.3.4/src/backend'
> gcc -O1 -fgcse -fstrength-reduce -fgcse-after-reload -I/usr/include
> -I/usr/local/include -Wall -Wmissing-prototypes -Wpointer-arith
> -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
> -Wformat-security -fno-strict-aliasing -fwrapv -pthread  -mt -D_REENTRANT
> -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS -I../../src/port -DFRONTEND
> -I../../src/include -I/usr/include -I/usr/local/include   -c -o thread.o
> thread.c
> cc1: error: unrecognized command line option "-mt" : recipe for
> target 'thread.o' failed
> gmake[1]: *** [thread.o] Error 1
> gmake[1]: Leaving directory
> '/usr/pkgsrc/databases/postgresql93-server/work/postgresql-9.3.4/src/port'
> ../../../src/Makefile.global:423: recipe for target 'submake-libpgport'

which I don't really understand - we actually test all that in
acx_pthread.m4?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD

2014-06-29 Thread Tom Lane
Andres Freund  writes:
> Maybe I'm just not playful enough, but keeping a platform alive so we
> can run postgres in simulator seems a bit, well, pointless.

True --- an actual machine would be more useful, even if slower.  Some
of the existing buildfarm critters are pretty slow already, so that's
not a huge hindrance AFAIK.

> That's probably something we should fix independently though. One of the
> failures was:
>> cc1: error: unrecognized command line option "-mt" : recipe for
>> target 'thread.o' failed
> which I don't really understand - we actually test all that in
> acx_pthread.m4?

Yeah.  We'd need to look at the relevant part of config.log to be sure,
but my guess is that configure tried that switch, failed to recognize
that it wasn't actually working, and seized on it as what to use.
Maybe the test-for-workingness isn't quite right for this platform.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decoding of (nearly) empty transactions and regression tests

2014-06-29 Thread Tom Lane
Andres Freund  writes:
> On 2014-06-29 10:36:22 -0400, Tom Lane wrote:
>> You mean disable autovac only in the contrib/test_decoding check run,
>> right?  That's probably OK since it's tested in the main regression
>> runs.

> Actually I'd not even though of that, but just of disabling it on
> relations with relevant amounts of changes in the test_decoding
> tests. That way it'd work even when run against an existing server (via
> installcheck-force) which I found useful during development.

I think that a change that affects the behavior in any other test runs is
not acceptable.  Our testing of autovac is weaker than I'd like already
(since the main regression runs are designed to not show visible changes
when it runs).  I don't want it being turned off elsewhere for the benefit
of this test.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decoding of (nearly) empty transactions and regression tests

2014-06-29 Thread Andres Freund
On 2014-06-29 11:08:39 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2014-06-29 10:36:22 -0400, Tom Lane wrote:
> >> You mean disable autovac only in the contrib/test_decoding check run,
> >> right?  That's probably OK since it's tested in the main regression
> >> runs.
> 
> > Actually I'd not even though of that, but just of disabling it on
> > relations with relevant amounts of changes in the test_decoding
> > tests. That way it'd work even when run against an existing server (via
> > installcheck-force) which I found useful during development.
> 
> I think that a change that affects the behavior in any other test runs is
> not acceptable.  Our testing of autovac is weaker than I'd like already
> (since the main regression runs are designed to not show visible changes
> when it runs).  I don't want it being turned off elsewhere for the benefit
> of this test.

Hm? I think we're misunderstanding each other - I was thinking of tacking
ALTER TABLE .. SET (AUTOVACUUM_ENABLED = false) to the tables created in
test_decoding/sql/, not to something outside.

Since we ignore transactions in other databases and the tests run in a
database freshly created by pg_regress that should get rid of spurious
transactions. Unless somebody fiddled with the template database or is
close to a wraparound - but I'd be pretty surprised if that wouldn't
cause failures in other tests as wel.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Decoding of (nearly) empty transactions and regression tests

2014-06-29 Thread Tom Lane
Andres Freund  writes:
> Hm? I think we're misunderstanding each other - I was thinking of tacking
> ALTER TABLE .. SET (AUTOVACUUM_ENABLED = false) to the tables created in
> test_decoding/sql/, not to something outside.

Ah, got it.  Yeah, seems like an OK workaround.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD

2014-06-29 Thread Tom Lane
[ trimming the cc list since this isn't VAX-specific ]

I wrote:
> Yeah.  We'd need to look at the relevant part of config.log to be sure,
> but my guess is that configure tried that switch, failed to recognize
> that it wasn't actually working, and seized on it as what to use.
> Maybe the test-for-workingness isn't quite right for this platform.

BTW, it sure looks like the part of ACX_PTHREAD beginning with
 # Various other checks:
 if test "x$acx_pthread_ok" = xyes; then
(lines 163..210 in HEAD's acx_pthread.m4) is dead code.  One might
think that this runs if the previous loop found any working thread/
library combinations, but actually it runs only if the *last* switch
tried worked, which seems a bit improbable, and even if that was the
intention it's sure fragile as can be.

It looks like that section is mostly AIX-specific hacks, and given that
it's been awhile since there was any AIX in the buildfarm, I wonder if
that code is correct or needed at all.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to send transaction commit/rollback stats to the stats collector unconditionally.

2014-06-29 Thread Kevin Grittner
I have reviewed this patch, and think we should do what the patch
is trying to do, but I don't think the submitted patch would
actually work.  I have attached a suggested patch which I think
would work.  Gurjeet, could you take a look at it?

My comments on prior discussion embedded below.

Gurjeet Singh  wrote:
> Tom Lane  wrote:
>> Alvaro Herrera  writes:
>>> I'm not sure I understand the point of this whole thing.
>>> Realistically, how many transactions are there that do not
>>> access any database tables?
>>
>> I think that something like "select * from pg_stat_activity"
>> might not bump any table-access counters, once the relevant
>> syscache entries had gotten loaded.  You could imagine that a
>> monitoring app would do a long series of those and nothing else
>> (whether any actually do or not is a different question).
>
> +1. This is exactly what I was doing; querying pg_stat_database
> from a psql session, to track transaction counters.

+1

A monitoring application might very well do exactly that.  Having
the history graphs show large spikes might waste someone's time
tracking down the cause.  (In fact, that seems to be exactly what
happened to Gurjeet, prompting this patch~.)  I've been in similar
situations -- enough to appreciate how user-unfriendly such
behavior is.

>> But still, it's a bit hard to credit that this patch is solving
>> any real problem.  Where's the user complaints about the
>> existing behavior?
>
> Consider my original email a user complaint, albeit with a patch
> attached. I was trying to ascertain TPS on a customer's instance
> when I noticed this behaviour; it violated POLA. Had I not
> decided to dig through the code to find the source of this
> behaviour, as a regular user I would've reported this anomaly as
> a bug, or maybe turned a blind eye to it labeling it as a quirky
> behaviour.

... or assumed that it was real transaction load during that
interval.  There have probably been many bewildered users who
thought they missed some activity storm from their own software,
and run around trying to identify the source -- never realizing it
was a measurement anomaly caused by surprising behavior of the
statistics gathering.

>> That is, even granting that anybody has a workload that acts
>> like this, why would they care ...
>
> To avoid surprises!
>
> This behaviour, at least in my case, causes Postgres to misreport
> the very thing I am trying to calculate.

Yup.  What's the point of reporting these numbers if we're going to
allow that kind of distortion?

>> and are they prepared to take a performance hit
>> to avoid the counter jump after the monitoring app exits?
>
> It doesn't look like we know how much of a  performance hit this
> will cause. I don't see any reasons cited in the code, either.
> Could this be a case of premature optimization. The commit log
> shows that, during the overhaul (commit 77947c5), this check was
> put in place to emulate what the previous code did; code before
> that commit reported stats, including transaction stats, only if
> there were any regular or shared table stats to report.

More than that, this only causes new activity for a connection
which, within a single PGSTAT_STAT_INTERVAL, ran queries -- but
none of the queries run accessed any tables.  That should be a
pretty small number of connections, since there only
special-purpose connections (e.g., monitoring) are likely to do
that.  And when it does happen, the new activity is just the same
as what any connection which does access a table would generate.
There's nothing special or more intense about this.  And an idle
connection won't generate any new activity.  This concern seem like
much ado about nothing (or about as close to nothing as you can
get).

That said, if you *did* actually have a workload where you were
using the database engine as a calculator, running such queries at
volume on a lot of connections, wouldn't you want the statistics to
represent that accurately?

> Besides, there's already a throttle built in using the
> PGSTAT_STAT_INTERVAL limit.

This is a key point; neither the original patch nor my proposed
alternative bypasses that throttling.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Companydiff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 3ab1428..6e6f7fe 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -753,6 +753,7 @@ pgstat_report_stat(bool force)
 
 	/* Don't expend a clock check if nothing to do */
 	if ((pgStatTabList == NULL || pgStatTabList->tsa_used == 0) &&
+		pgStatXactCommit == 0 && pgStatXactRollback == 0 &&
 		!have_function_stats && !force)
 		return;
 
@@ -817,11 +818,11 @@ pgstat_report_stat(bool force)
 	}
 
 	/*
-	 * Send partial messages.  If force is true, make sure that any pending
-	 * xact commit/abort gets counted, even if no table stats to send.
+	 * Send partial messages.  Make sure that any pending xact commit/abort
+	 * ge

Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD

2014-06-29 Thread Tom Lane
I wrote:
> BTW, it sure looks like the part of ACX_PTHREAD beginning with
>  # Various other checks:
>  if test "x$acx_pthread_ok" = xyes; then
> (lines 163..210 in HEAD's acx_pthread.m4) is dead code.

On closer inspection, this has been broken since commit
e48322a6d6cfce1ec52ab303441df329ddbc04d1, which is just barely short of
its tenth birthday.  The reason we've not noticed is that Postgres makes
no use of PTHREAD_CREATE_JOINABLE, nor of PTHREAD_CC, nor of HAVE_PTHREAD,
nor of the success/failure options for ACX_PTHREAD.

I'm tempted to just rip out all the useless code rather than fix the
logic bug as such.  OTOH, that might complicate updating to more recent
versions of the original Autoconf macro.  On the third hand, we've not
bothered to do that in ten years either.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD

2014-06-29 Thread Andres Freund
On 2014-06-29 12:20:02 -0400, Tom Lane wrote:
> I wrote:
> > BTW, it sure looks like the part of ACX_PTHREAD beginning with
> >  # Various other checks:
> >  if test "x$acx_pthread_ok" = xyes; then
> > (lines 163..210 in HEAD's acx_pthread.m4) is dead code.
> 
> On closer inspection, this has been broken since commit
> e48322a6d6cfce1ec52ab303441df329ddbc04d1, which is just barely short of
> its tenth birthday.  The reason we've not noticed is that Postgres makes
> no use of PTHREAD_CREATE_JOINABLE, nor of PTHREAD_CC, nor of HAVE_PTHREAD,
> nor of the success/failure options for ACX_PTHREAD.
> 
> I'm tempted to just rip out all the useless code rather than fix the
> logic bug as such.  OTOH, that might complicate updating to more recent
> versions of the original Autoconf macro.  On the third hand, we've not
> bothered to do that in ten years either.

Rip it out, maye leaving a comment inplace like
/* upstream tests for stuff we don't need here */

in its place. Since there have been a number of changes to the file,
one large missing hunk shouldn't make the task of syncing measurably
more difficult.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to send transaction commit/rollback stats to the stats collector unconditionally.

2014-06-29 Thread Tom Lane
Kevin Grittner  writes:
> Gurjeet Singh  wrote:
>> Besides, there's already a throttle built in using the
>> PGSTAT_STAT_INTERVAL limit.

> This is a key point; neither the original patch nor my proposed
> alternative bypasses that throttling.

That's a fair point that probably obviates my objection.  I think the
interval throttling is more recent than the code in question.

If we're going to do it like this, then I think the force flag should
be considered to do nothing except override the clock check, which
probably means it shouldn't be tested in the initial if() at all.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-29 Thread Kevin Grittner
Robert Haas  wrote:
> On Wed, Jun 25, 2014 at 11:40 AM, Tom Lane  wrote:

>>  Fujii Masao  writes:
>>>  Why is IIT timeout turned on only when send_ready_for_query is true?
>>>  I was thinking it should be turned on every time a message is received.
>>>  Imagine the case where the session is in idle-in-transaction state and
>>>  a client gets stuck after sending Parse message and before sending Bind
>>>  message. This case would also cause long transaction problem and should
>>>  be resolved by IIT timeout. But IIT timeout that this patch adds cannot
>>>  handle this case because it's enabled only when send_ready_for_query is
>>>  true. Thought?
>> 
>>  I think you just moved the goalposts to the next county.
>> 
>>  The point of this feature, AFAICS, is to detect clients that are failing
>>  to issue another query or close their transaction as a result of bad
>>  client logic.  It's not to protect against network glitches.
> 
> Hmm, so there's no reason a client, after sending one protocol
> message, might not pause before sending the next protocol message?
> That seems like a surprising argument.  Someone couldn't Parse and
> then wait before sending Bind and Execute, or Parse and Bind and then
> wait before sending Execute?
> 
> 
>>  Moreover, there would be no way to implement a timeout like that without
>>  adding a gettimeofday() call after every packet receipt, which is overhead
>>  we do not need and cannot afford.  I don't think this feature should add
>>  *any* gettimeofday's beyond the ones that are already there.
> 
> That's an especially good point if we think that this feature will be
> enabled commonly or by default - but as Fujii Masao says, it might be
> tricky to avoid.  :-(

I think that this patch, as it stands, is a clear win if the
postgres_fdw part is excluded.  Remaining points of disagreement
seem to be the postgres_fdw, whether a default value measured in
days might be better than a default of off, and whether it's worth
the extra work of covering more.  The preponderance of opinion
seems to be in favor of excluding the postgres_fdw changes, with
Tom being violently opposed to including it.  I consider the idea
of the FDW ignoring the server setting dead.  Expanding the
protected area of code seems like it would be sane to ask whoever
wants to extend the protection in that direction to propose a
patch.  My sense is that there is more support for a default of a
small number of days than a default of never, but that seems far
from settled.

I propose to push this as it stands except for the postgres_fdw
part.  The default is easy enough to change if we reach consensus,
and expanding the scope can be a new patch in a new CF. 
Objections?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-29 Thread Tom Lane
Kevin Grittner  writes:
> I propose to push this as it stands except for the postgres_fdw
> part.  The default is easy enough to change if we reach consensus,
> and expanding the scope can be a new patch in a new CF. 
> Objections?

There's been enough noise about the external definition that I think
no one has bothered to worry about whether the patch is actually
implemented sanely.  I do not think it is: specifically, the notion
that we will call ereport(FATAL) directly from a signal handler
does not fill me with warm fuzzies.  While the scope of code in
which the signal is enabled is relatively narrow, that doesn't mean
there's no problem.  Consider in particular the case where we are using
SSL: this will mean that we take control away from OpenSSL, which might be
in the midst of doing something if we're unlucky timing-wise, and then
we'll re-entrantly invoke it to try to send an error message to the
client.  That seems pretty unsafe.  Another risky flow of control is
if ReadCommand throws an ordinary error and then the timeout interrupt
happens while we're somewhere in the transaction abort logic (the
sigsetjmp stanza in postgres.c).

I'd be happier if this were implemented in the more traditional
style where the signal handler just sets a volatile flag variable,
which would be consulted at determinate places in the mainline logic.
Or possibly it could be made safe if we only let it throw the error
directly when ImmediateInterruptOK is true (compare the handling
of notify/catchup interrupts).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission

2014-06-29 Thread Abhijit Menon-Sen
Hi Vaishnavi.

In addition to Jaime's comments about the functionality, here are are
some comments about the code.

Well, they were supposed to be comments about the code, but it turns out
I have comments about the feature as well. We need to figure out what to
do about the database and user columns. Returning an array containing
e.g. "{all}" won't fly. We must distinguish between "all" and "{all}".

I don't have a good solution, other than returning two columns each: one
a string, and one an array, only one of them set for any given record.

> + int
> + GetNumHbaLines(void)
> + {

Please add a blank line before this.

> + /*
> +  * Fetches the HbaLine corresponding to linenum variable.
> +  * Fill the values required to build a tuple, of view pg_hba_settings, in 
> values pointer.
> +  */
> + void
> + GetHbaLineByNum(int linenum, const char **values)
> + {

I suggest naming this function GetHbaValuesByLinenum() and changing the
comment to "Fill in suitable values to build a tuple representing the
HbaLine corresponding to the given linenum".

Actually, maybe it should be hba_getvaluesbyline() for consistency with
the other functions in the file? In that case, the preceding function
should also be renamed to hba_getnumlines().

> + /* Retrieving connection type */
> + switch (hba->conntype)
> + {

The comment should be just "Connection type". Generally speaking, all of
the "Retrieving" comments should be changed similarly. Or just removed.

> + case ctLocal:
> + values[0] = pstrdup("Local");
> + break;

I agree with Jaime: this should be lowercase. And do you really need to
pstrdup() these strings?

> + appendStringInfoString(&str, "{");
> + foreach(dbcell, hba->databases)
> + {
> + tok = lfirst(dbcell);
> + appendStringInfoString(&str, tok->string);
> + appendStringInfoString(&str, ",");
> + }
> + 
> + /*
> +  * For any number of values inserted, separator at the end needs to be
> +  * replaced by string terminator
> +  */
> + if (str.len > 1)
> + {
> + str.data[str.len - 1] = '\0';
> + str.len -= 1;
> + appendStringInfoString(&str, "}");
> + values[1] = str.data;
> + }
> + else
> + values[1] = NULL;

I don't like this at all. I would write it something like this:

n = 0;

/* hba->conntype stuff */

n++;
if (list_length(hba->databases) != 0)
{
initStringInfo(&str);
appendStringInfoString(&str, "{");

foreach(cell, hba->databases)
{
tok = lfirst(cell);
appendStringInfoString(&str, tok->string);
appendStringInfoString(&str, ",");
}

str.data[str.len - 1] = '}';
values[n] = str.data;
}
else
values[n] = NULL;

Note the variable n instead of using 0/1/… indexes into values, and that
I moved the call to initStringInfo from the beginning of the function.

The same applies to the other cases below.

(But this is, of course, all subject to whatever solution is found to
the all/{all} problem.)

> /* Retrieving Socket address */
> switch (hba->addr.ss_family)
> {
> case AF_INET:
> len = sizeof(struct sockaddr_in);
> break;
> #ifdef HAVE_IPV6
> case AF_INET6:
> len = sizeof(struct sockaddr_in6);
> break;
> #endif
> default:
> len = sizeof(struct sockaddr_storage);
> break;
> }
> if (getnameinfo((const struct sockaddr *) & hba->addr, len, buffer, 
> sizeof(buffer), NULL, 0, NI_NUMERICHOST) == 0)
> values[3] = pstrdup(buffer);
> else
> values[3] = NULL;

This should use pg_getnameinfo_all. You don't need the switch, just pass
in .salen. Use a buffer of NI_MAXHOST, not 256. Look for other calls to
pg_getnameinfo_all for examples. (Also split long lines.)

(Note: this pstrdup is OK.)

> /* Retrieving socket mask */
> switch (hba->mask.ss_family)
> {
> case AF_INET:

Same.

> + case ipCmpMask:
> + values[5] = pstrdup("Mask");
> + break;

Lowercase, no pstrdup.

> + case uaReject:
> + values[7] = pstrdup("Reject");
> + break;

Same.

> + if ((hba->usermap) && (hba->auth_method == uaIdent || hba->auth_method 
> == uaPeer || hba->auth_method == uaCert || hba->auth_method == uaGSS || 
> hba->auth_method == uaSSPI))

Split across lines.

> + appendStringInfoString(&str, pstrdup(hba->ldapserver));

No need for the many, many pstrdup()s.

> + if (str.len > 1)
> + {
> + str.data[str.len - 1] = '\0';
> + str.len -= 1;
> + appendStringInfoString(&str, "}");
> + values[8] = str.data;
> + }
> + else
> + values[8] = NULL;

This should become:

if (s

Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD

2014-06-29 Thread Tom Lane
Dave McGuire  writes:
> On 06/29/2014 10:54 AM, Andres Freund wrote:
>> Maybe I'm just not playful enough, but keeping a platform alive so we
>> can run postgres in simulator seems a bit, well, pointless.

>   On the "in a simulator" matter: It's important to keep in mind that
> there are more VAXen out there than just simulated ones.  I'm offering
> up a simulated one here because I can spin it up in a dedicated VM on a
> VMware host that's already running and I already have power budget for.
>  I could just as easily run it on real hardware...there are, at last
> count, close to forty real-iron VAXen here, but only a few of those are
> running 24/7.  I'd happily bring up another one to do Postgres builds
> and testing, if someone will send me the bucks to pay for the additional
> power and cooling.  (that is a real offer)

Well, the issue from our point of view is that a lot of what we care about
testing is extremely low-level hardware behavior, like whether spinlocks
work as expected across processors.  It's not clear that a simulator would
provide a sufficiently accurate emulation.

OTOH, the really nasty issues like cache coherency rules don't arise in
single-processor systems.  So unless you have a multiprocessor VAX
available to spin up, a simulator may tell us as much as we'd learn
anyway.

(If you have got one, maybe some cash could be found --- we do have
project funds available, and I think they'd be well spent on testing
purposes.  I don't make those decisions though.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-29 Thread Robert Haas
On Sun, Jun 29, 2014 at 12:32 PM, Kevin Grittner  wrote:
>>>  Moreover, there would be no way to implement a timeout like that without
>>>  adding a gettimeofday() call after every packet receipt, which is overhead
>>>  we do not need and cannot afford.  I don't think this feature should add
>>>  *any* gettimeofday's beyond the ones that are already there.
>>
>> That's an especially good point if we think that this feature will be
>> enabled commonly or by default - but as Fujii Masao says, it might be
>> tricky to avoid.  :-(
>
> I think that this patch, as it stands, is a clear win if the
> postgres_fdw part is excluded.  Remaining points of disagreement
> seem to be the postgres_fdw, whether a default value measured in
> days might be better than a default of off, and whether it's worth
> the extra work of covering more.  The preponderance of opinion
> seems to be in favor of excluding the postgres_fdw changes, with
> Tom being violently opposed to including it.  I consider the idea
> of the FDW ignoring the server setting dead.  Expanding the
> protected area of code seems like it would be sane to ask whoever
> wants to extend the protection in that direction to propose a
> patch.  My sense is that there is more support for a default of a
> small number of days than a default of never, but that seems far
> from settled.
>
> I propose to push this as it stands except for the postgres_fdw
> part.  The default is easy enough to change if we reach consensus,
> and expanding the scope can be a new patch in a new CF.
> Objections?

Yeah, I think someone should do some analysis of whether this is
adding gettimeofday() calls, and how many, and what the performance
implications are.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Better partial index-only scans

2014-06-29 Thread Emre Hasegeli
Joshua Yanovski :
> Proof of concept initial patch for enabling index only scans for
> partial indices even when an attribute is not in the target list, as
> long as it is only used in restriction clauses that can be proved by
> the index predicate.  This also works for index quals, though they
> still can't be used in the target list.  However, this patch may be
> inefficient since it duplicates effort that is currently delayed until
> after the best plan is chosen.

I find the improvement very useful.  I use functional and partial
indexes, a lot.  I think we even need a dummy index to make more use
of these features.

> The patch works by basically repeating the logic from
> create_indexscan_plan in createplan.c that determines which clauses
> can't be discarded, instead of the current approach, which just
> assumes that any attributes referenced anywhere in a restriction
> clause has to be a column in the relevant index.  It should build
> against master and passes make check for me.  It also includes a minor
> fix in the same code in createplan.c to make sure we're explicitly
> comparing an empty list to NIL, but I can take that out if that's not
> considered in scope.  If this were the final patch I'd probably
> coalesce the code used in both places into a single function, but
> since I'm not certain that the implementation in check_index_only
> won't change substantially I held off on that.
> 
> Since the original comment suggested that this was not done due to
> planner performance concerns, I assume the performance of this
> approach is unacceptable (though I did a few benchmarks and wasn't
> able to detect a consistent difference--what would be a good test for
> this?).  As such, this is intended as more of a first pass that I can
> build on, but I wanted to get feedback at this stage on where we can
> improve (particularly if there were already ideas on how this might be
> done, as the comment hints).  Index only scans cost less than regular
> index scans so I don't think we can get away with waiting until we've
> chosen the best plan before we do the work described above.  That
> said, as I see it performance could improve in any combination of five
> ways:
> * Improve the performance of determining which clauses can't be
> discarded (e.g. precompute some information about equivalence classes
> for index predicates, mess around with the order in which we check the
> clauses to make it fail faster, switch to real union-find data
> structures for equivalence classes).
> * Find a cleverer way of figuring out whether a partial index can be
> used than just checking which clauses can't be discarded.
> * Use a simpler heuristic (that doesn't match what use to determine
> which clauses can be discarded, but still matches more than we do
> now).
> * Take advantage of work we do here to speed things up elsewhere (e.g.
> if this does get chosen as the best plan we don't need to recompute
> the same information in create_indexscan_plan).
> * Delay determining whether to use an index scan or index only scan
> until after cost analysis somehow.  I'm not sure exactly what this
> would entail.

I do not know much about the internals of the planner.  So, I am not
in a position to decide the performance is acceptable or not.  If it
is not, I think your first solution would minimize the penalty in
almost all scenarios.  Your other options seems harder to implement.

I think, it can also be useful to store predicates implied by the
index clause or the index predicate under the path tree.  We may
make use of them in future improvements.  Especially it would be
nice to avoid calling expensive functions if they are included in
an index.  This approach can also simplify the design.  The predicates
can be stored under IndexPath even index only scan is disabled.
They can be used used unconditionally on create_indexscan_plan().

I will update the patch as returned with feedback, but it would
be nice if someone more experienced give an opinion about these
ideas.  I would be happy to review further developments when they
are ready.  Please let me know if I can help any other way.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD

2014-06-29 Thread Andres Freund
On June 29, 2014 9:01:27 PM CEST, Dave McGuire  wrote:
>On 06/29/2014 02:58 PM, Patrick Finnegan wrote:
>> Last I checked, NetBSD doesn't support any sort of multiprocessor
>VAX.
>>  Multiprocessor VAXes exist, but you're stuck with either Ultrix or
>VMS
>> on them.
>
>  Hi Pat, it's good to see your name in my inbox.
>
>  NetBSD ran on multiprocessor BI-bus VAXen many, many years ago.  Is
>that support broken?

The new atomics code refers to a VAX SMP define... So somebody seems to have 
thought about it not too long ago.

Andres

Andres

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL access to database attributes

2014-06-29 Thread Tom Lane
Vik Fearing  writes:
> On 06/21/2014 10:11 PM, Pavel Stehule wrote:
>> Is any reason or is acceptable incompatible change CONNECTION_LIMIT
>> instead CONNECTION LIMIT? Is decreasing parser size about 1% good enough
>> for breaking compatibility?

> How is compatibility broken?  The grammar still accepts the old way, I
> just changed the documentation to promote the new way.

While I agree that this patch wouldn't break backwards compatibility,
I don't really see what the argument is for changing the recommended
spelling of the command.

The difficulty with doing what you've done here is that it creates
unnecessary cross-version incompatibilities; for example a 9.5 psql
being used against a 9.4 server would tab-complete the wrong spelling
of the option.  Back-patching would change the set of versions for
which the problem exists, but it wouldn't remove the problem altogether.
And in fact it'd add new problems, e.g. pg_dumpall output from a 9.3.5
pg_dumpall failing to load into a 9.3.4 server.  This is not the kind of
change we customarily back-patch anyway.

So personally I'd have just made connection_limit be an undocumented
internal equivalent for CONNECTION LIMIT, and kept the latter as the
preferred spelling, with no client-side changes.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD

2014-06-29 Thread Tom Lane
Dave McGuire  writes:
> On 06/29/2014 10:24 AM, Tom Lane wrote:
>> Hey, right up the river from here!

>   Come on up and hack!  There's always something neat going on around
> here.  Ever run a PDP-11?  B-)

There were so many PDP-11s around CMU when I was an undergrad that
I remember seeing spare ones being used as doorstops ;-).  I even
got paid to help hack on this:
http://en.wikipedia.org/wiki/C.mmp

So nah, don't want to do it any more.  Been there done that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RLS Design

2014-06-29 Thread Stephen Frost
Robert, Dean,

* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> On 26 June 2014 18:04, Robert Haas  wrote:
> >> ALTER TABLE t1 SET POLICY p1 ON SELECT TO t1_p1_sel_quals;
> >> GRANT SELECT ON TABLE t1 TO role1 USING p1;
> >
> > As I see it, the downside of this is that it gets a lot more complex.
> > We have to revise the ACL representation, which is already pretty darn
> > complicated, to keep track not only of the grantee, grantor, and
> > permissions, but also the policies qualifying those permissions.  The
> > changes to GRANT will need to propagate into GRANT ON ALL TABLES IN
> > SCHEMA and AFTER DEFAULT PRIVILEGES.
> 
> No, it can be done without any changes to the permissions code by
> storing the ACLs on the catalog entries where the RLS quals are held,
> rather than modifying the ACL items on the table. I.e., instead of
> thinking of "USING polname" as a modifier to the grant, think of it as
> as an additional qualifier on the thing being granted.

Yeah, I agree that we could do it without changing the existing ACL
structure by using another table and having a flag in pg_class which
indicates if there are RLS policies on the table or not.

Regarding the concerns about users not using the RLS capabilities
correctly- I find that concern to be much more appropriate for the
current permissions system rather than RLS.  If a user is going to the
level of even looking at RLS then, I'd hope at least, they'll be able to
understand and make good use of RLS to implement what they need and they
would appreciate the flexibility.

To try and clarify what this distinction is-

Dean's approach with GRANT allows specifying the policy to be
used when a given role queries a given table.  Through this mechanism,
one role might have access to many different tables, possibly with a
different policy granting that access for each table.

Robert's approach defines a policy for a user and that policy is used
for all tables that user accesses.  This ties the policy very closely to
the role.

With either approach, I wonder how we are going to address the role
membership question.  Do you inherit policies through role membership?
What happens on 'set role'?  Robert points out that we should be using
"OR" for these situations of overlapping policies and I tend to agree.
Therefore, we would look at the RLS policies for a table and extract out
all of them for all of the roles which the current user is a member of,
OR them together and that would be the set of quals used.

I'm leaning towards Dean's approach.  With Robert's approach, one could
emulate Dean's approach but I suspect it would devolve quickly into one
policy per user with that policy simply being a proxy for the role
instead of being useful on its own.  With Dean's approach though, I
don't think there's a need for a policy to be a stand-alone object.  The
policy is simply a proxy for the set of quals to be added and therefore
the policy could really live as a per-table object.

> That means the syntax I proposed earlier is wrong/misleading. Instead of
> 
> GRANT SELECT ON TABLE tbl TO role USING polname;
> 
> it should really be
> 
> GRANT SELECT USING polname ON TABLE tbl TO role;

This would work, though the 'polname' could be a per-table object, no?

This could even be:

GRANT SELECT USING (sec_level=manager) ON TABLE tbl TO role;

> > There is administrative
> > complexity as well, because if you want to policy-protect an
> > additional table, you've got to add the table to the policy and then
> > update all the grants as well.  I think what will happen in practice
> > is that people will grant to PUBLIC all rights on the policy, and then
> > do all the access control through the GRANT statements.

I agree that if you want to policy protect a table that you'll need to
set the policies on the table (that's required either way) and that,
with Dean's approach, you'd have to modify the GRANTs done to that table
as well.  I don't follow what you're suggesting with granting to PUBLIC
all rights on the policy though..?

With your approach though, if you have a policy which covers all
managers and one which covers all VPs and then you have one VP whose
access should be different, you'd have to create a new policy just for
that VP and then modify all of the tables which have manager/VP access
to also have that new VP's policy too, or something along those lines,
no?

> If you assume that most users will only have one policy through which
> they can access any given table, then there is no more administrative
> overhead than we have right now. Right now you have to grant each user
> permissions on each table you define. The only difference is that now
> you throw in a "USING polname". We could also simplify administration
> by supporting
> 
> GRANT SELECT USING polname ON ALL TABLES IN SCHEMA sch TO role;
> 
> The important distinction is that this is only granting permissions on
> tables that exist now, not on tables that might be created later.

Sure, that's the same a

Re: [HACKERS] idle_in_transaction_timeout

2014-06-29 Thread Tom Lane
Robert Haas  writes:
> On Sun, Jun 29, 2014 at 12:32 PM, Kevin Grittner  wrote:
>> I propose to push this as it stands except for the postgres_fdw
>> part.  The default is easy enough to change if we reach consensus,
>> and expanding the scope can be a new patch in a new CF.
>> Objections?

> Yeah, I think someone should do some analysis of whether this is
> adding gettimeofday() calls, and how many, and what the performance
> implications are.

I believe that as the patch stands, we'd incur one new gettimeofday()
per query-inside-a-transaction, inside the enable_timeout_after() call.
(I think the disable_timeout() call would not result in a gettimeofday
call, since there would be no remaining live timeout events.)

We could possibly refactor enough to share the clock reading with the call
done in pgstat_report_activity.  Not sure how ugly that would be or
whether it's worth the trouble.  Note that in the not-a-transaction-block
case, we already have got two gettimeofday calls in this sequence, one in
pgstat_report_stat and one in pgstat_report_activity :-(

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]

2014-06-29 Thread Pavel Stehule
2014-06-29 15:24 GMT+02:00 Abhijit Menon-Sen :

> At 2014-06-29 20:35:04 +0900, maumau...@gmail.com wrote:
> >
> > Thanks, I marked it as ready for committer.  I hope Fujii san or
> > another committer will commit this, refining English expression if
> > necessary.
>
> Since it was just a matter of editing, I went through the patch and
> corrected various minor errors (typos, awkwardness, etc.). I agree
> that this is now ready for committer.
>
> I've attached the updated diff.
>

I checked it, and it looks well

Thank you

Regards

Pavel


>
> -- Abhijit
>


Re: [HACKERS] SKIP LOCKED DATA (work in progress)

2014-06-29 Thread Simon Riggs
On 29 June 2014 10:01, Thomas Munro  wrote:

> Would anyone who is interested in a SKIP LOCKED feature and
> attending the CHAR(14)/PGDay UK conference next week be
> interested in a birds-of-a-feather discussion?

Sounds like a plan. I'll check my schedule.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Array of composite types returned from python

2014-06-29 Thread Tom Lane
Abhijit Menon-Sen  writes:
>> 3) This is such a simple change with no new infrastructure code
>> (PLyObject_ToComposite already exists). Can you think of a reason
>> why this wasn't done until now? Was it a simple miss or purposefully
>> excluded?

> This is not an authoritative answer: I think the infrastructure was
> originally missing, but was later added in #bc411f25 for OUT parameters.
> Perhaps it was overlooked at the time that the same code would suffice
> for this earlier-missing case. (I've Cc:ed Peter E. in case he has any
> comments.)

> I think the patch is ready for committer.

I took a quick look at this; not really a review either, but I have
a couple comments.

1. While I think the patch does what it intends to, it's a bit distressing
that it will invoke the information lookups in PLyObject_ToComposite over
again for *each element* of the array.  We probably ought to quantify that
overhead to see if it's bad enough that we need to do something about
improving caching, as speculated in the comment in PLyObject_ToComposite.

2. I wonder whether the no-composites restriction in plpy.prepare
(see lines 133ff in plpy_spi.c) could be removed as easily.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-29 Thread Andres Freund
On 2014-06-29 15:48:15 -0400, Tom Lane wrote:
> Robert Haas  writes:
> > On Sun, Jun 29, 2014 at 12:32 PM, Kevin Grittner  wrote:
> >> I propose to push this as it stands except for the postgres_fdw
> >> part.  The default is easy enough to change if we reach consensus,
> >> and expanding the scope can be a new patch in a new CF.
> >> Objections?
> 
> > Yeah, I think someone should do some analysis of whether this is
> > adding gettimeofday() calls, and how many, and what the performance
> > implications are.
> 
> I believe that as the patch stands, we'd incur one new gettimeofday()
> per query-inside-a-transaction, inside the enable_timeout_after() call.
> (I think the disable_timeout() call would not result in a gettimeofday
> call, since there would be no remaining live timeout events.)
> 
> We could possibly refactor enough to share the clock reading with the call
> done in pgstat_report_activity.  Not sure how ugly that would be or
> whether it's worth the trouble.  Note that in the not-a-transaction-block
> case, we already have got two gettimeofday calls in this sequence, one in
> pgstat_report_stat and one in pgstat_report_activity :-(

I've seen several high throughput production servers where code around
gettimeofday is in the top three profile entries - so I'd be hesitant to
add more there. Especially as the majority of people here seems to think
we should enable this by default.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-29 Thread Tom Lane
Andres Freund  writes:
> On 2014-06-29 15:48:15 -0400, Tom Lane wrote:
>> Robert Haas  writes:
>>> Yeah, I think someone should do some analysis of whether this is
>>> adding gettimeofday() calls, and how many, and what the performance
>>> implications are.

>> I believe that as the patch stands, we'd incur one new gettimeofday()
>> per query-inside-a-transaction, inside the enable_timeout_after() call.
>> (I think the disable_timeout() call would not result in a gettimeofday
>> call, since there would be no remaining live timeout events.)
>> 
>> We could possibly refactor enough to share the clock reading with the call
>> done in pgstat_report_activity.  Not sure how ugly that would be or
>> whether it's worth the trouble.  Note that in the not-a-transaction-block
>> case, we already have got two gettimeofday calls in this sequence, one in
>> pgstat_report_stat and one in pgstat_report_activity :-(

> I've seen several high throughput production servers where code around
> gettimeofday is in the top three profile entries - so I'd be hesitant to
> add more there. Especially as the majority of people here seems to think
> we should enable this by default.

Note that we'd presumably also be adding two kernel calls associated
with setting/killing the SIGALRM timer.  I'm not sure how much those
cost, but it likely wouldn't be negligible compared to the gettimeofday
cost.

OTOH, one should not forget that there's also going to be a client
round trip involved here, so it's possible this is all down in the
noise compared to that.  But we ought to try to quantify it rather
than just hope for the best.

A thought that comes to mind in connection with that is whether we
shouldn't be doing the ReadyForQuery call (which I believe includes
fflush'ing the previous query response out to the client) before
rather than after all this statistics housekeeping.  Then at least
we'd have a shot at spending these cycles in parallel with the
network I/O and client think-time, rather than serializing it all.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-29 Thread Andres Freund
On 2014-06-29 12:53:56 -0400, Tom Lane wrote:
> I do not think it is: specifically, the notion
> that we will call ereport(FATAL) directly from a signal handler
> does not fill me with warm fuzzies.

Aren't we already pretty much doing that for
SIGTERM/pg_terminate_backend() and recovery conflict interrupts?

If we get a SIGTERM while reading a command die() will set
ProcDiePending() and call ProcessInterrupts() after disabling some other
interrupts. Then the latter will FATAL out.

Imo the idle timeout handler pretty much needs a copy of die(), just
setting a different variable than (or in addition to?) ProcDiePending.

BUT: why is what ProcessInterrupts() is doing safe? Shouldn't it always
at least set whereToSendOutput = DestNone when FATALing while reading
(potentially via openssl)? The current behaviour imo both a protocol
violation and dangerous because of what you explained?

> I'd be happier if this were implemented in the more traditional
> style where the signal handler just sets a volatile flag variable,
> which would be consulted at determinate places in the mainline logic.
> Or possibly it could be made safe if we only let it throw the error
> directly when ImmediateInterruptOK is true (compare the handling
> of notify/catchup interrupts).

Hm. That sounds approximately like what I've written above.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2014-06-29 Thread David Fetter
On Sat, Jun 28, 2014 at 07:35:10AM -0700, Kevin Grittner wrote:
> David Fetter  wrote:
> > On Sat, Jun 21, 2014 at 11:06:26AM -0700, Kevin Grittner wrote:
> 
> >> Here is v2.
> 
> > I've taken the liberty of making an extension that uses this.
> > Preliminary tests indicate a 10x performance improvement over the
> > user-space hack I did that's similar in functionality.
> 
> Wow, this goes well beyond what I expected for a review!  Thanks!

It was the minimum I could come up with to test whether the patch
worked.

> As I said in an earlier post, I think that this is best committed
> as a series of patches, one for the core portion and one for each
> PL which implements the ability to use the transition (delta)
> relations in AFTER triggers.

Right.  I'm still holding out hope of having the transition relations
available in some more general way, but that seems more like a
refactoring job than anything fundamental.

> Your extension covers the C trigger angle, and it seems to me to be
> worth committing to contrib as a sample of how to use this feature
> in C.

It's missing a few pieces like surfacing transition table names.  I'll
work on those.  Also, it's not clear to me how to access the pre- and
post- relations at the same time, this being necessary for many use
cases.  I guess I need to think more about how that would be done.

> It is very encouraging that you were able to use this without
> touching what I did in core, and that it runs 10x faster than the
> alternatives before the patch.

The alternative included was pretty inefficient, so there's that.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-29 Thread Andres Freund
On 2014-06-29 17:28:06 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2014-06-29 15:48:15 -0400, Tom Lane wrote:
> >> Robert Haas  writes:
> >>> Yeah, I think someone should do some analysis of whether this is
> >>> adding gettimeofday() calls, and how many, and what the performance
> >>> implications are.
> 
> >> I believe that as the patch stands, we'd incur one new gettimeofday()
> >> per query-inside-a-transaction, inside the enable_timeout_after() call.
> >> (I think the disable_timeout() call would not result in a gettimeofday
> >> call, since there would be no remaining live timeout events.)
> >> 
> >> We could possibly refactor enough to share the clock reading with the call
> >> done in pgstat_report_activity.  Not sure how ugly that would be or
> >> whether it's worth the trouble.  Note that in the not-a-transaction-block
> >> case, we already have got two gettimeofday calls in this sequence, one in
> >> pgstat_report_stat and one in pgstat_report_activity :-(
> 
> > I've seen several high throughput production servers where code around
> > gettimeofday is in the top three profile entries - so I'd be hesitant to
> > add more there. Especially as the majority of people here seems to think
> > we should enable this by default.
> 
> Note that we'd presumably also be adding two kernel calls associated
> with setting/killing the SIGALRM timer.  I'm not sure how much those
> cost, but it likely wouldn't be negligible compared to the gettimeofday
> cost.

It's probably higher, at least if we get around to replacing
gettimeofday() with clock_gettime() :(

So, i've traced a SELECT 1. We're currently doing:
1) gettimeofday() in SetCurrentStatementStartTimestamp
2) gettimeofday() pgstat_report_activity()
3) gettimeofday() for enable_timeout_after (id=STATEMENT_TIMEOUT)
4) setitimer() for schedule_alarm for STATEMENT_TIMEOUT
5) gettimeofday() for pgstat_report_activity()

Interestingly recvfrom(), setitimer(), sendto() are the only calls to
actually fully hit the kernel via syscalls (i.e. visible via strace).

The performance difference of setting up statement_timeout=10s for a
pgbench run that does:
\setrandom aid 1 100
SELECT * FROM pgbench_accounts WHERE aid = :aid;
is 164850.368336 (no statment_timeout) vs 157866.924725
(statement_timeout=10s). That's the best of 10 10s runs.
for SELECT 1 it's 242846.348628 vs 236764.177593.

Not too bad. Absolutely bleeding edge kernel/libc though; I seem to
recall different results with earlier libc/kernel combos.

I think statement_timeout's overhead should be fairly similar to what's
proposed for iit_t?

> A thought that comes to mind in connection with that is whether we
> shouldn't be doing the ReadyForQuery call (which I believe includes
> fflush'ing the previous query response out to the client) before
> rather than after all this statistics housekeeping.  Then at least
> we'd have a shot at spending these cycles in parallel with the
> network I/O and client think-time, rather than serializing it all.

Worth a try. It'd be also rather neat to to consolidate the first three
gettimeofday()'s above. Afaics they should all be consolidated via
GetCurrentTransactionStartTimestamp()...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-29 Thread Tom Lane
Andres Freund  writes:
> On 2014-06-29 12:53:56 -0400, Tom Lane wrote:
>> I do not think it is: specifically, the notion
>> that we will call ereport(FATAL) directly from a signal handler
>> does not fill me with warm fuzzies.

> Aren't we already pretty much doing that for
> SIGTERM/pg_terminate_backend() and recovery conflict interrupts?

We do that *only at safe spots*, ie CHECK_FOR_INTERRUPTS. Or at least
I sure hope so. Otherwise interrupting, eg, malloc will lead to much
unhappiness.

> BUT: why is what ProcessInterrupts() is doing safe? Shouldn't it always
> at least set whereToSendOutput = DestNone when FATALing while reading
> (potentially via openssl)?

Uh, no, that would pretty much destroy the point of trying to report
an error message at all.

We do restrict the immediate interrupt to occur only while we're actually
doing a recv(), see prepare_for_client_read and client_read_ended.
I grant that in the case of an SSL connection, that could be in the middle
of some sort of SSL handshake, so it might not be completely safe
protocol-wise --- but it's not likely to mean instant data structure
corruption.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Deferring some AtStart* allocations?

2014-06-29 Thread Andres Freund
Hi,

In workloads with mostly simple statements, memory allocations are the
primary bottleneck. Some of the allocations are unlikely to be avoidable
without major work, but others seem superflous in many scenarios.

Why aren't we delaying allocations in e.g. AtStart_Inval(),
AfterTriggerBeginXact() to when the data structures are acutally used?
In most transactions neither will be?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bad estimation together with large work_mem generates terrible slow hash joins

2014-06-29 Thread Tomas Vondra
On 26.6.2014 23:48, Tomas Vondra wrote:
> On 26.6.2014 20:43, Tomas Vondra wrote:
>> Attached is v2 of the patch, with some cleanups / minor improvements:
>>
>> * there's a single FIXME, related to counting tuples in the
> 
> Meh, I couldn't resist resolving this FIXME, so attached is v3 of the
> patch. This just adds a proper 'batch tuples' counter to the hash table.
> 
> All comments, measurements on different queries etc. welcome. We'll
> certainly do a lot of testing, because this was a big issue for us.

Attached is v4 of the patch, with a few minor improvements. The only
thing worth mentioning is overflow protection, similar to what's done in
the ExecChooseHashTableSize() function. Otherwise it's mostly about
improving comments.

Also attached is a v4 with GUC, making it easier to compare effect of
the patch, by simply setting "enable_hashjoin_bucket" to "off" (original
behaviour) or "on" (new behaviour).

And finally there's an SQL script demonstrating the effect of the patch
with various work_mem settings. For example what I see on my desktop is
this (averages from 3 runs):

= SMALL WORK MEM (2MB) =
  no dynamic buckets dynamic buckets
query A   5945 ms5969 ms
query B   6080 ms5943 ms
query C   6531 ms6822 ms
query D   6962 ms6618 ms

= MEDIUM WORK MEM (16MB) =
  no dynamic buckets dynamic buckets
query A   7955 ms7944 ms
query B   9970 ms7569 ms
query C   8643 ms8560 ms
query D  33908 ms7700 ms

= LARGE WORK MEM (64MB) =
  no dynamic buckets dynamic buckets
query A   10235 ms   10233 ms
query B   32229 ms9318 ms
query C   14500 ms   10554 ms
query D  213344 ms9145 ms

Where "A" is "exactly estimated" and the other queries suffer by various
underestimates. My observations from this are:

(1) For small work_mem values it does not really matter, thanks to the
caching effects (the whole hash table fits into L2 CPU cache).

(2) For medium work_mem values (not really huge, but exceeding CPU
caches), the differences are negligible, except for the last query
with most severe underestimate. In that case the new behaviour is
much faster.

(3) For large work_mem values, the speedup is pretty obvious and
dependent on the underestimate.

The question is why to choose large work_mem values when the smaller
values actually perform better. Well, the example tables are not
perfectly representative. In case the outer table is much larger and
does not fit into RAM that easily (which is the case of large fact
tables or joins), the rescans (because of more batches) are more
expensive and outweight the caching benefits.

Also, the work_mem is shared with other nodes, e.g. aggregates, and
decreasing it because of hash joins would hurt them.

regards
Tomas
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 0d9663c..db3a953 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1900,18 +1900,20 @@ show_hash_info(HashState *hashstate, ExplainState *es)
 		if (es->format != EXPLAIN_FORMAT_TEXT)
 		{
 			ExplainPropertyLong("Hash Buckets", hashtable->nbuckets, es);
+			ExplainPropertyLong("Original Hash Buckets",
+hashtable->nbuckets_original, es);
 			ExplainPropertyLong("Hash Batches", hashtable->nbatch, es);
 			ExplainPropertyLong("Original Hash Batches",
 hashtable->nbatch_original, es);
 			ExplainPropertyLong("Peak Memory Usage", spacePeakKb, es);
 		}
-		else if (hashtable->nbatch_original != hashtable->nbatch)
+		else if ((hashtable->nbatch_original != hashtable->nbatch) || (hashtable->nbuckets_original != hashtable->nbuckets))
 		{
 			appendStringInfoSpaces(es->str, es->indent * 2);
 			appendStringInfo(es->str,
-			"Buckets: %d  Batches: %d (originally %d)  Memory Usage: %ldkB\n",
-			 hashtable->nbuckets, hashtable->nbatch,
-			 hashtable->nbatch_original, spacePeakKb);
+			"Buckets: %d (originally %d)  Batches: %d (originally %d)  Memory Usage: %ldkB\n",
+			 hashtable->nbuckets, hashtable->nbuckets_original,
+			 hashtable->nbatch, hashtable->nbatch_original, spacePeakKb);
 		}
 		else
 		{
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index 589b2f1..96fdd68 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -39,6 +39,7 @@
 
 
 static void ExecHashIncreaseNumBatches(HashJoinTable hashtable);
+static void ExecHashIncreaseNumBuckets(HashJoinTable hashtable);
 static void ExecHashBuildSkewHash(HashJoinTable hashtable, Hash *node,
 	  int mcvsToUse);
 static void ExecHashSkewTableInsert(HashJoinTable hashtable,
@@ -271,6 +272,7 @@ ExecHashTableCreate(Hash *node, List *hashOperators, bool keepNulls)
 	 */
 	hashtable = (HashJoinTable) palloc(sizeof(HashJoinTableData));
 	hashtable->nbuckets = 

Re: [HACKERS] idle_in_transaction_timeout

2014-06-29 Thread Andres Freund
On 2014-06-29 19:13:55 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2014-06-29 12:53:56 -0400, Tom Lane wrote:
> >> I do not think it is: specifically, the notion
> >> that we will call ereport(FATAL) directly from a signal handler
> >> does not fill me with warm fuzzies.
> 
> > Aren't we already pretty much doing that for
> > SIGTERM/pg_terminate_backend() and recovery conflict interrupts?
> 
> We do that *only at safe spots*, ie CHECK_FOR_INTERRUPTS. Or at least
> I sure hope so. Otherwise interrupting, eg, malloc will lead to much
> unhappiness.

I was specifically thinking about us immediately reacting to those while
we're reading from the client. We're indeed doing that directly:

#1  0x0076648a in proc_exit (code=1) at 
/home/andres/src/postgresql/src/backend/storage/ipc/ipc.c:143
#2  0x008bcbf7 in errfinish (dummy=0) at 
/home/andres/src/postgresql/src/backend/utils/error/elog.c:555
#3  0x007909f7 in ProcessInterrupts () at 
/home/andres/src/postgresql/src/backend/tcop/postgres.c:2869
#4  0x00790469 in die (postgres_signal_arg=15) at 
/home/andres/src/postgresql/src/backend/tcop/postgres.c:2604
#5  
#6  0x7fb7c8ca0ebb in __libc_recv (fd=10, buf=0xd5f240 , 
n=8192, flags=-1) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:29
#7  0x0066a07c in secure_read (port=0x1a29d30, ptr=0xd5f240 
, len=8192)
at /home/andres/src/postgresql/src/backend/libpq/be-secure.c:317
#8  0x006770b5 in pq_recvbuf () at 
/home/andres/src/postgresql/src/backend/libpq/pqcomm.c:854
#9  0x0067714f in pq_getbyte () at 
/home/andres/src/postgresql/src/backend/libpq/pqcomm.c:895
#10 0x0078d26b in SocketBackend (inBuf=0x7fffbc02bc30) at 
/home/andres/src/postgresql/src/backend/tcop/postgres.c:335
#11 0x0078d659 in ReadCommand (inBuf=0x7fffbc02bc30) at 
/home/andres/src/postgresql/src/backend/tcop/postgres.c:483

Note that we're *inside* recv() here. Both paths to recv, without ssl
and with, have called prepare_for_client_read() which sets
ImmediateInterruptOK = true. Right now I fail to see why it's safe to do
so, at least when inside openssl.

> > BUT: why is what ProcessInterrupts() is doing safe? Shouldn't it always
> > at least set whereToSendOutput = DestNone when FATALing while reading
> > (potentially via openssl)?
> 
> Uh, no, that would pretty much destroy the point of trying to report
> an error message at all.

I only mean that we should do so in scenarios where we're currently
reading from the client. For die(), while we're reading from the client,
we're sending a message the client doesn't expect - and thus
unsurprisingly doesn't report. The client will just report 'server
closed the connection unexpectedly'.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-29 Thread Tom Lane
Andres Freund  writes:
> Note that we're *inside* recv() here.

Well, actually, the recv() has probably failed with an EINTR at this
point, or else it will when/if control returns.

>> Uh, no, that would pretty much destroy the point of trying to report
>> an error message at all.

> I only mean that we should do so in scenarios where we're currently
> reading from the client. For die(), while we're reading from the client,
> we're sending a message the client doesn't expect - and thus
> unsurprisingly doesn't report.

This is nonsense.  The client will see the error as a response to its
query.  Of course, if it gets an EPIPE it might complain about that first
-- but that would only be likely to happen with a multi-packet query
string, at least over a TCP connection.

Experimenting with this on a RHEL6 box, I do see the "FATAL:  terminating
connection due to administrator command" message if I SIGTERM a backend
while idle and it's using a TCP connection; psql sees that as a response
next time it issues a command.  I do get the EPIPE behavior over a
Unix-socket connection, but I wonder if we shouldn't try to fix that.
It would make sense to see if there's data available before complaining
about the EPIPE.

Don't currently have an SSL configuration to experiment with.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deferring some AtStart* allocations?

2014-06-29 Thread Tom Lane
Andres Freund  writes:
> Why aren't we delaying allocations in e.g. AtStart_Inval(),
> AfterTriggerBeginXact() to when the data structures are acutally used?

Aren't we?  Neither of those would be doing much work certainly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deferring some AtStart* allocations?

2014-06-29 Thread Andres Freund
On 2014-06-29 19:52:23 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Why aren't we delaying allocations in e.g. AtStart_Inval(),
> > AfterTriggerBeginXact() to when the data structures are acutally used?
> 
> Aren't we?  Neither of those would be doing much work certainly.

They are perhaps not doing much in absolute terms, but it's a fair share
of the processing overhead for simple statements. AfterTriggerBeginXact()
is called unconditionally from StartTransaction() and does three
MemoryContextAlloc()s. AtStart_Inval() one.
I think they should just be initialized whenever the memory is used?
Doesn't look too complicated to me.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deferring some AtStart* allocations?

2014-06-29 Thread Tom Lane
Andres Freund  writes:
> On 2014-06-29 19:52:23 -0400, Tom Lane wrote:
>> Andres Freund  writes:
>>> Why aren't we delaying allocations in e.g. AtStart_Inval(),
>>> AfterTriggerBeginXact() to when the data structures are acutally used?
>> 
>> Aren't we?  Neither of those would be doing much work certainly.

> They are perhaps not doing much in absolute terms, but it's a fair share
> of the processing overhead for simple statements. AfterTriggerBeginXact()
> is called unconditionally from StartTransaction() and does three
> MemoryContextAlloc()s. AtStart_Inval() one.
> I think they should just be initialized whenever the memory is used?
> Doesn't look too complicated to me.

Meh.  Even "SELECT 1" is going to be doing *far* more pallocs than that to
get through raw parsing, parse analysis, planning, and execution startup.
If you can find a few hundred pallocs we can avoid in trivial queries,
it would get interesting; but I'll be astonished if saving 4 is measurable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-29 Thread Andres Freund
On 2014-06-29 19:51:05 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Note that we're *inside* recv() here.
> 
> Well, actually, the recv() has probably failed with an EINTR at this
> point, or else it will when/if control returns.

Well, the point is that we'll be reentering ssl when sending the error
message, without having left it properly. I.e. we're already hitting the
problem you've described.

Sure, if we're not FATALing, it'll return EINTR after that. But that's
not really the point here.

I wonder if we should instead *not* set ImmediateInterruptOK = true and
do a CHECK_FOR_INTERRUPT in secure_read, after returning from
openssl. When the recv in my_sock_read() sets BIO_set_retry_read()
because the signal handler caused an EINTR to be returned openssl will
return control to the caller. Which then can do the
CHECK_FOR_INTERRUPT(), jumping out without having to deal with openssl.

Probably has some problems with annoying platforms like windows though
:(. Not sure how the signal emulation plays with recv() being
interrupted there.

> >> Uh, no, that would pretty much destroy the point of trying to report
> >> an error message at all.
> 
> > I only mean that we should do so in scenarios where we're currently
> > reading from the client. For die(), while we're reading from the client,
> > we're sending a message the client doesn't expect - and thus
> > unsurprisingly doesn't report.
> 
> This is nonsense.  The client will see the error as a response to its
> query.

Man. Don't be so quick to judge stuff you can't immediately follow or
find wrongly stated as 'nonsense'.

We're sending messages back to the client while the client isn't
expecting any from the server. E.g. evidenced by the fact that libpq's
pqParseInput3() doesn't treat error messages during that phase as
errors... It instead emits them via the notice hooks, expecting them to
be NOTICEs...
This e.g. means that there's no error message stored in
conn->errorMessage.

That happens to be somewhat ok in the case of FATALs because the
connection is killed afterwards so any confusion won't be long lived,
but you can't tell me that you'd e.g. find it acceptable to send an
ERROR there.

>  Of course, if it gets an EPIPE it might complain about that first
> -- but that would only be likely to happen with a multi-packet query
> string, at least over a TCP connection.
> 
> Experimenting with this on a RHEL6 box, I do see the "FATAL:  terminating
> connection due to administrator command" message if I SIGTERM a backend
> while idle and it's using a TCP connection; psql sees that as a response
> next time it issues a command.  I do get the EPIPE behavior over a
> Unix-socket connection, but I wonder if we shouldn't try to fix that.
> It would make sense to see if there's data available before complaining
> about the EPIPE.

Even over unix sockets the data seems to be read - courtesy of
pqHandleSendFailure():
sendto(3, "Q\0\0\0\16SELECT 1;\0", 15, MSG_NOSIGNAL, NULL, 0) = -1 EPIPE 
(Broken pipe)
recvfrom(3, "E\0\0\0mSFATAL\0C57P01\0Mterminating "..., 16384, 0, NULL, NULL) = 
110

The reason we don't print anything is that pqDropConnection(), which is
called by the second pqReadData() invocation in the loop, resets the
data positions:
  conn->inStart = conn->inCursor = conn->inEnd = 0;

Moving the parseInput(conn) into the loop seems to "fix" it.


Haven't analyzed why, but if FATALs arrive during a query they're
printed twice. I've seen that before...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deferring some AtStart* allocations?

2014-06-29 Thread Andres Freund
On 2014-06-29 21:12:49 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2014-06-29 19:52:23 -0400, Tom Lane wrote:
> >> Andres Freund  writes:
> >>> Why aren't we delaying allocations in e.g. AtStart_Inval(),
> >>> AfterTriggerBeginXact() to when the data structures are acutally used?
> >> 
> >> Aren't we?  Neither of those would be doing much work certainly.
> 
> > They are perhaps not doing much in absolute terms, but it's a fair share
> > of the processing overhead for simple statements. AfterTriggerBeginXact()
> > is called unconditionally from StartTransaction() and does three
> > MemoryContextAlloc()s. AtStart_Inval() one.
> > I think they should just be initialized whenever the memory is used?
> > Doesn't look too complicated to me.
> 
> Meh.  Even "SELECT 1" is going to be doing *far* more pallocs than that to
> get through raw parsing, parse analysis, planning, and execution
> startup.

The quick test I ran used prepared statements - there the number of
memory allocations is *much* lower...

> If you can find a few hundred pallocs we can avoid in trivial queries,
> it would get interesting; but I'll be astonished if saving 4 is measurable.

I only noticed it because it shows up in profiles. I doubt it'll even
remotely be noticeable without using prepared statements, but a lot of
people do use those.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cluster name in ps output

2014-06-29 Thread Fujii Masao
On Sun, Jun 29, 2014 at 9:25 PM, Andres Freund  wrote:
> On 2014-06-29 11:11:14 +0100, Thomas Munro wrote:
>> On 29 June 2014 10:55, Andres Freund  wrote:
>> > So, I'd looked at it with an eye towards committing it and found some
>> > more things. I've now
>> > * added the restriction that the cluster name can only be ASCII. It's
>> >   shown from several clusters with differing encodings, and it's shown
>> >   in process titles, so that seems wise.
>> > * moved everything to the LOGGING_WHAT category
>> > * added minimal documention to monitoring.sgml
>> > * expanded the config.sgml entry to mention the restriction on the name.
>> > * Changed the GUCs short description
>>
>> Thank you.
>>
>> > I also think, but haven't done so, we should add a double colon after
>> > the cluster name, so it's not:
>> >
>> > postgres: server1 stats collector process
>> > but
>> > postgres: server1: stats collector process
>>
>> +1
>
> Committed with the above changes. Thanks for the contribution!

+build). Only printable ASCII characters may be used in the
+application_name value. Other characters will be

application_name should be cluster_name?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_resetxlog to clear backup start/end locations.

2014-06-29 Thread Kyotaro HORIGUCHI
Mmm. This patch is found that useless, from the first.

> Thanks for the patch! But when I read the source code of pg_resetxlog,
> I found that it has already reset the backup locations. Please see
> RewriteControlFile() which does that. So I wonder if we need nothing
> at least in HEAD for the purpose which you'd like to achieve. Thought?

Thank you for noticing of that, I checked by myself and found
that what this patch intended is already done in all
origin/REL9_x_STABLE. What is more, that code has not changed for
over hundreds of commits on each branch, that is, maybe from the
first. I don't know how I caught in such a stupid
misunderstanding, but all these patches are totally useless.

Sorry for taking your time for such a useless thing and thank you
for your kindness.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_resetxlog to clear backup start/end locations.

2014-06-29 Thread Fujii Masao
On Mon, Jun 30, 2014 at 12:49 PM, Kyotaro HORIGUCHI
 wrote:
> Mmm. This patch is found that useless, from the first.
>
>> Thanks for the patch! But when I read the source code of pg_resetxlog,
>> I found that it has already reset the backup locations. Please see
>> RewriteControlFile() which does that. So I wonder if we need nothing
>> at least in HEAD for the purpose which you'd like to achieve. Thought?
>
> Thank you for noticing of that, I checked by myself and found
> that what this patch intended is already done in all
> origin/REL9_x_STABLE. What is more, that code has not changed for
> over hundreds of commits on each branch, that is, maybe from the
> first. I don't know how I caught in such a stupid
> misunderstanding, but all these patches are totally useless.

So we should mark this patch as "Rejected Patches"?

> Sorry for taking your time for such a useless thing and thank you
> for your kindness.

No problem!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 CF1

2014-06-29 Thread Abhijit Menon-Sen
Hi.

We're two weeks into the CommitFest now, and things are moving along
quite nicely.

Fourteen patches have been committed, and twelve more are marked ready
for committer. But most importantly, many patches have been reviewed,
and only nine patches still lack a reviewer (and most of those have
seen some discussion already).

I have sent a number of reminders and status inquiries by private mail,
and will continue to do so for "waiting on author" and "needs review"
patches. (Living as I do at the edge of a forest, I have also curated
a fine selection of sharp sticks to poke people with if they don't
respond during the week.)

Here's a list of the patches that have no listed reviewers:

Buffer capture facility: check WAL replay consistency

http://archives.postgresql.org/message-id/CAB7nPqTFK4=fcrto=lji4vlbx9ah+fv1z1ke6r98ppxuuxw...@mail.gmail.com

Generic atomics implementation

http://archives.postgresql.org/message-id/20140625171434.gg24...@awork2.anarazel.de

KNN-GiST with recheck

http://archives.postgresql.org/message-id/CAPpHfdu_qBLNRnv-r=_tofZYYa6-r=z5_mgf4_phaokwcyx...@mail.gmail.com

Sort support for text with strxfrm() poor man's keys

http://archives.postgresql.org/message-id/CAM3SWZQKwELa58h1R9sxwAOCJpgs=xjbiu7apdyq9puusfq...@mail.gmail.com

Use unique index for longer pathkeys

http://archives.postgresql.org/message-id/20140613.164133.160845727.horiguchi.kyot...@lab.ntt.co.jp

CSN snapshots
http://archives.postgresql.org/message-id/539ad153.9000...@vmware.com

Postgres Hibernator contrib module

http://archives.postgresql.org/message-id/cabwtf4xgbpgwmkkt6ezsfvnpf11kmag5m969twkgodhnpj-...@mail.gmail.com

possibility to set "double" style for unicode lines

http://archives.postgresql.org/message-id/cafj8prczno6kp3vyuzajkakpcfb_abrhsralcgjujmpxno9...@mail.gmail.com

Refactor SSL code to support other SSL implementations
http://archives.postgresql.org/message-id/53986cf4.1030...@vmware.com

Does anyone want to pick up one of these?

Thanks to everyone for their participation. It's especially nice to see
several reviews posted, and the authors responding quickly with updated
versions of their patch to address the review comments.

As always, please feel free to contact me with questions.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Better partial index-only scans

2014-06-29 Thread Abhijit Menon-Sen
At 2014-06-29 21:55:24 +0300, e...@hasegeli.com wrote:
>
> I will update the patch as returned with feedback

I don't think it's fair to mark this as returned with feedback without a
more detailed review (I think of returned with feedback as providing a
concrete direction to follow). I've set it back to "needs review".

Does anyone else want to look at this patch?

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_resetxlog to clear backup start/end locations.

2014-06-29 Thread Kyotaro HORIGUCHI
Hello,

> > misunderstanding, but all these patches are totally useless.
> 
> So we should mark this patch as "Rejected Patches"?

I think so.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_resetxlog to clear backup start/end locations.

2014-06-29 Thread Abhijit Menon-Sen
At 2014-06-30 13:55:59 +0900, horiguchi.kyot...@lab.ntt.co.jp wrote:
>
> > So we should mark this patch as "Rejected Patches"?
> 
> I think so.

Done.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlogdump --stats

2014-06-29 Thread Dilip kumar
On 13 June 2014 13:01, Abhijit Menon-Sen Wrote

> 
> I've changed this to use %zu at Álvaro's suggestion. I'll post an
> updated patch after I've finished some (unrelated) refactoring.

I have started reviewing the patch..

1. Patch applied to git head cleanly.
2. Compiled in Linux  -- Some warnings same as mentioned by furuyao 

3. Some compilation error in windows
.\contrib\pg_xlogdump\pg_xlogdump.c(1002) : error C2065: 'optional_argument' : 
undeclared identifier
.\contrib\pg_xlogdump\pg_xlogdump.c(1002) : error C2099: initializer is not a 
constant

optional_argument should be added to getopt_long.h file for windows.

Please fix these issues and send the updated patch..

I will continue reviewing the patch..


Regards,
Dilip




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] better atomics - v0.5

2014-06-29 Thread Amit Kapila
On Sun, Jun 29, 2014 at 2:54 PM, Andres Freund 
wrote:
> On 2014-06-29 11:53:28 +0530, Amit Kapila wrote:
> > On Sat, Jun 28, 2014 at 1:48 PM, Andres Freund 
> > I think it is better to have some discussion about it. Apart from this,
> > today I noticed atomic read/write doesn't use any barrier semantics
> > and just rely on volatile.
>
> Yes, intentionally so. It's often important to get/set the current value
> without the cost of emitting a memory barrier. It just has to be a
> recent value  and it actually has to come from memory.

I agree with you that we don't want to incur the cost of memory barrier
for get/set, however it should not be at the cost of correctness.

> And that's actually
> enforced by the current definition - I think?

Yeah, this is the only point which I am trying to ensure, thats why I sent
you few links in previous mail where I had got some suspicion that
just doing get/set with volatile might lead to correctness issue in some
cases.

Some another Note, I found today while reading more on it which suggests
that my previous observation is right:

"Within a thread of execution, accesses (reads and writes) to all volatile
objects are guaranteed to not be reordered relative to each other, but this
order is not guaranteed to be observed by another thread, since volatile
access does not establish inter-thread synchronization."
http://en.cppreference.com/w/c/atomic/memory_order

> > > b) It's only supported from vista onwards. Afaik we still support XP.
> > #ifndef pg_memory_barrier_impl
> > #define pg_memory_barrier_impl() MemoryBarrier()
> > #endif
> >
> > The MemoryBarrier() macro used also supports only on vista onwards,
> > so I think for reasons similar to using MemoryBarrier() can apply for
> > this as well.
>
> I think that's odd because barrier.h has been using MemoryBarrier()
> without a version test for a long time now. I guess everyone uses a new
> enough visual studio.

Yeap or might be the people who even are not using new enough version
doesn't have a use case that can hit a problem due to MemoryBarrier()

> Those intrinsics aren't actually OS but just
> compiler dependent.
>
> Otherwise we could just define it as:
>
> FORCEINLINE
> VOID
> MemoryBarrier (
> VOID
> )
> {
> LONG Barrier;
> __asm {
> xchg Barrier, eax
> }
> }

Yes, I think it will be better, how about defining this way just for
the cases where MemoryBarrier macro is not supported.

> > > c) It doesn't make any difference on x86 ;)
> >
> > What about processors like Itanium which care about acquire/release
> > memory barrier semantics?
>
> Well, it still will be correct? I don't think it makes much sense to
> focus overly much on itanium here with the price of making anything more
> complicated for others.

Completely agreed that we should not add or change anything which
makes patch complicated or can effect the performance, however
the reason for focusing is just to ensure that we should not break
any existing use case for Itanium w.r.t performance or otherwise.

Another way is that we can give ignore or just give lower priority for
verfiying if the patch has anything wrong for Itanium processors.

> > If I am reading C11's spec for this API correctly, then it says as
below:
> > "Atomically compares the value pointed to by obj with the value pointed
> > to by expected, and if those are equal, replaces the former with desired
> > (performs read-modify-write operation). Otherwise, loads the actual
value
> > pointed to by obj into *expected (performs load operation)."
> >
> > So it essentialy means that it loads actual value in expected only if
> > operation is not successful.
>
> Yes. But in the case it's successfull it will already contain the right
> value.

The point was just that we are not completely following C11 atomic
specification, so it might be okay to tweak a bit and make things
simpler and save LOAD instruction.  However as there is no correctness
issue here and you think that current implementation is good and
can serve more cases, we can keep as it is and move forward.

> > > > 4.
> > ..
> > > > There is a Caution notice in microsoft site indicating
> > > > _ReadWriteBarrier/MemoryBarrier are deprected.
> > >
> > > It seemed to be the most widely available API, and it's what barrier.h
> > > already used.
> > > Do you have a different suggestion?
> >
> > I am trying to think based on suggestion given by Microsoft, but
> > not completely clear how to accomplish the same at this moment.
>
> Well, they refer to C11 stuff. But I think it'll be a while before it
> makes sense to use a fallback based on that.

In this case, I have a question for you.

Un-patched usage  in barrier.h is as follows:
..
#elif defined(__ia64__) || defined(__ia64)
#define pg_compiler_barrier() _Asm_sched_fence()
#define pg_memory_barrier() _Asm_mf()

#elif defined(WIN32_ONLY_COMPILER)
/* Should work on both MSVC and Borland. */
#include 
#pragma intrinsic(_ReadWriteBarrier)
#define pg_compiler_barr

Re: [HACKERS] psql: show only failed queries

2014-06-29 Thread Rajeev rastogi
On 26 June 2014 11:53, Samrat Revagade Wrote:

>> I am sending updated patch - buggy statement is printed via more logical 
>> psql_error function instead printf

>Thank you for updating patch, I really appreciate your efforts.

>Now, everything is good from my side.
>* it apply cleanly to the current git master
>* includes necessary docs
>* I think It is very good  and necessary feature.

>If Kumar Rajeev Rastogi do not have any extra comments, then I think patch is  
>ready for committer.

I have reviewed this patch. Please find my review comments below:

1. Command start-up option (e.g. -a/--echo-all for --ECHO=all), for new 
functionality is not provided.

2. New Command start-up option should be added in "psql --help" as well as 
in documentation.

Also as I understand, this new option is kind of sub-set of existing option 
(ECHO=query), so should not we display
query string in the same format as it was getting printed earlier.
Though I also feel that prefixing query with STATEMENT word will be helpful to 
grep but at the same time I am worried
about inconsistency with existing option.

Thanks and Regards,
Kumar Rajeev Rastogi


Re: [HACKERS] pg_xlogdump --stats

2014-06-29 Thread Abhijit Menon-Sen
At 2014-06-30 05:19:10 +, dilip.ku...@huawei.com wrote:
>
> I have started reviewing the patch..

Thanks.

> 1. Patch applied to git head cleanly.
> 2. Compiled in Linux  -- Some warnings same as mentioned by furuyao 

I've attached an updated version of the patch which should fix the
warnings by using %zu.

> 3. Some compilation error in windows
> .\contrib\pg_xlogdump\pg_xlogdump.c(1002) : error C2065: 'optional_argument' 
> : undeclared identifier
> .\contrib\pg_xlogdump\pg_xlogdump.c(1002) : error C2099: initializer is not a 
> constant
> 
> optional_argument should be added to getopt_long.h file for windows.

Hmm. I have no idea what to do about this. I did notice when I wrote the
code that nothing else used optional_argument, but I didn't realise that
it wouldn't work on Windows.

It may be that the best thing to do would be to avoid using
optional_argument altogether, and have separate --stats and
--stats-per-record options. Thoughts?

> Please fix these issues and send the updated patch..

I've also attached a separate patch to factor out the identify_record
function into rm_identify functions in the individual rmgr files, so
that the code can live next to the respective _desc functions.

This allows us to remove a number of #includes from pg_xlogdump, and
makes the code a bit more straightforward to read.

-- Abhijit
diff --git a/contrib/pg_xlogdump/pg_xlogdump.c b/contrib/pg_xlogdump/pg_xlogdump.c
index 824b8c3..1d3e664 100644
--- a/contrib/pg_xlogdump/pg_xlogdump.c
+++ b/contrib/pg_xlogdump/pg_xlogdump.c
@@ -15,12 +15,28 @@
 #include 
 #include 
 
+#include "access/clog.h"
+#include "access/gin_private.h"
+#include "access/gist_private.h"
+#include "access/heapam_xlog.h"
+#include "access/heapam_xlog.h"
+#include "access/multixact.h"
+#include "access/nbtree.h"
+#include "access/spgist_private.h"
+#include "access/xact.h"
 #include "access/xlog.h"
 #include "access/xlogreader.h"
 #include "access/transam.h"
+#include "catalog/pg_control.h"
+#include "catalog/storage_xlog.h"
+#include "commands/dbcommands.h"
+#include "commands/sequence.h"
+#include "commands/tablespace.h"
 #include "common/fe_memutils.h"
 #include "getopt_long.h"
 #include "rmgrdesc.h"
+#include "storage/standby.h"
+#include "utils/relmapper.h"
 
 
 static const char *progname;
@@ -41,6 +57,8 @@ typedef struct XLogDumpConfig
 	int			stop_after_records;
 	int			already_displayed_records;
 	bool		follow;
+	bool		stats;
+	bool		stats_per_record;
 
 	/* filter options */
 	int			filter_by_rmgr;
@@ -48,6 +66,20 @@ typedef struct XLogDumpConfig
 	bool		filter_by_xid_enabled;
 } XLogDumpConfig;
 
+typedef struct Stats
+{
+	uint64		count;
+	uint64		rec_len;
+	uint64		fpi_len;
+} Stats;
+
+typedef struct XLogDumpStats
+{
+	uint64		count;
+	Stats		rmgr_stats[RM_NEXT_ID];
+	Stats		record_stats[RM_NEXT_ID][16];
+} XLogDumpStats;
+
 static void
 fatal_error(const char *fmt,...)
 __attribute__((format(PG_PRINTF_ATTRIBUTE, 1, 2)));
@@ -322,6 +354,39 @@ XLogDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
 }
 
 /*
+ * Store per-rmgr and per-record statistics for a given record.
+ */
+static void
+XLogDumpCountRecord(XLogDumpConfig *config, XLogDumpStats *stats, XLogRecPtr ReadRecPtr, XLogRecord *record)
+{
+	RmgrId		rmid;
+	uint8  		recid;
+
+	if (config->filter_by_rmgr != -1 &&
+		config->filter_by_rmgr != record->xl_rmid)
+		return;
+
+	if (config->filter_by_xid_enabled &&
+		config->filter_by_xid != record->xl_xid)
+		return;
+
+	rmid = record->xl_rmid;
+	recid = (record->xl_info & ~XLR_INFO_MASK) >> 4;
+
+	stats->count++;
+
+	stats->rmgr_stats[rmid].count++;
+	stats->rmgr_stats[rmid].rec_len += record->xl_len;
+	stats->rmgr_stats[rmid].fpi_len +=
+		(record->xl_tot_len - record->xl_len - SizeOfXLogRecord);
+
+	stats->record_stats[rmid][recid].count++;
+	stats->record_stats[rmid][recid].rec_len += record->xl_len;
+	stats->record_stats[rmid][recid].fpi_len +=
+		(record->xl_tot_len - record->xl_len - SizeOfXLogRecord);
+}
+
+/*
  * Print a record to stdout
  */
 static void
@@ -380,6 +445,476 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogRecPtr ReadRecPtr, XLogRecord
 	}
 }
 
+/*
+ * Given an xl_rmid and the high bits of xl_info, returns a string
+ * describing the record type.
+ */
+static const char *
+identify_record(RmgrId rmid, uint8 info)
+{
+	const RmgrDescData *desc = &RmgrDescTable[rmid];
+	const char *rec;
+
+	rec = psprintf("0x%x", info);
+
+	switch (rmid)
+	{
+		case RM_XLOG_ID:
+			switch (info)
+			{
+case XLOG_CHECKPOINT_SHUTDOWN:
+	rec = "CHECKPOINT_SHUTDOWN";
+	break;
+case XLOG_CHECKPOINT_ONLINE:
+	rec = "CHECKPOINT_ONLINE";
+	break;
+case XLOG_NOOP:
+	rec = "NOOP";
+	break;
+case XLOG_NEXTOID:
+	rec = "NEXTOID";
+	break;
+case XLOG_SWITCH:
+	rec = "SWITCH";
+	break;
+case XLOG_BACKUP_END:
+	rec = "BACKUP_END";
+	break;
+case XLOG_PARAMETER_CHANGE:
+	rec = "PARAMETER_CHANGE";
+	break;
+case XLOG_RESTOR

Re: [HACKERS] pg_xlogdump --stats

2014-06-29 Thread Abhijit Menon-Sen
At 2014-06-30 12:05:06 +0530, a...@2ndquadrant.com wrote:
>
> It may be that the best thing to do would be to avoid using
> optional_argument altogether, and have separate --stats and
> --stats-per-record options. Thoughts?

That's what I've done in the attached patch, except I've called the new
option --record-stats. Both options now use no_argument. This should
apply on top of the diff I posted a little while ago.

-- Abhijit
commit cc9422aa71ef0b507c634282272be3fd15c39c0b
Author: Abhijit Menon-Sen 
Date:   Mon Jun 30 12:15:54 2014 +0530

Introduce --record-stats to avoid use of optional_argument

diff --git a/contrib/pg_xlogdump/pg_xlogdump.c b/contrib/pg_xlogdump/pg_xlogdump.c
index 47838d4..1853b47 100644
--- a/contrib/pg_xlogdump/pg_xlogdump.c
+++ b/contrib/pg_xlogdump/pg_xlogdump.c
@@ -609,7 +609,8 @@ main(int argc, char **argv)
 		{"timeline", required_argument, NULL, 't'},
 		{"xid", required_argument, NULL, 'x'},
 		{"version", no_argument, NULL, 'V'},
-		{"stats", optional_argument, NULL, 'z'},
+		{"stats", no_argument, NULL, 'z'},
+		{"record-stats", no_argument, NULL, 'Z'},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -643,7 +644,7 @@ main(int argc, char **argv)
 		goto bad_argument;
 	}
 
-	while ((option = getopt_long(argc, argv, "be:?fn:p:r:s:t:Vx:z::",
+	while ((option = getopt_long(argc, argv, "be:?fn:p:r:s:t:Vx:zZ",
  long_options, &optindex)) != -1)
 	{
 		switch (option)
@@ -738,18 +739,10 @@ main(int argc, char **argv)
 break;
 			case 'z':
 config.stats = true;
-config.stats_per_record = false;
-if (optarg)
-{
-	if (strcmp(optarg, "record") == 0)
-		config.stats_per_record = true;
-	else if (strcmp(optarg, "rmgr") != 0)
-	{
-		fprintf(stderr, "%s: unrecognised argument to --stats: %s\n",
-progname, optarg);
-		goto bad_argument;
-	}
-}
+break;
+			case 'Z':
+config.stats = true;
+config.stats_per_record = true;
 break;
 			default:
 goto bad_argument;
diff --git a/doc/src/sgml/pg_xlogdump.sgml b/doc/src/sgml/pg_xlogdump.sgml
index d9f4a6a..bfd9eb9 100644
--- a/doc/src/sgml/pg_xlogdump.sgml
+++ b/doc/src/sgml/pg_xlogdump.sgml
@@ -181,12 +181,22 @@ PostgreSQL documentation
 
  
   -z
-  --stats[=record]
+  --stats
   

 Display summary statistics (number and size of records and
-full-page images) instead of individual records. Optionally
-generate statistics per-record instead of per-rmgr.
+full-page images per rmgr) instead of individual records.
+   
+  
+ 
+
+ 
+  -Z
+  --record-stats
+  
+   
+Display summary statistics (number and size of records and
+full-page images) instead of individual records.

   
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: show only failed queries

2014-06-29 Thread Pavel Stehule
2014-06-30 8:17 GMT+02:00 Rajeev rastogi :

>  On 26 June 2014 11:53, Samrat Revagade Wrote:
>
>
>
> >> I am sending updated patch - buggy statement is printed via more
> logical psql_error function instead printf
>
>
>
> >Thank you for updating patch, I really appreciate your efforts.
>
>
>
> >Now, everything is good from my side.
>
> >* it apply cleanly to the current git master
>
> >* includes necessary docs
>
> >* I think It is very good  and necessary feature.
>
>
>
> >If Kumar Rajeev Rastogi do not have any extra comments, then I
> think patch is  ready for committer.
>
>
>
> I have reviewed this patch. Please find my review comments below:
>
> 1. Command start-up option (e.g. -a/--echo-all for --ECHO=all), for
> new functionality is not provided.
>
all not options entered via psql variables has psql option and psql
comment. I'll plan add new decription to --help-variables list.

If it is necessary I can add long option --echo-errors, I didn't a good
char for short option. Any idea?




>  2. New Command start-up option should be added in "psql --help" as
> well as in documentation.
>
depends on previous,



>
>
> Also as I understand, this new option is kind of sub-set of existing
> option (ECHO=query), so should not we display
>
> query string in the same format as it was getting printed earlier.
>
> Though I also feel that prefixing query with STATEMENT word will be
> helpful to grep but at the same time I am worried
>
> about inconsistency with existing option.
>

This is question. And I am not strong in feeling what should be preferred.
But still I am inclined to prefer a variant with STATEMENT prefix. Mode
with -a is used with different purpose than mode "show errors only" - and
output with prefix is much more consistent with log entry - and displaying
error. So I agree, so there is potential inconsistency (but nowhere is
output defined), but this output is more practical, when you are
concentrated to error's processing.

Regards

Pavel


>
>
> *Thanks and Regards,*
>
> *Kumar Rajeev Rastogi *
>