Re: [HACKERS] ECPG FETCH readahead, was: Re: ECPG fixes

2013-12-23 Thread Boszormenyi Zoltan

2013-12-21 14:56 keltezéssel, Peter Eisentraut írta:

This patch didn't make it out of the 2013-11 commit fest.  You should
move it to the next commit fest (probably with an updated patch)
before January 15th.


Done.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nested hstore patch

2013-12-23 Thread Robert Haas
On Fri, Dec 20, 2013 at 6:16 PM, David E. Wheeler da...@justatheory.com wrote:
 * New operators:
   + `hstore - int`: Get string value at array index (starting at 0)
   + `hstore ^ text`:Get numeric value for key
   + `hstore ^ int`: Get numeric value at array index
   + `hstore ? text`:Get boolean value for key
   + `hstore ? int`: Get boolean value at array index
   + `hstore # text[]`:  Get string value for key path
   + `hstore #^ text[]`: Get numeric value for key path
   + `hstore #? text[]`: Get boolean value for key path
   + `hstore % text`:Get hstore value for key
   + `hstore % int`: Get hstore value at array index
   + `hstore #% text[]`: Get hstore value for key path
   + `hstore ? int`:  Does hstore contain array index
   + `hstore #? text[]`:  Does hstore contain key path
   + `hstore - int`:  Delete index from left operand
   + `hstore #- text[]`:  Delete key path from left operand

Although in some ways there's a certain elegance to this, it also
sorta looks like punctuation soup.  I can't help wondering whether
we'd be better off sticking to function names.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nested hstore patch

2013-12-23 Thread Hannu Krosing
On 12/23/2013 12:28 PM, Robert Haas wrote:
 On Fri, Dec 20, 2013 at 6:16 PM, David E. Wheeler da...@justatheory.com 
 wrote:
 * New operators:
   + `hstore - int`: Get string value at array index (starting at 0)
   + `hstore ^ text`:Get numeric value for key
   + `hstore ^ int`: Get numeric value at array index
   + `hstore ? text`:Get boolean value for key
   + `hstore ? int`: Get boolean value at array index
   + `hstore # text[]`:  Get string value for key path
   + `hstore #^ text[]`: Get numeric value for key path
   + `hstore #? text[]`: Get boolean value for key path
   + `hstore % text`:Get hstore value for key
   + `hstore % int`: Get hstore value at array index
   + `hstore #% text[]`: Get hstore value for key path
   + `hstore ? int`:  Does hstore contain array index
   + `hstore #? text[]`:  Does hstore contain key path
   + `hstore - int`:  Delete index from left operand
   + `hstore #- text[]`:  Delete key path from left operand
 Although in some ways there's a certain elegance to this, it also
 sorta looks like punctuation soup.  I can't help wondering whether
 we'd be better off sticking to function names.

Has anybody looked into how hard it would be to add method notation
to postgreSQL, so that instead of calling

getString(hstorevalue, n)

we could use

hstorevalue.getString(n)

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CLUSTER FREEZE

2013-12-23 Thread Andres Freund
On 2013-12-22 20:45:02 -0500, Robert Haas wrote:
 I suspect we ought to extend this to rewriting variants of ALTER TABLE
 as well, but a little thought is needed there.  ATRewriteTables()
 appears to just call heap_insert() for each updated row, which if I'm
 not mistaken is an MVCC violation - offhand, it seems like a
 transaction with an older MVCC snapshot could access the table for
 this first time after the rewriter commits, and fail to see rows which
 would have appeared to be there before the rewrite. (I haven't
 actually tested this, so watch me be wrong.)  If we're OK with an MVCC
 violation here, we could just pass HEAP_INSERT_FROZEN and have a
 slightly different flavor of violation; if not, this needs some kind
 of more extensive surgery quite apart from what we do about freezing.

Yes, rewriting ALTER TABLEs certainly aren't MVCC safe. I thought that
was documented, but apparently not.
I am not sure it can be fixed easily using the tricks CLUSTER plays,
there might be nasty edge-cases because of the changes in the column
definition. Certainly not a trivial project.

I think we should leave ALTER TABLE as a completely separate project and
just improve CLUSTER for now. The practical impact of rewriting ALTER
TABLEs not freezing is far smaller, because they are very seldomly
performed in bigger databases.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] preserving forensic information when we freeze

2013-12-23 Thread Robert Haas
On Fri, Dec 20, 2013 at 9:56 PM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Dec 20, 2013 at 7:51 AM, Andres Freund and...@2ndquadrant.com wrote:
 I wondered that, too, but it's not well-defined for all tuples.  What
 happens if you pass in constructed tuple rather than an on-disk tuple?

 Those should be discernible I think, t_self/t_tableOid won't be set for
 generated tuples.

 I went looking for existing precedent for code that does things like
 this and found record_out, which does this:

 HeapTupleHeader rec = PG_GETARG_HEAPTUPLEHEADER(0);
 ...
 /* Extract type info from the tuple itself */
 tupType = HeapTupleHeaderGetTypeId(rec);
 tupTypmod = HeapTupleHeaderGetTypMod(rec);
 tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod);
 ncolumns = tupdesc-natts;

 /* Build a temporary HeapTuple control structure */
 tuple.t_len = HeapTupleHeaderGetDatumLength(rec);
 ItemPointerSetInvalid((tuple.t_self));
 tuple.t_tableOid = InvalidOid;
 tuple.t_data = rec;

 This appears to be a typical pattern, although interestingly I noticed
 that row_to_json() doesn't bother setting t_tableOid or t_self, which
 I think it's supposed to do.  The problem I see here is that this code
 seems to imply that a function passed a record doesn't actually have
 enough information to know what sort of a thing it's getting.  The use
 of HeapTupleHeaderGetTypeId and HeapTupleHeaderGetTypMod implies that
 it's safe to assume that t_choice will contain DatumTupleFields rather
 than HeapTupleFields, which doesn't seem to bode well for your
 approach.

 Am I missing a trick?

If not, here's a patch done the way I originally proposed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pg-tuple-header.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] preserving forensic information when we freeze

2013-12-23 Thread Andres Freund
On 2013-12-20 13:22:07 +0100, Andres Freund wrote:
 On 2013-12-20 07:12:01 -0500, Robert Haas wrote:
  I think the root of the problem is that nobody's very eager to add
  more hidden system catalog columns because each one bloats
  pg_attribute significantly.
 
 I think that part is actually solveable - there's no real need for them
 to be real columns, present in pg_attribute, things could easily enough be
 setup during parse analysis.

I've spent some time yesterday hacking up a prototype for this. The
amount of effort seems reasonable, although the attached patch certainly
doesn't do all the neccessary things. The regression tests pass.

In the prototype only oid keeps it's pg_attribute entry, everything else
uses the static entries via
SystemAttributeDefinition()/SystemAttributeByName(). We might want to
remove oids as well, but it seems reasonable to continue allowing
defining column permissions on it. And it's likely the attribute that's
most likely ingrained deeply in user applications.

 The bigger problem I see is that adding
 more useful columns will cause name clashes, which will probably
 prohibit improvements in that direction.

I wonder if we could solve that by naming new columns like
system:attribute or similar. Not pretty, but maybe better than the
alternatives.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 3a5692575c0077601cfba938239f4dd6f74de3c5 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Mon, 23 Dec 2013 13:44:36 +0100
Subject: [PATCH] prototype-patch: remove system columns from pg_attribute

---
 src/backend/access/common/heaptuple.c |  14 
 src/backend/catalog/aclchk.c  |  27 +++---
 src/backend/catalog/heap.c|  18 ++--
 src/backend/commands/tablecmds.c  | 154 +++---
 src/backend/parser/parse_relation.c   |  65 ++
 src/backend/utils/cache/lsyscache.c   |  11 ++-
 src/include/access/sysattr.h  |   3 +-
 7 files changed, 188 insertions(+), 104 deletions(-)

diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 347d616..99471ab 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -59,7 +59,9 @@
 
 #include access/sysattr.h
 #include access/tuptoaster.h
+#include access/transam.h
 #include executor/tuptable.h
+#include storage/procarray.h
 
 
 /* Does att's datatype allow packing into the 1-byte-header varlena format? */
@@ -286,6 +288,7 @@ heap_attisnull(HeapTuple tup, int attnum)
 		case MinCommandIdAttributeNumber:
 		case MaxTransactionIdAttributeNumber:
 		case MaxCommandIdAttributeNumber:
+		case CookedMaxTransactionIdAttributeNumber:
 			/* these are never null */
 			break;
 
@@ -558,6 +561,17 @@ heap_getsysattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 		case TableOidAttributeNumber:
 			result = ObjectIdGetDatum(tup-t_tableOid);
 			break;
+		case CookedMaxTransactionIdAttributeNumber:
+			{
+TransactionId xid;
+xid = HeapTupleHeaderGetUpdateXid(tup-t_data);
+if (TransactionIdIsValid(xid) 
+	!TransactionIdIsInProgress(xid) 
+	!TransactionIdDidCommit(xid))
+	xid = InvalidTransactionId;
+result = TransactionIdGetDatum(xid);
+			}
+			break;
 		default:
 			elog(ERROR, invalid attnum: %d, attnum);
 			result = 0;			/* keep compiler quiet */
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index 5a46fd9..807e274 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -1532,16 +1532,19 @@ expand_all_col_privileges(Oid table_oid, Form_pg_class classForm,
 		if (classForm-relkind == RELKIND_VIEW  curr_att  0)
 			continue;
 
-		attTuple = SearchSysCache2(ATTNUM,
-   ObjectIdGetDatum(table_oid),
-   Int16GetDatum(curr_att));
-		if (!HeapTupleIsValid(attTuple))
-			elog(ERROR, cache lookup failed for attribute %d of relation %u,
- curr_att, table_oid);
-
-		isdropped = ((Form_pg_attribute) GETSTRUCT(attTuple))-attisdropped;
-
-		ReleaseSysCache(attTuple);
+		if (curr_att  0)
+			isdropped = false;
+		else
+		{
+			attTuple = SearchSysCache2(ATTNUM,
+	   ObjectIdGetDatum(table_oid),
+	   Int16GetDatum(curr_att));
+			if (!HeapTupleIsValid(attTuple))
+elog(ERROR, cache lookup failed for attribute %d of relation %u,
+	 curr_att, table_oid);
+			isdropped = ((Form_pg_attribute) GETSTRUCT(attTuple))-attisdropped;
+			ReleaseSysCache(attTuple);
+		}
 
 		/* ignore dropped columns */
 		if (isdropped)
@@ -1579,6 +1582,10 @@ ExecGrant_Attribute(InternalGrant *istmt, Oid relOid, const char *relname,
 	Oid		   *oldmembers;
 	Oid		   *newmembers;
 
+	/* FIXME: don't set column permissions on system columns */
+	if (attnum  0)
+		return;
+
 	attr_tuple = SearchSysCache2(ATTNUM,
  ObjectIdGetDatum(relOid),
  Int16GetDatum(attnum));
diff --git 

Re: [HACKERS] preserving forensic information when we freeze

2013-12-23 Thread Andres Freund
On 2013-12-20 21:56:42 -0500, Robert Haas wrote:
 On Fri, Dec 20, 2013 at 7:51 AM, Andres Freund and...@2ndquadrant.com wrote:
  I wondered that, too, but it's not well-defined for all tuples.  What
  happens if you pass in constructed tuple rather than an on-disk tuple?
 
  Those should be discernible I think, t_self/t_tableOid won't be set for
  generated tuples.
 
 I went looking for existing precedent for code that does things like
 this and found record_out, which does this:

I think there's plenty precedent for C code, but here the problem seems
to be that, when we pass a record that's originally a plain row, it's
coerced (via IO functions) into a record with typeid/typemod set.

So, for this to work via SQL, we'd need to add a notation that allows
passing table rows to functions without being modified. That might be a
good idea in general, but likely more work than it's work tackling in
this context.

 This appears to be a typical pattern, although interestingly I noticed
 that row_to_json() doesn't bother setting t_tableOid or t_self, which
 I think it's supposed to do.

Yes, that seems to be a minor bug. Andrew?

 Am I missing a trick?

No, don't think so, I wasn't thinking on the SQL level.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] preserving forensic information when we freeze

2013-12-23 Thread Robert Haas
On Mon, Dec 23, 2013 at 7:45 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-12-20 13:22:07 +0100, Andres Freund wrote:
 On 2013-12-20 07:12:01 -0500, Robert Haas wrote:
  I think the root of the problem is that nobody's very eager to add
  more hidden system catalog columns because each one bloats
  pg_attribute significantly.

 I think that part is actually solveable - there's no real need for them
 to be real columns, present in pg_attribute, things could easily enough be
 setup during parse analysis.

 I've spent some time yesterday hacking up a prototype for this. The
 amount of effort seems reasonable, although the attached patch certainly
 doesn't do all the neccessary things. The regression tests pass.

Well, I'm all in favor of de-bloating pg_attribute, but I don't
particularly like cooked_xmax.  Even if it were better-named
(updatexid?), I'm still not sure I'd like it very much.  And I
definitely think that getting rid of the pg_attribute entries ought to
be a separate patch from adding any new system columns we want to
have.

 In the prototype only oid keeps it's pg_attribute entry, everything else
 uses the static entries via
 SystemAttributeDefinition()/SystemAttributeByName(). We might want to
 remove oids as well, but it seems reasonable to continue allowing
 defining column permissions on it. And it's likely the attribute that's
 most likely ingrained deeply in user applications.

Seems fine to me.

 The bigger problem I see is that adding
 more useful columns will cause name clashes, which will probably
 prohibit improvements in that direction.

 I wonder if we could solve that by naming new columns like
 system:attribute or similar. Not pretty, but maybe better than the
 alternatives.

If we're going to add a prefix, I think it should be one that doesn't
require quoting the identifier.  In retrospect pg_xmin or _pg_xmin or
__pg_xmin would have been a better name than xmin.

But TBH, I'm kind of enamored of the function-based approach at the
moment.  It just seems a lot more flexible.  Adding a function is
nearly free and we can add as many as we need doing whatever we want,
and they can even go into contrib modules if we like it better that
way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] preserving forensic information when we freeze

2013-12-23 Thread Andres Freund
On 2013-12-23 10:26:49 -0500, Robert Haas wrote:
 On Mon, Dec 23, 2013 at 7:45 AM, Andres Freund and...@2ndquadrant.com wrote:
  I've spent some time yesterday hacking up a prototype for this. The
  amount of effort seems reasonable, although the attached patch certainly
  doesn't do all the neccessary things. The regression tests pass.
 
 Well, I'm all in favor of de-bloating pg_attribute, but I don't
 particularly like cooked_xmax.  Even if it were better-named
 (updatexid?), I'm still not sure I'd like it very much.  And I
 definitely think that getting rid of the pg_attribute entries ought to
 be a separate patch from adding any new system columns we want to
 have.

Yea, that was just a demo attribute, I am far from sure we should add
any. It was just important to me to test that we're easily able to do
so. I don't even think it's semantics are necessarily all that useful.

I think we should do this, independent from adding additional
attributes. The reduction of rows in pg_attribute is quite noticeable,
and I can't see us just dropping the old system columns.

 But TBH, I'm kind of enamored of the function-based approach at the
 moment.  It just seems a lot more flexible.  Adding a function is
 nearly free and we can add as many as we need doing whatever we want,
 and they can even go into contrib modules if we like it better that
 way.

Well, it really depends on the usecase. Reading
SELECT header.xmin, header.xmax
FROM sometable tbl,
pg_tuple_header(tbl.tableoid, tbl.ctid) header;
is surely harder to understand than
SELECT tbl.xmin, tbl.xmax
FROM sometable tbl;
and the latter is noticeably cheaper since we're not re-fetching the
tuple, especially when the tuple is the result of a more complicated
query including a seqscan, we might often need to re-fetch the tuple
from disk, thanks to the ringbuffer.

That really makes me less than convinced that the function based
approach is the right tool for everything.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Upgrade to Autoconf 2.69

2013-12-23 Thread Peter Eisentraut
On 12/21/13, 9:39 AM, Peter Eisentraut wrote:
 This is enabling large-file support on OS X, so that seems kind of
 important.  It's not failing with newer versions of OS X, so that leaves
 the following possibilities, I think:
 
 - Large files never worked on 10.5.  That would be strange because
 Autoconf explicitly refers to that version.
 
 - New OS X versions have this on by default (could someone check?), so
 it already worked before, and it's something specific to 10.5 that's
 broken.

Current OS X has 64-bit inodes on by default, so this code works in
general.  So it's probably one of the other causes.

 - It's something specific to PowerPC or endianness.
 
 - That build farm member has a strange setup.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-12-23 Thread Robert Haas
On Sun, Dec 22, 2013 at 6:42 PM, Peter Geoghegan p...@heroku.com wrote:
 On Fri, Dec 20, 2013 at 11:59 PM, Peter Geoghegan p...@heroku.com wrote:
 I think that the way forward is to refine my design in order to
 upgrade locks from exclusive buffer locks to something else, managed
 by the lock manager but perhaps through an additional layer of
 indirection. As previously outlined, I'm thinking of a new SLRU-based
 granular value locking infrastructure built for this purpose, with
 btree inserters marking pages as having an entry in this table.

 I'm working on a revision that holds lmgr page-level exclusive locks
 (and buffer pins) across multiple operations.  This isn't too
 different to what you've already seen, since they are still only held
 for an instant. Notably, hash indexes currently quickly grab and
 release lmgr page-level locks, though they're the only existing
 clients of that infrastructure. I think on reflection that
 fully-fledged value locking may be overkill, given the fact that these
 locks are only held for an instant, and only need to function as a
 choke point for unique index insertion, and only when upserting
 occurs.

 This approach seems promising. It didn't take me very long to get it
 to a place where it passed a few prior test-cases of mine, with fairly
 varied input, though the patch isn't likely to be posted for another
 few days. I think I can get it to a place where it doesn't regress
 regular insertion at all. I think that that will tick all of the many
 boxes, without unwieldy complexity and without compromising conceptual
 integrity.

 I mention this now because obviously time is a factor. If you think
 there's something I need to do, or that there's some way that I can
 more usefully coordinate with you, please let me know. Likewise for
 anyone else following.

I don't think this is a project to rush through.  We've lived without
MERGE/UPSERT for several years now, and we can live without it for
another release cycle while we try to reach agreement on the way
forward.  I can tell that you're convinced you know the right way
forward here, and you may be right, but I don't think you've convinced
everyone else - maybe not even anyone else.

I wouldn't suggest modeling anything you do on the way hash indexes
using heavyweight locks.  That is a performance disaster, not to
mention being basically a workaround for the fact that whoever wrote
the code originally didn't bother figuring out any way that splitting
a bucket could be accomplished in a crash-safe manner, even in theory.
 If it weren't for that, we'd be using buffer locks there.  That
doesn't necessarily mean that page-level heavyweight locks aren't the
right thing here, but the performance aspects of any such approach
will need to be examined carefully.

To be honest, I am still not altogether sold on any part of this
feature.  I don't like the fact that it violates MVCC - although I
admit that some form of violation is inevitable in any feature in this
area unless we're content to live with many serialization failures, I
don't like the particular way it violates MVCC, I don't like the
syntax (returns rejects? blech), and I don't like the fact that
getting the locking right, or even getting the semantics right, seems
to be so darned hard.  I think we're in real danger of building
something that will be too complex, or just too weird, for users to
use, and too complex to maintain as well.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nested hstore patch

2013-12-23 Thread Pavel Stehule
Hello


2013/12/23 Hannu Krosing ha...@2ndquadrant.com

 On 12/23/2013 12:28 PM, Robert Haas wrote:
  On Fri, Dec 20, 2013 at 6:16 PM, David E. Wheeler da...@justatheory.com
 wrote:
  * New operators:
+ `hstore - int`: Get string value at array index (starting at 0)
+ `hstore ^ text`:Get numeric value for key
+ `hstore ^ int`: Get numeric value at array index
+ `hstore ? text`:Get boolean value for key
+ `hstore ? int`: Get boolean value at array index
+ `hstore # text[]`:  Get string value for key path
+ `hstore #^ text[]`: Get numeric value for key path
+ `hstore #? text[]`: Get boolean value for key path
+ `hstore % text`:Get hstore value for key
+ `hstore % int`: Get hstore value at array index
+ `hstore #% text[]`: Get hstore value for key path
+ `hstore ? int`:  Does hstore contain array index
+ `hstore #? text[]`:  Does hstore contain key path
+ `hstore - int`:  Delete index from left operand
+ `hstore #- text[]`:  Delete key path from left operand
  Although in some ways there's a certain elegance to this, it also
  sorta looks like punctuation soup.  I can't help wondering whether
  we'd be better off sticking to function names.
 
 Has anybody looked into how hard it would be to add method notation
 to postgreSQL, so that instead of calling

 getString(hstorevalue, n)

 we could use

 hstorevalue.getString(n)


yes, I played with it some years ago. I ended early, there was a problem
with parser - when I tried append a new rule. And because there was not
simple solution, I didn't continue.

But it can be nice feature - minimally for plpgsql coders.

Regards

Pavel



 --
 Hannu Krosing
 PostgreSQL Consultant
 Performance, Scalability and High Availability
 2ndQuadrant Nordic OÜ



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] preserving forensic information when we freeze

2013-12-23 Thread Robert Haas
On Mon, Dec 23, 2013 at 10:42 AM, Andres Freund and...@2ndquadrant.com wrote:
 But TBH, I'm kind of enamored of the function-based approach at the
 moment.  It just seems a lot more flexible.  Adding a function is
 nearly free and we can add as many as we need doing whatever we want,
 and they can even go into contrib modules if we like it better that
 way.

 Well, it really depends on the usecase. Reading
 SELECT header.xmin, header.xmax
 FROM sometable tbl,
 pg_tuple_header(tbl.tableoid, tbl.ctid) header;
 is surely harder to understand than
 SELECT tbl.xmin, tbl.xmax
 FROM sometable tbl;
 and the latter is noticeably cheaper since we're not re-fetching the
 tuple, especially when the tuple is the result of a more complicated
 query including a seqscan, we might often need to re-fetch the tuple
 from disk, thanks to the ringbuffer.

 That really makes me less than convinced that the function based
 approach is the right tool for everything.

Meh.  Who are all of these people who are fetching xmin, xmax, cmin,
and cmax in complex queries, and why are they doing that?  If those
columns are just for forensic purposes, then whatever performance
disadvantages the function-based approach has don't really matter.  If
people are using them as integral parts of their application, then
that's more of a concern, but how are those people not getting broken
every release or three anyway?  We keep inventing things like
combo-cids and multi-xacts and multi-xacts that also contain an update
and now freezing without changing xmin, and if you're actually relying
on those columns for anything but forensics your application is
presumably going to break when we change whatever part it depends on.
Is anyone really doing that, and if so, why?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertion failure in base backup code path

2013-12-23 Thread Magnus Hagander
On Dec 19, 2013 12:06 AM, Andres Freund and...@2ndquadrant.com wrote:

 Hi Magnus,

 It looks to me like the path to do_pg_start_backup() outside of a
 transaction context comes from your initial commit of the base backup
 facility.
 The problem is that you're not allowed to do anything leading to a
 syscache/catcache lookup in those contexts.

I think that may have come with the addition of the replication privilege
actually but that doesn't change the fact that yes, it appears broken..

/Magnus


Re: [HACKERS] WITHIN GROUP patch

2013-12-23 Thread Tom Lane
I wrote:
 Or, really, why don't we just do the same thing I'm advocating for
 the plain-ordered-set case?  That is, if there's a single collation
 applying to all the collatable inputs, that's the collation to use
 for the aggregate; otherwise it has no determinate collation, and
 it'll throw an error at runtime if it needs one.

I went and tried to implement this, and realized that it would take some
pretty significant rewriting of parse_collate.c, because of examples
like this:

 rank(a,b) within group (order by c collate foo, d collate bar)

In the current parse_collate logic, it would throw error immediately
upon being told to merge the two explicit-COLLATE results.  We'd
need a way to postpone that error and instead just decide that the
rank aggregate's collation is indeterminate.  While that's perhaps
just a SMOP, it would mean that ordered-set aggregates don't resolve
collation the same way as other functions, which pretty much destroys
the argument for this approach.

What's more, the same problem applies to non-hypothetical ordered-set
aggregates, if they've got more than one sortable input column.

What I'm now thinking we want to do is:

1. Non-hypothetical direct args always contribute to determining the
agg's collation.

2. Hypothetical and aggregated args contribute to the agg's collation
only if the agg is designed so that there is always exactly one
aggregated arg (ie, it's non-variadic with one aggregated arg).
Otherwise we assign their collations per-sort-column and don't merge
them into the aggregate's collation.

This specification ensures that a variadic aggregate doesn't change
behavior depending on how many sort columns there happen to be.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XML Issue with DTDs

2013-12-23 Thread Peter Eisentraut
On 12/19/13, 6:40 PM, Florian Pflug wrote:
 The following example fails for XMLOPTION set to DOCUMENT as well as for 
 XMLOPTION set to CONTENT.
 
   select xmlconcat(
 xmlparse(document '!DOCTYPE test [!ELEMENT test EMPTY]test/'),
 xmlparse(content 'test/')
   )::text::xml;

The SQL standard specifies that DTDs are dropped by xmlconcat.  It's
just not implemented.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-23 Thread Amit Kapila
On Sun, Dec 22, 2013 at 8:32 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Sun, Dec 22, 2013 at 3:01 PM, Fujii Masao masao.fu...@gmail.com wrote:
 I found the bug of ALTER SYSTEM SET patch. The procedure to reproduce
 it is here.

 $ psql
 =# ALTER SYSTEM SET shared_buffers = '512MB';
 ALTER SYSTEM
 =# \q
 $ pg_ctl -D data reload
 server signaled
 2013-12-22 18:24:13 JST LOG:  received SIGHUP, reloading configuration files
 2013-12-22 18:24:13 JST LOG:  parameter shared_buffers cannot be
 changed without restarting the server
 2013-12-22 18:24:13 JST LOG:  configuration file X?? contains
 errors; unaffected changes were applied

 The configuration file name in the last log message is broken. This problem 
 was
 introduced by the ALTER SYSTEM SET patch.

FreeConfigVariables(head);
 snip
else if (apply)
ereport(elevel,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
 errmsg(configuration file \%s\ contains errors; 
 unaffected changes were applied,
ErrorConfFile)));

 The cause of the problem is that, in ProcessConfigFile(), the log
 message including
 the 'ErrorConfFile' is emitted after 'head' is free'd even though
 'ErrorConfFile' points
 to one of entry of the 'head'.

 Your analysis is absolutely right.
 The reason for this issue is that we want to display filename to which
 erroneous parameter
 belongs and in this case we are freeing the parameter info before actual 
 error.
 To fix, we need to save the filename value before freeing parameter list.

 Please find the attached patch to fix this issue.

 Note - In my m/c, I could not reproduce the issue. I think this is not
 surprising because we
 are not setting freed memory to NULL, so it can display anything
 (sometimes right value as well)

If you find that the provided patch doesn't fix the problem or you don't
find it appropriate due to some other reason, let me know the feedback.

I shall be able to provide updated patch early next as I will be on vacation for
remaining part of this week.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nested hstore patch

2013-12-23 Thread Jonathan S. Katz
On Dec 23, 2013, at 6:28 AM, Robert Haas wrote:

 On Fri, Dec 20, 2013 at 6:16 PM, David E. Wheeler da...@justatheory.com 
 wrote:
 * New operators:
  + `hstore - int`: Get string value at array index (starting at 0)
  + `hstore ^ text`:Get numeric value for key
  + `hstore ^ int`: Get numeric value at array index
  + `hstore ? text`:Get boolean value for key
  + `hstore ? int`: Get boolean value at array index
  + `hstore # text[]`:  Get string value for key path
  + `hstore #^ text[]`: Get numeric value for key path
  + `hstore #? text[]`: Get boolean value for key path
  + `hstore % text`:Get hstore value for key
  + `hstore % int`: Get hstore value at array index
  + `hstore #% text[]`: Get hstore value for key path
  + `hstore ? int`:  Does hstore contain array index
  + `hstore #? text[]`:  Does hstore contain key path
  + `hstore - int`:  Delete index from left operand
  + `hstore #- text[]`:  Delete key path from left operand
 
 Although in some ways there's a certain elegance to this, it also
 sorta looks like punctuation soup.  I can't help wondering whether
 we'd be better off sticking to function names.

The key thing is making it easy for people to easily chain calls to their 
nested hstore objects, and I think these operators accomplish that.

Some of them are fairly intuitive, and I think as a community if we have a) 
good docs, b)  good blog posts on how to use nested hstore, and c) provides 
clear instructions @ PG events on how to use it, it would be okay, though some 
things, i.e. extracting the key by a path, might be better being in a function 
anyway.  However, having it as an operator might encourage more usage, only 
because people tend to think that functions will slow my query down.

My only concern is the consistency with the generally accepted standard of JSON 
and with the upcoming jsonb type.   I'm not sure if the jsonb API has  been 
defined yet, but it would be great to keep consistency between nested hstore 
and jsonb so people don't have to learn two different access systems.  Data 
extraction from JSON is often done by the dot operator in implementations, and 
depending on the language you are in, there are ways to add / test existence / 
remove objects from the JSON blob.

Being able to extract objects from nested hstore / JSON using the dot operator 
would be simple and intuitive and general well-understood, but of course there 
are challenges with doing that in PG and well, proper SQL.

Jonathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] preserving forensic information when we freeze

2013-12-23 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:

 Meh.  Who are all of these people who are fetching xmin, xmax, cmin,
 and cmax in complex queries, and why are they doing that?  If those
 columns are just for forensic purposes, then whatever performance
 disadvantages the function-based approach has don't really matter.  If
 people are using them as integral parts of their application, then
 that's more of a concern, but how are those people not getting broken
 every release or three anyway?  We keep inventing things like
 combo-cids and multi-xacts and multi-xacts that also contain an update
 and now freezing without changing xmin, and if you're actually relying
 on those columns for anything but forensics your application is
 presumably going to break when we change whatever part it depends on.
 Is anyone really doing that, and if so, why?

I've seen an ORM use xmin for a sort of optimistic concurrency
control scheme.  Let's say that you bring up an address by primary
key, and change an incorrect apartment number.  At the same time,
someone else is entering a change of address, to a whole new
location where a totally different apartment number is supplied.
The developers want to prevent portions of the two addresses from
being intermingled, they don't want to hold a database transaction
open for the time the user has the window up, they want to avoid
blocking as much as possible, and they don't want to store an extra
column with a version number or timestamp just to do that -- so
they use xmin.  When they go to rewrite the row they use the PK
values plus the xmin in the UPDATE statement's WHERE clause.  If
the row is not found, the user gets an error message.  Currently
that does mean that the users can get a spurious error message when
there is a tuple freeze while the window is open, so the current
changes are probably good news for those using this approach.

Other than that and ctid (for similar reasons) I have not seen
system columns used in applications or application frameworks.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-23 Thread Thom Brown
Discussion around this topic has reached hundreds of messages, and
whilst I have failed to find mention of my following question, I
appreciate it may have already been discussed.

I have noticed that configuration parameters for extensions are only
supported if the server was started with one of the parameters already
being set in postgresql.conf:

# without any mention in postgresql.conf
postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true;
ERROR:  unrecognized configuration parameter auto_explain.log_verbose

# with auto_explain.log_verbose = false added to postgresql.conf
postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true;
ALTER SYSTEM

Removing the entry from postgresql.conf, restarting Postgres, setting
it to the default, then restarting again renders it unsettable again:

# removed entry from postgresql.conf and restarted

postgres=# ALTER SYSTEM SET auto_explain.log_verbose = default;
ALTER SYSTEM

# restart postgres

postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true;
ERROR:  unrecognized configuration parameter auto_explain.log_verbose

Is this by design?

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITHIN GROUP patch

2013-12-23 Thread Andrew Gierth
 Tom == Tom Lane t...@sss.pgh.pa.us writes:

 Tom What I'm now thinking we want to do is:

 Tom 1. Non-hypothetical direct args always contribute to determining
 Tom the agg's collation.

 Tom 2. Hypothetical and aggregated args contribute to the agg's
 Tom collation only if the agg is designed so that there is always
 Tom exactly one aggregated arg (ie, it's non-variadic with one
 Tom aggregated arg).  Otherwise we assign their collations
 Tom per-sort-column and don't merge them into the aggregate's
 Tom collation.

 Tom This specification ensures that a variadic aggregate doesn't
 Tom change behavior depending on how many sort columns there happen
 Tom to be.

Works for me.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] preserving forensic information when we freeze

2013-12-23 Thread Andres Freund
On 2013-12-23 10:56:07 -0500, Robert Haas wrote:
  Well, it really depends on the usecase. Reading
  SELECT header.xmin, header.xmax
  FROM sometable tbl,
  pg_tuple_header(tbl.tableoid, tbl.ctid) header;
  is surely harder to understand than
  SELECT tbl.xmin, tbl.xmax
  FROM sometable tbl;
  and the latter is noticeably cheaper since we're not re-fetching the
  tuple, especially when the tuple is the result of a more complicated
  query including a seqscan, we might often need to re-fetch the tuple
  from disk, thanks to the ringbuffer.
 
  That really makes me less than convinced that the function based
  approach is the right tool for everything.

 Meh.  Who are all of these people who are fetching xmin, xmax, cmin,
 and cmax in complex queries, and why are they doing that?

I have seen it done to avoid ABA problems in cache invalidation
mechanisms by simply checking ctid, xmin, xmax to stay the same.

 If those
 columns are just for forensic purposes, then whatever performance
 disadvantages the function-based approach has don't really matter.  If
 people are using them as integral parts of their application, then
 that's more of a concern, but how are those people not getting broken
 every release or three anyway?  We keep inventing things like
 combo-cids and multi-xacts and multi-xacts that also contain an update
 and now freezing without changing xmin, and if you're actually relying
 on those columns for anything but forensics your application is
 presumably going to break when we change whatever part it depends on.

Well, all of the fundamental changes (combocids, the initial multixact
introduction) have been quite some time ago. I think backward compat has
a much higher value these days (I also don't see much point in looking
at cmin/cmax for anything but diagnostic purposes). I don't think any of
the usecases I've seen would be broken by either fk-locks (multixacts
have been there before, doesn't matter much that they now contain
updates) nor by forensic freezing. The latter because they really only
checked that xmin/xmax were the same, and we hopefully haven't broken
that...

But part of my point really was the usability, not only the
performance. Requiring LATERAL queries to be usable sensibly causes a
Meh from my side ;)

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-23 Thread Robert Haas
On Mon, Dec 23, 2013 at 1:42 PM, Thom Brown t...@linux.com wrote:
 Discussion around this topic has reached hundreds of messages, and
 whilst I have failed to find mention of my following question, I
 appreciate it may have already been discussed.

 I have noticed that configuration parameters for extensions are only
 supported if the server was started with one of the parameters already
 being set in postgresql.conf:

 # without any mention in postgresql.conf
 postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true;
 ERROR:  unrecognized configuration parameter auto_explain.log_verbose

 # with auto_explain.log_verbose = false added to postgresql.conf
 postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true;
 ALTER SYSTEM

 Removing the entry from postgresql.conf, restarting Postgres, setting
 it to the default, then restarting again renders it unsettable again:

 # removed entry from postgresql.conf and restarted

 postgres=# ALTER SYSTEM SET auto_explain.log_verbose = default;
 ALTER SYSTEM

 # restart postgres

 postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true;
 ERROR:  unrecognized configuration parameter auto_explain.log_verbose

 Is this by design?

I would think that you'd need to have auto_explain loaded in the
backend where you're trying to make a change, but you shouldn't need
the setting to be present in postgresql.conf, I would think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] preserving forensic information when we freeze

2013-12-23 Thread Robert Haas
On Mon, Dec 23, 2013 at 1:57 PM, Andres Freund and...@2ndquadrant.com wrote:
 Well, all of the fundamental changes (combocids, the initial multixact
 introduction) have been quite some time ago. I think backward compat has
 a much higher value these days (I also don't see much point in looking
 at cmin/cmax for anything but diagnostic purposes). I don't think any of
 the usecases I've seen would be broken by either fk-locks (multixacts
 have been there before, doesn't matter much that they now contain
 updates) nor by forensic freezing. The latter because they really only
 checked that xmin/xmax were the same, and we hopefully haven't broken
 that...

 But part of my point really was the usability, not only the
 performance. Requiring LATERAL queries to be usable sensibly causes a
 Meh from my side ;)

I simply can't imagine that we're going to want to add a system column
every time we change something, or even enough of them to cover all of
the things we've already got.  We'd need at least infomask, infomask2,
and hoff, and maybe some functions for peeking through mxacts to find
the updater and locker XIDs and lock modes.  With a couple of
functions we can do all that and not look back.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-23 Thread Thom Brown
On 23 December 2013 19:14, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Dec 23, 2013 at 1:42 PM, Thom Brown t...@linux.com wrote:
 Discussion around this topic has reached hundreds of messages, and
 whilst I have failed to find mention of my following question, I
 appreciate it may have already been discussed.

 I have noticed that configuration parameters for extensions are only
 supported if the server was started with one of the parameters already
 being set in postgresql.conf:

 # without any mention in postgresql.conf
 postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true;
 ERROR:  unrecognized configuration parameter auto_explain.log_verbose

 # with auto_explain.log_verbose = false added to postgresql.conf
 postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true;
 ALTER SYSTEM

 Removing the entry from postgresql.conf, restarting Postgres, setting
 it to the default, then restarting again renders it unsettable again:

 # removed entry from postgresql.conf and restarted

 postgres=# ALTER SYSTEM SET auto_explain.log_verbose = default;
 ALTER SYSTEM

 # restart postgres

 postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true;
 ERROR:  unrecognized configuration parameter auto_explain.log_verbose

 Is this by design?

 I would think that you'd need to have auto_explain loaded in the
 backend where you're trying to make a change, but you shouldn't need
 the setting to be present in postgresql.conf, I would think.

This appears to be the case.  I hadn't set the library to be loaded in
the config.

I guess therefore it follows that arbitrary configuration parameters
aren't supported (e.g. moo.bark = 5), only pre-defined ones.

Thanks

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-23 Thread Robert Haas
On Mon, Dec 23, 2013 at 2:19 PM, Thom Brown t...@linux.com wrote:
 I would think that you'd need to have auto_explain loaded in the
 backend where you're trying to make a change, but you shouldn't need
 the setting to be present in postgresql.conf, I would think.

 This appears to be the case.  I hadn't set the library to be loaded in
 the config.

 I guess therefore it follows that arbitrary configuration parameters
 aren't supported (e.g. moo.bark = 5), only pre-defined ones.

Yeah, and that's by design.  Otherwise, it would be too easy to set a
config parameter to a value that wasn't legal, and therefore make the
server fail to start.  moo.bark = 5 likely won't cause any problems,
but auto_explain.log_verbose = fasle could.  By insisting that the
module providing the GUC be loaded, we can sanity-check the value at
the time it gets set.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-23 Thread Thom Brown
On 23 December 2013 19:35, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Dec 23, 2013 at 2:19 PM, Thom Brown t...@linux.com wrote:
 I would think that you'd need to have auto_explain loaded in the
 backend where you're trying to make a change, but you shouldn't need
 the setting to be present in postgresql.conf, I would think.

 This appears to be the case.  I hadn't set the library to be loaded in
 the config.

 I guess therefore it follows that arbitrary configuration parameters
 aren't supported (e.g. moo.bark = 5), only pre-defined ones.

 Yeah, and that's by design.  Otherwise, it would be too easy to set a
 config parameter to a value that wasn't legal, and therefore make the
 server fail to start.  moo.bark = 5 likely won't cause any problems,
 but auto_explain.log_verbose = fasle could.  By insisting that the
 module providing the GUC be loaded, we can sanity-check the value at
 the time it gets set.

Alles klar. :)

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR during end-of-xact/FATAL

2013-12-23 Thread Noah Misch
On Fri, Nov 15, 2013 at 09:51:32AM -0500, Robert Haas wrote:
 On Wed, Nov 13, 2013 at 11:04 AM, Noah Misch n...@leadboat.com wrote:
  So, in short, ERROR + ERROR*10 = PANIC, but FATAL + ERROR*10 = FATAL.
  That's bizarre.
 
  Quite so.
 
  Given that that's where we are, promoting an ERROR during FATAL
  processing to PANIC doesn't seem like it's losing much; we're
  essentially already doing that in the (probably more likely) case of a
  persistent ERROR during ERROR processing.  But since PANIC sucks, I'd
  rather go the other direction: let's make an ERROR during ERROR
  processing promote to FATAL.  And then let's do what you write above:
  make sure that there's a separate on-shmem-exit callback for each
  critical shared memory resource and that we call of those during FATAL
  processing.
 
  Many of the factors that can cause AbortTransaction() to fail can also cause
  CommitTransaction() to fail, and those would still PANIC if the transaction
  had an xid.  How practical might it be to also escape from an error during
  CommitTransaction() with a FATAL instead of PANIC?  There's more to fix up 
  in
  that case (sinval, NOTIFY), but it may be within reach.  If such a technique
  can only reasonably fix abort, though, I have doubts it buys us enough.
 
 The critical stuff that's got to happen after
 RecordTransactionCommit() appears to be ProcArrayEndTransaction() and
 AtEOXact_Inval(). Unfortunately, the latter is well after the point
 when we're supposed to only be doing non-critical resource cleanup,
 nonwithstanding which it appears to be critical.
 
 So here's a sketch.  Hoist the preparatory logic in
 RecordTransactionCommit() - smgrGetPendingDeletes,
 xactGetCommittedChildren, and xactGetCommittedInvalidationMessages up
 into the caller and do it before setting TRANS_COMMIT.  If any of that
 stuff fails, we'll land in AbortTransaction() which must cope.  As
 soon as we exit the commit critical section, set a flag somewhere
 (where?) indicating that we have written our commit record; when that
 flag is set, (a) promote any ERROR after that point through the end of
 commit cleanup to FATAL and (b) if we enter AbortTransaction(), don't
 try to RecordTransactionAbort().
 
 I can't see that the notification stuff requires fixup in this case;
 AFAICS, it is just adjusting backend-local state, and it's OK to
 disregard any problems there during a FATAL exit.  Do you see
 something to the contrary?

The part not to miss is SignalBackends() in ProcessCompletedNotifies().  It
happens outside CommitTransaction(), just before the backend goes idle.
Without that, listeners could miss our notification until some other NOTIFY
transaction signals them.  That said, since ProcessCompletedNotifies() already
accepts the possibility of failure, it would not be crazy to overlook this.

 But invalidation messages are a problem:
 if we commit and exit without sending our queued-up invalidation
 messages, Bad Things Will Happen.  Perhaps we could arrange things so
 that in that case only, we just PANIC.   That would allow most write
 transactions to get by with FATAL, promoting to PANIC only in the case
 of transactions that have modified system catalogs and only until the
 invalidations have actually been sent.  Avoiding the PANIC in that
 case seems to require some additional wizardry which is not entirely
 clear to me at this time.

Agreed.

 I think we'll have to approach the various problems in this area
 stepwise, or we'll never make any progress.

That's one option.  The larger point is that allowing ERROR-late-in-COMMIT and
FATAL + ERROR scenarios to get away with FATAL requires this sort of analysis
of every exit callback and all the later stages of CommitTransaction().  The
current bug level shows such analysis hasn't been happening, and the dearth of
bug reports suggests the scenarios are rare.  I don't think the project stands
to gain enough from the changes contemplated here and, perhaps more notably,
from performing such analysis during review of future patches that touch
CommitTransaction() and the exit sequence.  It remains my recommendation to
run proc_exit_prepare() and the post-CLOG actions of CommitTransaction() and
AbortTransaction() in critical sections, giving the scenarios blunt but
reliable treatment.  If reports of excess PANIC arise, the thing to fix is the
code causing the late error.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] better atomics - v0.2

2013-12-23 Thread Robert Haas
On Thu, Dec 5, 2013 at 6:39 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-11-19 10:37:35 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  The only animal we have that doesn't support quiet inlines today is
  HP-UX/ac++, and I think - as in patch 1 in the series - we might be able
  to simply suppress the warning there.

 Or just not worry about it, if it's only a warning?

 So, my suggestion on that end is that we remove the requirement for
 quiet inline now and see if it that has any negative consequences on the
 buildfarm for a week or so. Imo that's a good idea regardless of us
 relying on inlining support.
 Does anyone have anything against that plan? If not, I'll prepare a
 patch.

 Or does the warning
 mean code bloat (lots of useless copies of the inline function)?

 After thinking on that for a bit, yes that's a possible consequence, but
 it's quite possible that it happens in cases where we don't get the
 warning too, so I don't think that argument has too much bearing as I
 don't recall a complaint about it.

But I bet the warning we're getting there is complaining about exactly
that thing.  It's possible that it means warning: i've optimized away
your unused function into nothing but I think it's more likely that
it means warning: i just emitted dead code.  Indeed, I suspect that
this is why that test is written that way in the first place.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITHIN GROUP patch

2013-12-23 Thread Tom Lane
Atri Sharma atri.j...@gmail.com writes:
 Please find attached the latest patch for WITHIN GROUP. This patch is
 after fixing the merge conflicts.

I've committed this after significant editorialization --- most notably,
I pushed control of the sort step into the aggregate support functions.
I didn't like the way nodeAgg.c had been hacked up to do it the other way.
There's a couple hundred lines of code handling that in orderedsetaggs.c,
which is more or less comparable to the amount of code that didn't get
added to nodeAgg.c, so I think the argument that the original approach
avoided code bloat is bogus.

The main reason I pushed all the new aggregates into a single file
(orderedsetaggs.c) was so they could share a private typedef for the
transition state value.  It's possible that we should expose that
struct so that third-party aggregate functions could leverage the
existing transition-function infrastructure instead of having to
copy-and-paste it.  I wasn't sure where to put it though --- maybe
a new include file would be needed.  Anyway I left the point for
future discussion.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-12-23 Thread Peter Geoghegan
On Mon, Dec 23, 2013 at 7:49 AM, Robert Haas robertmh...@gmail.com wrote:
 I don't think this is a project to rush through.  We've lived without
 MERGE/UPSERT for several years now, and we can live without it for
 another release cycle while we try to reach agreement on the way
 forward.  I can tell that you're convinced you know the right way
 forward here, and you may be right, but I don't think you've convinced
 everyone else - maybe not even anyone else.

That may be. Attention from reviewers has been in relatively short
supply. Not that that isn't always true.

 I wouldn't suggest modeling anything you do on the way hash indexes
 using heavyweight locks.  That is a performance disaster, not to
 mention being basically a workaround for the fact that whoever wrote
 the code originally didn't bother figuring out any way that splitting
 a bucket could be accomplished in a crash-safe manner, even in theory.
  If it weren't for that, we'd be using buffer locks there.

Having looked at the code for the first time recently, I'd agree that
hash indexes are a disaster. A major advantage of The Lehman and Yao
Algorithm, as prominently noted in the paper, is that exclusive locks
are only acquired on leaf pages to increase concurrency. Since I only
propose to extend this to a heavyweight page lock, and still only for
an instant, it seems reasonable to assume that the performance will be
acceptable for an initial version of this. It's not as if most places
will have to pay any heed to this heavyweight lock - index scans and
non-upserting inserts are generally unaffected. We can later optimize
performance as we measure a need to do so. Early indications are that
the performance is reasonable.

Holding value locks for more than an instant doesn't make sense. The
reason is simple: when upserting, we're tacitly only really looking
for violations on one particular unique index. We just lock them all
at once because the implementation doesn't know *which* unique index.
So in actuality, it's really no different from existing
potential-violation handling for unique indexes, except we have to do
some extra work in addition to the usual restart from scratch stuff
(iff we have multiple unique indexes).

 To be honest, I am still not altogether sold on any part of this
 feature.  I don't like the fact that it violates MVCC - although I
 admit that some form of violation is inevitable in any feature in this
 area unless we're content to live with many serialization failures, I
 don't like the particular way it violates MVCC

Discussions around visibility issues have not been very useful. As
I've said, I don't like the term MVCC violation, because it's
suggestive of some classical, codified definition of MVCC, a
definition that doesn't actually exist anywhere, even in research
papers, AFAICT. So while I understand your concerns around the
modifications to HeapTupleSatisfiesMVCC(), and while I respect that we
need to be mindful of the performance impact, my position is that if
that really is what we need to do, we might as well be honest about
it, and express intent succinctly and directly. This is a position
that is orthogonal to the proposed syntax, even if that is convenient
to my patch. It's already been demonstrated that yes, the MVCC
violation can be problematic when we call HeapTupleSatisfiesUpdate(),
which is a bug that was fixed by making another modest modification to
HeapTupleSatisfiesUpdate(). It is notable that that bug would have
still occurred had a would-be-HTSMVCC-invisible tuple been passed
through any other means. What problem, specifically, do you envisage
avoiding by doing it some other way? What other way do you have in
mind?

We invested huge effort into more granular FK locking when we had a
few complaints about it. I wouldn't be surprised if that effort
modestly regressed HeapTupleSatisfiesMVCC(). On the other hand, this
feature has been in very strong demand for over a decade, and has a
far smaller code footprint. I don't want to denigrate the FK locking
stuff in any way - it is a fantastic piece of work - but it's
important to have a sense of proportion about these things. In order
to make visibility work in the way we require, we're almost always
just doing additional checking of infomask bits, and the t_infomask
variable is probably already in a CPU register (this is a huge
simplification, but is essentially true). Like you, I have noted that
HeapTupleSatisfiesMVCC() is a fairly hot routine during profiling
before, but it's not *that* hot.

It's understandable that you raise these points, but from my
perspective it's hard to address your concerns without more concrete
objections.

 I don't like the
 syntax (returns rejects? blech)

I suppose it isn't ideal in some ways. On the other hand, it is
extremely flexible, with many of the same advantages of SQL MERGE.
Importantly, it will facilitate merging as part of conflict resolution
on multi-master replication systems, which I think is of considerable

[HACKERS] trailing comment ghost-timing

2013-12-23 Thread Erik Rijkers
With \timing on, a trailing comment yields a timing.

# test.sql
select 1;

/*
select 2
*/

$ psql -f test.sql
 ?column?
--
1
(1 row)

Time: 0.651 ms
Time: 0.089 ms

I assume it is timing something about that comment (right?).

Confusing and annoying, IMHO.  Is there any chance such trailing ghost-timings 
can be removed?

BTW:
$ cat ~/.psqlrc
\set QUIET on
\timing on


Thanks,

Erik Rijkers






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Planning time in explain/explain analyze

2013-12-23 Thread Andreas Karlsson

Hi,

A user asked in -performance[1] if there is a way to measure the 
planning time. Other than measuring it in the client I do not think 
there is so I quickly hacked a patched which adds Planning time to the 
outputs of EXPLAIN and EXPLAIN ANALYZE.


Is this something useful? I think it is, since plan time can become an 
issue for complex queries.


A couple of questions about the path: Should the planning time be added 
to both EXPLAIN and EXPLAIN ANALYZE? And is the output obvious enough?


The patch does not include any changes to documentation or tests. I will 
fix that if people think this patch is useful.


$ EXPLAIN SELECT * FROM pg_stat_activity;
QUERY PLAN 


--
 Nested Loop  (cost=1.15..2.69 rows=1 width=337)
   -  Hash Join  (cost=1.02..2.41 rows=1 width=273)
 Hash Cond: (s.usesysid = u.oid)
 -  Function Scan on pg_stat_get_activity s  (cost=0.00..1.00 
rows=100 width=209)

 -  Hash  (cost=1.01..1.01 rows=1 width=68)
   -  Seq Scan on pg_authid u  (cost=0.00..1.01 rows=1 
width=68)
   -  Index Scan using pg_database_oid_index on pg_database d 
(cost=0.13..0.27 rows=1 width=68)

 Index Cond: (oid = s.datid)
 Planning time: 0.258 ms
(9 rows)

$ EXPLAIN ANALYZE SELECT * FROM pg_stat_activity;
 QUERY 
PLAN


 Nested Loop  (cost=1.15..2.69 rows=1 width=337) (actual 
time=0.094..0.096 rows=1 loops=1)
   -  Hash Join  (cost=1.02..2.41 rows=1 width=273) (actual 
time=0.078..0.079 rows=1 loops=1)

 Hash Cond: (s.usesysid = u.oid)
 -  Function Scan on pg_stat_get_activity s  (cost=0.00..1.00 
rows=100 width=209) (actual time=0.053..0.053 rows=1 loops=1)
 -  Hash  (cost=1.01..1.01 rows=1 width=68) (actual 
time=0.014..0.014 rows=1 loops=1)

   Buckets: 1024  Batches: 1  Memory Usage: 1kB
   -  Seq Scan on pg_authid u  (cost=0.00..1.01 rows=1 
width=68) (actual time=0.007..0.009 rows=1 loops=1)
   -  Index Scan using pg_database_oid_index on pg_database d 
(cost=0.13..0.27 rows=1 width=68) (actual time=0.009..0.010 rows=1 loops=1)

 Index Cond: (oid = s.datid)
 Planning time: 0.264 ms
 Total runtime: 0.158 ms
(11 rows)

Links

1. 
http://www.postgresql.org/message-id/cacfv+pknembqyjpcqrgsvmc_hvrgai3d_ge893n8qbx+ymh...@mail.gmail.com


--
Andreas Karlsson
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
new file mode 100644
index 9969a25..42425fa
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
*** ExplainOneQuery(Query *query, IntoClause
*** 318,330 
  		(*ExplainOneQuery_hook) (query, into, es, queryString, params);
  	else
  	{
! 		PlannedStmt *plan;
  
  		/* plan the query */
  		plan = pg_plan_query(query, 0, params);
  
  		/* run it (if needed) and produce output */
! 		ExplainOnePlan(plan, into, es, queryString, params);
  	}
  }
  
--- 318,336 
  		(*ExplainOneQuery_hook) (query, into, es, queryString, params);
  	else
  	{
! 		PlannedStmt	*plan;
! 		instr_time	planstart, planduration;
! 
! 		INSTR_TIME_SET_CURRENT(planstart);
  
  		/* plan the query */
  		plan = pg_plan_query(query, 0, params);
  
+ 		INSTR_TIME_SET_CURRENT(planduration);
+ 		INSTR_TIME_SUBTRACT(planduration, planstart);
+ 
  		/* run it (if needed) and produce output */
! 		ExplainOnePlan(plan, into, es, queryString, params, planduration);
  	}
  }
  
*** ExplainOneUtility(Node *utilityStmt, Int
*** 401,412 
   */
  void
  ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
! 			   const char *queryString, ParamListInfo params)
  {
  	DestReceiver *dest;
  	QueryDesc  *queryDesc;
  	instr_time	starttime;
  	double		totaltime = 0;
  	int			eflags;
  	int			instrument_option = 0;
  
--- 407,420 
   */
  void
  ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
! 			   const char *queryString, ParamListInfo params,
! 			   instr_time planduration)
  {
  	DestReceiver *dest;
  	QueryDesc  *queryDesc;
  	instr_time	starttime;
  	double		totaltime = 0;
+ 	double		plantime = INSTR_TIME_GET_DOUBLE(planduration);
  	int			eflags;
  	int			instrument_option = 0;
  
*** ExplainOnePlan(PlannedStmt *plannedstmt,
*** 482,487 
--- 490,500 
  	/* Create textual dump of plan tree */
  	ExplainPrintPlan(es, queryDesc);
  
+ 	if (es-format == EXPLAIN_FORMAT_TEXT)
+ 		appendStringInfo(es-str, Planning time: %.3f ms\n, 1000.0 * plantime);
+ 	else
+ 		ExplainPropertyFloat(Planning Time, 1000.0 * plantime, 3, es);
+ 
  	/* Print info about runtime of triggers */
  	if (es-analyze)
  	{
diff --git a/src/backend/commands/prepare.c 

Re: [HACKERS] trailing comment ghost-timing

2013-12-23 Thread Andreas Karlsson

On 12/24/2013 02:05 AM, Erik Rijkers wrote:

With \timing on, a trailing comment yields a timing.

# test.sql
select 1;

/*
select 2
*/

$ psql -f test.sql
  ?column?
--
 1
(1 row)

Time: 0.651 ms
Time: 0.089 ms

I assume it is timing something about that comment (right?).

Confusing and annoying, IMHO.  Is there any chance such trailing ghost-timings 
can be removed?


This seems to be caused by psql sending the comment to the server to 
evaluate as a query. I personally think timing should always output 
something for every command sent to the server. To fix this problem I 
guess one would have to modify psql_scan() to check if a scanned query 
only contains comments and then not send it to the server at all.


The minimal file to reproduce it is:

/**/

--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2013-12-23 Thread Andreas Karlsson

On 12/22/2013 04:38 PM, Alexander Korotkov wrote:

postgres=# explain analyze select * from test order by v1, id limit 10;
   QUERY
PLAN
---
  Limit  (cost=11441.77..11442.18 rows=10 width=12) (actual
time=79.980..79.982 rows=10 loops=1)
-  Partial sort  (cost=11441.77..53140.44 rows=100 width=12)
(actual time=79.978..79.978 rows=10 loops=1)
  Sort Key: v1, id
  Presorted Key: v1
  Sort Method: top-N heapsort  Memory: 25kB
  -  Index Scan using test_v1_idx on test  (cost=0.42..47038.83
rows=100 width=12) (actual time=0.031..38.275 rows=100213 loops=1)
  Total runtime: 81.786 ms
(7 rows)


Have you thought about how do you plan to print which sort method and 
how much memory was used? Several different sort methods may have been 
use in the query. Should the largest amount of memory/disk be printed?



However, work with joins needs more improvements.


That would be really nice to have, but the patch seems useful even 
without the improvements to joins.


--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] trailing comment ghost-timing

2013-12-23 Thread David Johnston
Andreas Karlsson wrote
 On 12/24/2013 02:05 AM, Erik Rijkers wrote:
 With \timing on, a trailing comment yields a timing.

 # test.sql
 select 1;

 /*
 select 2
 */

 $ psql -f test.sql
   ?column?
 --
  1
 (1 row)

 Time: 0.651 ms
 Time: 0.089 ms

 I assume it is timing something about that comment (right?).

 Confusing and annoying, IMHO.  Is there any chance such trailing
 ghost-timings can be removed?
 
 This seems to be caused by psql sending the comment to the server to 
 evaluate as a query. I personally think timing should always output 
 something for every command sent to the server. To fix this problem I 
 guess one would have to modify psql_scan() to check if a scanned query 
 only contains comments and then not send it to the server at all.
 
 The minimal file to reproduce it is:
 
 /**/
 
 -- 
 Andreas Karlsson

I need to be convinced that the server should not just silently ignore
trailing comments.  I'd consider an exception if the only text sent is a
comment ( in such a case we should throw an error ) but if valid commands
are sent and there is just some comment text at the end it should be ignored
the same as if the comments were embedded in the middle of the query text.

I've encountered other clients that output phantom results in this situation
and solving it at the server seems worthwhile so client applications do not
have to care.

In the example case, I think, putting the comment before the command results
in only one timing.  This inconsistency is a symptom of this situation being
handled incorrectly.

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/trailing-comment-ghost-timing-tp5784473p5784484.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planning time in explain/explain analyze

2013-12-23 Thread Tom Lane
Andreas Karlsson andr...@proxel.se writes:
 The patch does not include any changes to documentation or tests. I will 
 fix that if people think this patch is useful.

I take it you've not tried the regression tests with this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] trailing comment ghost-timing

2013-12-23 Thread Andreas Karlsson

On 12/24/2013 03:17 AM, David Johnston wrote:

I need to be convinced that the server should not just silently ignore
trailing comments.  I'd consider an exception if the only text sent is a
comment ( in such a case we should throw an error ) but if valid commands
are sent and there is just some comment text at the end it should be ignored
the same as if the comments were embedded in the middle of the query text.

I've encountered other clients that output phantom results in this situation
and solving it at the server seems worthwhile so client applications do not
have to care.

In the example case, I think, putting the comment before the command results
in only one timing.  This inconsistency is a symptom of this situation being
handled incorrectly.


It is not sent to the server as a trailing comment. The following file 
is sent to the server like this.


File:
/**/;
/**/

Commands:
PQexec(..., /**/;);
PQexec(..., /**/);

If this has to be fixed it should be in the client. I think people would 
complain if we broke the API by starting to throw an exception on PQexec 
with a string containing no actual query.


--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planning time in explain/explain analyze

2013-12-23 Thread Euler Taveira
On 23-12-2013 22:12, Andreas Karlsson wrote:
 A user asked in -performance[1] if there is a way to measure the
 planning time. Other than measuring it in the client I do not think
 there is so I quickly hacked a patched which adds Planning time to the
 outputs of EXPLAIN and EXPLAIN ANALYZE.
 
 Is this something useful? I think it is, since plan time can become an
 issue for complex queries.
 
Yes, but a couple of years ago, this same feature was proposed as a
module [1][2]. I'm not sure if it is worth including as an additional
module or not. Opinions?


[1]
http://www.postgresql.org/message-id/BANLkTi=sxKTCTcv9hsACu38eaj=fw_4...@mail.gmail.com
[2] https://github.com/umitanuki/planinstr


-- 
   Euler Taveira   Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planning time in explain/explain analyze

2013-12-23 Thread Andreas Karlsson

On 12/24/2013 03:33 AM, Tom Lane wrote:

Andreas Karlsson andr...@proxel.se writes:

The patch does not include any changes to documentation or tests. I will
fix that if people think this patch is useful.


I take it you've not tried the regression tests with this.


Yeah, forgot to mention that we need some way to disable it in the 
tests. Either by not having it included in EXPLAIN or by adding an 
option to turn it off. Any suggestion on which would be preferable?


--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planning time in explain/explain analyze

2013-12-23 Thread Robert Haas
On Mon, Dec 23, 2013 at 9:54 PM, Andreas Karlsson andr...@proxel.se wrote:
 On 12/24/2013 03:33 AM, Tom Lane wrote:

 Andreas Karlsson andr...@proxel.se writes:

 The patch does not include any changes to documentation or tests. I will
 fix that if people think this patch is useful.


 I take it you've not tried the regression tests with this.


 Yeah, forgot to mention that we need some way to disable it in the tests.
 Either by not having it included in EXPLAIN or by adding an option to turn
 it off. Any suggestion on which would be preferable?

I would be tempted to display it only if (COSTS OFF) is not given.  As
far as I can tell, the major use case for (COSTS OFF) is when you want
the output to be stable so you can include it in a regression test.
Technically speaking, planning time is not a cost, but I'm not sure I
could live with myself if we forced everyone to write (COSTS OFF,
PLANNING_TIME OFF).  And I don't think much of the idea of only
including planning time when ANALYZE is used, because you don't have
to want to run the query to want to know how long it took to plan.

Also, +1 for this general concept.  Great idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planning time in explain/explain analyze

2013-12-23 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Dec 23, 2013 at 9:54 PM, Andreas Karlsson andr...@proxel.se wrote:
 Yeah, forgot to mention that we need some way to disable it in the tests.
 Either by not having it included in EXPLAIN or by adding an option to turn
 it off. Any suggestion on which would be preferable?

 I would be tempted to display it only if (COSTS OFF) is not given.

Yeah.  The alternatives seem to be:

1. Change a lot of regression tests.  This would be a serious PITA,
not so much for the one-time cost as for every time we needed to
back-patch a regression test that includes explain output.  -1.

2. Don't display the planning time by default, necessitating a new
PLANNING_TIME ON option.  This has backwards compatibility to
recommend it, but not much else.

3. Let COSTS OFF suppress it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planning time in explain/explain analyze

2013-12-23 Thread Robert Haas
On Mon, Dec 23, 2013 at 10:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Dec 23, 2013 at 9:54 PM, Andreas Karlsson andr...@proxel.se wrote:
 Yeah, forgot to mention that we need some way to disable it in the tests.
 Either by not having it included in EXPLAIN or by adding an option to turn
 it off. Any suggestion on which would be preferable?

 I would be tempted to display it only if (COSTS OFF) is not given.

 Yeah.  The alternatives seem to be:

 1. Change a lot of regression tests.  This would be a serious PITA,
 not so much for the one-time cost as for every time we needed to
 back-patch a regression test that includes explain output.  -1.

 2. Don't display the planning time by default, necessitating a new
 PLANNING_TIME ON option.  This has backwards compatibility to
 recommend it, but not much else.

 3. Let COSTS OFF suppress it.

Another option would be to not display it by default, but make VERBOSE
show it.  However, I think making it controlled by COSTS is a better
fit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging WAL when updating hintbit

2013-12-23 Thread Michael Paquier
On Sat, Dec 21, 2013 at 4:13 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Dec 20, 2013 at 9:06 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 Michael Paquier escribió:
 On Fri, Dec 20, 2013 at 1:05 PM, Sawada Masahiko sawada.m...@gmail.com 
 wrote:

  Sorry the patch which I attached has wrong indent on pg_controldata.
  I have modified it and attached the new version patch.
 Now that you send this patch, I am just recalling some recent email
 from Tom arguing about avoiding to mix lower and upper-case characters
 for a GUC parameter name:
 http://www.postgresql.org/message-id/30569.1384917...@sss.pgh.pa.us

 To fullfill this requirement, could you replace walLogHints by
 wal_log_hints in your patch? Thoughts from others?

 The issue is with the user-visible variables, not with internal
 variables implementing them.  I think the patch is sane.  (Other than
 the fact that it was posted as a patch-on-patch instead of
 patch-on-master).

 But spelling it the same way everywhere really improves greppability.
Yep, finding all the code paths with a single keyword is usually a
good thing. Attached is a purely-aesthetical patch that updates the
internal variable name to wal_log_hints.
-- 
Michael
commit a727578f38ae6574d36fc840ec903cc2a6f4a860
Author: Michael Paquier mich...@otacoo.com
Date:   Tue Dec 24 22:56:13 2013 +0900

Rename internal variable of wal_log_hints for better consistency

Purely-aesthetical patch...

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1277e71..43a0416 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -79,7 +79,7 @@ bool		XLogArchiveMode = false;
 char	   *XLogArchiveCommand = NULL;
 bool		EnableHotStandby = false;
 bool		fullPageWrites = true;
-bool		walLogHints = false;
+bool		wal_log_hints = false;
 bool		log_checkpoints = false;
 int			sync_method = DEFAULT_SYNC_METHOD;
 int			wal_level = WAL_LEVEL_MINIMAL;
@@ -5271,7 +5271,7 @@ BootStrapXLOG(void)
 	ControlFile-max_prepared_xacts = max_prepared_xacts;
 	ControlFile-max_locks_per_xact = max_locks_per_xact;
 	ControlFile-wal_level = wal_level;
-	ControlFile-wal_log_hints = walLogHints;
+	ControlFile-wal_log_hints = wal_log_hints;
 	ControlFile-data_checksum_version = bootstrap_data_checksum_version;
 
 	/* some additional ControlFile fields are set in WriteControlFile() */
@@ -9060,7 +9060,7 @@ static void
 XLogReportParameters(void)
 {
 	if (wal_level != ControlFile-wal_level ||
-		walLogHints != ControlFile-wal_log_hints ||
+		wal_log_hints != ControlFile-wal_log_hints ||
 		MaxConnections != ControlFile-MaxConnections ||
 		max_worker_processes != ControlFile-max_worker_processes ||
 		max_prepared_xacts != ControlFile-max_prepared_xacts ||
@@ -9083,7 +9083,7 @@ XLogReportParameters(void)
 			xlrec.max_prepared_xacts = max_prepared_xacts;
 			xlrec.max_locks_per_xact = max_locks_per_xact;
 			xlrec.wal_level = wal_level;
-			xlrec.wal_log_hints = walLogHints;
+			xlrec.wal_log_hints = wal_log_hints;
 
 			rdata.buffer = InvalidBuffer;
 			rdata.data = (char *) xlrec;
@@ -9098,7 +9098,7 @@ XLogReportParameters(void)
 		ControlFile-max_prepared_xacts = max_prepared_xacts;
 		ControlFile-max_locks_per_xact = max_locks_per_xact;
 		ControlFile-wal_level = wal_level;
-		ControlFile-wal_log_hints = walLogHints;
+		ControlFile-wal_log_hints = wal_log_hints;
 		UpdateControlFile();
 	}
 }
@@ -9485,7 +9485,7 @@ xlog_redo(XLogRecPtr lsn, XLogRecord *record)
 		ControlFile-max_prepared_xacts = xlrec.max_prepared_xacts;
 		ControlFile-max_locks_per_xact = xlrec.max_locks_per_xact;
 		ControlFile-wal_level = xlrec.wal_level;
-		ControlFile-wal_log_hints = walLogHints;
+		ControlFile-wal_log_hints = wal_log_hints;
 
 		/*
 		 * Update minRecoveryPoint to ensure that if recovery is aborted, we
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index a605363..f8261b6 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -876,7 +876,7 @@ static struct config_bool ConfigureNamesBool[] =
 			gettext_noop(Writes full pages to WAL when first modified after a checkpoint, even for a non-critical modifications),
 			NULL
 		},
-		walLogHints,
+		wal_log_hints,
 		false,
 		NULL, NULL, NULL
 	},
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 954486a..91ba0d1 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -189,7 +189,7 @@ extern bool XLogArchiveMode;
 extern char *XLogArchiveCommand;
 extern bool EnableHotStandby;
 extern bool fullPageWrites;
-extern bool walLogHints;
+extern bool wal_log_hints;
 extern bool log_checkpoints;
 extern int	num_xloginsert_slots;
 
@@ -221,7 +221,7 @@ extern int	wal_level;
  * of the bits make it to disk, but the checksum wouldn't match.  Also WAL-log
  * them if forced by wal_log_hints=on.
  */
-#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || walLogHints)
+#define XLogHintBitIsNeeded() (DataChecksumsEnabled() 

Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-12-23 Thread Robert Haas
On Mon, Dec 23, 2013 at 5:59 PM, Peter Geoghegan p...@heroku.com wrote:
 On Mon, Dec 23, 2013 at 7:49 AM, Robert Haas robertmh...@gmail.com wrote:
 I don't think this is a project to rush through.  We've lived without
 MERGE/UPSERT for several years now, and we can live without it for
 another release cycle while we try to reach agreement on the way
 forward.  I can tell that you're convinced you know the right way
 forward here, and you may be right, but I don't think you've convinced
 everyone else - maybe not even anyone else.

 That may be. Attention from reviewers has been in relatively short
 supply. Not that that isn't always true.

I think concrete concerns about usability have largely been
subordinated to abstruse discussions about locking protocols.  A
discussion strictly about what syntax people would consider
reasonable, perhaps on another thread, might elicit broader
participation (although this week might not be the right time to try
to attract an audience).

 Having looked at the code for the first time recently, I'd agree that
 hash indexes are a disaster. A major advantage of The Lehman and Yao
 Algorithm, as prominently noted in the paper, is that exclusive locks
 are only acquired on leaf pages to increase concurrency. Since I only
 propose to extend this to a heavyweight page lock, and still only for
 an instant, it seems reasonable to assume that the performance will be
 acceptable for an initial version of this. It's not as if most places
 will have to pay any heed to this heavyweight lock - index scans and
 non-upserting inserts are generally unaffected. We can later optimize
 performance as we measure a need to do so. Early indications are that
 the performance is reasonable.

OK.

 To be honest, I am still not altogether sold on any part of this
 feature.  I don't like the fact that it violates MVCC - although I
 admit that some form of violation is inevitable in any feature in this
 area unless we're content to live with many serialization failures, I
 don't like the particular way it violates MVCC

 Discussions around visibility issues have not been very useful. As
 I've said, I don't like the term MVCC violation, because it's
 suggestive of some classical, codified definition of MVCC, a
 definition that doesn't actually exist anywhere, even in research
 papers, AFAICT.

I don't know whether or not that's literally true, but like Potter
Stewart, I don't think there's any real ambiguity about the underlying
concept.  The concepts of read-write, write-read, and write-write
dependencies between transactions are well-described in textbooks such
as Jim Gray's Transaction Processing: Concepts and Techniques and this
paper on MVCC:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.142.552rep=rep1type=pdf

I think the definition of an MVCC violation is that a snapshot sees
the effects of a transaction which committed after that snapshot was
taken.  And maybe it's good and right that this patch is introducing a
new way for that to happen, or maybe it's not, but the point is that
we get to decide.

 I've been very consistent even in the face of strong criticism. What I
 have now is essentially the same design as back in early September.
 After the initial ON DUPLICATE KEY IGNORE patch in August, I soon
 realized that value locking and row locking could not be sensibly
 considered in isolation, and over the objections of others pushed
 ahead with integrating the two. I believe now as I believed then that
 value locks need to be cheap to release (or it at least needs to be
 possible), and that it was okay to drop all value locks when we need
 to deal with a possible conflict/getting an xid shared lock - if those
 unlocked pages have separate conflicts on our next attempt, the
 feature is being badly misused (for upserting) or it doesn't matter
 because we only need one conclusive No answer (for IGNOREing, but
 also for upserting).

I'm not saying that you haven't been consistent, or that you've done
anything wrong at all.  I'm just saying that the default outcome is
that we change nothing, and the fact that nobody's been able to
demonstrate an approach is clearly superior to what you've proposed
does not mean we have to accept what you've proposed.  I am not
necessarily advocating for rejecting your proposed approach, although
I do have concerns about it, but I think it is clear that it is not
backed by any meaningful amount of consensus.  Maybe that will change
in the next two months, and maybe it won't.  If it doesn't, whether
through any fault of yours or not, I don't think this is going in.  If
this is all perfectly clear to you already, then I apologize for
belaboring the point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fix typo in src/backend/utils/mmgr/README

2013-12-23 Thread Etsuro Fujita
This is a small patch to fix a typo in src/backend/utils/mmgr/README.

Thanks,

Best regards,
Etsuro Fujita


typofix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] varattno remapping

2013-12-23 Thread Craig Ringer
Hi all

I'm finding it necessary to remap the varno and varattno of Var elements
in RETURNING lists as part of updatable s.b. view processing. A
reference to a view col in RETURNING must be rewritten to point to the
new resultRelation at the top level, so that the effects of BEFORE
triggers are visible in the returned result.

Usually the view will have a different structure to the base table, and
in this case it's necessary to change the varattno of the Var to point
to the corresponding col on the base rel.

So the short version: Given the RTE for a simple view with only one base
rel and an attribute number for a col in that view, and assuming that
the view col is a simple reference to a col in the underlying base rel,
is there any sane way to get the attribute number for the corresponding
col on the base rel?



For example, given:

CREATE TABLE t (a integer, b text, c numeric);
CREATE VIEW v1 AS SELECT c, a FROM t;
UPDATE v1 SET c = NUMERIC '42' RETURNING a, c;

... the Vars for a and c in the RETURNING clause need to be remapped so
their varno is the rti for t once the RTE has been injected, and the
varattnos need changing to the corresponding attribute numbers on the
base table. So a Var for v1(c), say (1,1), must be remapped to (2,3)
i.e. varno 2 (t), varattno 3 (c).

I'm aware that this is somewhat like the var/attr twiddling being
complained about in the RLS patch. I don't see a way around it, though.
I can't push the RETURNING tlist entries down into the expanded view's
subquery and add an outer Var reference - they must point directly to
the resultRelation so that we see the effects of any BEFORE triggers.

I'm looking for advice on how to determine, given an RTI and an
attribute number for a simple view, what the attribute number of the col
in the view's base relation is. It can be assumed for this purpose that
the view attribute is a simple Var (otherwise we'll ERROR out, since we
can't update an expression).

I'm a bit stumped at this point.

I could adapt the ChangeVarNodes walker to handle the actual recursive
rewriting and Var changing. It assumes that the varattno is the same on
both the old and new varno, as it's intended for when you have an RT
index change, but could be taught to optionally remap varattnos. To do
that, though, I need a way to actually do that varattno remapping cleanly.

Anyone got any ideas?
--
Craig Ringer


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] varattno remapping

2013-12-23 Thread Craig Ringer
On 12/24/2013 02:35 PM, Craig Ringer wrote:

 So the short version: Given the RTE for a simple view with only one base
 rel and an attribute number for a col in that view, and assuming that
 the view col is a simple reference to a col in the underlying base rel,
 is there any sane way to get the attribute number for the corresponding
 col on the base rel?

So, of course, I find it as soon as I post.

map_variable_attnos(...), also in src/backend/rewrite/rewriteManip.c .

Just generate the mapping table and go.

Sorry for the noise folks.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] varattno remapping

2013-12-23 Thread Abbas Butt
On Tue, Dec 24, 2013 at 11:47 AM, Craig Ringer cr...@2ndquadrant.comwrote:

 On 12/24/2013 02:35 PM, Craig Ringer wrote:

  So the short version: Given the RTE for a simple view with only one base
  rel and an attribute number for a col in that view, and assuming that
  the view col is a simple reference to a col in the underlying base rel,
  is there any sane way to get the attribute number for the corresponding
  col on the base rel?

 So, of course, I find it as soon as I post.

 map_variable_attnos(...), also in src/backend/rewrite/rewriteManip.c .

 Just generate the mapping table and go.


Could you please explain a little bit more how would you solve the posed
problem using map_variable_attnos?

I was recently working on a similar problem and used the following algo to
solve it.

I had to find to which column of the base table does a column in the select
statement of the view query belong.
To relate a target list entry in the select query of the view to an actual
column in base table here is what I did

First determine whether the var's varno refer to an RTE which is a view?
If yes then get the view query (RangeTblEntry::subquery) and see which
element in the view query's target list does this var's varattno point to.
Get the varno of that target list entry. Look for that RTE in the view's
query and see if that RTE is a real table then use that var making sure its
varno now points to the actual table.

Thanks in advance.




 Sorry for the noise folks.

 --
  Craig Ringer   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
-- 
*Abbas*
 Architect

Ph: 92.334.5100153
 Skype ID: gabbasb
www.enterprisedb.co
http://www.enterprisedb.com/mhttp://www.enterprisedb.com/

*Follow us on Twitter*
@EnterpriseDB

Visit EnterpriseDB for tutorials, webinars,
whitepapershttp://www.enterprisedb.com/resources-communityand
morehttp://www.enterprisedb.com/resources-community


Re: [HACKERS] trailing comment ghost-timing

2013-12-23 Thread Martijn van Oosterhout
On Tue, Dec 24, 2013 at 03:40:58AM +0100, Andreas Karlsson wrote:
 On 12/24/2013 03:17 AM, David Johnston wrote:
 It is not sent to the server as a trailing comment. The following
 file is sent to the server like this.
 
 File:
 /**/;
 /**/
 
 Commands:
 PQexec(..., /**/;);
 PQexec(..., /**/);
 
 If this has to be fixed it should be in the client. I think people
 would complain if we broke the API by starting to throw an exception
 on PQexec with a string containing no actual query.

(Untested). Isn't this just a case of psql not printing out a timing if
the server responds with PGRES_EMPTY_QUERY?

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature