date:20140506

[HACKERS] New pg_lsn type doesn't have hash/btree opclasses

2014-05-06 Thread Andres Freund

Hi,

Craig just mentioned in an internal chat that there's no btree or even
hash opclass for the new pg_lsn type. That restricts what you can do
with it quite severely.
Imo this should be fixed for 9.4 - after all it was possible unto now to
index a table with lsns returned by system functions or have queries
with grouping on them without casting.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible fix for occasional failures on castoroides etc

2014-05-06 Thread Dave Page

On Sat, May 3, 2014 at 8:29 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-05-03 13:25:32 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2012-09-17 08:23:01 -0400, Dave Page wrote:
  I've added MAX_CONNECTIONS=5 to both Castoroides and Protosciurus.

  I've just noticed (while checking whether backporting 4c8aa8b5aea caused
  problems) that this doesn't seem to have fixed the issue. One further
  thing to try would be to try whether tcp connections don't have the same
  problem.

 I did some googling on this, and found out that people have seen identical
 behavior on Solaris with mysql and other products, so at least we're not
 alone.

 Yea, I found a couple report of that as well.

  Googling also reminded me that we could have a look at the source
 (duh), which is still available from hg.openindiana.org.

 I didn't get that far ;)

 I think we should try whether the problem disappears if tcp connections
 are used. That ought to be much more heavily used in the real
 world. Thus less likely to be buggy.

 While It's not documented as such, passing --host=localhost to
 pg_regress seems to have the desired effect. Dave, could you make your
 animal specify that?

I've added:

EXTRA_REGRESS_OPTS = '--host=localhost',

to the build_env setting for both animals.


-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2014-05-06 Thread Heikki Linnakangas


On 03/31/2014 09:08 PM, Robert Haas wrote:

On Wed, Mar 26, 2014 at 9:45 PM, Peter Geoghegan p...@heroku.com wrote:

On Wed, Nov 27, 2013 at 9:10 AM, Noah Misch n...@leadboat.com wrote:

The threat is that rounding the read size up to the next MAXALIGN would cross
into an unreadable memory page, resulting in a SIGSEGV.  Every palloc chunk
has MAXALIGN'd size under the hood, so the excess read of toDelete cannot
cause a SIGSEGV.  For a stack variable, it depends on the ABI.  I'm not aware
of an ABI where the four bytes past the end of this stack variable could be
unreadable, which is not to claim I'm well-read on the topic.  We should fix
this in due course on code hygiene grounds, but I would not back-patch it.


Attached patch silences the Invalid read of size n complaints of
Valgrind. I agree with your general thoughts around backpatching. Note
that the patch addresses a distinct complaint from Kevin's, as
Valgrind doesn't take issue with the invalid reads past the end of
spgxlogPickSplit variables on the stack.


Is the needless zeroing this patch introduces apt to cause a
performance problem?

This function is actually pretty wacky.  If we're stuffing bytes with
undefined contents into the WAL record, maybe the answer isn't to
force the contents of those bytes to be defined, but rather to elide
them from the WAL record.


Agreed. I propose the attached, which removes the MAXALIGNs. It's not 
suitable for backpatching, though, as it changes the format of the WAL 
record.


- Heikki
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
index 48f32cd..ff403ef 100644
--- a/src/backend/access/spgist/spgdoinsert.c
+++ b/src/backend/access/spgist/spgdoinsert.c
@@ -533,9 +533,9 @@ moveLeafs(Relation index, SpGistState *state,
 	{
 		XLogRecPtr	recptr;
 
-		ACCEPT_RDATA_DATA(xlrec, MAXALIGN(sizeof(xlrec)), 0);
-		ACCEPT_RDATA_DATA(toDelete, MAXALIGN(sizeof(OffsetNumber) * nDelete), 1);
-		ACCEPT_RDATA_DATA(toInsert, MAXALIGN(sizeof(OffsetNumber) * nInsert), 2);
+		ACCEPT_RDATA_DATA(xlrec, SizeOfSpgxlogMoveLeafs, 0);
+		ACCEPT_RDATA_DATA(toDelete, sizeof(OffsetNumber) * nDelete, 1);
+		ACCEPT_RDATA_DATA(toInsert, sizeof(OffsetNumber) * nInsert, 2);
 		ACCEPT_RDATA_DATA(leafdata, leafptr - leafdata, 3);
 		ACCEPT_RDATA_BUFFER(current-buffer, 4);
 		ACCEPT_RDATA_BUFFER(nbuf, 5);
@@ -1116,7 +1116,7 @@ doPickSplit(Relation index, SpGistState *state,
 
 	leafdata = leafptr = (char *) palloc(totalLeafSizes);
 
-	ACCEPT_RDATA_DATA(xlrec, MAXALIGN(sizeof(xlrec)), 0);
+	ACCEPT_RDATA_DATA(xlrec, SizeOfSpgxlogPickSplit, 0);
 	ACCEPT_RDATA_DATA(innerTuple, innerTuple-size, 1);
 	nRdata = 2;
 
@@ -1152,7 +1152,7 @@ doPickSplit(Relation index, SpGistState *state,
 		{
 			xlrec.nDelete = nToDelete;
 			ACCEPT_RDATA_DATA(toDelete,
-			  MAXALIGN(sizeof(OffsetNumber) * nToDelete),
+			  sizeof(OffsetNumber) * nToDelete,
 			  nRdata);
 			nRdata++;
 			ACCEPT_RDATA_BUFFER(current-buffer, nRdata);
@@ -1251,13 +1251,9 @@ doPickSplit(Relation index, SpGistState *state,
 	}
 
 	xlrec.nInsert = nToInsert;
-	ACCEPT_RDATA_DATA(toInsert,
-	  MAXALIGN(sizeof(OffsetNumber) * nToInsert),
-	  nRdata);
+	ACCEPT_RDATA_DATA(toInsert, sizeof(OffsetNumber) * nToInsert, nRdata);
 	nRdata++;
-	ACCEPT_RDATA_DATA(leafPageSelect,
-	  MAXALIGN(sizeof(uint8) * nToInsert),
-	  nRdata);
+	ACCEPT_RDATA_DATA(leafPageSelect, sizeof(uint8) * nToInsert, nRdata);
 	nRdata++;
 	ACCEPT_RDATA_DATA(leafdata, leafptr - leafdata, nRdata);
 	nRdata++;
diff --git a/src/backend/access/spgist/spgxlog.c b/src/backend/access/spgist/spgxlog.c
index 1689324..cd6a4b2 100644
--- a/src/backend/access/spgist/spgxlog.c
+++ b/src/backend/access/spgist/spgxlog.c
@@ -205,7 +205,7 @@ static void
 spgRedoMoveLeafs(XLogRecPtr lsn, XLogRecord *record)
 {
 	char	   *ptr = XLogRecGetData(record);
-	spgxlogMoveLeafs *xldata = (spgxlogMoveLeafs *) ptr;
+	spgxlogMoveLeafs *xldata;
 	SpGistState state;
 	OffsetNumber *toDelete;
 	OffsetNumber *toInsert;
@@ -213,18 +213,19 @@ spgRedoMoveLeafs(XLogRecPtr lsn, XLogRecord *record)
 	Buffer		buffer;
 	Page		page;
 
-	fillFakeState(state, xldata-stateSrc);
-
+	xldata = (spgxlogMoveLeafs *) ptr;
 	nInsert = xldata-replaceDead ? 1 : xldata-nMoves + 1;
 
-	ptr += MAXALIGN(sizeof(spgxlogMoveLeafs));
+	ptr += SizeOfSpgxlogMoveLeafs;
 	toDelete = (OffsetNumber *) ptr;
-	ptr += MAXALIGN(sizeof(OffsetNumber) * xldata-nMoves);
+	ptr += sizeof(OffsetNumber) * xldata-nMoves;
 	toInsert = (OffsetNumber *) ptr;
-	ptr += MAXALIGN(sizeof(OffsetNumber) * nInsert);
+	ptr += sizeof(OffsetNumber) * nInsert;
 
 	/* now ptr points to the list of leaf tuples */
 
+	fillFakeState(state, xldata-stateSrc);
+
 	/*
 	 * In normal operation we would have all three pages (source, dest, and
 	 * parent) locked simultaneously; but in WAL replay it should be safe to
@@ -252,10 +253,16 @@ spgRedoMoveLeafs(XLogRecPtr lsn, XLogRecord *record)
 
 for (i = 0; i  nInsert; i++)
 {
-	SpGistLeafTuple lt =

Re: [HACKERS] Sequential disk access during VACUUM for GiST/GIN

2014-05-06 Thread Alexander Korotkov

Hi!

On Tue, Apr 29, 2014 at 2:34 PM, Костя Кузнецов chapae...@yandex.ru wrote:

 There is a task Sequential disk access during VACUUM for GiST/GIN  in
 list GSOC14.
 Nobody is working on this task?


I didn't hear anybody is working on it.


 Do I understand this task correctly?
 I must recode gistbulkdelete.
 GistBDItem *stack is must have items with sequential blkno as possible.


Yes, make gistbulkdelete and ginbulkdelete access disk sequentially while
now tree is traversed in logical order. So these functions need to be
completely reworked: I'm not sure GistBDItem will survive :)
The challenge is concurrency. Vacuum shouldn't block concurrent readers and
writers. You can see btbulkdelete  which supports sequential disk access
now.


 I have a question:
 where are access to disk in this function? ReadBufferExtended?


Yes, this function read buffer to shared memory (if it isn't already) and
pins it.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] using array of char pointers gives wrong results

2014-05-06 Thread Michael Meskes

 PFA patch ecpg_char_ptr_arr.patch to fix this issue. It has changes as
 follows
 ...

Thanks for finding and fixing the problem. Patch applied to complete 9 series 
with only a minor change.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at gmail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: [HACKERS] Problem with txid_snapshot_in/out() functionality)

2014-05-06 Thread Andres Freund

Hi,

On 2014-05-05 13:41:00 +0300, Heikki Linnakangas wrote:
 I came up with the attached fix for this. Currently, the entry is implicitly
 considered dead or unlocked if the locking_xid transaction is no longer
 active, but this patch essentially turns locking_xid into a simple boolean,
 and makes it the backend's responsibility to clear it on abort. (it's not
 actually a boolean, it's a BackendId, but that's just for debugging purposes
 to track who's keeping the entry locked). This requires a process exit hook,
 and an abort hook, to make sure the entry is always released, but that's not
 too difficult. It allows the backend to release the entry at exactly the
 right time, instead of having it implicitly released by

 I considered Andres' idea of using a new heavy-weight lock, but didn't like
 it much. It would be a larger patch, which is not nice for back-patching.
 One issue would be that if you run out of lock memory, you could not roll
 back any prepared transactions, which is not nice because it could be a
 prepared transaction that's hoarding the lock memory.

I am not convinced by the latter reasoning but you're right that any
such change would hardly be backpatchable.

 +/*
 + * Exit hook to unlock the global transaction entry we're working on.
 + */
 +static void
 +AtProcExit_Twophase(int code, Datum arg)
 +{
 + /* same logic as abort */
 + AtAbort_Twophase();
 +}
 +
 +/*
 + * Abort hook to unlock the global transaction entry we're working on.
 + */
 +void
 +AtAbort_Twophase(void)
 +{
 + if (MyLockedGxact == NULL)
 + return;
 +
 + /*
 +  * If we were in process of preparing the transaction, but haven't
 +  * written the WAL record yet, remove the global transaction entry.
 +  * Same if we are in the process of finishing an already-prepared
 +  * transaction, and fail after having already written the WAL 2nd
 +  * phase commit or rollback record.
 +  *
 +  * After that it's too late to abort, so just unlock the 
 GlobalTransaction
 +  * entry.  We might not have transfered all locks and other state to the
 +  * prepared transaction yet, so this is a bit bogus, but it's the best 
 we
 +  * can do.
 +  */
 + if (!MyLockedGxact-valid)
 + {
 + RemoveGXact(MyLockedGxact);
 + }
 + else
 + {
 + LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
 +
 + MyLockedGxact-locking_backend = InvalidBackendId;
 +
 + LWLockRelease(TwoPhaseStateLock);
 + }
 + MyLockedGxact = NULL;
 +}

Is it guaranteed that all paths have called LWLockReleaseAll()
before calling the proc exit hooks? Otherwise we might end up waiting
for ourselves...

  /*
   * MarkAsPreparing
 @@ -261,29 +329,15 @@ MarkAsPreparing(TransactionId xid, const char *gid,
errmsg(prepared transactions are disabled),
 errhint(Set max_prepared_transactions to a nonzero 
 value.)));
  
 - LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
 -
 - /*
 -  * First, find and recycle any gxacts that failed during prepare. We do
 -  * this partly to ensure we don't mistakenly say their GIDs are still
 -  * reserved, and partly so we don't fail on out-of-slots unnecessarily.
 -  */
 - for (i = 0; i  TwoPhaseState-numPrepXacts; i++)
 + /* on first call, register the exit hook */
 + if (!twophaseExitRegistered)
   {
 - gxact = TwoPhaseState-prepXacts[i];
 - if (!gxact-valid  !TransactionIdIsActive(gxact-locking_xid))
 - {
 - /* It's dead Jim ... remove from the active array */
 - TwoPhaseState-numPrepXacts--;
 - TwoPhaseState-prepXacts[i] = 
 TwoPhaseState-prepXacts[TwoPhaseState-numPrepXacts];
 - /* and put it back in the freelist */
 - gxact-next = TwoPhaseState-freeGXacts;
 - TwoPhaseState-freeGXacts = gxact;
 - /* Back up index count too, so we don't miss scanning 
 one */
 - i--;
 - }
 + before_shmem_exit(AtProcExit_Twophase, 0);
 + twophaseExitRegistered = true;
   }

It's not particularly nice to register shmem exit hooks in the middle of
normal processing because it makes it impossible to use
cancel_before_shmem_exit() previously registered hooks. I think this
should be registered at startup, if max_prepared_xacts  0.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_shmem_allocations view

2014-05-06 Thread Simon Riggs

On 5 May 2014 21:54, Robert Haas robertmh...@gmail.com wrote:
 On Mon, May 5, 2014 at 3:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sun, May 4, 2014 at 7:50 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
 Thinking about this, I think it was a mistake to not add a 'name' field
 to dynamic shared memory's dsm_control_item.

 Well, right now a dsm_control_item is 8 bytes.  If we add a name field
 of our usual 64 bytes, they'll each be 9 times bigger.

 And the controlled shared segment is likely to be how big exactly?  It's
 probably not even possible for it to be smaller than a page size, 4K or
 so depending on the OS.  I agree with Andres that a name would be a good
 idea; complaining about the space needed to hold it is penny-wise and
 pound-foolish.

 The control segment is sized to support a number of dynamic shared
 memory segments not exceeding 64 + 2 *MaxBackends.  With default
 settings, that currently works out to 288 segments, or 2306 bytes.
 So, adding a 64-byte name to each of those structures would increase
 the size from 2k to about 20k.

 So, sure, that's not a lot of memory.  But I'm still not convinced
 that's it's very useful.  What I think is going to happen is that (1)
 most people won't be used dynamic shared memory at all, so they won't
 have any use for this; (2) those people who do run an extension that
 uses dynamic shared memory will most likely only be running one such
 extension, so they won't need a name to know what the segments are
 being used for; and (3) if and when we eventually get parallel query,
 dynamic shared memory segments will be widely used, but a bunch of
 segments that are all named parallel_query or parallel_query.$PID
 isn't going to be too informative.

Not sure your arguments hold any water.

Most people don't use most features... and so we're not allowed
features that can be debugged?
How do you know people will only use one extension that uses dshmem?
Why would we call multiple segments the same thing??

If names are a problem, lets give them numbers. Seems a minor point.
Perhaps we can allocate space for names dynamically??

Not being able to tell segments apart from each other is just daft, if
we are trying to supply bug free software for the world to use.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_shmem_allocations view

2014-05-06 Thread Andres Freund

On 2014-05-05 23:20:43 -0400, Robert Haas wrote:
 On Mon, May 5, 2014 at 6:54 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  I'm not confident that it'll be useful either.  But I am confident that
  if we don't put it in now, and decide we want it later, there will be
  complaints when we change the API.  Better to have an ignored parameter
  than no parameter.
 
 I'm generally skeptical of that philosophy.  If we put in an ignored
 parameter, people may pass pointers to NULL or to garbage or to an
 overly-long string, and they won't know it's broken until we make it
 do something; at which point their code will begin to fail without
 warning.

If it were a complex change, maybe. But I don't think that's likely
here.
Assert(name != NULL  strlen(name)  0  strlen(name)  NAMEDATALEN);
should perfectly do the trick.

 If we're going to do anything at all here for 9.4, I recommend
 ignoring the fact we're in feature freeze and going whole hog: add the
 name, add the monitoring view, and add the monitoring view for the
 main shared memory segment just for good measure.

We can do that as well. If there's agreement on that path I'll update
the patch to also show dynamic statements.

 Anyone who expects PostgreSQL's C API to be
 completely stable is going to be regularly disappointed, as most
 recently demonstrated by the Enormous Header Churn of the 9.3 era.  I
 don't particularly mind being the cause of further disappointment; as
 long as the breakage is obvious rather than subtle, the fix usually
 takes about 10 minutes.

Didn't you complain rather loudly about that change?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_shmem_allocations view

2014-05-06 Thread Robert Haas

On Tue, May 6, 2014 at 7:45 AM, Simon Riggs si...@2ndquadrant.com wrote:
 The control segment is sized to support a number of dynamic shared
 memory segments not exceeding 64 + 2 *MaxBackends.  With default
 settings, that currently works out to 288 segments, or 2306 bytes.
 So, adding a 64-byte name to each of those structures would increase
 the size from 2k to about 20k.

 So, sure, that's not a lot of memory.  But I'm still not convinced
 that's it's very useful.  What I think is going to happen is that (1)
 most people won't be used dynamic shared memory at all, so they won't
 have any use for this; (2) those people who do run an extension that
 uses dynamic shared memory will most likely only be running one such
 extension, so they won't need a name to know what the segments are
 being used for; and (3) if and when we eventually get parallel query,
 dynamic shared memory segments will be widely used, but a bunch of
 segments that are all named parallel_query or parallel_query.$PID
 isn't going to be too informative.

 Not sure your arguments hold any water.

I'm not, either.

 Most people don't use most features... and so we're not allowed
 features that can be debugged?

I certainly didn't say that.

 How do you know people will only use one extension that uses dshmem?

I don't.  If they do, that's a good argument for adding this.

 Why would we call multiple segments the same thing??

It's not clear to me how someone is going to intelligently name
multiple segments used by the same extension.  Maybe they'll give them
all the same name.  Maybe they'll name them all extension_name.pid.
More than likely, different extensions will use different conventions.
 :-(

It might be sensible to add a creator PID field to the DSM control
items.  Of course, that PID might have exited, but it could still
possibly be useful for debugging purposes.

 If names are a problem, lets give them numbers. Seems a minor point.
 Perhaps we can allocate space for names dynamically??

A static buffer, as proposed by Andres, seems a lot simper.

 Not being able to tell segments apart from each other is just daft, if
 we are trying to supply bug free software for the world to use.

I can see I'm losing this argument.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should we remove not fast promotion at all?

2014-05-06 Thread Simon Riggs

On 19 August 2013 09:20, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 On 08.08.2013 20:15, Josh Berkus wrote:

 Bruce, all:

 We seem to be all over the map with the fast promotion code --- some
 people don't trust it, some people want an option to enable the old
 method, and some people want the old method removed.


 Having read over this thread, the only reason given for retaining any
 ability to use old promotion code is because people are worried about
 fast promotion being buggy.  This seems wrong.

 Either we have confidence is fast promotion, or we don't.  If we don't
 have confidence, then either (a) more testing is needed, or (b) it
 shouldn't be the default.  Again, here, we are coming up against our
 lack of any kind of broad replication failure testing.


 Well, I don't see much harm in keeping the old behavior as an undocumented
 escape hatch, as it is now. The way I'd phrase the current situation is
 this: 9.3 now always does fast promotion. However, for debugging and
 testing purposes, you can still trigger the old behavior by manually
 creating a file in $PGDATA. That should never be necessary in the field,
 however.

+1, again.

I have removed this item from the 9.4 open items list, since this
issue has already been resolved.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing xloginsert_slots?

2014-05-06 Thread Simon Riggs

On 29 January 2014 20:53, Andres Freund and...@2ndquadrant.com wrote:
 On 29. Januar 2014 20:51:38 MEZ, Robert Haas robertmh...@gmail.com wrote:
On Wed, Jan 29, 2014 at 8:22 AM, Andres Freund and...@2ndquadrant.com
wrote:
 Hi,

 On 2014-01-29 21:59:05 +0900, Michael Paquier wrote:
 The undocumented GUC called xloginsert_slots has been introduced by
 commit 9a20a9b. It is mentioned by the commit that this parameter
 should be removed before the release. Wouldn't it be a good time to
 remove this parameter soon? I imagine that removing it before the
beta
 would make sense so now is perhaps too early... Either way, attached
 is a patch doing so...

 I'd rather wait till somebody actually has done some benchmarks. I
don't
 think we're more clueful about it now than back when the patch went
 in. And such benchmarking is more likely during beta, so...

Well, it's either got to go away, or get documented, IMHO.

 Yes, all I am saying is that I'd like to wait till things have calmed down a 
 bit, so it actually makes sense to run bigger benchmarks. I don't think 
 removing the option is that urgent.


I do not want this removed until we have reasonable evidence that the
correct number is 8, and that it is useful for both small, large and
every other kind of config.

We may find evidence it is useful to be able to alter this in the
field and decide to keep it.

I suggest we maintain a Request for Beta Tests list, so people are
aware that they can (and should) test this, but it is not necessarily
functionality we would like to keep in the future.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_shmem_allocations view

2014-05-06 Thread Heikki Linnakangas


On 05/06/2014 02:59 PM, Robert Haas wrote:

Why would we call multiple segments the same thing??

It's not clear to me how someone is going to intelligently name
multiple segments used by the same extension.  Maybe they'll give them
all the same name.  Maybe they'll name them all extension_name.pid.
More than likely, different extensions will use different conventions.
  :-(


That seems sensible to me. The best scheme will depend on how the 
segments are used. Best to leave it to the extension author.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Fix use of free in walsender error handling after a sysid mismatch.

2014-05-06 Thread Heikki Linnakangas


On 05/05/2014 07:14 PM, Andres Freund wrote:

Hi,

Walsender does a PQClear(con) but then accesses data acquired with
PQgetvalue(). That's clearly not ok.


Fixed, thanks!

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_shmem_allocations view

2014-05-06 Thread Simon Riggs

On 6 May 2014 13:06, Heikki Linnakangas hlinnakan...@vmware.com wrote:

 The best scheme will depend on how the segments
 are used. Best to leave it to the extension author.

+1

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] possible dsm bug in dsm_attach()

2014-05-06 Thread Andres Freund

Hi,

dsm_attach() does the following:

nitems = dsm_control-nitems;
for (i = 0; i  nitems; ++i)
{
/* If the reference count is 0, the slot is actually unused. */
if (dsm_control-item[i].refcnt == 0)
continue;

/*
 * If the reference count is 1, the slot is still in use, but 
the
 * segment is in the process of going away.  Treat that as if we
 * didn't find a match.
 */
if (dsm_control-item[i].refcnt == 1)
break;

/* Otherwise, if the descriptor matches, we've found a match. */
if (dsm_control-item[i].handle == seg-handle)
{
dsm_control-item[i].refcnt++;
seg-control_slot = i;
break;
}
}

The break because of refcnt == 1 doesn't generally seem to be a good
idea. Why are we bailing if there's *any* segment that's in the process
of being removed? I think the check should be there *after* the
dsm_control-item[i].handle == seg-handle check?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-05-06 Thread Neil Tiffin


On May 4, 2014, at 5:27 PM, Stephen Frost sfr...@snowman.net wrote:

 * Neil Tiffin (ne...@neiltiffin.com) wrote:
 On May 4, 2014, at 3:17 PM, Stephen Frost sfr...@snowman.net wrote:
 Any system where there exists a role similar to 'superuser' in the PG
 sense (a user who is equivilant to the Unix UID under which the rest of
 the system is run) would be hard-pressed to provide a solution to this
 issue.
 
 Not sure I understand which issue you are referring to.  If you are 
 referring to 'cannot be turned off', I would think a reasonable first pass 
 would be to handle it similar to '--data-checksums' in 'initdb'.  For 
 example, This option can only be set during initialization, and cannot be 
 changed later. If set, basic auditing is on for all objects, in all 
 databases.
 
 Well, except that a superuser *could* effectively turn off checksums by
 changing the the control file and doing a restart (perhaps modulo some
 other hacking; I've not tried).  That kind of trivial 'hole' isn't
 acceptable from a security standpoint though and given that we couldn't
 prevent a superuser from doing an LD_PRELOAD and overriding any system
 call we make from the backend, it's kind of hard to see how we could
 plug such a hole.
 

Ah, I thought it would be more difficult than that for checksums, but 
PostgreSQL does not have to prevent hacking in my experience, that is the 
responsibility of other systems and procedures.  If the core code was such that 
once on, formal logging could not be turned off with any changes to config 
files, settings, or SQL then in my experience that would suffice.  

 With SELinux it may be possible and I'd love to see an example
 from someone who feels they've accomplished it.  That said, if we can
 reduce the need for a 'superuser' role sufficiently by having the
 auditing able to be managed independently, then we may have reached the
 level of considerable headache.
 
 As many have pointed out previously, there is a certain amount of risk
 associated with running without *any* superuser role in the system
 
 If all of the superuser's actions are logged and it's not possible to turn 
 off the logging (without considerable headache) then it may not matter what 
 the superuser can do.  If the superuser makes changes and they are logged 
 then the auditors have sufficient information to see if the correct 
 procedures were followed.  Validated systems are based on tracking, not 
 necessarily prohibiting. Select individuals that should be able to be 
 trusted (which should apply to superusers) should be able to perform the 
 actions necessary to support the organization.
 
 Fair enough- the question is just a matter of what exactly that level of
 headache is.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] possible dsm bug in dsm_attach()

2014-05-06 Thread Robert Haas

On Tue, May 6, 2014 at 8:43 AM, Andres Freund and...@2ndquadrant.com wrote:
 dsm_attach() does the following:

 nitems = dsm_control-nitems;
 for (i = 0; i  nitems; ++i)
 {
 /* If the reference count is 0, the slot is actually unused. 
 */
 if (dsm_control-item[i].refcnt == 0)
 continue;

 /*
  * If the reference count is 1, the slot is still in use, but 
 the
  * segment is in the process of going away.  Treat that as if 
 we
  * didn't find a match.
  */
 if (dsm_control-item[i].refcnt == 1)
 break;

 /* Otherwise, if the descriptor matches, we've found a match. 
 */
 if (dsm_control-item[i].handle == seg-handle)
 {
 dsm_control-item[i].refcnt++;
 seg-control_slot = i;
 break;
 }
 }

 The break because of refcnt == 1 doesn't generally seem to be a good
 idea. Why are we bailing if there's *any* segment that's in the process
 of being removed? I think the check should be there *after* the
 dsm_control-item[i].handle == seg-handle check?

You are correct.  Good catch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgindent run

2014-05-06 Thread Bruce Momjian

On Sat, May  3, 2014 at 02:20:27PM -0400, Bruce Momjian wrote:
 I am planning to run pgindent in a few days to prepare for beta.  Does
 anyone have major patches that you are planning to apply soon?  If so, I
 can delay pgindent until you are done.  
 
 This run will also have a tabs-in-comments removal phase which will also
 be run on supported back branches.

Hearing nothing, I plan to pgindent head in an hour, around 14:00 GMT. 
I will also be removing some tabs in comments in all supported back
branches.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-05-06 Thread Stephen Frost

* Neil Tiffin (ne...@neiltiffin.com) wrote:
 On May 4, 2014, at 5:27 PM, Stephen Frost sfr...@snowman.net wrote:
  * Neil Tiffin (ne...@neiltiffin.com) wrote:
  Well, except that a superuser *could* effectively turn off checksums by
  changing the the control file and doing a restart (perhaps modulo some
  other hacking; I've not tried).  That kind of trivial 'hole' isn't
  acceptable from a security standpoint though and given that we couldn't
  prevent a superuser from doing an LD_PRELOAD and overriding any system
  call we make from the backend, it's kind of hard to see how we could
  plug such a hole.
  
 
 Ah, I thought it would be more difficult than that for checksums, but 
 PostgreSQL does not have to prevent hacking in my experience, that is the 
 responsibility of other systems and procedures.  If the core code was such 
 that once on, formal logging could not be turned off with any changes to 
 config files, settings, or SQL then in my experience that would suffice.  

We could set it up similar to how security labels work, where the
config file (which could be owned by 'root' and therefore unable to be
changed by a superuser) has an auditing setting and changing it requires
a restart (meaning that the config file would have to be modified to
change it, and the database restarted).  However, it might be possible
for a superuser to configure and start an independent postmaster with a
different configuration that points to the same database (or a copy of
it).

That's for a system-wide auditing setting, but if we actually want the
auditing to only be on certain database objects, it gets worse.  We
need to track what objects need the auditing and we'd do that using the
catalog, which a superuser can modify.  Security labels have
more-or-less the same issue, of course.

This is why we don't try to protect against superusers (and why I'm
hopeful that we can reduce the need for a superuser role to exist).
Again, we have to consider that a superuser essentially has a full shell
on the DB server as the user that the database runs under.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Comment patch for index-only scans

2014-05-06 Thread Robert Haas

On Sun, May 4, 2014 at 4:50 PM, Jeff Davis pg...@j-davis.com wrote:
 If I recall correctly, Tom pointed out a while back that the comment
 justifying the lockless read of the VM bit was not correct (or at least
 not complete).

 I rewrote it, but it was part of a patch that was not accepted. Attached
 is the comment patch only.

Looks reasonable to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] `pg_dump -Fd` doesn't check write return status...

2014-05-06 Thread Noah Misch

On Mon, May 05, 2014 at 08:29:56PM -0400, Bruce Momjian wrote:
 On Tue, Apr 22, 2014 at 03:19:15PM -0400, Bruce Momjian wrote:
  As is often the case with pg_dump, the problems you saw are a small part
  of a larger set of problems in that code --- there is general ignoring of
  read and write errors.
  
  I have developed a comprehensive patch that addresses all the issues I
  could find.  The use of function pointers and the calling of functions
  directly and through function pointers makes the fix quite complex.  I
  ended up placing checks at the lowest level and removing checks at
  higher layers where they were sporadically placed.  
  
  Patch attached.  I have tested dump/restore of all four dump output
  formats with the 9.3 regression database and all tests passed.  I
  believe this patch is too complex to backpatch, and I don't know how it
  could be fixed more simply.
 
 Patch applied.  Thanks for the report.

This broke automatic detection of tar-format dumps:

$ pg_dump -Ft -f tardump
$ pg_restore tardump
pg_restore: [archiver] could not read from input file: Success
$ pg_restore -Ft tardump | wc -c
736

Stack trace:

#0  vwrite_msg (modulename=0x4166c9 archiver, fmt=0x41aa08 could not read 
from input file: %s\n, ap=0x7fffde38) at pg_backup_utils.c:85
#1  0x0040f820 in exit_horribly (modulename=0x4166c9 archiver, 
fmt=0x41aa08 could not read from input file: %s\n) at parallel.c:190
#2  0x0040557d in _discoverArchiveFormat (AH=0x425340) at 
pg_backup_archiver.c:2040
#3  0x00405756 in _allocAH (FileSpec=0x7fffecbb tardump, 
fmt=archUnknown, compression=0, mode=archModeRead, setupWorkerPtr=0x4044f0 
setupRestoreWorker) at pg_backup_archiver.c:2160
#4  0x00405819 in OpenArchive (FileSpec=optimized out, fmt=optimized 
out) at pg_backup_archiver.c:148
#5  0x00404331 in main (argc=2, argv=0x7fffea28) at pg_restore.c:375

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sb_alloc: a new memory allocator for PostgreSQL

2014-05-06 Thread Greg Stark

On Tue, May 6, 2014 at 2:04 PM, Robert Haas robertmh...@gmail.com wrote:
  I also didn't find anything that looked like our
 memory context paradigm, and in particular the ability to cheaply
 reset a context, in any other allocator.


You probably knew this but just in case the term for this strategy is
called a ripcord allocator. I believe GCC had one, not sure if it
still uses it. I doubt any existing ones are especially helpful though
compared to looking at regular allocators and seeing how to integrate
them.

I assume you're also aware of lock-free data structures and the like.
A memory allocator seems like something that needs to be super aware
of causing concurrency bottlenecks since it has no idea where it'll be
used.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sb_alloc: a new memory allocator for PostgreSQL

2014-05-06 Thread Heikki Linnakangas


On 05/06/2014 04:04 PM, Robert Haas wrote:

Over the last several months, I've been working on a new memory
allocator for PostgreSQL.  While it's not done, and there are
problems, I think I've reached the point where it makes sense to get
this out in front of a wider audience and get some feedback,
preferably of the sort that doesn't do any permanent damage.  I
started out down this path because parallel sort will really be much
happier if it can allocate from either dynamic shared memory or
backend-local memory using the same interface.  Arguably, we could
implement parallel internal sort using some kind of bespoke memory
management strategy, like packing all of the tuples into the memory
segment tightly starting from the end, and growing the tuple array
from the beginning, but then what happens if we decide to give up on
parallelism and do an external sort after all?  Even if we could
engineer our way around that problem, I think there are bound to be
other things that people want to do with dynamic shared memory
segments where being able to easily allocate and free memory is
desirable.


As a generic remark, I wish that whatever parallel algorithms we will 
use won't need a lot of ad hoc memory allocations from shared memory. 
Even though we have dynamic shared memory now, complex data structures 
with a lot of pointers and different allocations are more painful to 
debug, tune, and make concurrency-safe. But I have no idea what exactly 
you have in mind, so I'll just have to take your word on it that this is 
sensible.



As I got into the problem a little bit further, I realized that
AllocSetAlloc is actually pretty poor allocator for sorting.  As far
as I can see, the basic goal of aset.c was to make context resets
cheap - which makes sense, because we have a lot of very short-lived
memory contexts.  However, when you're sorting a lot of data, being
able to reset the context quickly is not as important as using memory
efficiently, and AllocSetAlloc is pretty terrible at that, especially
for small allocations.  Each allocation is rounded up to a power of
two, and then we add 16 bytes of overhead on top of that (on 64-bit
systems), which means that palloc has 100% overhead for a large number
of 16-byte allocations, and 200% overhead for a large number of 8-byte
allocations.  The 16-byte overhead doesn't hurt quite as much on big
allocations, but the rounding to a power of two can sometimes be very
bad.  For example, for repeated 96-byte allocations, palloc has 50%
overhead.  I decided I wanted to come up with something better.


Yeah, I saw in some tests that about 50% of the memory used for catalog 
caches was waste caused by rounding up all the allocations to power-of-two.



I read through the literature and found that most modern allocators
seemed to use superblocks.  A superblock is a chunk of memory, perhaps
64kB, which is carved up into a bunch of chunks of equal size.  Those
chunks don't have individual headers; instead, the pointer address is
used to locate the metadata for the chunk.  Requests are binned into
size classes, which are generally much more fine-grained than the
powers-of-two strategy used by AllocSetAlloc.  Of course, too many
size classes can waste memory if you end up with many partially-full
superblocks, each for a different size class, but the consensus seems
to be that you make it up and more by wasting less memory within each
allocation (e.g. a 96 byte allocation can be exactly 96 bytes, rather
than 128 bytes).


Interesting work.

Another kind of memory allocator that I've played with in the past is a 
simple stack allocator that only supports wholesale release of the whole 
context and pfree() is a no-op. That's dead simple and very efficient 
when you don't need retail pfree(), but obviously cannot completely 
replace the current allocator.


I wouldn't conflate shared memory with this. If a piece of code needs to 
work with either one, I think the way to go is to have some sort of 
wrapper functions that route the calls to either the shared or private 
memory allocator, similar to how the same interface is used to deal with 
local, temporary buffers and shared buffers.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] postgresql.auto.conf read from wrong directory

2014-05-06 Thread Christoph Berg

Hi,

if you split configuration and data by setting data_directory,
postgresql.auto.conf is writen to the data directory
(/var/lib/postgresql/9.4/main in Debian), but tried to be read from
the etc directory (/etc/postgresql/9.4/main).

One place to fix it would be in ProcessConfigFile in
src/backend/utils/misc/guc-file.l:162 by always setting
CallingFileName = NULL in src/backend/utils/misc/guc-file.l:162, but
that breaks later in AbsoluteConfigLocation() when data_directory is
NULL. (As the comment in ProcessConfigFile says.)

I'm not sure where to break that dependency loop - the fix itself
seems easy, just don't tell to look for postgresql.auto.conf next to
ConfigFileName, but in the data_directory.

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New pg_lsn type doesn't have hash/btree opclasses

2014-05-06 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 Craig just mentioned in an internal chat that there's no btree or even
 hash opclass for the new pg_lsn type. That restricts what you can do
 with it quite severely.
 Imo this should be fixed for 9.4 - after all it was possible unto now to
 index a table with lsns returned by system functions or have queries
 with grouping on them without casting.

Sorry, it is *way* too late for 9.4.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New pg_lsn type doesn't have hash/btree opclasses

2014-05-06 Thread Andres Freund

On 2014-05-06 09:37:54 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  Craig just mentioned in an internal chat that there's no btree or even
  hash opclass for the new pg_lsn type. That restricts what you can do
  with it quite severely.
  Imo this should be fixed for 9.4 - after all it was possible unto now to
  index a table with lsns returned by system functions or have queries
  with grouping on them without casting.
 
 Sorry, it is *way* too late for 9.4.

It's imo a regression/oversight introduced in the pg_lsn patch. Not a
new feature.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tab completion for setting search_path

2014-05-06 Thread Christoph Berg

Re: Jeff Janes 2014-05-05 
CAMkU=1yo97bcGR-z6wg-OJpHKfEcaaaS=x1n7xygxcuakv5...@mail.gmail.com
 I've personally never had a need to set the search_path to a system schema,
 and I guess I was implicitly modelling this on what is returned by \dn, not
 by \dnS.   I wouldn't object much to including them; that would be better
 than not having any completion.  I just don't see much point.
 
 And now playing a bit with the system ones, I think it would be more
 confusing to offer them.  pg_catalog and pg_temp_appropriate always get
 searched, whether you put them in the search_path or not.

Imho the system schemas should be included, because they don't hurt.
If you do tab completion, you'll usually enter a few chars and hit
tab, so you won't get confused by pg_catalog and information_schema
just because you won't see them. Also, it makes sense to explicitely
put pg_catalog at the beginning or the end of search_path, so tab
completion should support that.

I would opt to exclude pg_temp_*, though - these don't serve any
purpose SQL-wise and the name changes all the time anyway.

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sb_alloc: a new memory allocator for PostgreSQL

2014-05-06 Thread Robert Haas

On Tue, May 6, 2014 at 9:31 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 As a generic remark, I wish that whatever parallel algorithms we will use
 won't need a lot of ad hoc memory allocations from shared memory. Even
 though we have dynamic shared memory now, complex data structures with a lot
 of pointers and different allocations are more painful to debug, tune, and
 make concurrency-safe. But I have no idea what exactly you have in mind, so
 I'll just have to take your word on it that this is sensible.

Yeah, I agree.  Actually, I'm hoping that a lot of what we want to do
can be done using the shm_mq stuff, which uses a messaging paradigm.
If the queue is full, you wait for the consumer to read some data
before writing more.  That is much simpler and avoids a lot of
problems.

There are several problems with using pointers in dynamic shared
memory.  The ones I'm most concerned about are:

1. Segments are relocatable, so you can't actually use absolute
pointers.  Maybe someday we'll have a facility for dynamic shared
memory segments that are mapped at the same address in every process,
or maybe not, but right now we sure don't.

2. You've got to decide up-front how much memory to set aside for
dynamic allocation, and you can't easily change your mind later.  Some
of the DSM implementations support growing the segment, but you've got
to somehow get everyone who is using it to remap it, possibly at a
different address, so it's a long way from being transparent.

But that having been said, pointers are a pretty fundamental data
structure, and I think trying to get rid of them completely isn't
likely to be feasible.  For sorting, you need a big SortTuple array
and then that needs to point to the individual tuples.  I think that's
simple enough to be reasonable, and at any rate I don't see a simpler
approach.

 Yeah, I saw in some tests that about 50% of the memory used for catalog
 caches was waste caused by rounding up all the allocations to power-of-two.

Sadly, I can't see using this allocator for the catalog caches as-is.
The problem is that AllocSetAlloc can start those caches off with a
tiny 1kB allocation.  This allocator is intended to be efficient for
large contexts, so you start off with 4kB of superblock descriptors
and a 64kB chunk for each size class that is in use.  Very reasonable
for multi-megabyte allocations; not so hot for tiny ones.  There may
be a way to serve both needs, but I haven't found it yet.

 I wouldn't conflate shared memory with this. If a piece of code needs to
 work with either one, I think the way to go is to have some sort of wrapper
 functions that route the calls to either the shared or private memory
 allocator, similar to how the same interface is used to deal with local,
 temporary buffers and shared buffers.

Well, that has several disadvantages.  One of them is code
duplication.  This allocator could certainly be a lot simpler if it
only handed shared memory, or for that matter if it only handled
backend-private memory.  But if the right way to do allocation for
sorting is to carve chunks out of superblocks, then it's the right way
regardless of whether you're allocating from shared memory or
backend-private memory.  And if that's the wrong way, then it's wrong
for both.  Using completely different allocators for parallel sort and
non-parallel sort doesn't seem like a great idea to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New pg_lsn type doesn't have hash/btree opclasses

2014-05-06 Thread Michael Paquier

On Tue, May 6, 2014 at 4:14 PM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 Craig just mentioned in an internal chat that there's no btree or even
 hash opclass for the new pg_lsn type. That restricts what you can do
 with it quite severely.
 Imo this should be fixed for 9.4 - after all it was possible unto now to
 index a table with lsns returned by system functions or have queries
 with grouping on them without casting.
Makes sense, especially knowing operators needed for btree processing
are already defined. Patch attached solves that. Now to include it in
9.4 where development is over is another story... I wouldn't mind
adding it to the next commit fest either.
Regards,
-- 
Michael
commit 604177ac2af4bfd26a392b8669dd2fc3550f2a3d
Author: Michael Paquier mich...@otacoo.com
Date:   Tue May 6 22:42:09 2014 +0900

Support for hash and btree operators for pg_lsn datatype

The following functions are added in pg_lsn bundle for the different
index operators:
- pg_lsn_hash for hash index
- pg_lsn_cmp for btree index
Related catalogs are updated as well.

diff --git a/src/backend/utils/adt/pg_lsn.c b/src/backend/utils/adt/pg_lsn.c
index e2b528a..f7980f8 100644
--- a/src/backend/utils/adt/pg_lsn.c
+++ b/src/backend/utils/adt/pg_lsn.c
@@ -150,6 +150,24 @@ pg_lsn_ge(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(lsn1 = lsn2);
 }
 
+/* handler for btree index operator */
+Datum
+pg_lsn_cmp(PG_FUNCTION_ARGS)
+{
+   XLogRecPtr lsn1 = PG_GETARG_LSN(0);
+   XLogRecPtr lsn2 = PG_GETARG_LSN(1);
+
+   PG_RETURN_INT32(lsn1 == lsn2);
+}
+
+/* hash index support */
+Datum
+pg_lsn_hash(PG_FUNCTION_ARGS)
+{
+   XLogRecPtr lsn = PG_GETARG_LSN(0);
+
+   return hashint8(lsn);
+}
 
 /*--
  * Arithmetic operators on PostgreSQL LSNs.
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
index 8efd3be..5cd170d 100644
--- a/src/include/catalog/pg_amop.h
+++ b/src/include/catalog/pg_amop.h
@@ -513,6 +513,16 @@ DATA(insert (  2968  2950 2950 4 s 2977403 0 
));
 DATA(insert (  2968  2950 2950 5 s 2975403 0 ));
 
 /*
+ * btree pglsn_ops
+ */
+
+DATA(insert (  3260  3220 3220 1 s 3224403 0 ));
+DATA(insert (  3260  3220 3220 2 s 3226403 0 ));
+DATA(insert (  3260  3220 3220 3 s 3222403 0 ));
+DATA(insert (  3260  3220 3220 4 s 3227403 0 ));
+DATA(insert (  3260  3220 3220 5 s 3225403 0 ));
+
+/*
  * hash index _ops
  */
 
@@ -581,6 +591,8 @@ DATA(insert (   2231   1042 1042 1 s 1054 405 0 ));
 DATA(insert (  2235   1033 1033 1 s  974 405 0 ));
 /* uuid_ops */
 DATA(insert (  2969   2950 2950 1 s 2972 405 0 ));
+/* pglsn_ops */
+DATA(insert (  3261   3220 3220 1 s 3222 405 0 ));
 /* numeric_ops */
 DATA(insert (  1998   1700 1700 1 s 1752 405 0 ));
 /* array_ops */
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
index 198b126..369adb9 100644
--- a/src/include/catalog/pg_amproc.h
+++ b/src/include/catalog/pg_amproc.h
@@ -134,6 +134,7 @@ DATA(insert (   2789   27 27 1 2794 ));
 DATA(insert (  2968   2950 2950 1 2960 ));
 DATA(insert (  2994   2249 2249 1 2987 ));
 DATA(insert (  3194   2249 2249 1 3187 ));
+DATA(insert (  3260   3220 3220 1 3263 ));
 DATA(insert (  3522   3500 3500 1 3514 ));
 DATA(insert (  3626   3614 3614 1 3622 ));
 DATA(insert (  3683   3615 3615 1 3668 ));
@@ -174,6 +175,7 @@ DATA(insert (   2229   25 25 1 400 ));
 DATA(insert (  2231   1042 1042 1 1080 ));
 DATA(insert (  2235   1033 1033 1 329 ));
 DATA(insert (  2969   2950 2950 1 2963 ));
+DATA(insert (  3261   3220 3220 1 3262 ));
 DATA(insert (  3523   3500 3500 1 3515 ));
 DATA(insert (  3903   3831 3831 1 3902 ));
 DATA(insert (  4034   3802 3802 1 4045 ));
diff --git a/src/include/catalog/pg_opclass.h b/src/include/catalog/pg_opclass.h
index 49b2410..cbcdacd 100644
--- a/src/include/catalog/pg_opclass.h
+++ b/src/include/catalog/pg_opclass.h
@@ -215,6 +215,8 @@ DATA(insert (   2742_reltime_opsPGNSP 
PGUID 2745  1024 t 703 ));
 DATA(insert (  2742_tinterval_ops  PGNSP PGUID 2745  1025 t 704 ));
 DATA(insert (  403 uuid_opsPGNSP PGUID 
2968  2950 t 0 ));
 DATA(insert (  405 uuid_opsPGNSP PGUID 
2969  2950 t 0 ));
+DATA(insert (  403 pglsn_ops   PGNSP PGUID 
3260  3220 t 0 ));
+DATA(insert (  405 pglsn_ops   PGNSP PGUID 
3261  3220 t 0 ));
 DATA(insert (  403 enum_opsPGNSP PGUID 
3522  3500 t 0 ));
 DATA(insert (  405 enum_opsPGNSP PGUID 
3523  3500 t 0 ));
 DATA(insert (  403 tsvector_opsPGNSP PGUID 3626  3614 
t 0 ));
diff --git a/src/include/catalog/pg_operator.h 
b/src/include/catalog/pg_operator.h
index f280af4..87ee4eb 100644
--- a/src/include/catalog/pg_operator.h

Re: [HACKERS] 9.4 release notes

2014-05-06 Thread Nicolas Barbier

2014-05-05 Bruce Momjian br...@momjian.us:

 On Mon, May  5, 2014 at 10:40:29AM -0700, Josh Berkus wrote:

 * ALTER SYSTEM SET

 Lemme know if you need description text for any of the above.

 OK, great!  Once I have the markup done, I will beef up the descriptions
 if needed and copy the text up to the major items section so we have
 that all ready for beta.

“Add SQL-level command ALTER SYSTEM command [..]”

Using “command” twice sounds weird to my ears. Wouldn’t leaving out
the second instance be better?

Nicolas

-- 
A. Because it breaks the logical sequence of discussion.
Q. Why is top posting bad?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] `pg_dump -Fd` doesn't check write return status...

2014-05-06 Thread Bruce Momjian

On Tue, May  6, 2014 at 09:22:20AM -0400, Noah Misch wrote:
 On Mon, May 05, 2014 at 08:29:56PM -0400, Bruce Momjian wrote:
  On Tue, Apr 22, 2014 at 03:19:15PM -0400, Bruce Momjian wrote:
   As is often the case with pg_dump, the problems you saw are a small part
   of a larger set of problems in that code --- there is general ignoring of
   read and write errors.
   
   I have developed a comprehensive patch that addresses all the issues I
   could find.  The use of function pointers and the calling of functions
   directly and through function pointers makes the fix quite complex.  I
   ended up placing checks at the lowest level and removing checks at
   higher layers where they were sporadically placed.  
   
   Patch attached.  I have tested dump/restore of all four dump output
   formats with the 9.3 regression database and all tests passed.  I
   believe this patch is too complex to backpatch, and I don't know how it
   could be fixed more simply.
  
  Patch applied.  Thanks for the report.
 
 This broke automatic detection of tar-format dumps:
 
 $ pg_dump -Ft -f tardump
 $ pg_restore tardump
 pg_restore: [archiver] could not read from input file: Success
 $ pg_restore -Ft tardump | wc -c
 736

OK, thanks for the report.  Fixed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sb_alloc: a new memory allocator for PostgreSQL

2014-05-06 Thread Robert Haas

On Tue, May 6, 2014 at 9:25 AM, Greg Stark st...@mit.edu wrote:
 On Tue, May 6, 2014 at 2:04 PM, Robert Haas robertmh...@gmail.com wrote:
  I also didn't find anything that looked like our
 memory context paradigm, and in particular the ability to cheaply
 reset a context, in any other allocator.

 You probably knew this but just in case the term for this strategy is
 called a ripcord allocator. I believe GCC had one, not sure if it
 still uses it. I doubt any existing ones are especially helpful though
 compared to looking at regular allocators and seeing how to integrate
 them.

Actually, I hadn't heard that term.

 I assume you're also aware of lock-free data structures and the like.
 A memory allocator seems like something that needs to be super aware
 of causing concurrency bottlenecks since it has no idea where it'll be
 used.

Lock-free data structures are cool, but you tend to pay for contention
avoidance with larger constant overheads.  Earlier versions of this
included more provisions for reducing locking bottlenecks that the
current version; that stuff ended up being surprisingly expensive and
I had to rip it out.  There are a few remaining vestiges that need to
be cleaned up yet.  What I'm thinking is probably cheap enough to live
with - and what the patch roughly contemplates right now - is one
lwlock per size class; so shared allocations can proceed in parallel
if they're for objects of different sizes, but not otherwise.  If we
need an allocator that is better-optimized for concurrency than that,
then I think that's going to need to be a separate effort, and the
result will be something that nobody will want to use for
backend-private allocations, because the overhead in the non-contended
case will be too high.

I don't expect concurrency to be a big problem for the problems I
intend to solve in the medium term.  For parallel internal sort, I
actually only need one backend at a time to be able to allocate from
the shared memory segment; the other backends just move data around,
without actually allocating anything.  So the locking is a non-issue.
Parallel external sort might be different; e.g. you can imagine the
user backend pushing tuples into the DSM and worker backends pulling
them out and writing them to tuplestores, or something like that.
Similarly, you can imagine something like a parallel hash table build
with multiple inserts working at once.  But I don't want to spend too
much time worrying about concurrency for future projects at the
expense of solving problems nearer at hand; one could argue that I've
gone too far in that direction already.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Simon Riggs

On 8 October 2013 17:13, Bruce Momjian br...@momjian.us wrote:

 Patch applied with a default of 4x shared buffers.  I have added a 9.4
 TODO that we might want to revisit this.

I certainly want to revisit this patch and this setting.

How can we possibly justify a default setting that could be more than
physical RAM?

The maximum known safe value is the setting of shared_buffers itself,
without external knowledge. But how can we possibly set it even that
high?

Does anyone have any evidence at all on how to set this? How can we
possibly autotune it?

I prefer the idea of removing effective_cache_size completely, since
it has so little effect on workloads and is very frequently
misunderstood by users. It's just dangerous, without being useful.

Why do we autotune the much more important synch scan threshold, yet
allow tuning of e_c_s?

Lets fix e_c_s at 25% of shared_buffers and remove the parameter
completely, just as we do with so many other performance parameters.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Andres Freund

On 2014-05-06 15:09:15 +0100, Simon Riggs wrote:
 On 8 October 2013 17:13, Bruce Momjian br...@momjian.us wrote:
 
  Patch applied with a default of 4x shared buffers.  I have added a 9.4
  TODO that we might want to revisit this.
 
 I certainly want to revisit this patch and this setting.
 
 How can we possibly justify a default setting that could be more than
 physical RAM?

Because it doesn't hurt overly much if it's set too large?

 The maximum known safe value is the setting of shared_buffers itself,
 without external knowledge. But how can we possibly set it even that
 high?

 Does anyone have any evidence at all on how to set this? How can we
 possibly autotune it?

It's just a different default setting? I think the new value will cause
less problems than the old one which frequently leads to index scans not
being used although beneficial.

 I prefer the idea of removing effective_cache_size completely, since
 it has so little effect on workloads and is very frequently
 misunderstood by users. It's just dangerous, without being useful.

-many.

 Lets fix e_c_s at 25% of shared_buffers and remove the parameter
 completely, just as we do with so many other performance parameters.

That'd cause *massive* regression for many installations. Without
significantly overhauling costsize.c that's really not feasible. There's
lots of installations that use relatively small s_b settings for good
reasons. If we fix e_c_s to 25% of s_b many queries on those won't use
indexes anymore.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New pg_lsn type doesn't have hash/btree opclasses

2014-05-06 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2014-05-06 09:37:54 -0400, Tom Lane wrote:
 Sorry, it is *way* too late for 9.4.

 It's imo a regression/oversight introduced in the pg_lsn patch. Not a
 new feature.

You can argue that if you like, but it doesn't matter.  It's too late for
a change as big as that for such an inessential feature.  We are in the
stabilization game at this point, and adding features is not the thing to
be doing.

Especially not features for which no patch even exists.  I don't care if
it could be done in a few hours by copy-and-paste, it would still be the
very definition of rushed coding.  We're past the window for this kind of
change in 9.4.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Tom Lane

Simon Riggs si...@2ndquadrant.com writes:
 Lets fix e_c_s at 25% of shared_buffers and remove the parameter
 completely, just as we do with so many other performance parameters.

Apparently, you don't even understand what this parameter is for.
Setting it smaller than shared_buffers is insane.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sb_alloc: a new memory allocator for PostgreSQL

2014-05-06 Thread Simon Riggs

On 6 May 2014 14:49, Robert Haas robertmh...@gmail.com wrote:
 On Tue, May 6, 2014 at 9:31 AM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
 As a generic remark, I wish that whatever parallel algorithms we will use
 won't need a lot of ad hoc memory allocations from shared memory. Even
 though we have dynamic shared memory now, complex data structures with a lot
 of pointers and different allocations are more painful to debug, tune, and
 make concurrency-safe. But I have no idea what exactly you have in mind, so
 I'll just have to take your word on it that this is sensible.

 Yeah, I agree.  Actually, I'm hoping that a lot of what we want to do
 can be done using the shm_mq stuff, which uses a messaging paradigm.
 If the queue is full, you wait for the consumer to read some data
 before writing more.  That is much simpler and avoids a lot of
 problems.

 There are several problems with using pointers in dynamic shared
 memory.  The ones I'm most concerned about are:

 1. Segments are relocatable, so you can't actually use absolute
 pointers.  Maybe someday we'll have a facility for dynamic shared
 memory segments that are mapped at the same address in every process,
 or maybe not, but right now we sure don't.

Sounds like a problem for static allocations, not dynamic ones.

It makes a lot of sense to use dynamic shared memory for sorts
especially, since you can just share the base pointer and other info
and a blind worker can then do the sort for you without needing
transactions, snapshots etc..

I'd also like to consider putting common reference tables as hash
tables into shmem.

 2. You've got to decide up-front how much memory to set aside for
 dynamic allocation, and you can't easily change your mind later.  Some
 of the DSM implementations support growing the segment, but you've got
 to somehow get everyone who is using it to remap it, possibly at a
 different address, so it's a long way from being transparent.

Again, depends on the algorithm. If we program the sort to work in
fixed size chunks, we can then use a merge sort at end to link the
chunks together. So we just use an array of fixed size chunks. We
might need to dynamically add more chunks, but nobody needs to remap.

Doing it that way means we do *not* need to change situation if it
becomes an external sort. We just mix shmem and external files, all
merged together at the end.

We need to take account of the amount of memory locally available per
CPU, so there is a maximum size for these things. Not sure what tho'

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New pg_lsn type doesn't have hash/btree opclasses

2014-05-06 Thread Heikki Linnakangas


On 05/06/2014 05:17 PM, Tom Lane wrote:

Andres Freund and...@2ndquadrant.com writes:

On 2014-05-06 09:37:54 -0400, Tom Lane wrote:

Sorry, it is *way* too late for 9.4.



It's imo a regression/oversight introduced in the pg_lsn patch. Not a
new feature.


You can argue that if you like, but it doesn't matter.  It's too late for
a change as big as that for such an inessential feature.  We are in the
stabilization game at this point, and adding features is not the thing to
be doing.


FWIW, I agree with Andres that this would be a reasonable thing to add. 
Exactly the kind of oversight that we should be fixing at this stage in 
the release cycle.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] using array of char pointers gives wrong results

2014-05-06 Thread Ashutosh Bapat

Thanks Michael.



On Tue, May 6, 2014 at 5:01 PM, Michael Meskes mes...@postgresql.orgwrote:

  PFA patch ecpg_char_ptr_arr.patch to fix this issue. It has changes as
  follows
  ...

 Thanks for finding and fixing the problem. Patch applied to complete 9
 series with only a minor change.

 Michael
 --
 Michael Meskes
 Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
 Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
 Jabber: michael.meskes at gmail dot com
 VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-05-06 Thread Amit Kapila

On Tue, May 6, 2014 at 7:04 PM, Christoph Berg c...@df7cb.de wrote:
 Hi,

 if you split configuration and data by setting data_directory,
 postgresql.auto.conf is writen to the data directory
 (/var/lib/postgresql/9.4/main in Debian), but tried to be read from
 the etc directory (/etc/postgresql/9.4/main).

 One place to fix it would be in ProcessConfigFile in
 src/backend/utils/misc/guc-file.l:162 by always setting
 CallingFileName = NULL in src/backend/utils/misc/guc-file.l:162, but
 that breaks later in AbsoluteConfigLocation() when data_directory is
 NULL. (As the comment in ProcessConfigFile says.)

 I'm not sure where to break that dependency loop - the fix itself
 seems easy, just don't tell to look for postgresql.auto.conf next to
 ConfigFileName, but in the data_directory.

Thanks for the report, will look into it.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sb_alloc: a new memory allocator for PostgreSQL

2014-05-06 Thread Robert Haas

On Tue, May 6, 2014 at 10:40 AM, Simon Riggs si...@2ndquadrant.com wrote:
 1. Segments are relocatable, so you can't actually use absolute
 pointers.  Maybe someday we'll have a facility for dynamic shared
 memory segments that are mapped at the same address in every process,
 or maybe not, but right now we sure don't.

 Sounds like a problem for static allocations, not dynamic ones.

Maybe I didn't say that well: the dynamic shared memory segment isn't
guaranteed to be mapped at the same address in every backend.  So if
you use absolute pointers in your data structures, they might be
incorrect from the point of view of some other backend accessing the
same shared memory segment.  This is true regardless of how the
allocation is done; it's an artifact of the decision not to try to map
dynamic shared memory segments at a constant address in every process
using them.

 It makes a lot of sense to use dynamic shared memory for sorts
 especially, since you can just share the base pointer and other info
 and a blind worker can then do the sort for you without needing
 transactions, snapshots etc..

That would be nice, but I think we're actually going to need a lot of
that stuff to make parallel sort work, because the workers need to
have enough an environment set up to run the comparison functions, and
those may try to de-TOAST.

 I'd also like to consider putting common reference tables as hash
 tables into shmem.

I'm not sure exactly what you have in mind here, but some kind of hash
tables facility for dynamic memory is on my bucket list.

 2. You've got to decide up-front how much memory to set aside for
 dynamic allocation, and you can't easily change your mind later.  Some
 of the DSM implementations support growing the segment, but you've got
 to somehow get everyone who is using it to remap it, possibly at a
 different address, so it's a long way from being transparent.

 Again, depends on the algorithm. If we program the sort to work in
 fixed size chunks, we can then use a merge sort at end to link the
 chunks together. So we just use an array of fixed size chunks. We
 might need to dynamically add more chunks, but nobody needs to remap.

This is a possible approach, but since each segment might end up in a
different spot in each process that maps it, a pointer now needs to
consist of a segment identifier and an offset within that segment.

 Doing it that way means we do *not* need to change situation if it
 becomes an external sort. We just mix shmem and external files, all
 merged together at the end.

Huh.  Interesting idea.  I haven't researched external sort much yet.

 We need to take account of the amount of memory locally available per
 CPU, so there is a maximum size for these things. Not sure what tho'

I agree that NUMA effects could be significant, but so far I don't
know enough to have any idea what, if anything, makes sense to do in
that area.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TABLESPACE and directory for Foreign tables?

2014-05-06 Thread Robert Haas

On Mon, May 5, 2014 at 1:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 A larger and more philosophical point is that such a direction of
 development could hardly be called a foreign data wrapper.  People
 would expect Postgres to take full responsibility for such files,
 including data integrity considerations such as fsync-at-checkpoints
 and WAL support.  Even if we wanted the FDW abstractions to allow
 for that, we're very far away from it.  And frankly I'd maintain
 that FDW is the wrong abstraction.

The right abstraction, as Josh points out, would probably be pluggable
storage.  Are you (or is anyone) planning to pursue that further?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New pg_lsn type doesn't have hash/btree opclasses

2014-05-06 Thread Vik Fearing

On 05/06/2014 04:59 PM, Heikki Linnakangas wrote:
 On 05/06/2014 05:17 PM, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On 2014-05-06 09:37:54 -0400, Tom Lane wrote:
 Sorry, it is *way* too late for 9.4.

 It's imo a regression/oversight introduced in the pg_lsn patch. Not a
 new feature.

 You can argue that if you like, but it doesn't matter.  It's too late
 for
 a change as big as that for such an inessential feature.  We are in the
 stabilization game at this point, and adding features is not the
 thing to
 be doing.

 FWIW, I agree with Andres that this would be a reasonable thing to
 add. Exactly the kind of oversight that we should be fixing at this
 stage in the release cycle.

I agree as well.

-- 
Vik



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Simon Riggs

On 6 May 2014 15:18, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Lets fix e_c_s at 25% of shared_buffers and remove the parameter
 completely, just as we do with so many other performance parameters.

 Apparently, you don't even understand what this parameter is for.
 Setting it smaller than shared_buffers is insane.

You know you can't justify that comment and so do I. What workload is
so badly affected as to justify use of the word insane in this
context?

I can read code. But it appears nobody apart from me actually does, or
at least understand the behaviour that results.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgindent run

2014-05-06 Thread Bruce Momjian

On Tue, May  6, 2014 at 08:55:07AM -0400, Bruce Momjian wrote:
 On Sat, May  3, 2014 at 02:20:27PM -0400, Bruce Momjian wrote:
  I am planning to run pgindent in a few days to prepare for beta.  Does
  anyone have major patches that you are planning to apply soon?  If so, I
  can delay pgindent until you are done.  
  
  This run will also have a tabs-in-comments removal phase which will also
  be run on supported back branches.
 
 Hearing nothing, I plan to pgindent head in an hour, around 14:00 GMT. 
 I will also be removing some tabs in comments in all supported back
 branches.

All done.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_stat_statements: Query normalisation may fail during stats reset

2014-05-06 Thread Michael Renner

Hi,

when regularly collecting  resetting query information from pg_stat_statements 
it’s possible to trigger a situation where unnormalised queries are stored.


I think what happens is the following:

pgss_post_parse_analyse calls pgss_store with a non-null jstate which will 
cause the query string to be normalised and stored if the query id doesn’t 
exist in the hash table.


pgss_ExecutorEnd calls pgss_store with a null jstate which will cause the 
statistics to be stored if the query id exists.

If the query id does not exist (because the hash table has been reset between 
post_parse_analyse and ExecutorEnd) it hits the entry creation path which in 
turn will create an entry with the unnormalised query string because jstate is 
null.


This is a bit counterintuitive if you rely on the query to be normalised, e.g. 
for privacy reasons where you don’t want to leak query constants like password 
hashes or usernames.


Is this something that should be fixed or is this intentional behavior? The 
documentation doesn’t make any strong claims on the state of the query string, 
so this might be debatable. [1]


best,
Michael

[1] 
http://www.postgresql.org/message-id/e1uwztj-00075u...@wrigleys.postgresql.org 
is a similar situation where constants remain unnormalised.


signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: [HACKERS] Sending out a request for more buildfarm animals?

2014-05-06 Thread Alvaro Herrera

Andres Freund wrote:

   * sparc 32bit
  
  Do we really care about sparc 32bit at this point?  You're talking a
  10-year-old machine, there.
 
 I personally don't really, but the last time it came up significant
 parts of community opinionated the other way. And I'd rather have it
 tested and actually supported than supposedly supported.

The thing is that a machine as weird as Sparc uncovers strange failures
that don't show up in other architectures.  For instance, spoonbill
(sparc64 IIRC) has turned up rather interesting numbers of problems.
I think it's useful to have such a thing in the buildfarm.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] elog a stack trace

2014-05-06 Thread Alvaro Herrera

MauMau wrote:

 On Windows, you can use Stackwalk() or Stackwalk64() Win32 API.
 Several years ago, for some software, I implemented a feature that
 outputs the stack trace of a crashing process to its server log
 file.
 
 It would be very nice if PostgreSQL outputs the stack trace of a
 crashing postgres process in the server log.  That eliminates the
 need for users in many cases to send the huge core files to the
 support staff or to use the debugger to get the stack trace.  I've
 recently heard that MySQL has this feature.

+1, assuming it can be made to work reliably and does not cause a larger
reliability problem.  I see the GNU extension to get backtraces, for
instance, tries to malloc stuff in order to get a human-readable trace,
which might not be all that great.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2014-05-06 Thread Andres Freund

On 2014-05-06 13:33:01 +0300, Heikki Linnakangas wrote:
 On 03/31/2014 09:08 PM, Robert Haas wrote:
 On Wed, Mar 26, 2014 at 9:45 PM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, Nov 27, 2013 at 9:10 AM, Noah Misch n...@leadboat.com wrote:
 The threat is that rounding the read size up to the next MAXALIGN would 
 cross
 into an unreadable memory page, resulting in a SIGSEGV.  Every palloc chunk
 has MAXALIGN'd size under the hood, so the excess read of toDelete cannot
 cause a SIGSEGV.  For a stack variable, it depends on the ABI.  I'm not 
 aware
 of an ABI where the four bytes past the end of this stack variable could be
 unreadable, which is not to claim I'm well-read on the topic.  We should 
 fix
 this in due course on code hygiene grounds, but I would not back-patch it.
 
 Attached patch silences the Invalid read of size n complaints of
 Valgrind. I agree with your general thoughts around backpatching. Note
 that the patch addresses a distinct complaint from Kevin's, as
 Valgrind doesn't take issue with the invalid reads past the end of
 spgxlogPickSplit variables on the stack.
 
 Is the needless zeroing this patch introduces apt to cause a
 performance problem?
 
 This function is actually pretty wacky.  If we're stuffing bytes with
 undefined contents into the WAL record, maybe the answer isn't to
 force the contents of those bytes to be defined, but rather to elide
 them from the WAL record.
 
 Agreed. I propose the attached, which removes the MAXALIGNs. It's not
 suitable for backpatching, though, as it changes the format of the WAL
 record.

Not knowing anything about spgist this looks sane to me. Do we need a
backpatchable solution given that we seem to agree that these bugs
aren't likely to cause harm?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Simon Riggs

On 6 May 2014 15:17, Andres Freund and...@2ndquadrant.com wrote:

 Lets fix e_c_s at 25% of shared_buffers and remove the parameter
 completely, just as we do with so many other performance parameters.

 That'd cause *massive* regression for many installations. Without
 significantly overhauling costsize.c that's really not feasible. There's
 lots of installations that use relatively small s_b settings for good
 reasons. If we fix e_c_s to 25% of s_b many queries on those won't use
 indexes anymore.

many queries can't be correct. All this changes is the cost of
IndexScans that would use more than 25% of shared_buffers worth of
data. Hopefully not many of those in your workload. Changing the cost
doesn't necessarily prevent index scans either. And if there are many
of those in your workload AND you run more than one at same time, then
the larger setting will work against you. So the benefit window for
such a high setting is slim, at best.

I specifically picked 25% of shared_buffers because that is the point
at which sequential scans become more efficient and use the cache more
efficiently. If our cost models are correct, then switching away from
index scans shouldn't hurt at all.

Assuming we can use large tranches of memory for single queries has a
very bad effect on cache hit ratios. Encouraging such usage seems to
fall into the category of insane, from my perspective. Having it as a
default setting is bad.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TABLESPACE and directory for Foreign tables?

2014-05-06 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Mon, May 5, 2014 at 1:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 A larger and more philosophical point is that such a direction of
 development could hardly be called a foreign data wrapper.  People
 would expect Postgres to take full responsibility for such files,
 including data integrity considerations such as fsync-at-checkpoints
 and WAL support.  Even if we wanted the FDW abstractions to allow
 for that, we're very far away from it.  And frankly I'd maintain
 that FDW is the wrong abstraction.

 The right abstraction, as Josh points out, would probably be pluggable
 storage.  Are you (or is anyone) planning to pursue that further?

Well, as you've noticed, I made no progress on that since last PGCon.
It's still something I'm thinking about, but it's a hard problem.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Tom Lane

Simon Riggs si...@2ndquadrant.com writes:
 On 6 May 2014 15:18, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Lets fix e_c_s at 25% of shared_buffers and remove the parameter
 completely, just as we do with so many other performance parameters.

 Apparently, you don't even understand what this parameter is for.
 Setting it smaller than shared_buffers is insane.

 You know you can't justify that comment and so do I.

What I meant is that your comments indicate complete lack of understanding
of the parameter.  It's *supposed* to be larger than shared_buffers, and
there is no safety risk involved in setting it too high.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Andres Freund

On 2014-05-06 17:43:45 +0100, Simon Riggs wrote:
 On 6 May 2014 15:17, Andres Freund and...@2ndquadrant.com wrote:
 
  Lets fix e_c_s at 25% of shared_buffers and remove the parameter
  completely, just as we do with so many other performance parameters.
 
  That'd cause *massive* regression for many installations. Without
  significantly overhauling costsize.c that's really not feasible. There's
  lots of installations that use relatively small s_b settings for good
  reasons. If we fix e_c_s to 25% of s_b many queries on those won't use
  indexes anymore.
 
 many queries can't be correct.

It is.

 All this changes is the cost of
 IndexScans that would use more than 25% of shared_buffers worth of
 data. Hopefully not many of those in your workload. Changing the cost
 doesn't necessarily prevent index scans either. And if there are many
 of those in your workload AND you run more than one at same time, then
 the larger setting will work against you. So the benefit window for
 such a high setting is slim, at best.

Why? There's many workloads where indexes are larger than shared buffers
but fit into the operating system's cache. And that's precisely what
effective_cache_size is about.
Especially on bigger machines shared_buffers can't be set big enough to
actually use all the machine's memory. It's not uncommon to have 4GB
shared buffers on a machine with 512GB RAM... It'd be absolutely
disastrous to set effective_cache_size to 1GB for an analytics workload.

 I specifically picked 25% of shared_buffers because that is the point
 at which sequential scans become more efficient and use the cache more
 efficiently. If our cost models are correct, then switching away from
 index scans shouldn't hurt at all.

More often than not indexes are smaller than the table size, so this
argument doesn't seem to make much sense.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bgworker crashed or not?

2014-05-06 Thread Robert Haas

On Sun, May 4, 2014 at 9:57 AM, Petr Jelinek p...@2ndquadrant.com wrote:
 On 30/04/14 23:35, Robert Haas wrote:
 On Mon, Apr 28, 2014 at 4:19 PM, Petr Jelinek p...@2ndquadrant.com
 wrote:
 On 28/04/14 16:27, Robert Haas wrote:
 It seems we have consensus on what to do about this, but what we
 haven't got is a patch.


 If you mean the consensus that exit status 0 should mean permanent stop
 then
 I think the patch can be as simple as attached.

 Hmm.  Well, at the very least, you need to update the comment.

 Obviously, also docs, the above was just demonstration that the patch is
 simple (and also all I could manage to write and test on the way from
 airport to hotel).

 The attached patch is more proper.

Hmm, I see some problems here.  The current comment for
bgworker_quickdie() says that we do exit(0) there rather than exit(2)
to avoid having the postmaster delay restart based on
bgw_restart_time, but with the proposed change in the meaning of
exit(2), that would have the effect of unregistering the worker, which
I guess is why you've changed it to call exit(2) instead, but then the
bgw_restart_delay applies, which I think we do not want.  To make
matters more confusing, the existing comment is actually only correct
for non-shmem-connected workers - shmem-connected workers will treat
an exit status of anything other than 0 or 1 in exactly the same
fashion as a failure to disengage the deadman switch.

Which brings up another point: the behavior of non-shmem-connected
workers is totally bizarre.  An exit status other than 0 or 1 is not
treated as a crash requiring a restart, but failure to disengage the
deadman switch is still treated as a crash requiring a restart.  Why?
If the workers are not shmem-connected, then no crash requires a
system-wide restart.  Of course, there's the tiny problem that we
aren't actually unmapping shared memory from supposedly non-shmem
connected workers, which is a different bug, but ignoring that for the
moment there's no reason for this logic to be like this.

What I'm inclined to do is change the logic so that:

(1) After a crash-and-restart sequence, zero rw-rw_crashed_at, so
that anything which is still registered gets restarted immediately.

(2) If a shmem-connected backend fails to release the deadman switch
or exits with an exit code other than 0 or 1, we crash-and-restart.  A
non-shmem-connected backend never causes a crash-and-restart.

(3) When a background worker exits without triggering a
crash-and-restart, an exit code of precisely 0 causes the worker to be
unregistered; any other exit code has no special effect, so
bgw_restart_time controls.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Josh Berkus

On 05/06/2014 08:41 AM, Simon Riggs wrote:
 On 6 May 2014 15:18, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Lets fix e_c_s at 25% of shared_buffers and remove the parameter
 completely, just as we do with so many other performance parameters.

 Apparently, you don't even understand what this parameter is for.
 Setting it smaller than shared_buffers is insane.
 
 You know you can't justify that comment and so do I. What workload is
 so badly affected as to justify use of the word insane in this
 context?

Most of them?  Really?

I have to tell you, your post sounds like you've missed out on the last
12 years of PostgreSQL query tuning.  Which is a little shocking
considering where you've spent that 12 years.

 I can read code. But it appears nobody apart from me actually does, or
 at least understand the behaviour that results.

So, break it down for us: explain how we'll get desirable query plans
out of the current code if:

(1) Table  Index is larger than shared_buffers;
(2) Table  Index is smaller than RAM;
(3) Selectivity is 0.02
(4) ECS is set lower than shared_buffers

I think the current cost math does a pretty good job of choosing the
correct behavior if ECS is set correctly.  But if it's not, no.

If I'm wrong, then you've successfully found a bug in our costing math,
so I'd love to see it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] possible dsm bug in dsm_attach()

2014-05-06 Thread Andres Freund

On 2014-05-06 08:48:57 -0400, Robert Haas wrote:
 On Tue, May 6, 2014 at 8:43 AM, Andres Freund and...@2ndquadrant.com wrote:
  The break because of refcnt == 1 doesn't generally seem to be a good
  idea. Why are we bailing if there's *any* segment that's in the process
  of being removed? I think the check should be there *after* the
  dsm_control-item[i].handle == seg-handle check?
 
 You are correct.  Good catch.

Fix attached.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 9bc05ac103c77fcd2276754a3ccbf49e66dd00b7 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Tue, 6 May 2014 19:08:07 +0200
Subject: [PATCH] Don't bail in dsm_attach() if any any other segment is about
 to be destroyed.

The previous coding accidentally checked for refcnt == 1, which
signals moribund segments, before checking whether the segment we're
looking at is the one we're trying to attach to.
---
 src/backend/storage/ipc/dsm.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index d86b0c7..bde7026 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -570,6 +570,10 @@ dsm_attach(dsm_handle h)
 		if (dsm_control-item[i].refcnt == 0)
 			continue;
 
+		/* Not the segment you're looking for. */
+		if (dsm_control-item[i].handle != seg-handle)
+			continue;
+
 		/*
 		 * If the reference count is 1, the slot is still in use, but the
 		 * segment is in the process of going away.  Treat that as if we
@@ -578,13 +582,10 @@ dsm_attach(dsm_handle h)
 		if (dsm_control-item[i].refcnt == 1)
 			break;
 
-		/* Otherwise, if the descriptor matches, we've found a match. */
-		if (dsm_control-item[i].handle == seg-handle)
-		{
-			dsm_control-item[i].refcnt++;
-			seg-control_slot = i;
-			break;
-		}
+		/* Otherwise we've found a match. */
+		dsm_control-item[i].refcnt++;
+		seg-control_slot = i;
+		break;
 	}
 	LWLockRelease(DynamicSharedMemoryControlLock);
 
-- 
1.8.5.rc2.dirty


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Release schedule for PG 9.4beta1

2014-05-06 Thread Tom Lane

The plan for beta1 is to wrap the tarball this Sunday (May 11)
for public announcement Thursday May 15.  This deviates from our
usual Monday wrap + Thursday announcement schedule so as to give
the packagers a little more time, knowing that packaging scripts
usually require extra adjustments when seeing a new major release
for the first time.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] possible dsm bug in dsm_attach()

2014-05-06 Thread Robert Haas

On Tue, May 6, 2014 at 1:14 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-05-06 08:48:57 -0400, Robert Haas wrote:
 On Tue, May 6, 2014 at 8:43 AM, Andres Freund and...@2ndquadrant.com wrote:
  The break because of refcnt == 1 doesn't generally seem to be a good
  idea. Why are we bailing if there's *any* segment that's in the process
  of being removed? I think the check should be there *after* the
  dsm_control-item[i].handle == seg-handle check?

 You are correct.  Good catch.

 Fix attached.

Committed, thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] possible dsm bug in dsm_attach()

2014-05-06 Thread Andres Freund

On 2014-05-06 13:45:13 -0400, Robert Haas wrote:
 On Tue, May 6, 2014 at 1:14 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-05-06 08:48:57 -0400, Robert Haas wrote:
  On Tue, May 6, 2014 at 8:43 AM, Andres Freund and...@2ndquadrant.com 
  wrote:
   The break because of refcnt == 1 doesn't generally seem to be a good
   idea. Why are we bailing if there's *any* segment that's in the process
   of being removed? I think the check should be there *after* the
   dsm_control-item[i].handle == seg-handle check?
 
  You are correct.  Good catch.
 
  Fix attached.
 
 Committed, thanks.

Heh. Not a fan of film references? :)

Thanks,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_statements: Query normalisation may fail during stats reset

2014-05-06 Thread Robert Haas

On Tue, May 6, 2014 at 12:26 PM, Michael Renner
michael.ren...@amd.co.at wrote:
 when regularly collecting  resetting query information from
 pg_stat_statements it’s possible to trigger a situation where unnormalised
 queries are stored.

 I think what happens is the following:

 pgss_post_parse_analyse calls pgss_store with a non-null jstate which will
 cause the query string to be normalised and stored if the query id doesn’t
 exist in the hash table.

 pgss_ExecutorEnd calls pgss_store with a null jstate which will cause the
 statistics to be stored if the query id exists.

 If the query id does not exist (because the hash table has been reset
 between post_parse_analyse and ExecutorEnd) it hits the entry creation path
 which in turn will create an entry with the unnormalised query string
 because jstate is null.

 This is a bit counterintuitive if you rely on the query to be normalised,
 e.g. for privacy reasons where you don’t want to leak query constants like
 password hashes or usernames.


 Is this something that should be fixed or is this intentional behavior? The
 documentation doesn’t make any strong claims on the state of the query
 string, so this might be debatable. [1]

It sounds pretty wonky to me, but then, so does the behavior in the
email to which you linked.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] possible dsm bug in dsm_attach()

2014-05-06 Thread Robert Haas

On Tue, May 6, 2014 at 1:46 PM, Andres Freund and...@2ndquadrant.com wrote:
  Fix attached.

 Committed, thanks.

 Heh. Not a fan of film references? :)

I didn't quite put the pieces together there.  I just thought the use
of you was awkward.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] psql tab completion for DROP TRIGGER/RULE and ALTER TABLE ... DISABLE/ENABLE

2014-05-06 Thread Andreas Karlsson


Hi,

When benchmarking an application I got annoyed at how basic the tab 
completion for ALTER TABLE ... DISABLE/ENABLE TRIGGER and DROP TRIGGER 
is. So here is a patch improving the tab completion around triggers. For 
consistency I have also added the same completions to rules since their 
DDL is almost identical.


--
Andreas Karlsson


diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
new file mode 100644
index 6d26ffc..65483de
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
*** static const SchemaQuery Query_for_list_
*** 626,631 
--- 626,639 
   WHERE pg_catalog.quote_ident(conname)='%s')
  
  /* the silly-looking length condition is just to eat up the current word */
+ #define Query_for_rule_of_table \
+ SELECT pg_catalog.quote_ident(rulename) \
+   FROM pg_catalog.pg_class c1, pg_catalog.pg_rewrite \
+  WHERE c1.oid=ev_class and (%d = pg_catalog.length('%s'))\
+and pg_catalog.quote_ident(c1.relname)='%s'\
+and pg_catalog.pg_table_is_visible(c1.oid)
+ 
+ /* the silly-looking length condition is just to eat up the current word */
  #define Query_for_list_of_tables_for_rule \
  SELECT pg_catalog.quote_ident(relname) \
FROM pg_catalog.pg_class\
*** static const SchemaQuery Query_for_list_
*** 635,640 
--- 643,656 
   WHERE pg_catalog.quote_ident(rulename)='%s')
  
  /* the silly-looking length condition is just to eat up the current word */
+ #define Query_for_trigger_of_table \
+ SELECT pg_catalog.quote_ident(tgname) \
+   FROM pg_catalog.pg_class c1, pg_catalog.pg_trigger \
+  WHERE c1.oid=tgrelid and (%d = pg_catalog.length('%s'))\
+and pg_catalog.quote_ident(c1.relname)='%s'\
+and pg_catalog.pg_table_is_visible(c1.oid)
+ 
+ /* the silly-looking length condition is just to eat up the current word */
  #define Query_for_list_of_tables_for_trigger \
  SELECT pg_catalog.quote_ident(relname) \
FROM pg_catalog.pg_class\
*** psql_completion(const char *text, int st
*** 1409,1414 
--- 1425,1462 
  
  		COMPLETE_WITH_LIST(list_ALTERENABLE2);
  	}
+ 	else if (pg_strcasecmp(prev5_wd, ALTER) == 0 
+ 			 pg_strcasecmp(prev4_wd, TABLE) == 0 
+ 			 pg_strcasecmp(prev2_wd, ENABLE) == 0 
+ 			 pg_strcasecmp(prev_wd, RULE) == 0)
+ 	{
+ 		completion_info_charp = prev3_wd;
+ 		COMPLETE_WITH_QUERY(Query_for_rule_of_table);
+ 	}
+ 	else if (pg_strcasecmp(prev6_wd, ALTER) == 0 
+ 			 pg_strcasecmp(prev5_wd, TABLE) == 0 
+ 			 pg_strcasecmp(prev3_wd, ENABLE) == 0 
+ 			 pg_strcasecmp(prev_wd, RULE) == 0)
+ 	{
+ 		completion_info_charp = prev4_wd;
+ 		COMPLETE_WITH_QUERY(Query_for_rule_of_table);
+ 	}
+ 	else if (pg_strcasecmp(prev5_wd, ALTER) == 0 
+ 			 pg_strcasecmp(prev4_wd, TABLE) == 0 
+ 			 pg_strcasecmp(prev2_wd, ENABLE) == 0 
+ 			 pg_strcasecmp(prev_wd, TRIGGER) == 0)
+ 	{
+ 		completion_info_charp = prev3_wd;
+ 		COMPLETE_WITH_QUERY(Query_for_trigger_of_table);
+ 	}
+ 	else if (pg_strcasecmp(prev6_wd, ALTER) == 0 
+ 			 pg_strcasecmp(prev5_wd, TABLE) == 0 
+ 			 pg_strcasecmp(prev3_wd, ENABLE) == 0 
+ 			 pg_strcasecmp(prev_wd, TRIGGER) == 0)
+ 	{
+ 		completion_info_charp = prev4_wd;
+ 		COMPLETE_WITH_QUERY(Query_for_trigger_of_table);
+ 	}
  	/* ALTER TABLE xxx INHERIT */
  	else if (pg_strcasecmp(prev4_wd, ALTER) == 0 
  			 pg_strcasecmp(prev3_wd, TABLE) == 0 
*** psql_completion(const char *text, int st
*** 1424,1429 
--- 1472,1478 
  	{
  		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, );
  	}
+ 	/* ALTER TABLE xxx DISABLE */
  	else if (pg_strcasecmp(prev4_wd, ALTER) == 0 
  			 pg_strcasecmp(prev3_wd, TABLE) == 0 
  			 pg_strcasecmp(prev_wd, DISABLE) == 0)
*** psql_completion(const char *text, int st
*** 1433,1438 
--- 1482,1503 
  
  		COMPLETE_WITH_LIST(list_ALTERDISABLE);
  	}
+ 	else if (pg_strcasecmp(prev5_wd, ALTER) == 0 
+ 			 pg_strcasecmp(prev4_wd, TABLE) == 0 
+ 			 pg_strcasecmp(prev2_wd, DISABLE) == 0 
+ 			 pg_strcasecmp(prev_wd, RULE) == 0)
+ 	{
+ 		completion_info_charp = prev3_wd;
+ 		COMPLETE_WITH_QUERY(Query_for_rule_of_table);
+ 	}
+ 	else if (pg_strcasecmp(prev5_wd, ALTER) == 0 
+ 			 pg_strcasecmp(prev4_wd, TABLE) == 0 
+ 			 pg_strcasecmp(prev2_wd, DISABLE) == 0 
+ 			 pg_strcasecmp(prev_wd, TRIGGER) == 0)
+ 	{
+ 		completion_info_charp = prev3_wd;
+ 		COMPLETE_WITH_QUERY(Query_for_trigger_of_table);
+ 	}
  
  	/* ALTER TABLE xxx ALTER */
  	else if (pg_strcasecmp(prev4_wd, ALTER) == 0 
*** psql_completion(const char *text, int st
*** 2587,2592 
--- 2652,2680 
  		COMPLETE_WITH_LIST(list_ALTERTEXTSEARCH);
  	}
  
+ 	/* DROP TRIGGER */
+ 	else if (pg_strcasecmp(prev3_wd, DROP) == 0 
+ 			 pg_strcasecmp(prev2_wd, TRIGGER) == 0)
+ 	{
+ 		COMPLETE_WITH_CONST(ON);
+ 	}
+ 	else if (pg_strcasecmp(prev4_wd, DROP) == 0 
+ 			 pg_strcasecmp(prev3_wd, TRIGGER) == 0 
+ 			 pg_strcasecmp(prev_wd, ON) == 0)
+ 	{
+ 		completion_info_charp = prev2_wd;
+

Re: [HACKERS] psql tab completion for DROP TRIGGER/RULE and ALTER TABLE ... DISABLE/ENABLE

2014-05-06 Thread Stephen Frost

Andreas,

* Andreas Karlsson (andr...@proxel.se) wrote:
 When benchmarking an application I got annoyed at how basic the tab
 completion for ALTER TABLE ... DISABLE/ENABLE TRIGGER and DROP
 TRIGGER is. So here is a patch improving the tab completion around
 triggers. For consistency I have also added the same completions to
 rules since their DDL is almost identical.

Please add this to the commitfest application, so we don't forget about
this patch.

http://commitfest.postgresql.org

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] pg_shmem_allocations view

2014-05-06 Thread Andres Freund

Hi,

On 2014-05-06 13:56:41 +0200, Andres Freund wrote:
 On 2014-05-05 23:20:43 -0400, Robert Haas wrote:
  On Mon, May 5, 2014 at 6:54 PM, Tom Lane t...@sss.pgh.pa.us wrote:
   I'm not confident that it'll be useful either.  But I am confident that
   if we don't put it in now, and decide we want it later, there will be
   complaints when we change the API.  Better to have an ignored parameter
   than no parameter.
 
  I'm generally skeptical of that philosophy.  If we put in an ignored
  parameter, people may pass pointers to NULL or to garbage or to an
  overly-long string, and they won't know it's broken until we make it
  do something; at which point their code will begin to fail without
  warning.

 If it were a complex change, maybe. But I don't think that's likely
 here.
 Assert(name != NULL  strlen(name)  0  strlen(name)  NAMEDATALEN);
 should perfectly do the trick.

Attached are two patches:
a) Patch addin a 'name' parameter to dsm_create(). I think we should
   apply this to 9.4.
b) pg_dynamic_shmem_allocations and pg_static_shmem_allocations
   views. The previous version didn't include dsm support and didn't
   take the required lock.

I am not so sure whether b) should be applied together with a) in 9.4,
but I'd be happy enough to add docs if people agree with the naming.

FWIW, I like dsm_create()'s internals more after this patch...

postgres=# \d pg_dynamic_shmem_allocations
View pg_catalog.pg_dynamic_shmem_allocations
 Column |  Type  | Modifiers
++---
 handle | bigint |
 name   | text   |
 size   | bigint |
 refcnt | bigint |

postgres=# \d pg_static_shmem_allocations
View pg_catalog.pg_static_shmem_allocations
  Column   |  Type   | Modifiers
---+-+---
 key   | text|
 off   | bigint  |
 size  | bigint  |
 allocated | boolean |

postgres=# SELECT * FROM pg_dynamic_shmem_allocations;
   handle   |name | size  | refcnt
+-+---+
 1120921036 | test_shm_mq | 65656 |  1
(1 row)

postgres=# SELECT * FROM pg_static_shmem_allocations ORDER BY key NULLS FIRST;
 key |off |size| allocated
-+++---
 | 605024 |1727776 | f
 ||   34844752 | t
 Async Ctl   | 539168 |  65856 | t
 Async Queue Control | 537784 |   1384 | t
 AutoVacuum Data | 533576 |224 | t
 Backend Activity Buffer | 2217099552 | 114688 | t
 Backend Application Name Buffer | 2217085216 |   7168 | t
 Backend Client Host Name Buffer | 2217092384 |   7168 | t
 Backend Status Array| 2217061024 |  24192 | t
 Background Worker Data  | 2217214256 |   1992 | t
 BTree Vacuum State  | 535768 |   1356 | t
 Buffer Blocks   |   51365312 | 2147483648 | t
 Buffer Descriptors  |   34588096 |   16777216 | t
 Buffer Strategy Status  | 2213546176 | 32 | t
 Checkpointer Data   | 2217290656 |5242920 | t
 CLOG Ctl|   33601152 | 525312 | t
 Control File|   16796384 |240 | t
 Fast Path Strong Relation Lock Data | 2214767072 |   4100 | t
 FinishedSerializableTransactions| 2216841952 | 16 | t
 LOCK hash   | 2213546208 |   2160 | t
 MultiXactMember Ctl |   34455488 | 131648 | t
 MultiXactOffset Ctl |   34389632 |  65856 | t
 OldSerXidControlData| 2216973632 | 16 | t
 OldSerXid SLRU Ctl  | 2216841984 | 131648 | t
 PMSignalState   | 2217285400 |940 | t
 PREDICATELOCK hash  | 2215182944 |   2160 | t
 PREDICATELOCKTARGET hash| 2214771176 |   2160 | t
 PredXactList| 2216348384 | 88 | t
 Prepared Transaction Table  | 2217214240 | 16 | t
 Proc Array  | 2217060536 |488 | t
 Proc Header | 2216973648 | 88 | t
 PROCLOCK hash   | 2214183264 |   2160 | t
 ProcSignalSlots | 2217286344 |   4284 | t
 RWConflictPool  | 2216573120 | 24 | t
 SERIALIZABLEXID hash| 2216518720 |   2160 | t
 Shared Buffer Lookup Table  | 2198848960 |  16496 | t
 Shared MultiXact State  |   34587136 |936 | t
 shmInvalBuffer  | 2217216256 |  69144 | t
 SUBTRANS Ctl|   34126464 | 263168 | t
 Sync Scan Locations List| 537128 |656 | t
 Wal Receiver Ctl| 534576 |

Re: [HACKERS] pg_stat_statements: Query normalisation may fail during stats reset

2014-05-06 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Tue, May 6, 2014 at 12:26 PM, Michael Renner
 when regularly collecting  resetting query information from
 pg_stat_statements itâs possible to trigger a situation where unnormalised
 queries are stored.
 
 Is this something that should be fixed or is this intentional behavior? The
 documentation doesnât make any strong claims on the state of the query
 string, so this might be debatable. [1]

 It sounds pretty wonky to me, but then, so does the behavior in the
 email to which you linked.

The source code says that query strings are normalized on a best effort
basis, so perhaps we ought to say the same in the documentation.

It would be rather expensive to provide a guarantee of normalization:
basically, we'd have to compute the normalized query string during parsing
*even when the hashtable entry already exists*, and then store it
somewhere where it'd survive till ExecutorEnd (but, preferably, not be
leaked if we never get to ExecutorEnd; which makes this hard).  I think
most people would find that a bad tradeoff.

One cheap-and-dirty solution is to throw away the execution stats if we
get to the end and find the hash table entry no longer exists, rather than
make a new entry with a not-normalized string.  Not sure if that cure is
better than the disease or not.

Another thought, though it's not too relevant to this particular scenario
of intentional resets, is that we could raise the priority of entries
for statements-in-progress even further.  I notice for example that if
entry_alloc finds an existing hashtable entry, it does nothing to raise
the usage count of that entry.

 This is a bit counterintuitive if you rely on the query to be normalised,
 e.g. for privacy reasons where you donât want to leak query constants like
 password hashes or usernames.

The bigger picture here is that relying on query normalization for privacy
doesn't seem like a bright idea.  Consider making sure that
security-relevant values are passed as parameters rather than being
embedded in the query text in the first place.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_shmem_allocations view

2014-05-06 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 Attached are two patches:
 a) Patch addin a 'name' parameter to dsm_create(). I think we should
apply this to 9.4.
 b) pg_dynamic_shmem_allocations and pg_static_shmem_allocations
views. The previous version didn't include dsm support and didn't
take the required lock.

 I am not so sure whether b) should be applied together with a) in 9.4,
 but I'd be happy enough to add docs if people agree with the naming.

FWIW, I vote for fixing (a) now but holding (b) for 9.5.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-05-06 Thread Peter Eisentraut

On 5/2/14, 2:22 PM, Stephen Frost wrote:
 I'm aware and I really am not convinced that pushing all of this to
 contrib modules using the hooks is the right approach- for one thing, it
 certainly doesn't seem to me that we've actually gotten a lot of
 traction from people to actually make use of them and keep them updated.
 We've had many of those hooks for quite a while.

What is there to update?  The stuff works and doesn't necessarily need
any changes.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-05-06 Thread Stephen Frost

* Peter Eisentraut (pete...@gmx.net) wrote:
 On 5/2/14, 2:22 PM, Stephen Frost wrote:
  I'm aware and I really am not convinced that pushing all of this to
  contrib modules using the hooks is the right approach- for one thing, it
  certainly doesn't seem to me that we've actually gotten a lot of
  traction from people to actually make use of them and keep them updated.
  We've had many of those hooks for quite a while.
 
 What is there to update?  The stuff works and doesn't necessarily need
 any changes.

I'm referring to auditing/logging systems built on top of those, not the
hooks themselves..

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Issue with GRANT/COMMENT ON FUNCTION with default

2014-05-06 Thread Peter Eisentraut

On 5/5/14, 4:09 PM, Jim Nasby wrote:
 They do not accept DEFAULT though:
 
 GRANT EXECUTE ON FUNCTION test(t text DEFAULT '') to public;
 ERROR:  syntax error at or near DEFAULT
 LINE 1: GRANT EXECUTE ON FUNCTION test(t text DEFAULT '') to public;
 
 Presumably this is just an oversight?

It appears to be specified that way in SQL.  The DEFAULT clause is not
part of the function signature.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] btree_gist valgrind warnings about uninitialized memory

2014-05-06 Thread Andres Freund

Hi,

I am not sure whether these are new for 9.4 but they're worth looking at
anyway:
Valgrind was run with:
 --trace-children=yes --quiet \
 --track-origins=yes --leak-check=no \
 --read-var-info=yes \
 --suppressions=/home/andres/src/postgresql/src/tools/valgrind.supp \
 postgres with options
All seem to be due btree_bit's fault and they seem to have a common
origin. Didn't look at code.

==27401== Conditional jump or move depends on uninitialised value(s)
==27401==at 0x4C2C812: bcmp (mc_replace_strmem.c:935)
==27401==by 0x8A8533: byteacmp (varlena.c:2735)
==27401==by 0x8D70A2: DirectFunctionCall2Coll (fmgr.c:1050)
==27401==by 0x180822C6: gbt_bitcmp (btree_bit.c:68)
==27401==by 0x18079C05: gbt_vsrt_cmp (btree_utils_var.c:430)
==27401==by 0x919F66: qsort_arg (qsort_arg.c:121)
==27401==by 0x91A531: qsort_arg (qsort_arg.c:186)
==27401==by 0x91A531: qsort_arg (qsort_arg.c:186)
==27401==by 0x18079E68: gbt_var_picksplit (btree_utils_var.c:484)
==27401==by 0x180825AC: gbt_bit_picksplit (btree_bit.c:180)
==27401==by 0x8D7AD3: FunctionCall2Coll (fmgr.c:1324)
==27401==by 0x497701: gistUserPicksplit (gistsplit.c:433)
==27401==  Uninitialised value was created by a heap allocation
==27401==at 0x4C274A0: malloc (vg_replace_malloc.c:291)
==27401==by 0x8F8EE1: AllocSetAlloc (aset.c:853)
==27401==by 0x8FAA71: palloc (mcxt.c:677)
==27401==by 0x18078E37: gbt_var_decompress (btree_utils_var.c:42)
==27401==by 0x8D79E5: FunctionCall1Coll (fmgr.c:1298)
==27401==by 0x48E4A6: gistdentryinit (gistutil.c:549)
==27401==by 0x498217: gistSplitByKey (gistsplit.c:644)
==27401==by 0x48BEEC: gistSplit (gist.c:1270)
==27401==by 0x48899D: gistplacetopage (gist.c:242)
==27401==by 0x48BAEE: gistinserttuples (gist.c:1134)
==27401==by 0x48BA73: gistinserttuple (gist.c:1093)
==27401==by 0x48A71F: gistdoinsert (gist.c:750)
==27401== 
==27401== Conditional jump or move depends on uninitialised value(s)
==27401==at 0x8A853B: byteacmp (varlena.c:2736)
==27401==by 0x8D70A2: DirectFunctionCall2Coll (fmgr.c:1050)
==27401==by 0x180822C6: gbt_bitcmp (btree_bit.c:68)
==27401==by 0x18079C05: gbt_vsrt_cmp (btree_utils_var.c:430)
==27401==by 0x919F66: qsort_arg (qsort_arg.c:121)
==27401==by 0x91A531: qsort_arg (qsort_arg.c:186)
==27401==by 0x91A531: qsort_arg (qsort_arg.c:186)
==27401==by 0x18079E68: gbt_var_picksplit (btree_utils_var.c:484)
==27401==by 0x180825AC: gbt_bit_picksplit (btree_bit.c:180)
==27401==by 0x8D7AD3: FunctionCall2Coll (fmgr.c:1324)
==27401==by 0x497701: gistUserPicksplit (gistsplit.c:433)
==27401==by 0x498491: gistSplitByKey (gistsplit.c:697)
==27401==  Uninitialised value was created by a heap allocation
==27401==at 0x4C274A0: malloc (vg_replace_malloc.c:291)
==27401==by 0x8F8EE1: AllocSetAlloc (aset.c:853)
==27401==by 0x8FAA71: palloc (mcxt.c:677)
==27401==by 0x18078E37: gbt_var_decompress (btree_utils_var.c:42)
==27401==by 0x8D79E5: FunctionCall1Coll (fmgr.c:1298)
==27401==by 0x48E4A6: gistdentryinit (gistutil.c:549)
==27401==by 0x498217: gistSplitByKey (gistsplit.c:644)
==27401==by 0x48BEEC: gistSplit (gist.c:1270)
==27401==by 0x48899D: gistplacetopage (gist.c:242)
==27401==by 0x48BAEE: gistinserttuples (gist.c:1134)
==27401==by 0x48BA73: gistinserttuple (gist.c:1093)
==27401==by 0x48A71F: gistdoinsert (gist.c:750)
==27401== 
==27401== Conditional jump or move depends on uninitialised value(s)
==27401==at 0x18079C0D: gbt_vsrt_cmp (btree_utils_var.c:431)
==27401==by 0x919F66: qsort_arg (qsort_arg.c:121)
==27401==by 0x91A531: qsort_arg (qsort_arg.c:186)
==27401==by 0x91A531: qsort_arg (qsort_arg.c:186)
==27401==by 0x18079E68: gbt_var_picksplit (btree_utils_var.c:484)
==27401==by 0x180825AC: gbt_bit_picksplit (btree_bit.c:180)
==27401==by 0x8D7AD3: FunctionCall2Coll (fmgr.c:1324)
==27401==by 0x497701: gistUserPicksplit (gistsplit.c:433)
==27401==by 0x498491: gistSplitByKey (gistsplit.c:697)
==27401==by 0x48BEEC: gistSplit (gist.c:1270)
==27401==by 0x48899D: gistplacetopage (gist.c:242)
==27401==by 0x48BAEE: gistinserttuples (gist.c:1134)
==27401==  Uninitialised value was created by a heap allocation
==27401==at 0x4C274A0: malloc (vg_replace_malloc.c:291)
==27401==by 0x8F8EE1: AllocSetAlloc (aset.c:853)
==27401==by 0x8FAA71: palloc (mcxt.c:677)
==27401==by 0x18078E37: gbt_var_decompress (btree_utils_var.c:42)
==27401==by 0x8D79E5: FunctionCall1Coll (fmgr.c:1298)
==27401==by 0x48E4A6: gistdentryinit (gistutil.c:549)
==27401==by 0x498217: gistSplitByKey (gistsplit.c:644)
==27401==by 0x48BEEC: gistSplit (gist.c:1270)
==27401==by 0x48899D: gistplacetopage (gist.c:242)
==27401==by 0x48BAEE: gistinserttuples (gist.c:1134)
==27401==by 0x48BA73: gistinserttuple (gist.c:1093)
==27401==by 0x48A71F: gistdoinsert (gist.c:750)
==27401== 
==27401== Conditional

Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.4

2014-05-06 Thread Kevin Grittner

Bruce Momjian br...@momjian.us wrote:

 pgindent run for 9.4

 This includes removing tabs after periods in C comments, which was
 applied to back branches, so this change should not effect backpatching.

 http://git.postgresql.org/pg/commitdiff/0a7832005792fa6dad171f9cadb8d587fe0dd800

 .../test/expected/compat_informix-test_informix2.c |    2 +-
 src/interfaces/ecpg/test/expected/preproc-init.c  |    2 +-
 src/interfaces/ecpg/test/expected/sql-array.c  |    2 +-
 src/interfaces/ecpg/test/expected/sql-code100.c    |    2 +-
 src/interfaces/ecpg/test/expected/sql-copystdout.c |    2 +-
 src/interfaces/ecpg/test/expected/sql-define.c    |    2 +-
 src/interfaces/ecpg/test/expected/sql-dynalloc.c  |    2 +-
 src/interfaces/ecpg/test/expected/sql-dynalloc2.c  |    2 +-
 src/interfaces/ecpg/test/expected/sql-dyntest.c    |    2 +-
 src/interfaces/ecpg/test/expected/sql-indicators.c |    2 +-
 src/interfaces/ecpg/test/expected/thread-alloc.c  |    2 +-
 .../ecpg/test/expected/thread-descriptor.c    |    2 +-
 src/interfaces/ecpg/test/expected/thread-prep.c    |    2 +-

The 13 tests above are broken by this commit.  Probably the
directory should be excluded from pgindent processing.

While I haven't checked, I assume all other branches are affected.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.4

2014-05-06 Thread Tom Lane

Kevin Grittner kgri...@ymail.com writes:
 Bruce Momjian br...@momjian.us wrote:
 pgindent run for 9.4

 The 13 tests above are broken by this commit.  Probably the
 directory should be excluded from pgindent processing.

What's broken?  The buildfarm isn't complaining, and make installcheck
in src/interfaces/ecpg/test passes for me.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_statements: Query normalisation may fail during stats reset

2014-05-06 Thread Peter Geoghegan

On Tue, May 6, 2014 at 11:31 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The source code says that query strings are normalized on a best effort
 basis, so perhaps we ought to say the same in the documentation.

Perhaps.

 It would be rather expensive to provide a guarantee of normalization:
 basically, we'd have to compute the normalized query string during parsing
 *even when the hashtable entry already exists*, and then store it
 somewhere where it'd survive till ExecutorEnd (but, preferably, not be
 leaked if we never get to ExecutorEnd; which makes this hard).  I think
 most people would find that a bad tradeoff.

I certainly would.

 One cheap-and-dirty solution is to throw away the execution stats if we
 get to the end and find the hash table entry no longer exists, rather than
 make a new entry with a not-normalized string.  Not sure if that cure is
 better than the disease or not.

I am certain that it is. Consider long running queries that don't
manage to get the benefit of the aggressive decay for stick entries
technique, because there is consistent contention.

 Another thought, though it's not too relevant to this particular scenario
 of intentional resets, is that we could raise the priority of entries
 for statements-in-progress even further.  I notice for example that if
 entry_alloc finds an existing hashtable entry, it does nothing to raise
 the usage count of that entry.

To do otherwise would create an artificial prejudice against prepared
queries, though.

 This is a bit counterintuitive if you rely on the query to be normalised,
 e.g. for privacy reasons where you don’t want to leak query constants like
 password hashes or usernames.

 The bigger picture here is that relying on query normalization for privacy
 doesn't seem like a bright idea.  Consider making sure that
 security-relevant values are passed as parameters rather than being
 embedded in the query text in the first place.

I agree.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Wanted: jsonb on-disk representation documentation

2014-05-06 Thread Heikki Linnakangas

I'm reading the new jsonb code, trying to understand the on-disk 
representation. And I cannot make heads or tails of it.


My first entry point was jsonb.h. Jsonb struct is the on-disk 
representation, so I looked at the comments above that. No help, the 
comments are useless for getting an overview picture.


Help, anyone?

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_shmem_allocations view

2014-05-06 Thread Robert Haas

On Tue, May 6, 2014 at 2:34 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andres Freund and...@2ndquadrant.com writes:
 Attached are two patches:
 a) Patch addin a 'name' parameter to dsm_create(). I think we should
apply this to 9.4.
 b) pg_dynamic_shmem_allocations and pg_static_shmem_allocations
views. The previous version didn't include dsm support and didn't
take the required lock.

 I am not so sure whether b) should be applied together with a) in 9.4,
 but I'd be happy enough to add docs if people agree with the naming.

 FWIW, I vote for fixing (a) now but holding (b) for 9.5.

I guess I'll vote for applying both.  I don't see a lot of risk, and I
think doing one with out the other is somewhat pointless.

Regarding patch 0002, I don't think we're using the term static
shmem anywhere else, so I vote for dropping the word static, so that
we have pg_get_shmem_allocations() and
pg_get_dynamic_shmem_allocations().  Also, I think using the
allocated column is pretty weird.  There are always exactly two
entries with allocated = false, one of which is for actual free memory
and the other of which is for memory that actually IS allocated but
without using ShmemIndex (maybe the latter was supposed to have
allocated = true but still key = null?).  I guess I'd vote for
ditching the allocated column completely and outputting the memory
allocated without ShmemIndex using some fixed tag (like ShmemIndex
or Bootstrap or Overhead or something).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.4

2014-05-06 Thread Kevin Grittner

Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kgri...@ymail.com writes:

 Bruce Momjian br...@momjian.us wrote:
 pgindent run for 9.4

 The 13 tests above are broken by this commit.  Probably the
 directory should be excluded from pgindent processing.

 What's broken?  The buildfarm isn't complaining, and make installcheck
 in src/interfaces/ecpg/test passes for me.

On make check-world I get the attached.  After the period, the
trailing tabs in the comment have been changed to spaces in
expected, but are still tabs in results.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

regression.diffs
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.4

2014-05-06 Thread Bruce Momjian

On Tue, May  6, 2014 at 12:38:41PM -0700, Kevin Grittner wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
  Kevin Grittner kgri...@ymail.com writes:
 
  Bruce Momjian br...@momjian.us wrote:
  pgindent run for 9.4
 
  The 13 tests above are broken by this commit.  Probably the
  directory should be excluded from pgindent processing.
 
  What's broken?  The buildfarm isn't complaining, and make installcheck
  in src/interfaces/ecpg/test passes for me.
 
 On make check-world I get the attached.  After the period, the
 trailing tabs in the comment have been changed to spaces in
 expected, but are still tabs in results.

Yes, I had to modify those lines before I pushed the pgindent changes so
'make installcheck-world' would pass.  It passes here for me now.  Did
you do 'make maintainer-clean' before running the tests?  That might
help.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Jeff Janes

On Tue, May 6, 2014 at 7:18 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Simon Riggs si...@2ndquadrant.com writes:
  Lets fix e_c_s at 25% of shared_buffers and remove the parameter
  completely, just as we do with so many other performance parameters.

 Apparently, you don't even understand what this parameter is for.
 Setting it smaller than shared_buffers is insane.


The e_c_s is assumed to be usable for each backend trying to run queries
sensitive to it.  If you have dozens of such queries running simultaneously
(not something I personally witness, but also not insane) and each of these
queries has its own peculiar working set, then having e_c_s smaller than
s_b makes sense.

I have a hard time believe that this is at all common, however.   Certainly
not common enough so to justify cranking the setting all the way the other
direction and then removing the crank handle.

Cheers,

Jeff

Re: [HACKERS] pg_shmem_allocations view

2014-05-06 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Tue, May 6, 2014 at 2:34 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 FWIW, I vote for fixing (a) now but holding (b) for 9.5.

 I guess I'll vote for applying both.  I don't see a lot of risk, and I
 think doing one with out the other is somewhat pointless.

The difference is that there's not consensus about the details of the
views ... as borne out by your next paragraph.

Now admittedly, we could always redefine the views in 9.5, but
I'd rather not be doing this sort of thing in haste.  Something
as user-visible as a system view really ought to have baked awhile
before we ship it.  Patch (a) is merely institutionalizing the
expectation that DSM segments should have names, which is a much
lower-risk bet.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.4

2014-05-06 Thread Kevin Grittner

Bruce Momjian br...@momjian.us wrote:
 On Tue, May  6, 2014 at 12:38:41PM -0700, Kevin Grittner wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kgri...@ymail.com writes:

 Bruce Momjian br...@momjian.us wrote:
 pgindent run for 9.4

 The 13 tests above are broken by this commit.  Probably the
 directory should be excluded from pgindent processing.

 What's broken?  The buildfarm isn't complaining, and make installcheck
 in src/interfaces/ecpg/test passes for me.

 On make check-world I get the attached.  After the period, the
 trailing tabs in the comment have been changed to spaces in
 expected, but are still tabs in results.

 Yes, I had to modify those lines before I pushed the pgindent changes so
 'make installcheck-world' would pass.  It passes here for me now.  Did
 you do 'make maintainer-clean' before running the tests?  That might
 help.

It occurred to me after my last post that I had just done a make
world without any cleanup when I pulled that, and had started a
full build from make maintainer-clean when you sent that.  :-)

I'll let you know either way when I get results from that.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wanted: jsonb on-disk representation documentation

2014-05-06 Thread Andres Freund

On May 6, 2014 9:30:15 PM CEST, Heikki Linnakangas hlinnakan...@vmware.com 
wrote:
I'm reading the new jsonb code, trying to understand the on-disk 
representation. And I cannot make heads or tails of it.

My first entry point was jsonb.h. Jsonb struct is the on-disk 
representation, so I looked at the comments above that. No help, the 
comments are useless for getting an overview picture.

Help, anyone?

Enthusiatically seconded. I've asked for that about three times without much 
success. If it had been my decision the patch wouldn't have been merged without 
that and other adjustments.

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.4

2014-05-06 Thread Kevin Grittner

Kevin Grittner kgri...@ymail.com wrote:

 It occurred to me after my last post that I had just done a make
 world without any cleanup when I pulled that, and had started a
 full build from make maintainer-clean when you sent that.  :-)

 I'll let you know either way when I get results from that.

Yeah, after make maintainer-clean it's fine.  Another case where
--enable-depend doesn't handle things.  I don't know whether we can
or should try to fix that.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.4

2014-05-06 Thread Tom Lane

Kevin Grittner kgri...@ymail.com writes:
 Bruce Momjian br...@momjian.us wrote:
 Yes, I had to modify those lines before I pushed the pgindent changes so
 'make installcheck-world' would pass.  It passes here for me now.  Did
 you do 'make maintainer-clean' before running the tests?  That might
 help.

 It occurred to me after my last post that I had just done a make
 world without any cleanup when I pulled that, and had started a
 full build from make maintainer-clean when you sent that.  :-)

FWIW, I did a make distclean before pulling the update, which is
my usual habit, and it worked fine.  A look at the make rules for
ecpg suggests that make clean is enough to get rid of all the
derived files for the tests.

But having said that, if this didn't work then there's something broken
about the make rules for the ecpg tests.  I'm a bit suspicious of commit
69e9768e7b183d4b276d0e067a5a689580eb.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.4

2014-05-06 Thread Bruce Momjian

On Tue, May  6, 2014 at 03:54:24PM -0400, Tom Lane wrote:
 Kevin Grittner kgri...@ymail.com writes:
  Bruce Momjian br...@momjian.us wrote:
  Yes, I had to modify those lines before I pushed the pgindent changes so
  'make installcheck-world' would pass.� It passes here for me now.� Did
  you do 'make maintainer-clean' before running the tests?� That might
  help.
 
  It occurred to me after my last post that I had just done a make
  world without any cleanup when I pulled that, and had started a
  full build from make maintainer-clean when you sent that.� :-)
 
 FWIW, I did a make distclean before pulling the update, which is
 my usual habit, and it worked fine.  A look at the make rules for
 ecpg suggests that make clean is enough to get rid of all the
 derived files for the tests.
 
 But having said that, if this didn't work then there's something broken
 about the make rules for the ecpg tests.  I'm a bit suspicious of commit
 69e9768e7b183d4b276d0e067a5a689580eb.

What _is_ odd is that I had to change these files after the pgindent run
in head, but _not_ in the back branches when I removed the tabs from
comments.  I assume there is something new in 9.4 about they way they
are built.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

2014-05-06 Thread Bruce Momjian

On Tue, Apr 22, 2014 at 08:50:20PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Where are we on the default JSONB opclass change?
 
 Not sure.  I'm for changing it, I think, but it wasn't at all clear
 that we had consensus on that.  We did not have a proposed new name
 for the opclass either ...

Where are we on this question?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sending out a request for more buildfarm animals?

2014-05-06 Thread Tomas Vondra

On 4.5.2014 21:29, Tomas Vondra wrote:
 On 3.5.2014 19:01, Andrew Dunstan wrote:

 On 05/03/2014 12:42 PM, Tomas Vondra wrote:
 On 3.5.2014 03:07, Noah Misch wrote:
 More coverage of non-gcc compilers would be an asset to the buildfarm.
 Does that include non-gcc compilers on Linux/x86 platforms?

 Magpie is pretty much dedicated to the buildfarm, and it's pretty much
 doing nothing most of the time, so running the tests with other
 compilers  (llvm/ic/...) would be just fine. Not sure how to do that,
 though. Should I run the tests with multiple configurations, or should
 we have one animal for each config?


 No, don't run with multiple configs. That makes it much harder to see
 where problems come from. One animal per config, please.
 
 Yeah, that's what I thought.
 
 I've requested another animal for clang, I'll do the same with the intel
 compiler once I get the clang one running.
 
 Are there any other compilers / something else we could run on this box?

OK, both new animals are up and apparently running - treepie for clang,
fulmar for icc.

I recall there was a call for more animals with CLOBBER_CACHE_ALWAYS
some time ago, so I went and enabled that on all three animals. Let's
see how long that will take.

I see there are more 'clobber' options in the code: CLOBBER_FREED_MEMORY
and CLOBBER_CACHE_RECURSIVELY. Would that be a good idea to enable these
as well?

The time requirements will be much higher (especially for the
RECURSIVELY option), but running that once a week shouldn't be a big
deal - the machine is pretty much dedicated to the buildfarm.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wanted: jsonb on-disk representation documentation

2014-05-06 Thread Bruce Momjian

On Tue, May  6, 2014 at 09:48:04PM +0200, Andres Freund wrote:
 On May 6, 2014 9:30:15 PM CEST, Heikki Linnakangas hlinnakan...@vmware.com 
 wrote:
 I'm reading the new jsonb code, trying to understand the on-disk 
 representation. And I cannot make heads or tails of it.
 
 My first entry point was jsonb.h. Jsonb struct is the on-disk 
 representation, so I looked at the comments above that. No help, the 
 comments are useless for getting an overview picture.
 
 Help, anyone?
 
 Enthusiatically seconded. I've asked for that about three times without much 
 success. If it had been my decision the patch wouldn't have been merged 
 without that and other adjustments.

I also would like to know what the index-everything hash ops does?  Does
it index the keys, values, or both?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wanted: jsonb on-disk representation documentation

2014-05-06 Thread Peter Geoghegan

On Tue, May 6, 2014 at 1:06 PM, Bruce Momjian br...@momjian.us wrote:
 I also would like to know what the index-everything hash ops does?  Does
 it index the keys, values, or both?

It indexes both, but it isn't possible to test existence (of a key)
with the hash GIN opclass.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

2014-05-06 Thread Tom Lane

Bruce Momjian br...@momjian.us writes:
 On Tue, Apr 22, 2014 at 08:50:20PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
 Where are we on the default JSONB opclass change?

 Not sure.  I'm for changing it, I think, but it wasn't at all clear
 that we had consensus on that.  We did not have a proposed new name
 for the opclass either ...

 Where are we on this question?

Stuck on the naming question.  I'd be willing to do the patch legwork
if we had a consensus (or even a proposal) for what to rename the
current jsonb_ops to.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sending out a request for more buildfarm animals?

2014-05-06 Thread Tom Lane

Tomas Vondra t...@fuzzy.cz writes:
 I recall there was a call for more animals with CLOBBER_CACHE_ALWAYS
 some time ago, so I went and enabled that on all three animals. Let's
 see how long that will take.

 I see there are more 'clobber' options in the code: CLOBBER_FREED_MEMORY
 and CLOBBER_CACHE_RECURSIVELY. Would that be a good idea to enable these
 as well?

 The time requirements will be much higher (especially for the
 RECURSIVELY option), but running that once a week shouldn't be a big
 deal - the machine is pretty much dedicated to the buildfarm.

I've never had the patience to run the regression tests to completion with 
CLOBBER_CACHE_RECURSIVELY at all, let alone do it on a regular basis.
(I wonder if there's some easy way to run it for just a few regression
tests...)

I think testing CLOBBER_FREED_MEMORY would be sensible though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wanted: jsonb on-disk representation documentation

2014-05-06 Thread Peter Geoghegan

On Tue, May 6, 2014 at 12:48 PM, Andres Freund and...@anarazel.de wrote:
 Enthusiatically seconded. I've asked for that about three times without much 
 success. If it had been my decision the patch wouldn't have been merged 
 without that and other adjustments.

I'm almost certain that the only feedback of yours that I didn't
incorporate was that I didn't change the name of JsonbValue, a
decision I stand by, and also that I didn't add ascii art to
illustrate the on-disk format. I can write a patch that adds the
latter soon.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wanted: jsonb on-disk representation documentation

2014-05-06 Thread Oleg Bartunov

FYI,
http://obartunov.livejournal.com/178495.html

This is hash based gin opclass for hstore with all operators support.
It's pity we had no time to do the same for jsonb, but we may include
it and couple of other opclasses to contrib/jsonx.

Oleg

On Wed, May 7, 2014 at 12:18 AM, Peter Geoghegan p...@heroku.com wrote:
 On Tue, May 6, 2014 at 1:06 PM, Bruce Momjian br...@momjian.us wrote:
 I also would like to know what the index-everything hash ops does?  Does
 it index the keys, values, or both?

 It indexes both, but it isn't possible to test existence (of a key)
 with the hash GIN opclass.

 --
 Peter Geoghegan


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Simon Riggs

On 6 May 2014 18:08, Josh Berkus j...@agliodbs.com wrote:
 On 05/06/2014 08:41 AM, Simon Riggs wrote:
 On 6 May 2014 15:18, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Lets fix e_c_s at 25% of shared_buffers and remove the parameter
 completely, just as we do with so many other performance parameters.

 Apparently, you don't even understand what this parameter is for.
 Setting it smaller than shared_buffers is insane.

 You know you can't justify that comment and so do I. What workload is
 so badly affected as to justify use of the word insane in this
 context?

 Most of them?  Really?

I didn't use the word most anywhere. So not really clear what you are saying.


 I have to tell you, your post sounds like you've missed out on the last
 12 years of PostgreSQL query tuning.  Which is a little shocking
 considering where you've spent that 12 years.

 I read the code, think what to say and then say what I think, not
rely on dogma.

I tried to help years ago by changing the docs on e_c_s, but that's
been mostly ignored down the years, as it is again here.


 I can read code. But it appears nobody apart from me actually does, or
 at least understand the behaviour that results.

 So, break it down for us: explain how we'll get desirable query plans
 out of the current code if:

 (1) Table  Index is larger than shared_buffers;
 (2) Table  Index is smaller than RAM;
 (3) Selectivity is 0.02
 (4) ECS is set lower than shared_buffers

Is that it? The above use case is the basis for a default setting??

It's a circular argument, since you're assuming we've all followed
your advice of setting shared_buffers to 25% of RAM, which then
presumes a large gap between (1) and (2). It also ignores that if ECS
is set low then it increases the cost, but does not actually preclude
index scans larger than that setting. It also ignores that if your
database fits in RAM, your random_page_cost setting is wrong and
lowering that appropriately will increase the incidence of index scans
again.

You should also include

(5) You're only running one query at a time (which you know, how?)
(6) You don't care if you flush your cache for later queries
(7) You've got big tables yet are not partitioning them effectively

 I think the current cost math does a pretty good job of choosing the
 correct behavior if ECS is set correctly.  But if it's not, no.

 If I'm wrong, then you've successfully found a bug in our costing math,
 so I'd love to see it.

Setting it high generates lovely EXPLAINs for a single query, but do
we have any evidence that whole workloads are better off with higher
settings? And that represents the general case?

And it makes sense even if it makes it bigger than actual RAM??

If you assume that you can use all of that memory, you're badly wrong.
Presumably you also set work_mem larger than shared_buffers, since
that will induce exactly the same behaviour and have the same
downsides. (Large memory usage for single query, but causes cache
churn, plus problems if we try to overuse RAM because of concurrent
usage).

In the absence of  performance measurements that show the genuine
effect on workloads, I am attempting to make a principle-based
argument. I suggested 25% of shared_buffers because we already use
that as the point where other features cut in to minimise cache churn.
I'm making the argument that if *that* setting is the right one to
control cache churn, then why is it acceptable for index scans to
churn up even bigger chunks of cache?

In case it wasn't clear, I am only suggesting 25% of shared_buffers
for large settings, not for micro-configurations. My proposal to
remove the setting completely was a rhetorical question, asking why we
have a setting for this parameter and yet no tunables for other
things.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Simon Riggs

On 6 May 2014 20:41, Jeff Janes jeff.ja...@gmail.com wrote:

 The e_c_s is assumed to be usable for each backend trying to run queries
 sensitive to it.  If you have dozens of such queries running simultaneously
 (not something I personally witness, but also not insane) and each of these
 queries has its own peculiar working set, then having e_c_s smaller than s_b
 makes sense.

 I have a hard time believe that this is at all common, however.

If larger queries are frequent enough to care about, they will happen together.

We should be acting conservatively with default settings. You can be
as aggressive as you like with your own config.

 Certainly
 not common enough so to justify cranking the setting all the way the other
 direction and then removing the crank handle.

Yes, that part was mostly rhetorical, I wasn't arguing for complete
removal, especially when autotuning is unclear.

I am worried about people that set effective_cache_size but not
shared_buffers, which is too common. If we link the two parameters it
should work in both directions by default.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Sending out a request for more buildfarm animals?

2014-05-06 Thread Tomas Vondra

On 6.5.2014 22:24, Tom Lane wrote:
 Tomas Vondra t...@fuzzy.cz writes:
 I recall there was a call for more animals with CLOBBER_CACHE_ALWAYS
 some time ago, so I went and enabled that on all three animals. Let's
 see how long that will take.
 
 I see there are more 'clobber' options in the code: CLOBBER_FREED_MEMORY
 and CLOBBER_CACHE_RECURSIVELY. Would that be a good idea to enable these
 as well?
 
 The time requirements will be much higher (especially for the
 RECURSIVELY option), but running that once a week shouldn't be a big
 deal - the machine is pretty much dedicated to the buildfarm.
 
 I've never had the patience to run the regression tests to completion
 with CLOBBER_CACHE_RECURSIVELY at all, let alone do it on a regular 
 basis. (I wonder if there's some easy way to run it for just a few 
 regression tests...)

Now, that's a challenge ;-)

 
 I think testing CLOBBER_FREED_MEMORY would be sensible though.

OK, I've enabled this for now.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_shmem_allocations view

2014-05-06 Thread Simon Riggs

On 6 May 2014 20:44, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, May 6, 2014 at 2:34 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 FWIW, I vote for fixing (a) now but holding (b) for 9.5.

 I guess I'll vote for applying both.  I don't see a lot of risk, and I
 think doing one with out the other is somewhat pointless.

 The difference is that there's not consensus about the details of the
 views ... as borne out by your next paragraph.

 Now admittedly, we could always redefine the views in 9.5, but
 I'd rather not be doing this sort of thing in haste.  Something
 as user-visible as a system view really ought to have baked awhile
 before we ship it.  Patch (a) is merely institutionalizing the
 expectation that DSM segments should have names, which is a much
 lower-risk bet.

As long as all the functions are exposed to allow b) to run as an
extension, I don't see we lose anything by waiting a while.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.4

2014-05-06 Thread Tom Lane

Bruce Momjian br...@momjian.us writes:
 On Tue, May  6, 2014 at 03:54:24PM -0400, Tom Lane wrote:
 But having said that, if this didn't work then there's something broken
 about the make rules for the ecpg tests.  I'm a bit suspicious of commit
 69e9768e7b183d4b276d0e067a5a689580eb.

I looked into this, and find that the cause of the problem is that
pgindent touched src/interfaces/ecpg/include/sqlca.h, which is copied
verbatim into preprocessed files by the ecpg preprocessor, so the expected
files had to change in tandem.  This amounts to a dependency, but the make
rules don't know about it.  Should they?  That particular file changes so
seldom that it'd hardly be worth worrying about, but I'm not sure which
other files can get copied similarly.

 What _is_ odd is that I had to change these files after the pgindent run
 in head, but _not_ in the back branches when I removed the tabs from
 comments.  I assume there is something new in 9.4 about they way they
 are built.

I'm confused by this statement.  Your tab-adjustment commits in the back
branches also touched both sqlca.h and the ecpg expected files.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.4

2014-05-06 Thread Bruce Momjian

On Tue, May  6, 2014 at 05:05:00PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Tue, May  6, 2014 at 03:54:24PM -0400, Tom Lane wrote:
  But having said that, if this didn't work then there's something broken
  about the make rules for the ecpg tests.  I'm a bit suspicious of commit
  69e9768e7b183d4b276d0e067a5a689580eb.
 
 I looked into this, and find that the cause of the problem is that
 pgindent touched src/interfaces/ecpg/include/sqlca.h, which is copied
 verbatim into preprocessed files by the ecpg preprocessor, so the expected
 files had to change in tandem.  This amounts to a dependency, but the make
 rules don't know about it.  Should they?  That particular file changes so
 seldom that it'd hardly be worth worrying about, but I'm not sure which
 other files can get copied similarly.
 
  What _is_ odd is that I had to change these files after the pgindent run
  in head, but _not_ in the back branches when I removed the tabs from
  comments.  I assume there is something new in 9.4 about they way they
  are built.
 
 I'm confused by this statement.  Your tab-adjustment commits in the back
 branches also touched both sqlca.h and the ecpg expected files.

They probably did in the back branches as I hit _all_ C files.  I wonder
if pgindent somehow skipped some of them.

Ah, found it.  There is an excludes pattern file list I had forgotten
about;  it has:

/s_lock\.h$
/ecpg/test/expected/
/snowball/libstemmer/
/ecpg/include/(sqlda|sqltypes)\.h$
/ecpg/include/preproc/struct\.h$
/pl/plperl/ppport\.h$

I am thinking I should back out the tab/comment changes in those files
in the back branches, though I would then need to adjust the ecpg
regression tests.  In practice, these files are rarely patched, so it
might be fine to just leave them alone.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

2014-05-06 Thread Bruce Momjian

On Tue, May  6, 2014 at 04:18:50PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Tue, Apr 22, 2014 at 08:50:20PM -0400, Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
  Where are we on the default JSONB opclass change?
 
  Not sure.  I'm for changing it, I think, but it wasn't at all clear
  that we had consensus on that.  We did not have a proposed new name
  for the opclass either ...
 
  Where are we on this question?
 
 Stuck on the naming question.  I'd be willing to do the patch legwork
 if we had a consensus (or even a proposal) for what to rename the
 current jsonb_ops to.

Well, then, we only have a few days to come up with a name.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.4

2014-05-06 Thread Tom Lane

I wrote:
 I looked into this, and find that the cause of the problem is that
 pgindent touched src/interfaces/ecpg/include/sqlca.h, which is copied
 verbatim into preprocessed files by the ecpg preprocessor, so the expected
 files had to change in tandem.  This amounts to a dependency, but the make
 rules don't know about it.  Should they?  That particular file changes so
 seldom that it'd hardly be worth worrying about, but I'm not sure which
 other files can get copied similarly.

While I'm looking at it: there's no dependency forcing the test .c files
to get rebuilt after the ecpg preprocessor changes, either, and that
seems much more likely to be a routine problem.

Arguably, we need some more dependencies in this rule in
ecpg/test/Makefile.regress:

%.c: %.pgc ../regression.h
$(ECPG) -o $@ -I$(srcdir) $

I also notice that some of the subdirectory makefiles that include
Makefile.regress have custom build rules that seem mostly duplicative
of this one, except for passing different switches to ecpg.  Those
would likewise need additions to their dependency lists, which suggests
that the ../regression.h part ought to be wrapped up in some macro.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 141 matches

Mail list logo