Re: [HACKERS] Regression tests vs existing users in an installation

2016-07-17 Thread Michael Paquier
On Mon, Jul 18, 2016 at 10:37 AM, Robert Haas  wrote:
> On Sat, Jul 16, 2016 at 11:38 AM, Tom Lane  wrote:
> We could also do this by loading a C module during the regression
> tests, which seems maybe less ugly than adding a GUC.
> I don't particularly like your suggestion of spooky action at a
> distance between force_parallel_mode and regression_test_mode.  That
> just seems kooky.

One downside of the plugin is that any users willing to do make
installcheck would need to install it as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-07-17 Thread Amit Kapila
On Mon, Jul 18, 2016 at 9:13 AM, Andres Freund  wrote:
> On 2016-07-18 09:07:19 +0530, Amit Kapila wrote:
>> + /*
>> + * Before locking the buffer, pin the visibility map page if it may be
>> + * necessary.
>> + */
>>
>> + if (PageIsAllVisible(BufferGetPage(*buffer)))
>> + visibilitymap_pin(relation, block, );
>> +
>>   LockBuffer(*buffer, BUFFER_LOCK_EXCLUSIVE);
>>
>> I think we need to check for PageIsAllVisible and try to pin the
>> visibility map after taking the lock on buffer. I think it is quite
>> possible that in the time this routine tries to acquire lock on
>> buffer, the page becomes all visible.
>
> I don't see how. Without a cleanup lock it's not possible to mark a page
> all-visible/frozen.
>

Consider the below scenario.

Vacuum
a. acquires a cleanup lock for page - 10
b. busy in checking visibility of tuples
--assume, here it takes some time and in the meantime Session-1
performs step (a) and (b) and start waiting in step- (c)
c. marks the page as all-visible (PageSetAllVisible)
d. unlockandrelease the buffer

Session-1
a. In heap_lock_tuple(), readbuffer for page-10
b. check PageIsAllVisible(), found page is not all-visible, so didn't
acquire the visbilitymap_pin
c. LockBuffer in ExlusiveMode  - here it will wait for vacuum to
release the lock
d. Got the lock, but now the page is marked as all-visible, so ideally
need to recheck the page and acquire the visibilitymap_pin




-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-07-17 Thread Andres Freund
On 2016-07-18 09:07:19 +0530, Amit Kapila wrote:
> + /*
> + * Before locking the buffer, pin the visibility map page if it may be
> + * necessary.
> + */
> 
> + if (PageIsAllVisible(BufferGetPage(*buffer)))
> + visibilitymap_pin(relation, block, );
> +
>   LockBuffer(*buffer, BUFFER_LOCK_EXCLUSIVE);
> 
> I think we need to check for PageIsAllVisible and try to pin the
> visibility map after taking the lock on buffer. I think it is quite
> possible that in the time this routine tries to acquire lock on
> buffer, the page becomes all visible.

I don't see how. Without a cleanup lock it's not possible to mark a page
all-visible/frozen. We might miss the bit becoming unset concurrently,
but that's ok.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-07-17 Thread Andres Freund
On 2016-07-17 23:34:01 -0400, Robert Haas wrote:
> Thanks very much for working on this.  Random suggestions after a quick look:
> 
> + * Before locking the buffer, pin the visibility map page if it may be
> + * necessary.
> 
> s/necessary/needed/
> 
> More substantively, what happens if the situation changes before we
> obtain the buffer lock?  I think you need to release the page lock,
> pin the page after all, and then relock the page.

It shouldn't be able to. Cleanup locks, which are required for
vacuumlazy to do anything relevant, aren't possible with the buffer
pinned.  This pattern is used in heap_delete/heap_update, so I think
we're on a reasonably well trodden path.

> There seem to be several ways to escape from this function without
> releasing the pin on vmbuffer.  From the visibilitymap_pin call here,
> search downward for "return".

Hm, that's cleary not good.

The best thing to address that seems to be to create a
separate jump label, which check vmbuffer and releases the page
lock. Unless you have a better idea.

> + *  visibilitymap_clear - clear bit(s) for one page in visibility map
> 
> I don't really like the parenthesized-s convention as a shorthand for
> "one or more".  It tends to confuse non-native English speakers.
> 
> + * any I/O.  Returns whether any bits have been cleared.
> 
> I suggest "Returns true if any bits have been cleared and false otherwise".

Will change.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-07-17 Thread Amit Kapila
On Mon, Jul 18, 2016 at 8:18 AM, Andres Freund  wrote:
> On 2016-07-16 10:45:26 -0700, Andres Freund wrote:
>>
>>
>> On July 16, 2016 8:49:06 AM PDT, Tom Lane  wrote:
>> >Amit Kapila  writes:
>> >> On Sat, Jul 16, 2016 at 7:02 AM, Andres Freund 
>> >wrote:
>> >>> I think we have two choices how to deal with that: First, we can add
>> >a
>> >>> new flags variable to xl_heap_lock similar to
>> >>> xl_heap_insert/update/... and bump page magic,
>> >
>> >> +1 for going in this way.  This will keep us consistent with how
>> >clear
>> >> the visibility info in other places like heap_xlog_update().
>> >
>> >Yeah.  We've already forced a catversion bump for beta3, and I'm about
>> >to go fix PG_CONTROL_VERSION as well, so there's basically no downside
>> >to doing an xlog version bump as well.  At least, not if you can get it
>> >in before Monday.
>>
>> OK, Cool. Will do it later today.
>
> Took till today. Attached is a rather heavily revised version of
> Sawada-san's patch. Most notably the recovery routines take care to
> reset the vm in all cases, we don't perform visibilitymap_get_status
> from inside a critical section anymore, and
> heap_lock_updated_tuple_rec() also resets the vm (although I'm not
> entirely sure that can practically be hit).
>


@@ -4563,8 +4579,18 @@ heap_lock_tuple(Relation relation, HeapTuple tuple,

+ /*
+ * Before locking the buffer, pin the visibility map page if it may be
+ * necessary.
+ */

+ if (PageIsAllVisible(BufferGetPage(*buffer)))
+ visibilitymap_pin(relation, block, );
+
  LockBuffer(*buffer, BUFFER_LOCK_EXCLUSIVE);

I think we need to check for PageIsAllVisible and try to pin the
visibility map after taking the lock on buffer. I think it is quite
possible that in the time this routine tries to acquire lock on
buffer, the page becomes all visible.  To avoid the similar hazard, we
do try to check the visibility of page after acquiring buffer lock in
heap_update() at below place.

if (vmbuffer == InvalidBuffer && PageIsAllVisible(page))


Similarly, I think heap_lock_updated_tuple_rec() needs to take care of
same.  While I was typing this e-mail, it seems Robert has already
pointed the same issue.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-07-17 Thread Robert Haas
On Sun, Jul 17, 2016 at 10:48 PM, Andres Freund  wrote:
> Took till today. Attached is a rather heavily revised version of
> Sawada-san's patch. Most notably the recovery routines take care to
> reset the vm in all cases, we don't perform visibilitymap_get_status
> from inside a critical section anymore, and
> heap_lock_updated_tuple_rec() also resets the vm (although I'm not
> entirely sure that can practically be hit).
>
> I'm doing some more testing, and Robert said he could take a quick look
> at the patch. If somebody else... Will push sometime after dinner.

Thanks very much for working on this.  Random suggestions after a quick look:

+ * Before locking the buffer, pin the visibility map page if it may be
+ * necessary.

s/necessary/needed/

More substantively, what happens if the situation changes before we
obtain the buffer lock?  I think you need to release the page lock,
pin the page after all, and then relock the page.

There seem to be several ways to escape from this function without
releasing the pin on vmbuffer.  From the visibilitymap_pin call here,
search downward for "return".

+ *  visibilitymap_clear - clear bit(s) for one page in visibility map

I don't really like the parenthesized-s convention as a shorthand for
"one or more".  It tends to confuse non-native English speakers.

+ * any I/O.  Returns whether any bits have been cleared.

I suggest "Returns true if any bits have been cleared and false otherwise".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-07-17 Thread Andres Freund
On 2016-07-16 10:45:26 -0700, Andres Freund wrote:
> 
> 
> On July 16, 2016 8:49:06 AM PDT, Tom Lane  wrote:
> >Amit Kapila  writes:
> >> On Sat, Jul 16, 2016 at 7:02 AM, Andres Freund 
> >wrote:
> >>> I think we have two choices how to deal with that: First, we can add
> >a
> >>> new flags variable to xl_heap_lock similar to
> >>> xl_heap_insert/update/... and bump page magic,
> >
> >> +1 for going in this way.  This will keep us consistent with how
> >clear
> >> the visibility info in other places like heap_xlog_update().
> >
> >Yeah.  We've already forced a catversion bump for beta3, and I'm about
> >to go fix PG_CONTROL_VERSION as well, so there's basically no downside
> >to doing an xlog version bump as well.  At least, not if you can get it
> >in before Monday.
> 
> OK, Cool. Will do it later today.

Took till today. Attached is a rather heavily revised version of
Sawada-san's patch. Most notably the recovery routines take care to
reset the vm in all cases, we don't perform visibilitymap_get_status
from inside a critical section anymore, and
heap_lock_updated_tuple_rec() also resets the vm (although I'm not
entirely sure that can practically be hit).

I'm doing some more testing, and Robert said he could take a quick look
at the patch. If somebody else... Will push sometime after dinner.

Regards,

Andres
>From 26f6eff8cef9b436e328a7364d6e4954b702208b Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Sun, 17 Jul 2016 19:30:38 -0700
Subject: [PATCH] Clear all-frozen visibilitymap status when locking tuples.

Since a892234 & fd31cd265 the visibilitymap's freeze bit is used to
avoid vacuuming the whole relation in anti-wraparound vacuums. Doing so
correctly relies on not adding xids to the heap without also unsetting
the visibilitymap flag.  Tuple locking related code has not done so.

To allow selectively resetting all-frozen - to avoid pessimizing
heap_lock_tuple - allow to selectively reset the all-frozen with
visibilitymap_clear(). To avoid having to use
visibilitymap_get_status (e.g. via VM_ALL_FROZEN) inside a critical
section, have visibilitymap_clear() return whether any bits have been
reset.

The added flags field fields to xl_heap_lock and xl_heap_lock_updated
require bumping the WAL magic. Since there's already been a catversion
bump since the last beta, that's not an issue.

Author: Masahiko Sawada, heavily revised by Andres Freund
Discussion: CAEepm=3fwabwryvw9swhylty4sxvf0xblvxqowuodincx9m...@mail.gmail.com
Backpatch: -
---
 src/backend/access/heap/heapam.c| 126 +---
 src/backend/access/heap/visibilitymap.c |  18 +++--
 src/backend/access/rmgrdesc/heapdesc.c  |   6 +-
 src/backend/commands/vacuumlazy.c   |   6 +-
 src/include/access/heapam_xlog.h|   9 ++-
 src/include/access/visibilitymap.h  |   4 +-
 src/include/access/xlog_internal.h  |   2 +-
 7 files changed, 145 insertions(+), 26 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 2815d91..1216f3f 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2423,7 +2423,7 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 		PageClearAllVisible(BufferGetPage(buffer));
 		visibilitymap_clear(relation,
 			ItemPointerGetBlockNumber(&(heaptup->t_self)),
-			vmbuffer);
+			vmbuffer, VISIBILITYMAP_VALID_BITS);
 	}
 
 	/*
@@ -2737,7 +2737,7 @@ heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples,
 			PageClearAllVisible(page);
 			visibilitymap_clear(relation,
 BufferGetBlockNumber(buffer),
-vmbuffer);
+vmbuffer, VISIBILITYMAP_VALID_BITS);
 		}
 
 		/*
@@ -3239,7 +3239,7 @@ l1:
 		all_visible_cleared = true;
 		PageClearAllVisible(page);
 		visibilitymap_clear(relation, BufferGetBlockNumber(buffer),
-			vmbuffer);
+			vmbuffer, VISIBILITYMAP_VALID_BITS);
 	}
 
 	/* store transaction information of xact deleting the tuple */
@@ -3925,6 +3925,7 @@ l2:
 		TransactionId xmax_lock_old_tuple;
 		uint16		infomask_lock_old_tuple,
 	infomask2_lock_old_tuple;
+		bool		cleared_all_frozen = false;
 
 		/*
 		 * To prevent concurrent sessions from updating the tuple, we have to
@@ -3968,6 +3969,17 @@ l2:
 		/* temporarily make it look not-updated, but locked */
 		oldtup.t_data->t_ctid = oldtup.t_self;
 
+		/*
+		 * Clear all-frozen bit on visibility map if needed. We could
+		 * immediately reset ALL_VISIBLE, but given that the WAL logging
+		 * overhead would be unchanged, that doesn't seem necessarily
+		 * worthwhile.
+		 */
+		if (PageIsAllVisible(BufferGetPage(buffer)) &&
+			visibilitymap_clear(relation, block, vmbuffer,
+VISIBILITYMAP_ALL_FROZEN))
+			cleared_all_frozen = true;
+
 		MarkBufferDirty(buffer);
 
 		if (RelationNeedsWAL(relation))
@@ -3982,6 +3994,8 @@ l2:
 			xlrec.locking_xid = xmax_lock_old_tuple;
 			xlrec.infobits_set = 

Re: [HACKERS] One process per session lack of sharing

2016-07-17 Thread Jan Wieck
On Sun, Jul 17, 2016 at 9:28 PM, Robert Haas  wrote:

> On Sun, Jul 17, 2016 at 3:12 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> On Fri, Jul 15, 2016 at 4:28 AM, Craig Ringer 
> wrote:
> >>> I don't think anyone's considering moving from multi-processing to
> >>> multi-threading in PostgreSQL. I really, really like the protection
> that the
> >>> shared-nothing-by-default process model gives us, among other things.
> >
> >> We get some very important protection by having the postmaster in a
> >> separate address space from the user processes, but separating the
> >> other backends from each other has no value.
> >
> > I do not accept that proposition in the least.  For one thing, debugging
> > becomes an order of magnitude harder when you've got multiple threads
> > in the same address space: you have essentially zero guarantees about
> > what one thread might have done to the supposedly-private state of
> > another one.
>
> Well, that's true, in theory.  In practice, random memory clobbers are
> a pretty rare type of bug.  The chances that thread A crashed because
> thread B overwrote its supposedly-private state are just not very
> high.  Also, such bugs are extremely hard to troubleshoot even when
> there is only process and one thread involved, so it's not like things
> are a rose garden today.  I don't buy the argument that it's worth
> giving up arbitrary amounts of performance and functionality for this.
>

The random memory clobbers are partially rare because they often aren't
triggered. Many of them are of the dangling pointer type and in a single
threaded process, there is less of a chance to allocate and overwrite the
free'd and then used memory.

But you are right, all of them are tricky to hunt and I remember using
hardware watch points and what not in the past. They make your day
though when you finally find them. So you will be sorry when the last
one is hunted down.



>
> >> ... enough other people have
> >> written complex, long-running multithreaded programs that I think it
> >> is probably possible to do so without unduly compromising reliability.
> >
> > I would bet that every single successful project of that sort has been
> > written with threading in mind from the get-go.  Trying to retro-fit
> > threading onto thirty years' worth of single-threaded coding is a recipe
> > for breaking your project; even if you had control of all the code
> running
> > in the address space, which we assuredly do not.
>
> I admit that it is risky, but I think there are things that could be
> done to limit the risk.  I don't believe we can indefinitely continue
> to ignore the potential performance benefits of making a switch like
> this.  Breaking a thirty-year old code base irretrievably would be
> sad, but letting it fade into irrelevance because we're not willing to
> make the architecture changes that are needed to remain relevant would
> be sad, too.
>

I have to agree with Robert on that one. We have been "thinking" about
multi-threading some 16 years ago already. We were aware of the dangers
and yet at least considered doing it some day for things like a parallel
executor. And that would probably be our best bang for the buck still.

The risks of jamming all sessions into a single, multi-threaded process are
huge. Think of snapshot visibility together with catalog cache
invalidations.
I'd say no to that one as a first step.

But multi-threading the executor or even certain utility commands at first
should not be rejected purely on the notion that "we don't have
multithreading
today."


Regards, Jan








>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Jan Wieck
Senior Postgres Architect
http://pgblog.wi3ck.info


Re: [HACKERS] Regression tests vs existing users in an installation

2016-07-17 Thread Robert Haas
On Sat, Jul 16, 2016 at 11:38 AM, Tom Lane  wrote:
> I'm coming to the conclusion that the only thing that will make this
> materially better in the long run is automatic enforcement of a convention
> about what role names may be created in the regression tests.  See my
> response to Stephen just now for a concrete proposal.

We could also do this by loading a C module during the regression
tests, which seems maybe less ugly than adding a GUC.

I don't particularly like your suggestion of spooky action at a
distance between force_parallel_mode and regression_test_mode.  That
just seems kooky.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One process per session lack of sharing

2016-07-17 Thread Robert Haas
On Sun, Jul 17, 2016 at 3:04 PM, Jim Nasby  wrote:
> On 7/14/16 12:34 AM, Craig Ringer wrote:
>> Starting with a narrow scope would help. Save/restore GUCs and the other
>> easy stuff, and disallow sessions that are actively LISTENing, hold
>> advisory locks, have open cursors, etc from being saved and restored.
>
> Along the lines of narrow scope... I wonder about allowing functions to
> execute in a separate process that communicates back to the main backend.
> That would allow unsafe languages to operate under a different OS user that
> was tightly restricted (ie: nobody/nogroup), but it could also allow for a
> pool of "function executors". Depending on how it was structured, it might
> also insulate the database from having to panic if a function crashed it's
> process.

You can do this sort of thing with background workers today.  Just
create a parallel context and set the entrypoint to code that will
execute the guts of the function.  Retrieve the results using a
shm_mq.  With somewhat more work, you could have persistent background
workers that get reused for one function call after another instead of
being continually spun up and torn down.  However, I suspect the IPC
costs would make this rather slow.  Thomas Munro mentioned to me an
IPC facility called "doors" a while back which, if I understood him
correctly, is supposed to let you do a remote procedure call which
also transfers the current processes' time slice to the process on the
other end of the door.  Maybe that would be faster, and fast enough;
or maybe not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One process per session lack of sharing

2016-07-17 Thread Robert Haas
On Sun, Jul 17, 2016 at 3:12 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Jul 15, 2016 at 4:28 AM, Craig Ringer  wrote:
>>> I don't think anyone's considering moving from multi-processing to
>>> multi-threading in PostgreSQL. I really, really like the protection that the
>>> shared-nothing-by-default process model gives us, among other things.
>
>> We get some very important protection by having the postmaster in a
>> separate address space from the user processes, but separating the
>> other backends from each other has no value.
>
> I do not accept that proposition in the least.  For one thing, debugging
> becomes an order of magnitude harder when you've got multiple threads
> in the same address space: you have essentially zero guarantees about
> what one thread might have done to the supposedly-private state of
> another one.

Well, that's true, in theory.  In practice, random memory clobbers are
a pretty rare type of bug.  The chances that thread A crashed because
thread B overwrote its supposedly-private state are just not very
high.  Also, such bugs are extremely hard to troubleshoot even when
there is only process and one thread involved, so it's not like things
are a rose garden today.  I don't buy the argument that it's worth
giving up arbitrary amounts of performance and functionality for this.

>> ... enough other people have
>> written complex, long-running multithreaded programs that I think it
>> is probably possible to do so without unduly compromising reliability.
>
> I would bet that every single successful project of that sort has been
> written with threading in mind from the get-go.  Trying to retro-fit
> threading onto thirty years' worth of single-threaded coding is a recipe
> for breaking your project; even if you had control of all the code running
> in the address space, which we assuredly do not.

I admit that it is risky, but I think there are things that could be
done to limit the risk.  I don't believe we can indefinitely continue
to ignore the potential performance benefits of making a switch like
this.  Breaking a thirty-year old code base irretrievably would be
sad, but letting it fade into irrelevance because we're not willing to
make the architecture changes that are needed to remain relevant would
be sad, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Regression tests vs existing users in an installation

2016-07-17 Thread Tom Lane
I've gone ahead and pushed a patch that does all of the cosmetic renamings
needed to clean up the global-object-names situation.  I've not done
anything yet about those special cases in the rolenames test, since it's
open for discussion exactly what to do there.  I figured that this patch
was bulky enough, and mechanical enough, that there wasn't much point in
putting it up for review; the buildfarm will do a lot better at finding
any mistakes I may have made.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements

2016-07-17 Thread Julien Rouhaud
On 18/07/2016 01:06, Peter Geoghegan wrote:
> On Sun, Jul 17, 2016 at 12:22 AM, Jun Cheol Gim  wrote:
>> If we have timestamp of first and last executed, we can easily gather thess
>> informations and there are tons of more use cases.
> 
> -1 from me.
> 
> I think that this is the job of a tool that aggregates things from
> pg_stat_statements. It's unfortunate that there isn't a good
> third-party tool that does that, but there is nothing that prevents
> it.
> 

FWIW there's https://github.com/dalibo/powa-archivist which does that,
assuming you're using a 9.4+ server.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DO with a large amount of statements get stuck with high memory consumption

2016-07-17 Thread Jan Wieck
BTW, here is the email thread about double-linking MemoryContext children
patch, that Kevin at the end committed to master.

https://www.postgresql.org/message-id/55F2D834.8040106%40wi3ck.info


Regards, Jan


On Sat, Jul 16, 2016 at 3:47 PM, Jan Wieck  wrote:

>
>
> On Tue, Jul 12, 2016 at 3:29 PM, Merlin Moncure 
> wrote:
>
>> I've noticed that pl/pgsql functions/do commands do not behave well
>> when the statement resolves and frees memory.   To be clear:
>>
>> FOR i in 1..100
>> LOOP
>>   INSERT INTO foo VALUES (i);
>> END LOOP;
>>
>> ...runs just fine while
>>
>> BEGIN
>>   INSERT INTO foo VALUES (1);
>>   INSERT INTO foo VALUES (2);
>>   ...
>>   INSERT INTO foo VALUES (100);
>> END;
>>
>
> This sounds very much like what led to
> commit 25c539233044c235e97fd7c9dc600fb5f08fe065.
>
> It seems that patch was only applied to master and never backpatched to
> 9.5 or earlier.
>
>
> Regards, Jan
>
>
>
>
>>
>> (for the curious, create a script yourself via
>> copy (
>>   select
>> 'do $$begin create temp table foo(i int);'
>>   union all select
>> format('insert into foo values (%s);', i) from
>> generate_series(1,100) i
>>   union all select 'raise notice ''abandon all hope!''; end; $$;'
>> ) to '/tmp/breakit.sql';
>>
>> ...while consume amounts of resident memory proportional to the number
>> of statemnts and eventually crash the server.  The problem is obvious;
>> each statement causes a plan to get created and the server gets stuck
>> in a loop where SPI_freeplan() is called repeatedly.  Everything is
>> working as designed I guess, but when this happens it's really
>> unpleasant: the query is uncancellable and unterminatable, nicht gut.
>> A pg_ctl kill ABRT  will do the trick but I was quite astonished
>> to see linux take a few minutes to clean up the mess (!) on a somewhat
>> pokey virtualized server with lots of memory.  With even as little as
>> ten thousand statements the cleanup time far exceed the runtime of the
>> statement block.
>>
>> I guess the key takeaway here is, "don't do that"; pl/pgsql
>> aggressively generates plans and turns out to be a poor choice for
>> bulk loading because of all the plan caching.   Having said that, I
>> can't help but wonder if there should be a (perhaps user configurable)
>> limit to the amount of SPI plans a single function call should be able
>> to acquire on the basis you are going to smack into very poor
>> behaviors in the memory subsystem.
>>
>> Stepping back, I can't help but wonder what the value of all the plan
>> caching going on is at all for statement blocks.  Loops might comprise
>> a notable exception, noted.  I'd humbly submit though that (relative
>> to functions) it's much more likely to want to do something like
>> insert a lot of statements and a impossible to utilize any cached
>> plans.
>>
>> This is not an academic gripe -- I just exploded production :-D.
>>
>> merlin
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>
>
>
> --
> Jan Wieck
> Senior Postgres Architect
>



-- 
Jan Wieck
Senior Postgres Architect
http://pgblog.wi3ck.info


Re: [HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements

2016-07-17 Thread Peter Geoghegan
On Sun, Jul 17, 2016 at 4:15 PM, Tom Lane  wrote:
> The concern I've got about this proposal is that the results get very
> questionable as soon as we start dropping statement entries for lack
> of space.  last_executed would be okay, perhaps, but first_executed
> not so much.

Agreed.

Also, for what it's worth, I should point out to Jun that
GetCurrentTimestamp() should definitely not be called when a spinlock
is held like that.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Obsolete comment within fmgr.c

2016-07-17 Thread Tom Lane
Peter Geoghegan  writes:
> On Thu, Apr 14, 2016 at 4:03 PM, Peter Geoghegan  wrote:
>> Attached patch removes obsolete comment from fmgr.c.

> This patch seems to have been overlooked. It's a pretty straightforward case.

Yeah --- pushed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements

2016-07-17 Thread Tom Lane
Peter Geoghegan  writes:
> On Sun, Jul 17, 2016 at 12:22 AM, Jun Cheol Gim  wrote:
>> If we have timestamp of first and last executed, we can easily gather thess
>> informations and there are tons of more use cases.

> -1 from me.

> I think that this is the job of a tool that aggregates things from
> pg_stat_statements. It's unfortunate that there isn't a good
> third-party tool that does that, but there is nothing that prevents
> it.

The concern I've got about this proposal is that the results get very
questionable as soon as we start dropping statement entries for lack
of space.  last_executed would be okay, perhaps, but first_executed
not so much.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements

2016-07-17 Thread Peter Geoghegan
On Sun, Jul 17, 2016 at 12:22 AM, Jun Cheol Gim  wrote:
> If we have timestamp of first and last executed, we can easily gather thess
> informations and there are tons of more use cases.

-1 from me.

I think that this is the job of a tool that aggregates things from
pg_stat_statements. It's unfortunate that there isn't a good
third-party tool that does that, but there is nothing that prevents
it.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Obsolete comment within fmgr.c

2016-07-17 Thread Peter Geoghegan
On Thu, Apr 14, 2016 at 4:03 PM, Peter Geoghegan  wrote:
> Attached patch removes obsolete comment from fmgr.c.

This patch seems to have been overlooked. It's a pretty straightforward case.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving executor performance - tidbitmap

2016-07-17 Thread Andres Freund
On 2016-07-17 08:32:17 -0400, Robert Haas wrote:
> On Wed, Jul 13, 2016 at 11:06 PM, Andres Freund  wrote:
> > The major issue with the simplehash implementation in the path is
> > probably the deletion; which should rather move cells around, rather
> > than use toombstones. But that was too complex for a POC ;). Also, it'd
> > likely need a proper iterator interface.
>
> Do we ever need to delete from a TIDBitmap?  Probably not, but I'm
> guessing you have other uses for this in mind.

We do, via BitmapAnd.


> > FWIW, the dynahash usage in nodeAgg.c is a major bottleneck in a number
> > of other queries.
>
> Can we use this implementation for that as well, or are we going to
> need yet another one?

I've not tested it, but I'd assume that something like the simplehash
should work there. It's a bit more complicated of a scenario though.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A Modest Upgrade Proposal

2016-07-17 Thread Joshua D. Drake

On 07/17/2016 11:55 AM, Jan Wieck wrote:


Yeah, I haven't meet anyone yet that would like to have:

select replicate_these_relations('['public']);

vs:

ALTER SCHEMA public ENABLE REPLICATION;

(or something like that).


I generally agree, but I think the more important question is
"Why?". Is it becouse DDL looks more like a sentence? Is it because
arrays are a PITA? Is it too hard to call functions?


IMO, because it isn't code. I think that people forget that many, many 
DBAs are not developers, they are business analysts that happen to also 
be DBAs. Similarly, there is a reason why MongoDB/NoSQL will never be as 
popular as good old fashion SQL.





Once you get fine grained enough to support replicating different sets
of possibly overlapping objects/namespaces to different groups of
recipients, the DDL approach becomes just as convoluted as calling
functions and nobody will memorize the entire syntax.



Ehh, partially true. For example, I don't know every single nuance of 
ALTER TABLE but that is what the \h is for. Replication would be no 
different.


JD

--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sslmode=require fallback

2016-07-17 Thread Christoph Berg
Re: Peter Eisentraut 2016-07-17 

> On 7/15/16 3:07 PM, Andrew Dunstan wrote:
> > Do those packagers who install dummy certificates and turn SSL on also 
> > change their pg_hba.conf.sample files to use hostssl?. That could go a 
> > long way towards encouraging people.
> 
> Debian, which I guess sort of started this, does not, but there are
> allusions to it in the TODO list.

I guess we should actually do that if we had any non-local(host)
entries in there by default, but we don't touch the default
pg_hba.conf from pg_createcluster.

Possibly we could add some hostssl example in comments to the end of
the .sample file so people could grow the habit of using that instead
of host (I certainly aren't doing myself that yet), but I'd rather see
that changed upstream.

So, how about something like this for the end of pg_hba.conf.sample?

# Examples for allowing access from given networks:
#hostssl all  all  192.0.2.0/24   @authmethod@
#hostssl all  all  2001:DB8::/32  @authmethod@

(These are "documentation" networks from RF5737/RFC3849.)

Christoph


signature.asc
Description: PGP signature


Re: [HACKERS] A Modest Upgrade Proposal

2016-07-17 Thread Petr Jelinek

On 17/07/16 20:50, Robert Haas wrote:



It's the same with cluster-wide management, dump and restore of replication
state to re-create a replication setup elsewhere, etc. We have to build the
groundwork first. Trying to pour the top storey concrete when the bottom
storey isn't even there yet isn't going to work out. You've argued
effectively the same thing elsewhere, saying that the pglogical submission
tried to do too much and should be further cut down.


Absolutely.  I didn't mean to imply that the scope of that submission
should be expanded.  What I'm a bit concerned about is that maybe we
haven't given enough thought to how some of this stuff is going to
work.  Petr replied earlier with an assertion that all of the things
that I mentioned could be done using pglogical, but I'm not convinced.
I don't see how you can use pglogical to build a self-healing
replicated cluster, which is ultimately what people want.  Maybe
that's just because I'm not personally deeply enmeshed in that
project, but I do try to read all of the relevant threads on
pgsql-hackers and keep tabs on what is happening.



That really depends on what you call self-healing replicated cluster.


Suppose server A is publishing to server B.  Well, clearly, A needs to
have a slot for server B, but does that slot correspond precisely to
the publication, or is that represented in some other way?  How is the
subscription represented on server B?  What happens if either A or B
undergoes a dump-and-restore cycle?  It's just fuzzy to me how this
stuff is supposed to work in detail.


Yeah, that's because it is. The dump/restore cycle would work provided 
you stopped the replication before doing it. That might not be perfect 
but it's still more than physical can do. Solving more complex scenarios 
is something for the future. Logical PITR might be the answer for that, 
not sure yet.


About slots. Slots are just primitive which helps us to get snapshot 
mapped to LSN, keep the historical catalog and wal files. There is no 
reason for single replication path to be forever tied to single slot 
though. In fact in pglogical (and my plan for core is same) we already 
create limited lifespan slots to get new snapshots when either adding 
new table or re-syncing the existing one. This is one of the reasons why 
I don't really see much usefulness in being able to do snapshot inside 
the slot once it started replicating already btw, using multiple slots 
has also advantage of parallelism (replication of other tables does not 
lag because we are syncing another one).





It does make me wonder if we should look at extension points for the
walsender protocol though, now we're like to have a future desire for newer
versions to connect to older versions - it'd be great if we could do
something like pg_upgrade_support to allow an enhanced logical migration
from 10.0 to 11.0 by installing some extension in 10.0 first.


Maybe, but let's get something that can work from >=10.0 to >=10.0 first.



Agreed.

--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A Modest Upgrade Proposal

2016-07-17 Thread Petr Jelinek

On 17/07/16 20:08, Jim Nasby wrote:

On 7/13/16 2:06 PM, Joshua D. Drake wrote:

On 07/07/2016 01:01 PM, Robert Haas wrote:


There was an unconference session on this topic at PGCon and quite a
number of people there stated that they found DDL to be an ease-of-use
feature and wanted to have it.


Yeah, I haven't meet anyone yet that would like to have:

select replicate_these_relations('['public']);

vs:

ALTER SCHEMA public ENABLE REPLICATION;

(or something like that).


I generally agree, but I think the more important question is "Why?". Is
it becouse DDL looks more like a sentence? Is it because arrays are a
PITA? Is it too hard to call functions?


For me it's many small reasons. I want to store it in catalogs and some 
things there are nicer when you manipulate using standard DDL processing 
(like dependencies for example). The syntax is also bit nicer. Our 
documentation works better for DDLs than functions (that's something we 
should fix but I am not doing it as part of this patch). Same goes for 
psql tab completion. We automatically gain things like event triggers. 
The support in pg_dump is also more straightforward with DDL.


It might make sense to have functions for manipulating slots and origins 
as those are just primitives which user should not have to fiddle with 
but for things that are directly meant for user interaction DDL just 
feels better.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One process per session lack of sharing

2016-07-17 Thread Tom Lane
Robert Haas  writes:
> On Fri, Jul 15, 2016 at 4:28 AM, Craig Ringer  wrote:
>> I don't think anyone's considering moving from multi-processing to
>> multi-threading in PostgreSQL. I really, really like the protection that the
>> shared-nothing-by-default process model gives us, among other things.

> We get some very important protection by having the postmaster in a
> separate address space from the user processes, but separating the
> other backends from each other has no value.

I do not accept that proposition in the least.  For one thing, debugging
becomes an order of magnitude harder when you've got multiple threads
in the same address space: you have essentially zero guarantees about
what one thread might have done to the supposedly-private state of
another one.

> ... enough other people have
> written complex, long-running multithreaded programs that I think it
> is probably possible to do so without unduly compromising reliability.

I would bet that every single successful project of that sort has been
written with threading in mind from the get-go.  Trying to retro-fit
threading onto thirty years' worth of single-threaded coding is a recipe
for breaking your project; even if you had control of all the code running
in the address space, which we assuredly do not.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One process per session lack of sharing

2016-07-17 Thread Jim Nasby

On 7/14/16 12:34 AM, Craig Ringer wrote:

Starting with a narrow scope would help. Save/restore GUCs and the other
easy stuff, and disallow sessions that are actively LISTENing, hold
advisory locks, have open cursors, etc from being saved and restored.


Along the lines of narrow scope... I wonder about allowing functions to 
execute in a separate process that communicates back to the main 
backend. That would allow unsafe languages to operate under a different 
OS user that was tightly restricted (ie: nobody/nogroup), but it could 
also allow for a pool of "function executors". Depending on how it was 
structured, it might also insulate the database from having to panic if 
a function crashed it's process.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A Modest Upgrade Proposal

2016-07-17 Thread Jan Wieck
On Sun, Jul 17, 2016 at 2:08 PM, Jim Nasby  wrote:

> On 7/13/16 2:06 PM, Joshua D. Drake wrote:
>
>> On 07/07/2016 01:01 PM, Robert Haas wrote:
>>
>> There was an unconference session on this topic at PGCon and quite a
>>> number of people there stated that they found DDL to be an ease-of-use
>>> feature and wanted to have it.
>>>
>>
>> Yeah, I haven't meet anyone yet that would like to have:
>>
>> select replicate_these_relations('['public']);
>>
>> vs:
>>
>> ALTER SCHEMA public ENABLE REPLICATION;
>>
>> (or something like that).
>>
>
> I generally agree, but I think the more important question is "Why?". Is
> it becouse DDL looks more like a sentence? Is it because arrays are a PITA?
> Is it too hard to call functions?


Once you get fine grained enough to support replicating different sets
of possibly overlapping objects/namespaces to different groups of
recipients, the DDL approach becomes just as convoluted as calling
functions and nobody will memorize the entire syntax.


Jan





>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532)   mobile: 512-569-9461
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Jan Wieck
Senior Postgres Architect
http://pgblog.wi3ck.info


Re: [HACKERS] A Modest Upgrade Proposal

2016-07-17 Thread Robert Haas
On Thu, Jul 14, 2016 at 2:29 AM, Craig Ringer  wrote:
> Yes, I'd like that too. I'd also like to have fully parallized writeable
> queries right now. But we can't build everything all at once.

I agree.

> Before doing parallelized writes, things like dsm, dsm queues, group
> locking, worker management, and read parallelism were all necessary.

Yep.

> It's the same with cluster-wide management, dump and restore of replication
> state to re-create a replication setup elsewhere, etc. We have to build the
> groundwork first. Trying to pour the top storey concrete when the bottom
> storey isn't even there yet isn't going to work out. You've argued
> effectively the same thing elsewhere, saying that the pglogical submission
> tried to do too much and should be further cut down.

Absolutely.  I didn't mean to imply that the scope of that submission
should be expanded.  What I'm a bit concerned about is that maybe we
haven't given enough thought to how some of this stuff is going to
work.  Petr replied earlier with an assertion that all of the things
that I mentioned could be done using pglogical, but I'm not convinced.
I don't see how you can use pglogical to build a self-healing
replicated cluster, which is ultimately what people want.  Maybe
that's just because I'm not personally deeply enmeshed in that
project, but I do try to read all of the relevant threads on
pgsql-hackers and keep tabs on what is happening.

Suppose server A is publishing to server B.  Well, clearly, A needs to
have a slot for server B, but does that slot correspond precisely to
the publication, or is that represented in some other way?  How is the
subscription represented on server B?  What happens if either A or B
undergoes a dump-and-restore cycle?  It's just fuzzy to me how this
stuff is supposed to work in detail.

>> I don't understand this.  We add new DDL in new releases, and we avoid
>> changing the meaning existing of DDL.  Using function interfaces won't
>> make it possible to change the meaning of existing syntax, and it
>> won't make it any more possible to add new syntax.  It will just make
>> replication commands be spelled differently from everything else.
>
> Say you want to upgrade from 9.4 to 10.0 using the new logical replication
> features. How would that be possible if you can't add the required
> interfaces for setting up the downstream side to 9.4 as an extension?
>
> I think what we're leaning toward here is "don't do that". Tools like
> pglogical will have to carry that load until the Pg versions with built-in
> replication become the "old" versions to be upgraded _from_.

That may be true, but it's hard to tell whether that's going to be
feasible anyway without a fleshed-out proposal for how this is all
going to work.  If this can be made to work for upgrades from 9.4 with
only an extension, that would IMHO be worth trying to do.  But, just
for example, adding a replication set capability to 10 isn't going to
affect that one way or the other.  For upgrades, you'll want to
replicate the whole database.

> Ideally the new infrastructure won't have to make normal (non-walsender)
> libpq connections and will work entirely over the walsender protocol. That's
> not extensible at all, so the point becomes kind of moot, it just can't be
> used for downversion upgrades. Pity, but cleaner in the long run.

Yeah.  I'm entirely willing to leave downgrades to earlier versions to
extensions.  "Cleaner in the long run" has got to be a high priority
for core features; if we had not followed that policy in the past,
we'd have an unmaintainable mess now.

> It does make me wonder if we should look at extension points for the
> walsender protocol though, now we're like to have a future desire for newer
> versions to connect to older versions - it'd be great if we could do
> something like pg_upgrade_support to allow an enhanced logical migration
> from 10.0 to 11.0 by installing some extension in 10.0 first.

Maybe, but let's get something that can work from >=10.0 to >=10.0 first.

>> I'm concerned about dump-and-restore
>> preserving as much state as is usefully possible, because I think
>> that's critical for the user experience
>
> Right. See the pain points caused by our current dump issues like the
> brokenness around dumping security labels, grants, etc on the database its
> self. It certainly matters.
>
> The keyword there is "usefully" though. Replication sets: definitely useful.
> Knowledge about what peers we were connected to and what we were up to on
> those peers: possibly useful, if we have some way to meaningfully encode
> that knowledge, but far from crucial, especially since we can't actually
> resume replay from them without replication slots and replication
> identifiers we can't dump.
>
> It seems we were mostly crossing wires about different assumptions about
> what dump and restore would include.

Yes, that may be the case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] A Modest Upgrade Proposal

2016-07-17 Thread Jim Nasby

On 7/7/16 8:17 PM, Simon Riggs wrote:

Simplicity is key, I agree. But that's just a user interface feature,
not a comment on what's underneath the covers. pg_upgrade is not simple
and is never likely to be so, under the covers.


Right, and what I'd prefer effort put into is making managing 
replication in all forms easier. Replication has a lot of uses outside 
of upgrades.


FWIW, I've actually never used pg_upgrade because I view it as high-risk 
for the environments I've dealt with. There's no ability to fall back to 
the old version without losing data, and because of it's binary nature 
the odds of some kind of a corruption event happening are far higher 
than with something like londiste. Certainly many environments don't 
have those concerns though. Having options are good.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One process per session lack of sharing

2016-07-17 Thread Robert Haas
On Fri, Jul 15, 2016 at 4:28 AM, Craig Ringer  wrote:
> I don't think anyone's considering moving from multi-processing to
> multi-threading in PostgreSQL. I really, really like the protection that the
> shared-nothing-by-default process model gives us, among other things.

We get some very important protection by having the postmaster in a
separate address space from the user processes, but separating the
other backends from each other has no value.  If one of the backends
dies, we take provisions to make sure they all die, which is little or
no different from what would happen if we had the postmaster as one
process and all of the other backends as threads within a second
process.  As far as I can see, running each and every backend in a
separate process has downsides but no upsides.  It slows down the
system and makes it difficult to share data between processes without
much in the way of benefits.

> I'm personally not absolutely opposed to threading, but you'll find it hard
> to convince anyone it's worth the huge work required to ensure that
> everything in PostgreSQL is done thread-safely, adapt all our logic to
> handle thread IDs where we use process IDs, etc. It'd be a massive amount of
> work for no practical gain for most users, and a huge reliability loss in
> the short to medium term as we ironed out all the bugs.

It would actually be pretty simple to allow PostgreSQL to be compiled
to use either processes or threads, provided that you don't mind using
something like GCC's __thread keyword.  When compiling with threads,
slap __thread on every global variable we have (using some kind of
macro trick, no doubt), spawn threads instead of processes wherever
you like, and I think you're more or less done.  There could be some
problems with third-party libraries we use, but I bet there probably
wouldn't be all that many problems.  Of course, there's not
necessarily a whole lot of benefit to such a minimal transformation,
but you could certainly do useful things on top of it.  For example,
the parallel query code could arrange to pass pointers to existing
data structures in some cases instead of copying those data structures
as we do currently.  Spinning up a new thread and giving it pointers
to some of the old thread's data structures is probably a lot faster
than spinning up a new process and serializing and deserializing those
data structures, so you wouldn't necessarily have to do all that much
work before the "thread model" compile started to have noticeable
advantages over the "process model" compile.  You could pass tuples
around directly rather than by copying them, too.  A lot of things
that we might want to do in this area would expose us to the risk of
server-lifespan memory leaks, and we'd need to spend time and energy
figuring out how to minimize those risks, but enough other people have
written complex, long-running multithreaded programs that I think it
is probably possible to do so without unduly compromising reliability.

> Where I agreed with you, and where I think Robert sounded like he was
> agreeing, was that our current design where we have one executor per user
> sessions and can't suspend/resume sessions is problematic.

The problems are very closely related.  The problem with suspending
and resuming sessions is that you need to keep all of the session's
global variable contents (except for any caches that are safe to
rebuild) until the session is resumed; and we have no way of
discovering all of the global variables a process is using and no
general mechanism that can be used to serialize and deserialize them.
The problem with using threads is that code which uses global
variables will not be thread-safe unless all of those variables are
thread-local.  Getting our hands around the uncontrolled use of global
variables - doubtless at the risk of breaking third-party code - seems
crucial.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A Modest Upgrade Proposal

2016-07-17 Thread Jim Nasby

On 7/13/16 2:06 PM, Joshua D. Drake wrote:

On 07/07/2016 01:01 PM, Robert Haas wrote:


There was an unconference session on this topic at PGCon and quite a
number of people there stated that they found DDL to be an ease-of-use
feature and wanted to have it.


Yeah, I haven't meet anyone yet that would like to have:

select replicate_these_relations('['public']);

vs:

ALTER SCHEMA public ENABLE REPLICATION;

(or something like that).


I generally agree, but I think the more important question is "Why?". Is 
it becouse DDL looks more like a sentence? Is it because arrays are a 
PITA? Is it too hard to call functions?

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] application_name in process name?

2016-07-17 Thread Tom Lane
Jim Nasby  writes:
> On 7/13/16 12:07 PM, Tom Lane wrote:
>> In a lot of situations ("top" for instance) only a limited number of
>> characters can be displayed from a process title.  I'm hesitant to add
>> fields to that string that we don't really need.

> Could we make this configurable, similar to log_line_prefix?

Yeah, we could get rid of a lot of the need for value judgments in this
area if we bit the bullet and provided infrastructure like that.

It occurs to me that we could also remove the update_process_title GUC:
what you would do is configure a process_title pattern that doesn't
include the %-escape for current command tag, and the infrastructure
could notice that that escape isn't present and skip unnecessary updates.
The same kind of trick could be used for other potentially-expensive
items like the lock "waiting" flag.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sslmode=require fallback

2016-07-17 Thread Peter Eisentraut
On 7/15/16 3:07 PM, Andrew Dunstan wrote:
> Do those packagers who install dummy certificates and turn SSL on also 
> change their pg_hba.conf.sample files to use hostssl?. That could go a 
> long way towards encouraging people.

Debian, which I guess sort of started this, does not, but there are
allusions to it in the TODO list.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sslmode=require fallback

2016-07-17 Thread Peter Eisentraut
On 7/15/16 4:14 AM, Magnus Hagander wrote:
> The entire "prefer" mode is a design flaw, that we unfortunately picked
> as default mode.
> 
> If it fails *for any reason*, it falls back to plaintext. Thus, you have
> to assume it will make a plaintext connection. Thus, it gives you zero
> guarantees, so it serves no actual purpose from a security perspective.

I could imagine a variant of "prefer" that tries SSL if available, but
fails the connection if the SSL setup fails for some reason (e.g.,
certificates).  That would be more similar to how browsers with
HTTPS-Everywhere work.

Modulo that, I don't think that "prefer" is a bad default.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] One process per session lack of sharing

2016-07-17 Thread Simon Riggs
On 12 July 2016 at 09:57,  wrote:


> We have faced with some lack of sharing resources.
> So in our test memory usage per session:
> Oracle: about 5M
> MSSqlServer: about 4M
> postgreSql: about 160лю
>

Using shared resources also has significant problems, so care must be taken.

I think its clear that threading is out, but it is far from being the only
solution to reducing the memory overhead of sharing.

Analysing the overhead and suggesting practical ways forward may help.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] One process per session lack of sharing

2016-07-17 Thread Jan Wieck
On Sun, Jul 17, 2016 at 3:23 AM, Craig Ringer  wrote:

>
>
> Lots more could be shared, too. Cached plans, for example.
>

But the fact that PostgreSQL has transactional DDL complicates things like
a shared plan cache and shared PL/pgSQL execution trees. Those things are
much easier in a trivial database implementation, where an ALTER TABLE is
just trampling over a running transaction.


Regards, Jan

-- 
Jan Wieck
Senior Postgres Architect
http://pgblog.wi3ck.info


Re: [HACKERS] rethinking dense_alloc (HashJoin) as a memory context

2016-07-17 Thread Robert Haas
On Wed, Jul 13, 2016 at 4:39 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Jul 13, 2016 at 1:10 PM, Tomas Vondra
>>  wrote:
>> What's not clear to me is to what extent slowing down pfree is an
>> acceptable price for improving the behavior in other ways.  I wonder
>> how many of the pfree calls in our current codebase are useless or
>> even counterproductive, or could be made so.
>
> I think there's a lot, but I'm afraid most of them are in contexts
> (pun intended) where aset.c already works pretty well, ie it's a
> short-lived context anyway.  The areas where we're having pain are
> where there are fairly long-lived contexts with lots of pfree traffic;
> certainly that seems to be the case in reorderbuffer.c.  Because they're
> long-lived, you can't just write off the pfrees as ignorable.
>
> I wonder whether we could compromise by reducing the minimum "standard
> chunk header" to be just a pointer to owning context, with the other
> fields becoming specific to particular mcxt implementations.  That would
> be enough to allow contexts to decide that pfree was a no-op, say, or that
> they wouldn't support GetMemoryChunkSpace(), without having to decree that
> misuse can lead to crashes.  But that's still more than zero overhead
> per-chunk.

I think that would be worth doing.  It's not perfect, and the extra 8
(or 4) bytes per chunk certainly do matter.  On the other hand, it's
certainly better than what we're doing today, which basically closes
off all meaningful innovation in this area.  If you are locked into
having a 16 byte chunk header that includes the size, the individual
memory context implementations don't have much latitude to vary their
behavior from what aset.c already does.  You can change the policy for
allocating chunks from the operating system, and the way you try to
recycle chunks that have been freed, but that's about it.  Removing
the size from the standard header would, I think, make it easier to
experiment more widely with alternative memory context implementations
and get a better idea what can be saved.

We can always think about further changes later, but this seems like a
good start.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving executor performance - tidbitmap

2016-07-17 Thread Robert Haas
On Wed, Jul 13, 2016 at 11:06 PM, Andres Freund  wrote:
> On 2016-06-24 16:29:53 -0700, Andres Freund wrote:
>> 4) Various missing micro optimizations have to be performed, for more
>>architectural issues to become visible. E.g. [2] causes such bad
>>slowdowns in hash-agg workloads, that other bottlenecks are hidden.
>
> One such issue is the usage of dynahash.c in tidbitmap.c. In many
> queries, e.g. tpch q7, the bitmapscan is often the bottleneck. Profiling
> shows that this is largely due to dynahash.c being slow. Primary issues
> are: a) two level structure doubling the amount of indirect lookups b)
> indirect function calls c) using separate chaining based conflict
> resolution d) being too general.
>
> I've quickly hacked up an alternative linear addressing hashtable
> implementation. And the improvements are quite remarkable.

Nice!

> I'm wondering whether we can do 'macro based templates' or
> something. I.e. have something like the simplehash in the patch in
> simplehash.h, but the key/value widths, the function names, are all
> determined by macros (oh, this would be easier with C++ templates...).
>
> Does anybody have a better idea?

No.

> The major issue with the simplehash implementation in the path is
> probably the deletion; which should rather move cells around, rather
> than use toombstones. But that was too complex for a POC ;). Also, it'd
> likely need a proper iterator interface.

Do we ever need to delete from a TIDBitmap?  Probably not, but I'm
guessing you have other uses for this in mind.

> FWIW, the dynahash usage in nodeAgg.c is a major bottleneck in a number
> of other queries.

Can we use this implementation for that as well, or are we going to
need yet another one?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sslmode=require fallback

2016-07-17 Thread Robert Haas
On Fri, Jul 15, 2016 at 4:14 AM, Magnus Hagander  wrote:
>> The original complaint was not actually that "prefer" is a bad default,
>> but that in the presence of a root certificate on the client, a
>> certificate validation failure falls back to plain text.  That seems
>> like a design flaw of the "prefer" mode, no matter whether it is the
>> default or not.
>
> The entire "prefer" mode is a design flaw, that we unfortunately picked as
> default mode.

Well, you keep saying that, but what I'm saying is you should stop
complaining about and start figuring out how to fix it. :-)

> If it fails *for any reason*, it falls back to plaintext. Thus, you have to
> assume it will make a plaintext connection. Thus, it gives you zero
> guarantees, so it serves no actual purpose from a security perspective.
>
> it will equally fall back on incompatible SSL configs. Or on a network
> hiccup. The presence of the certificate is just one of many different
> scenarios where it will fall back.
>
> If you care about encryption, you should pick something else
> (require/verify). If you don't care about encryption, you should pick
> something else (allow, probably) so as not to pay unnecessary overhead.

If we think trying to push everyone on to SSL isn't a good plan, then
how about changing the default to allow?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] One question about transformation ANY Sublinks into joins

2016-07-17 Thread Armor
Hi
I run a simple SQL with latest PG??
postgres=# explain select * from t1 where id1 in (select id2 from t2 where 
c1=c2);
 QUERY PLAN 

 Seq Scan on t1  (cost=0.00..43291.83 rows=1130 width=8)
   Filter: (SubPlan 1)
   SubPlan 1
 ->  Seq Scan on t2  (cost=0.00..38.25 rows=11 width=4)
   Filter: (t1.c1 = c2)
(5 rows)



and the table schema are as following:


postgres=# \d t1
  Table "public.t1"
 Column |  Type   | Modifiers 
+-+---
 id1| integer | 
 c1 | integer | 


postgres=# \d t2
  Table "public.t2"
 Column |  Type   | Modifiers 
+-+---
 id2| integer | 
 c2 | integer | 



 I find PG decide not to pull up this sublink because the whereClauses in 
this sublink refer to the Vars of parent query, for detail please check the 
function named convert_ANY_sublink_to_join in 
src/backend/optimizer/plan/subselect.c. 
 However, for such simple sublink which has no agg, no window function, no 
limit, may be we can carefully pull up the predicates in whereCluase which 
refers to the Vars of parent query, then pull up this sublink and produce a 
query plan as following:

postgres=# explain select * from t1 where id1 in (select id2 from t2 where 
c1=c2);
   QUERY PLAN   

 Hash Join  (cost=49.55..99.23 rows=565 width=8)
   Hash Cond: ((t1.id1 = t2.id2) AND (t1.c1 = t2.c2))
   ->  Seq Scan on t1  (cost=0.00..32.60 rows=2260 width=8)
   ->  Hash  (cost=46.16..46.16 rows=226 width=8)
 ->  HashAggregate  (cost=43.90..46.16 rows=226 width=8)
   Group Key: t2.id2, t2.c2
   ->  Seq Scan on t2  (cost=0.00..32.60 rows=2260 width=8)
   
--
Jerry Yu
https://github.com/scarbrofair

[HACKERS] [PROPOSAL] timestamp informations to pg_stat_statements

2016-07-17 Thread Jun Cheol Gim
Hi hackers!

Following is a proposal to add timestamp informations to
`pg_stat_statements`.

# Use case
- If we want to gather list and stats for queries executed at least once
last 1 hour, we had to reset a hours ago. There is no way if we didn't.
- If we found some strange query from `pg_stat_statments`, we might want to
identify when it ran firstly.

If we have timestamp of first and last executed, we can easily gather thess
informations and there are tons of more use cases.

# Implementations
Firstly, I added API version 1.5 to add additional fields and I added two
fields to Counters structure. Now it has 25 fields in total.

```
@@ -156,6 +158,8 @@ typedef struct Counters
  doubleblk_read_time;  /* time spent reading, in msec */
  doubleblk_write_time; /* time spent writing, in msec */
  doubleusage;  /* usage factor */
+ TimestampTz   created;  /* timestamp of created time */
+ TimestampTz   last_updated; /* timestamp of last updated */
 } Counters;

 /*
```

The `created` field is filled at the first time the entry will added to
hash table.

```
@@ -1663,6 +1690,8 @@ entry_alloc(pgssHashKey *key, Size query_offset, int
query_len, int encoding,

/* reset the statistics */
memset(>counters, 0, sizeof(Counters));
+   /* set the created timestamp */
+entry->counters.created = GetCurrentTimestamp();
/* set the appropriate initial usage count */
entry->counters.usage = sticky ? pgss->cur_median_usage : USAGE_INIT;
/* re-initialize the mutex each time ... we assume no one using it */
```

The `last_updated` will be updated every time `pgss_store()` updates stats.

```
@@ -1251,6 +1256,7 @@ pgss_store(const char *query, uint32 queryId,
e->counters.blk_read_time +=
INSTR_TIME_GET_MILLISEC(bufusage->blk_read_time);
e->counters.blk_write_time +=
INSTR_TIME_GET_MILLISEC(bufusage->blk_write_time);
e->counters.usage += USAGE_EXEC(total_time);
+   e->counters.last_updated = GetCurrentTimestamp();

SpinLockRelease(>mutex);
  }
```

The attached is my first implementation.

Regards,
Jason Kim.


pg_stat_statements_with_timestamp_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Version number for pg_control

2016-07-17 Thread Craig Ringer
On 15 July 2016 at 23:54, Tom Lane  wrote:

> While researching a pgsql-general question, I noticed that commit
> 73c986adde5d73a5e2555da9b5c8facedb146dcd added several new fields
> to pg_control without bothering to touch PG_CONTROL_VERSION.  Thus,
> PG_CONTROL_VERSION is still "942" even though the file contents
> are not at all compatible with 9.4.
>
> It's way too late to do anything about this in 9.5.


Is it?

If PG_VERSION and the catalog version are correct for 9.5, the next point
release could update pg_control's version, accepting the old one during
recovery but only writing the new one.

Whether we should is another matter, since that means people can't
downgrade to old point releases, replicas will break if they upgrade the
master before the replicas, etc. It doesn't seem worth the cost/benefit.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] One process per session lack of sharing

2016-07-17 Thread Craig Ringer
On 15 July 2016 at 20:54,  wrote:

> Hi
>
>
> > but parallel processing doesn't requires threading support - see
> PostgreSQL 9.6 features.
>
> To   share  dynamic  execution  code between threads much more easy(If
> sharing this code between process is possible).
> There  is  many  other  interaction techniques  between threads which is
> absence between process.
>
>

We have shared memory.

How do you think the buffer cache works?

Lots more could be shared, too. Cached plans, for example.

It's possible. You just have to change how you think about it, and you
might not be able to do it within the framework of the JVM. So it's quite
possibly not worth it for you. Nonetheless, don't assume it can't be done
just because you can't do it the way you're used to thinking of doing it.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] One process per session lack of sharing

2016-07-17 Thread Craig Ringer
On 16 July 2016 at 00:43, james  wrote:

> On 15/07/2016 09:28, Craig Ringer wrote:
>
>> I don't think anyone's considering moving from multi-processing to
>> multi-threading in PostgreSQL. I really, really like the protection that
>> the shared-nothing-by-default process model gives us, among other things.
>>
>> As I understand it, the main issue is that it is hard to integrate
> extensions that use heavyweight runtimes and are focussed on isolation
> within a virtual machine.  Its not just
>
> Perhaps it would be possible for the postmaster (or a delegate process) to
> host such a runtime, and find a way for a user process that wants to use
> such a runtime to communicate with it, whether by copying function
> parameters over RPC or by sharing some of its address space explicitly to
> the runtime to operate on directly.
>

It is, and the JVM supports that, but it's so costly that it would
eliminate most of the benefits this user is seeking from the kind of
sharing they want. It also needs at least a minimal JVM running in the
target backends.

The issue here is an architectural mismatch between PostgreSQL and the JVM,
made worse by the user's very stored-proc-heavy code. Some other runtime
that's designed to co-operate with a multiprocessing environment could well
be fine, but the JVM isn't. At least, the Sun/Oracle/OpenJDK JVM isn't.

They could explore doing their bytecode compilation for another runtime
that's more friendly toward multiprocessing (maybe Mono? Haven't tried)
and/or look at using PostgreSQL's shared memory facilities within their
target runtime. It's not like this is insoluible without completely
rebuilding PostgreSQL.

For them it'd be great if PostgreSQL used multi-threading instead of
multi-processing, but it doesn't, and it's not likely to. So they've got to
find other solutions to their difficulties within PostgreSQL or use another
product.

Such a host delegate process could be explicitly built with multithread
> support and not 'infect' the rest of the code with its requirements.
>

Not if it lives in the postmaster. It'd have to be forked and communicated
with at one remove.

Using granular RPC is nice for isolation but I am concerned that the
> latencies might be high.
>

Yep. When your goal is performance and you're trying to move stuff closer
to the DB and out of an appserver, it's likely counterproductive.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services