Re: [HACKERS] libpq bad async behaviour

2015-01-14 Thread Daurnimator
On 14 January 2015 at 08:40, Andres Freund and...@2ndquadrant.com wrote:

 I think that kind of solution isn't likely to be satisfying. The amount
 of porting work is just not going to be worth the cost. And it won't be
 easily hideable in the API at all as the callers will expect a normal
 fd.


All that consumers of the API need is something they can `select()` or
equivalent on.


  Yeah, this is a problem. Fixing it isn't easy, though, I think.

 I think
 extern PostgresPollingStatusType PQconnectPoll(PGconn *conn);
 has the right interface. It returns what upper layers need to wait
 for. I think we should extend pretty much that to more interfaces.


This would be a fine solution. That enum indeed has the correct
values/semantics.


 This
 likely means that we'll need extended versions of PQFlush() and
 PQconsumeInput() - afaics it shouldn't be much more?


PQping?
PQconnectPoll already has it.

Though, I think we could probably even reduce this down to a single common
function for all cases:
PQpoll() or similar.


Re: [HACKERS] OOM on EXPLAIN with lots of nodes

2015-01-14 Thread Robert Haas
On Tue, Jan 13, 2015 at 8:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 Heikki Linnakangas hlinnakan...@vmware.com writes:
 But do we really need to backpatch any of this?

 Alexey's example consumes only a couple hundred MB in 9.2, vs about 7GB
 peak in 9.3 and up.  That seems like a pretty nasty regression.

 I did a bit more measurement of the time and backend memory consumption
 for Alexey's example EXPLAIN:

 9.2: 0.9 sec, circa 200 MB
 HEAD: 56 sec, circa 7300 MB
 with patch below: 3.7 sec, circa 300 MB

 So while this doesn't get us all the way back down to where we were before
 we started trying to guarantee unique table/column identifiers in EXPLAIN
 printouts, it's at least a lot closer.

 Not sure whether to just commit this to HEAD and call it a day, or to
 risk back-patching.

I think we need to back-patch something; that's a pretty nasty
regression, and I have some EDB-internal reports that might be from
the same cause.  I'm not too concerned about forcibly breaking the API
here, but I can understand why somebody might want to do that.  If we
do, I like the idea of renaming ExplainInitState() or maybe by
replacing it by a NewExplainState() function that is used instead.
But I'm not sure how necessary it is really.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS

2015-01-14 Thread Stephen Frost
* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
 Turns out it wasn't as simple as that. prepend_row_security_policies()
 really could get called multiple times for the same RTE, because the
 call to query_tree_walker() at the end of fireRIRrules() would descend
 into the just-added quals again. The simplest fix seems to be to
 process RLS in a separate loop at the end, so that it can have it's
 own infinite recursion detection, which is different from that needed
 for pre-existing security quals and with check options from security
 barrier views. This approach simplifies things a bit, and ensures that
 we only try to expand RLS once for each RTE.

Right, I specifically recall having prepend_row_security_policies()
getting called multiple times for the same RTE.  I like this approach of
using a separate loop though and it strikes me that it lends more
weight to the argument that we're better off with these as independent
considerations.

  Also, I'm thinking that it would be better to refactor things a bit
  and have prepend_row_security_policies() just return the new
  securityQuals and withCheckOptions to add. Then fireRIRrules() would
  only have to recurse into the new quals being added, not the
  already-processed quals.

Hmm, good point.

 Turns out that refactoring actually became necessary in order to fix
 this bug, but I think it makes things cleaner and more efficient.

Sounds good, I'll take a look.

 Here's an updated patch with a new test for this bug. I've been
 developing the fixes for these RLS issues as one big patch, but I
 suppose it would be easy to split up, if that's preferred.

I'm alright with it as-is for now.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Safe memory allocation functions

2015-01-14 Thread Robert Haas
On Tue, Jan 13, 2015 at 10:10 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 However, there is a larger practical problem with this whole concept,
 which is that experience should teach us to be very wary of the assumption
 that asking for memory the system can't give us will just lead to nice
 neat malloc-returns-NULL behavior.  Any small perusal of the mailing list
 archives will remind you that very often the end result will be SIGSEGV,
 OOM kills, unrecoverable trap-on-write when the kernel realizes it can't
 honor a copy-on-write promise, yadda yadda.  Agreed that it's arguable
 that these only occur in misconfigured systems ... but misconfiguration
 appears to be the default in a depressingly large fraction of systems.
 (This is another reason for _safe not being the mot juste :-()

I don't really buy this.  It's pretty incredible to think that after a
malloc() failure there is absolutely no hope of carrying on sanely.
If that were true, we wouldn't be able to ereport() out-of-memory
errors at any severity less than FATAL, but of course it doesn't work
that way.  Moreover, AllocSetAlloc() contains malloc() and, if that
fails, calls malloc() again with a smaller value, without even
throwing an error.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Merlin Moncure
On Wed, Jan 14, 2015 at 9:49 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-01-14 09:47:19 -0600, Merlin Moncure wrote:
 On Wed, Jan 14, 2015 at 9:30 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  If you gdb in, and type 'fin' a couple times, to wait till the function
  finishes, is there actually any progress? I'm wondering whether it's
  just many catalog accesses + contention, or some other
  problem. Alternatively set a breakpoint on ScanPgRelation() or so and
  see how often it's hit.

 well, i restarted the database, forgetting my looper was running which
 immediately spun up and it got stuck again with a similar profile
 (lots of cpu in spinlock):

 Samples: 3K of event 'cycles', Event count (approx.): 2695723228
  31.16%  postgres[.] s_lock
  22.32%  postgres[.] tas
  12.13%  postgres[.] tas
   5.93%  postgres[.] spin_delay
   5.69%  postgres[.] LWLockRelease
   3.75%  postgres[.] LWLockAcquireCommon
   3.61%  perf[.] 0x000526c4
   2.51%  postgres[.] FunctionCall2Coll
   1.48%  libc-2.17.so[.] 0x0016a132

  If you gdb in, and type 'fin' a couple times,
 (gdb) fin
 Run till exit from #0  0x7ff4c63f7a97 in semop () from
 /lib/x86_64-linux-gnu/libc.so.6
 0x006de073 in PGSemaphoreLock ()
 (gdb) fin
 Run till exit from #0  0x006de073 in PGSemaphoreLock ()

 It returned once.  Second time, it didn't at least so far (minute or so).

 Hm, that's autovac though, not the normal user backends that actually do
 stuff, right? If you could additionally check those, it'd be great.


got it to reproduce.  had to take a break to get some work done.


(gdb) fin
Run till exit from #0  0x7f7b07099dc3 in select ()
   from /lib/x86_64-linux-gnu/libc.so.6
0x008d3ac4 in pg_usleep ()
(gdb) fin
Run till exit from #0  0x008d3ac4 in pg_usleep ()
0x00750b69 in s_lock ()
(gdb) fin
Run till exit from #0  0x00750b69 in s_lock ()
0x00750844 in LWLockRelease ()
(gdb) fin
Run till exit from #0  0x00750844 in LWLockRelease ()
0x0073 in LockBuffer ()
(gdb) fin
Run till exit from #0  0x0073 in LockBuffer ()
0x004b2db4 in _bt_relandgetbuf ()
(gdb) fin
Run till exit from #0  0x004b2db4 in _bt_relandgetbuf ()
0x004b7116 in _bt_moveright ()
(gdb) fin
Run till exit from #0  0x004b7116 in _bt_moveright ()



so it looks like nobody ever exits from _bt_moveright.   any last
requests before I start bisecting down?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] explain sortorder

2015-01-14 Thread Arne Scheffer

Hi,

we will also remove the following is lc_collate hint in the next version, 
showing only mandatory info as suggested.

/* for those who use COLLATE although their default is already 
the wanted */
if (strcmp(collname, localeptr) == 0)
{
appendStringInfo(sortorderInformation,  (%s is 
LC_COLLATE), collname);
}

Anybody insisting on that?

Arne

Note: I see, at the moment we use the wrong default for DESC. We'll fix that.

On Wed, 14 Jan 2015, Heikki Linnakangas wrote:


On 01/14/2015 05:26 PM, Timmer, Marius wrote:

Hello Heikki,

abbreviated version:
Sorry, the problem is only the unhandy patch text format, not different 
opinions how to proceed.


Long version:
The v7 patch file already addressed your suggestions,
but the file contained serveral (old) local commits,
the new ones at the end of the patch text/file.


Ah, missed that. I stopped reading when I saw the old stuff there :-).


v7.1 is attached and addresses this issue providing a clean patch file.


Ok, thanks, will take a look.


V8 will - as mentioned - add missing docs and regression tests,


Great!

- Heikki





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] explain sortorder

2015-01-14 Thread Timmer, Marius
Hello Heikki,

abbreviated version:
Sorry, the problem is only the unhandy patch text format, not different 
opinions how to proceed.

Long version:
The v7 patch file already addressed your suggestions,
but the file contained serveral (old) local commits,
the new ones at the end of the patch text/file.

v7.1 is attached and addresses this issue providing a clean patch file.

V8 will - as mentioned - add missing docs and regression tests,
Mike suggested.

VlG-Arne  Marius




---
Marius Timmer
Zentrum für Informationsverarbeitung
Westfälische Wilhelms-Universität Münster
Einsteinstraße 60

mtimm...@uni-muenster.de

Am 13.01.2015 um 18:52 schrieb Heikki Linnakangas hlinnakan...@vmware.com:

 On 01/13/2015 06:04 PM, Timmer, Marius wrote:
  -malloc() (StringInfo is used as suggested now).

 There really shouldn't be any snprintf() calls in the patch, when StringInfo 
 is used correctly...

 @@ -1187,6 +1187,7 @@ explain (verbose, costs off) select * from matest0 
 order by 1-id;
  Sort
Output: matest0.id, matest0.name, ((1 - matest0.id))
Sort Key: ((1 - matest0.id))
 +   Sort Order: ASC NULLS LAST
-  Result
  Output: matest0.id, matest0.name, (1 - matest0.id)
  -  Append

 This patch isn't going to be committed with this output format. Please change 
 per my suggestion earlier:

 I don't like this output. If there are a lot of sort keys, it gets
 difficult to match the right ASC/DESC element to the sort key it applies
 to. (Also, there seems to be double-spaces in the list)

 I would suggest just adding the information to the Sort Key line. As
 long as you don't print the modifiers when they are defaults (ASC and
 NULLS LAST), we could print the information even in non-VERBOSE mode. So
 it would look something like:

 Sort Key: sortordertest.n1 NULLS FIRST, sortordertest.n2 DESC

 Or if you don't agree with that, explain why.

 - Heikki




explain_sortorder-v7_1.patch
Description: explain_sortorder-v7_1.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing INNER JOINs

2015-01-14 Thread Jim Nasby

On 1/13/15 5:02 AM, David Rowley wrote:


I can't quite get my head around what you mean here, as the idea sounds quite 
similar to something that's been discussed already and ruled out.
If we're joining relation a to relation b, say the plan chosen is a merge join. 
If we put some special node as the parent of the merge join then how will we 
know to skip or not skip any sorts that are there especially for the merge 
join, or perhaps the planner choose an index scan as a sorted path, where now 
that the join is removed, could become a faster seqscan. The whole plan 
switching node discussion came from this exact problem. Nobody seemed to like 
the non-optimal plan that was not properly optimised for having the relation 
removed.


In my mind this is the same as a root level Alternative Plan, so you'd be free 
to do whatever you wanted in the alternate:

- blah blah
  - Alternate
- Merge join
   ...
- SeqScan

I'm guessing this would be easier to code, but that's just a guess. The other 
advantage is if you can't eliminate the join to table A at runtime you could 
still eliminate table B, whereas a top-level Alternate node doesn't have that 
flexibility.

This does have a disadvantage of creating more plan variations to consider. 
With a single top-level Alternate node there's only one other option. I believe 
multiple Alternates would create more options to consider.

Ultimately, unless this is easier to code than a top-level alternate, it's 
probably not worth pursuing.


It also seems that transitioning through needless nodes comes at a cost. This 
is why I quite liked the Alternative Plan node idea, as it allowed me to skip 
over the alternative plan node at plan initialisation.


For init I would expect this to result in a smaller number of nodes than a 
top-level Alternate, because you wouldn't be duplicating all the stuff above 
the joins. That said, I rather doubt it's worth worrying about the cost to 
init; won't it be completely swamped by extra planning cost, no matter how we 
go about this?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-01-14 Thread Robert Haas
On Tue, Jan 13, 2015 at 6:25 AM, John Gorman johngorm...@gmail.com wrote:
 One approach that I has worked well for me is to break big jobs into much
 smaller bite size tasks. Each task is small enough to complete quickly.

 We add the tasks to a task queue and spawn a generic worker pool which eats
 through the task queue items.

 This solves a lot of problems.

 - Small to medium jobs can be parallelized efficiently.
 - No need to split big jobs perfectly.
 - We don't get into a situation where we are waiting around for a worker to
 finish chugging through a huge task while the other workers sit idle.
 - Worker memory footprint is tiny so we can afford many of them.
 - Worker pool management is a well known problem.
 - Worker spawn time disappears as a cost factor.
 - The worker pool becomes a shared resource that can be managed and reported
 on and becomes considerably more predictable.

I think this is a good idea, but for now I would like to keep our
goals somewhat more modest: let's see if we can get parallel
sequential scan, and only parallel sequential scan, working and
committed.  Ultimately, I think we may need something like what you're
talking about, because if you have a query with three or six or twelve
different parallelizable operations in it, you want the available CPU
resources to switch between those as their respective needs may
dictate.  You certainly don't want to spawn a separate pool of workers
for each scan.

But I think getting that all working in the first version is probably
harder than what we should attempt.  We have a bunch of problems to
solve here just around parallel sequential scan and the parallel mode
infrastructure: heavyweight locking, prefetching, the cost model, and
so on.  Trying to add to that all of the problems that might attend on
a generic task queueing infrastructure fills me with no small amount
of fear.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typo fix in alter_table.sgml

2015-01-14 Thread Robert Haas
On Wed, Jan 14, 2015 at 2:55 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 I noticed that SET STATISTICS was not in a literal block in 
 alter_table.sgml:
   para
 -  SET STATISTICS acquires a literalSHARE UPDATE EXCLUSIVE/literal 
 lock.
 +  literalSET STATISTICS/literal acquires a
 +  literalSHARE UPDATE EXCLUSIVE/literal lock.
   /para
 That's a small detail, still.. Patch is attached.

OK, committed.  But that's not a typo as stated in $SUBJECT but
rather a markup fix.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ereport bug

2015-01-14 Thread Robert Haas
On Wed, Jan 14, 2015 at 1:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Jan 12, 2015 at 6:27 AM, Dmitry Voronin
 carriingfat...@yandex.ru wrote:
 I am attaching to this letter a test case that shows the behavior
 errcontext() macro and the way to fix it.

 So the upshot of this is that given errfinish(A, B, C), where A, B,
 and C are expressions, my gcc is choosing to evaluate C, then B, then
 A, then the errfinish call itself.  But whoever wrote the errcontext()
 macro evidently thought, in this kind of situation, the compiler would
 be certain to evaluate A, then B, then C, then errfinish.  But it
 doesn't.

 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1f9bf05e539646103c518bcbb49c04919b238f7a

Oops, sorry.  I missed that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup fails with long tablespace paths

2015-01-14 Thread Robert Haas
On Tue, Jan 13, 2015 at 4:41 PM, Peter Eisentraut pete...@gmx.net wrote:
 I think the key point I'm approaching is that the information should
 only ever be in one place, all the time.  This is not dissimilar from
 why we took the tablespace location out of the system catalogs.  Users
 might have all kinds of workflows for how they back up, restore, and
 move their tablespaces.  This works pretty well right now, because the
 authoritative configuration information is always in plain view.  The
 proposal is essentially that we add another location for this
 information, because the existing location is incompatible with some
 operating system tools.  And, when considered by a user, that second
 location might or might not collide with or overwrite the first location
 at some mysterious times.

 So I think the preferable fix is not to add a second location, but to
 make the first location compatible with said operating system tools,
 possibly in the way I propose above.

I see.  I'm a little concerned that following symlinks may be cheaper
than whatever system we would come up with for caching the
tablespace-name-to-file-name mappings.  But that concern might be
unfounded, and apart from it I have no reason to oppose your proposal,
if you want to do the work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS

2015-01-14 Thread Dean Rasheed
On 14 January 2015 at 08:43, Dean Rasheed dean.a.rash...@gmail.com wrote:
 On 12 January 2015 at 14:24, Stephen Frost sfr...@snowman.net wrote:
 Interesting, thanks for the work!  I had been suspicious that there was
 an issue with the recursion handling.


 So continuing to review the RLS code, I spotted the following in
 prepend_row_security_policies():

 /*
  * We may end up getting called multiple times for the same RTE, so check
  * to make sure we aren't doing double-work.
  */
 if (rte-securityQuals != NIL)
 return false;

 which looked suspicious. I assume that's just a hang-up from an
 earlier attempt to prevent infinite recursion in RLS expansion, but
 actually it's wrong because in the case of an UPDATE to a security
 barrier view on top of a table with RLS enabled, the view's
 securityQuals will get added to the RTE first, and that shouldn't
 prevent the underlying table's RLS securityQuals from being added.

 AFAICT, it should be safe to just remove the above code. I can't see
 how prepend_row_security_policies() could end up getting called more
 than once for the same RTE.


Turns out it wasn't as simple as that. prepend_row_security_policies()
really could get called multiple times for the same RTE, because the
call to query_tree_walker() at the end of fireRIRrules() would descend
into the just-added quals again. The simplest fix seems to be to
process RLS in a separate loop at the end, so that it can have it's
own infinite recursion detection, which is different from that needed
for pre-existing security quals and with check options from security
barrier views. This approach simplifies things a bit, and ensures that
we only try to expand RLS once for each RTE.

 Also, I'm thinking that it would be better to refactor things a bit
 and have prepend_row_security_policies() just return the new
 securityQuals and withCheckOptions to add. Then fireRIRrules() would
 only have to recurse into the new quals being added, not the
 already-processed quals.


Turns out that refactoring actually became necessary in order to fix
this bug, but I think it makes things cleaner and more efficient.

Here's an updated patch with a new test for this bug. I've been
developing the fixes for these RLS issues as one big patch, but I
suppose it would be easy to split up, if that's preferred.

Regards,
Dean
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
new file mode 100644
index 9cbbcfb..34ddf41
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*** inheritance_planner(PlannerInfo *root)
*** 790,795 
--- 790,796 
  {
  	Query	   *parse = root-parse;
  	int			parentRTindex = parse-resultRelation;
+ 	Bitmapset  *resultRTindexes = NULL;
  	List	   *final_rtable = NIL;
  	int			save_rel_array_size = 0;
  	RelOptInfo **save_rel_array = NULL;
*** inheritance_planner(PlannerInfo *root)
*** 814,820 
--- 815,835 
  	 * (1) would result in a rangetable of length O(N^2) for N targets, with
  	 * at least O(N^3) work expended here; and (2) would greatly complicate
  	 * management of the rowMarks list.
+ 	 *
+ 	 * Note that any RTEs with security barrier quals will be turned into
+ 	 * subqueries during planning, and so we must create copies of them too,
+ 	 * except where they are target relations, which will each only be used
+ 	 * in a single plan.
  	 */
+ 	resultRTindexes = bms_add_member(resultRTindexes, parentRTindex);
+ 	foreach(lc, root-append_rel_list)
+ 	{
+ 		AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc);
+ 		if (appinfo-parent_relid == parentRTindex)
+ 			resultRTindexes = bms_add_member(resultRTindexes,
+ 			 appinfo-child_relid);
+ 	}
+ 
  	foreach(lc, root-append_rel_list)
  	{
  		AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc);
*** inheritance_planner(PlannerInfo *root)
*** 885,905 
  			{
  RangeTblEntry *rte = (RangeTblEntry *) lfirst(lr);
  
! if (rte-rtekind == RTE_SUBQUERY)
  {
  	Index		newrti;
  
  	/*
  	 * The RTE can't contain any references to its own RT
! 	 * index, so we can save a few cycles by applying
! 	 * ChangeVarNodes before we append the RTE to the
! 	 * rangetable.
  	 */
  	newrti = list_length(subroot.parse-rtable) + 1;
  	ChangeVarNodes((Node *) subroot.parse, rti, newrti, 0);
  	ChangeVarNodes((Node *) subroot.rowMarks, rti, newrti, 0);
  	ChangeVarNodes((Node *) subroot.append_rel_list, rti, newrti, 0);
  	rte = copyObject(rte);
  	subroot.parse-rtable = lappend(subroot.parse-rtable,
  	rte);
  }
--- 900,928 
  			{
  RangeTblEntry *rte = (RangeTblEntry *) lfirst(lr);
  
! /*
!  * Copy subquery RTEs and RTEs with security barrier quals
!  * that will be turned into subqueries, except those that are
!  * target relations.
!  */
! if (rte-rtekind == RTE_SUBQUERY ||
! 	(rte-securityQuals != 

Re: [HACKERS] orangutan seizes up during isolation-check

2015-01-14 Thread Peter Eisentraut
On 1/1/15 11:04 PM, Noah Misch wrote:
 Clusters hosted on OS X fall into these categories:

 1) Unaffected configuration.  This includes everyone setting a valid messages
locale via LANG, LC_ALL or LC_MESSAGES.
 2) Affected configuration.  Through luck and light use, the cluster would not
experience the crashes/hangs.
 3) Cluster would experience the crashes/hangs.

 DBAs in (3) want the FATAL at startup, but those in (2) want a LOG message
 instead.  DBAs in (1) don't care.  Since intermittent postmaster hangs are 
 far
 worse than startup failure, if (2) and (3) have similar population, FATAL is
 the better bet.  If (2) is sufficiently more populous than (3), then the many
 small pricks from startup failure do add up to hurt more than the occasional
 postmaster hang.  Who knows how that calculation plays out.
 
 The first attached patch, for all branches, adds LOG-level messages and an
 assertion.  So cassert builds will fail hard, while others won't.  The second
 patch, for master only, changes the startup-time message to FATAL.  If we
 decide to use FATAL in all branches, I would just squash them into one.

What I'm seeing now is that the unaccent regression tests when run under
make check-world abort with

FATAL:  postmaster became multithreaded during startup
HINT:  Set the LC_ALL environment variable to a valid locale.

My locale settings for this purpose are

LANG=en_US.UTF-8
LC_COLLATE=en_US.UTF-8
LC_CTYPE=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
LC_MONETARY=en_US.UTF-8
LC_NUMERIC=en_US.UTF-8
LC_TIME=en_US.UTF-8
LC_ALL=

The environment variable LANG is set, all the other ones are not set.

I could presumably set LC_ALL, but then I lose the ability to set
different locales for different categories.

Running

LC_ALL=$LANG make -C contrib/unaccent check

doesn't fix it; I get the same error.

The behavior is also a bit strange.  The step

== starting postmaster==

hangs for one minute before failing.  This probably has nothing to do
with your change, but maybe pg_regress could detect postmaster startup
failures better.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Peter Geoghegan
On Wed, Jan 14, 2015 at 4:53 PM, Merlin Moncure mmonc...@gmail.com wrote:
 yeah.  via:
 cds2=# \copy (select s as page, (bt_page_items('pg_class_oid_index',
 s)).* from generate_series(1,12) s) to '/tmp/page_items.csv' csv
 header;

My immediate observation here is that blocks 2 and 9 have identical
metadata (from their page opaque area), but partially non-matching
data items (however, the number of items on each block is consistent
and correct according to that metadata, as far as it goes). I think
block 9 is supposed to have a right-link to block 5 (block 5 seems to
think so, at least -- its left link is 9).

So now the question is: how did that inconsistency arise? It didn't
necessarily arise at the time of the (presumed) split of block 2 to
create 9. It could be that the opaque area was changed by something
else, some time later. I'll investigate more.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] s_lock.h default definitions are rather confused

2015-01-14 Thread Andres Freund
On 2015-01-14 12:27:42 -0500, Robert Haas wrote:
 On Mon, Jan 12, 2015 at 12:57 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
  On 2015-01-10 23:03:36 +0100, Andres Freund wrote:
  On 2015-01-10 16:09:42 -0500, Tom Lane wrote:
   I've not tried to build HEAD on my HPPA dinosaur for awhile, but I did
   just now, and I am presented with boatloads of this:
  
   ../../../src/include/storage/s_lock.h:759: warning: `S_UNLOCK' redefined
   ../../../src/include/storage/s_lock.h:679: warning: this is the location 
   of the previous definition
  
   which is not too surprising because the default definition at line 679
   precedes the HPPA-specific one at line 759.
 
  That's 0709b7ee72e4bc71ad07b7120acd117265ab51d0.
 
  Not too surprising that it broke and wasn't noticed without access to
  hppa - the hppa code uses gcc inline assembly outside of the big
  defined(__GNUC__) and inside the section headed Platforms that use
  non-gcc inline assembly.
 
   I'm not particularly interested in untangling the effects of the recent
   hackery in s_lock.h enough to figure out how the overall structure got
   broken, but I trust one of you will clean up the mess.
 
  I think it's easiest solved by moving the gcc inline assembly up to the
  rest of the gcc inline assembly. That'll require duplicating a couple
  lines, but seems easier to understand nonetheless. Not pretty.
 
  Robert, do you have a better idea?
 
 How about hoisting the entire hppa section up to the top of the file,
 before the __GNUC__ || __INTEL_COMPILER section?

We could do that, but I'd rather not see that uglyness at the top
everytime I open the file :P

Right now I think a #ifdef/undef S_UNLOCK in the relevant gcc section
sufficient and acceptable. It's after all the HPPA section that doesn't
really play by the rules.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-01-14 Thread Kouhei Kaigai
 On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
  When custom-scan node replaced a join-plan, it shall have at least two
  child plan-nodes. The callback handler of PlanCustomPath needs to be
  able to call create_plan_recurse() to transform the underlying
  path-nodes to plan-nodes, because this custom-scan node may take other
  built-in scan or sub-join nodes as its inner/outer input.
  In case of FDW, it shall kick any underlying scan relations to remote
  side, thus we may not expect ForeignScan has underlying plans...
 
 Do you have an example of this?

Yes, even though full code set is too large for patch submission...

https://github.com/pg-strom/devel/blob/master/src/gpuhashjoin.c#L1880

This create_gpuhashjoin_plan() is PlanCustomPath callback of GpuHashJoin.
It takes GpuHashJoinPath inherited from CustomPath that has multiple
underlying scan/join paths.
Once it is called back from the backend, it also calls create_plan_recurse()
to make inner/outer plan nodes according to the paths.

In the result, we can see the following query execution plan that CustomScan
takes underlying scan plans.

postgres=# EXPLAIN SELECT * FROM t0 NATURAL JOIN t1 NATURAL JOIN t2;
QUERY PLAN
--
 Custom Scan (GpuHashJoin)  (cost=2968.00..140120.31 rows=3970922 width=143)
   Hash clause 1: (aid = aid)
   Hash clause 2: (bid = bid)
   Bulkload: On
   -  Custom Scan (GpuScan) on t0  (cost=500.00..57643.00 rows=409 
width=77)
   -  Custom Scan (MultiHash)  (cost=734.00..734.00 rows=4 width=37)
 hash keys: aid
 nBatches: 1  Buckets: 46000  Memory Usage: 99.99%
 -  Seq Scan on t1  (cost=0.00..734.00 rows=4 width=37)
 -  Custom Scan (MultiHash)  (cost=734.00..734.00 rows=4 width=37)
   hash keys: bid
   nBatches: 1  Buckets: 46000  Memory Usage: 49.99%
   -  Seq Scan on t2  (cost=0.00..734.00 rows=4 width=37)
(13 rows)

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


 -Original Message-
 From: Robert Haas [mailto:robertmh...@gmail.com]
 Sent: Thursday, January 15, 2015 2:07 AM
 To: Kaigai Kouhei(海外 浩平)
 Cc: Tom Lane; pgsql-hackers@postgreSQL.org; Shigeru Hanada
 Subject: ##freemail## Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5]
 Custom Plan API)
 
 On Fri, Jan 9, 2015 at 10:51 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
  When custom-scan node replaced a join-plan, it shall have at least two
  child plan-nodes. The callback handler of PlanCustomPath needs to be
  able to call create_plan_recurse() to transform the underlying
  path-nodes to plan-nodes, because this custom-scan node may take other
  built-in scan or sub-join nodes as its inner/outer input.
  In case of FDW, it shall kick any underlying scan relations to remote
  side, thus we may not expect ForeignScan has underlying plans...
 
 Do you have an example of this?
 
 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
 Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS qual pushdown

2015-01-14 Thread Robert Haas
On Wed, Jan 14, 2015 at 9:22 AM, Dean Rasheed dean.a.rash...@gmail.com wrote:
 On 14 January 2015 at 13:29, Robert Haas robertmh...@gmail.com wrote:
 One thing they could still leak is the number of times they got
 called, and thus possibly the number of unseen rows.  Now if the
 expressions get constant-folded away that won't be an issue, but a
 clever user can probably avoid that.

 Right now, EXPLAIN ANALYSE can be used to tell you the number of
 unseen rows. Is that something that people are concerned about, and
 are there any plans to change it?

Interesting question.  I don't know.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Shouldn't CREATE TABLE LIKE copy the relhasoids property?

2015-01-14 Thread Andrew Dunstan


On 01/14/2015 07:29 PM, Tom Lane wrote:

dst1 doesn't get an OID column:

regression=# create table src1 (f1 int) with oids;
CREATE TABLE
regression=# create table dst1 (like src1);
CREATE TABLE
regression=# \d+ src1
  Table public.src1
  Column |  Type   | Modifiers | Storage | Stats target | Description
+-+---+-+--+-
  f1 | integer |   | plain   |  |
Has OIDs: yes

regression=# \d+ dst1
  Table public.dst1
  Column |  Type   | Modifiers | Storage | Stats target | Description
+-+---+-+--+-
  f1 | integer |   | plain   |  |


If you don't find that problematic, how about this case?

regression=# create table src2 (f1 int, primary key(oid)) with oids;
CREATE TABLE
regression=# create table dst2 (like src2 including indexes);
ERROR:  column oid named in key does not exist





I agree it's odd, and probably wrong, although it's been like that for a 
very long time, hasn't it?


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Merlin Moncure
On Wed, Jan 14, 2015 at 6:50 PM, Peter Geoghegan p...@heroku.com wrote:
 This is great, but it's not exactly clear which bt_page_items() page
 is which - some are skipped, but I can't be sure which. Would you mind
 rewriting that query to indicate which block is under consideration by
 bt_page_items()?

yeah.  via:
cds2=# \copy (select s as page, (bt_page_items('pg_class_oid_index',
s)).* from generate_series(1,12) s) to '/tmp/page_items.csv' csv
header;

merlin
page,itemoffset,ctid,itemlen,nulls,vars,data
1,1,(0,9),16,f,f,16 0b 00 00 00 00 00 00
1,2,(2,50),16,f,f,70 00 00 00 00 00 00 00
1,3,(2,52),16,f,f,71 00 00 00 00 00 00 00
1,4,(2,24),16,f,f,ae 00 00 00 00 00 00 00
1,5,(2,25),16,f,f,af 00 00 00 00 00 00 00
1,6,(2,51),16,f,f,24 02 00 00 00 00 00 00
1,7,(2,53),16,f,f,25 02 00 00 00 00 00 00
1,8,(4,66),16,f,f,3a 03 00 00 00 00 00 00
1,9,(2,55),16,f,f,3b 03 00 00 00 00 00 00
1,10,(2,56),16,f,f,3c 03 00 00 00 00 00 00
1,11,(4,15),16,f,f,70 04 00 00 00 00 00 00
1,12,(1,31),16,f,f,71 04 00 00 00 00 00 00
1,13,(4,14),16,f,f,bd 04 00 00 00 00 00 00
1,14,(4,17),16,f,f,be 04 00 00 00 00 00 00
1,15,(1,34),16,f,f,d0 04 00 00 00 00 00 00
1,16,(1,35),16,f,f,d1 04 00 00 00 00 00 00
1,17,(0,2),16,f,f,df 04 00 00 00 00 00 00
1,18,(3,10),16,f,f,e1 04 00 00 00 00 00 00
1,19,(3,11),16,f,f,e7 04 00 00 00 00 00 00
1,20,(7,6),16,f,f,eb 04 00 00 00 00 00 00
1,21,(2,23),16,f,f,ec 04 00 00 00 00 00 00
1,22,(4,16),16,f,f,ed 04 00 00 00 00 00 00
1,23,(4,12),16,f,f,ee 04 00 00 00 00 00 00
1,24,(4,64),16,f,f,89 05 00 00 00 00 00 00
1,25,(3,31),16,f,f,8a 05 00 00 00 00 00 00
1,26,(1,48),16,f,f,8b 08 00 00 00 00 00 00
1,27,(4,63),16,f,f,18 09 00 00 00 00 00 00
1,28,(0,64),16,f,f,20 09 00 00 00 00 00 00
1,29,(0,65),16,f,f,21 09 00 00 00 00 00 00
1,30,(4,18),16,f,f,5c 09 00 00 00 00 00 00
1,31,(1,11),16,f,f,5d 09 00 00 00 00 00 00
1,32,(4,40),16,f,f,28 0a 00 00 00 00 00 00
1,33,(3,47),16,f,f,29 0a 00 00 00 00 00 00
1,34,(3,48),16,f,f,2a 0a 00 00 00 00 00 00
1,35,(3,49),16,f,f,2b 0a 00 00 00 00 00 00
1,36,(3,41),16,f,f,2c 0a 00 00 00 00 00 00
1,37,(4,45),16,f,f,2d 0a 00 00 00 00 00 00
1,38,(3,42),16,f,f,2e 0a 00 00 00 00 00 00
1,39,(4,48),16,f,f,2f 0a 00 00 00 00 00 00
1,40,(4,49),16,f,f,30 0a 00 00 00 00 00 00
1,41,(4,44),16,f,f,31 0a 00 00 00 00 00 00
1,42,(3,43),16,f,f,32 0a 00 00 00 00 00 00
1,43,(7,26),16,f,f,33 0a 00 00 00 00 00 00
1,44,(3,50),16,f,f,34 0a 00 00 00 00 00 00
1,45,(4,71),16,f,f,35 0a 00 00 00 00 00 00
1,46,(4,47),16,f,f,37 0a 00 00 00 00 00 00
1,47,(3,46),16,f,f,38 0a 00 00 00 00 00 00
1,48,(3,44),16,f,f,39 0a 00 00 00 00 00 00
1,49,(4,41),16,f,f,3a 0a 00 00 00 00 00 00
1,50,(0,1),16,f,f,3b 0a 00 00 00 00 00 00
1,51,(4,42),16,f,f,3c 0a 00 00 00 00 00 00
1,52,(0,61),16,f,f,5a 0a 00 00 00 00 00 00
1,53,(0,54),16,f,f,5b 0a 00 00 00 00 00 00
1,54,(0,55),16,f,f,5c 0a 00 00 00 00 00 00
1,55,(0,56),16,f,f,5d 0a 00 00 00 00 00 00
1,56,(0,57),16,f,f,5e 0a 00 00 00 00 00 00
1,57,(0,59),16,f,f,5f 0a 00 00 00 00 00 00
1,58,(0,50),16,f,f,60 0a 00 00 00 00 00 00
1,59,(7,14),16,f,f,61 0a 00 00 00 00 00 00
1,60,(0,42),16,f,f,62 0a 00 00 00 00 00 00
1,61,(0,43),16,f,f,63 0a 00 00 00 00 00 00
1,62,(0,68),16,f,f,64 0a 00 00 00 00 00 00
1,63,(0,69),16,f,f,65 0a 00 00 00 00 00 00
1,64,(7,3),16,f,f,66 0a 00 00 00 00 00 00
1,65,(7,4),16,f,f,67 0a 00 00 00 00 00 00
1,66,(0,53),16,f,f,68 0a 00 00 00 00 00 00
1,67,(7,22),16,f,f,69 0a 00 00 00 00 00 00
1,68,(7,23),16,f,f,6a 0a 00 00 00 00 00 00
1,69,(7,24),16,f,f,6b 0a 00 00 00 00 00 00
1,70,(1,72),16,f,f,6c 0a 00 00 00 00 00 00
1,71,(1,73),16,f,f,6d 0a 00 00 00 00 00 00
1,72,(1,74),16,f,f,6e 0a 00 00 00 00 00 00
1,73,(1,6),16,f,f,6f 0a 00 00 00 00 00 00
1,74,(1,7),16,f,f,70 0a 00 00 00 00 00 00
1,75,(1,75),16,f,f,71 0a 00 00 00 00 00 00
1,76,(1,76),16,f,f,72 0a 00 00 00 00 00 00
1,77,(1,67),16,f,f,73 0a 00 00 00 00 00 00
1,78,(0,40),16,f,f,74 0a 00 00 00 00 00 00
1,79,(0,41),16,f,f,75 0a 00 00 00 00 00 00
1,80,(1,50),16,f,f,76 0a 00 00 00 00 00 00
1,81,(1,51),16,f,f,77 0a 00 00 00 00 00 00
1,82,(1,49),16,f,f,78 0a 00 00 00 00 00 00
1,83,(1,58),16,f,f,79 0a 00 00 00 00 00 00
1,84,(1,59),16,f,f,7a 0a 00 00 00 00 00 00
1,85,(1,89),16,f,f,7b 0a 00 00 00 00 00 00
1,86,(1,70),16,f,f,7c 0a 00 00 00 00 00 00
1,87,(1,71),16,f,f,7d 0a 00 00 00 00 00 00
1,88,(1,56),16,f,f,7e 0a 00 00 00 00 00 00
1,89,(1,57),16,f,f,7f 0a 00 00 00 00 00 00
1,90,(1,52),16,f,f,80 0a 00 00 00 00 00 00
1,91,(1,53),16,f,f,81 0a 00 00 00 00 00 00
1,92,(1,43),16,f,f,82 0a 00 00 00 00 00 00
1,93,(7,1),16,f,f,83 0a 00 00 00 00 00 00
1,94,(1,61),16,f,f,84 0a 00 00 00 00 00 00
1,95,(1,62),16,f,f,85 0a 00 00 00 00 00 00
1,96,(0,29),16,f,f,86 0a 00 00 00 00 00 00
1,97,(0,30),16,f,f,87 0a 00 00 00 00 00 00
1,98,(6,35),16,f,f,88 0a 00 00 00 00 00 00
1,99,(1,36),16,f,f,89 0a 00 00 00 00 00 00
1,100,(1,37),16,f,f,8a 0a 00 00 00 00 00 00
1,101,(1,63),16,f,f,8b 0a 00 00 00 00 00 00
1,102,(1,64),16,f,f,8d 0a 00 00 00 00 00 00
1,103,(7,61),16,f,f,8e 0a 00 00 00 00 00 00
1,104,(2,19),16,f,f,8f 0a 00 00 00 00 00 00
1,105,(2,22),16,f,f,90 0a 

Re: [HACKERS] Async execution of postgres_fdw.

2015-01-14 Thread Robert Haas
On Tue, Jan 13, 2015 at 6:46 AM, Kyotaro HORIGUCHI
horiguchi.kyot...@lab.ntt.co.jp wrote:
 Is it possible to use the parallel query infrastructure being built by
 Robert or to do something like parallel seq scan? That will work, not just
 for Postgres FDW but all the FDWs.

 But, I think, from the performance view, every scan of multiple
 foreign scans don't need correnponding local process.

Quite so.  I think this is largely a separate project.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] __attribute__ for non-gcc compilers

2015-01-14 Thread Andres Freund
On 2015-01-14 17:46:39 -0500, Robert Haas wrote:
 On Wed, Jan 14, 2015 at 5:29 PM, Andres Freund and...@2ndquadrant.com wrote:
  I think it's better than the alternatives:
 
  a) Don't support 64bit atomics on any 32bit platform. I think that'd be
 sad because there's some places that could greatly benefit from being
 able to read/store/manipulate e.g. LSNs atomically.
  b) Double the size of 64bit atomics on 32bit platforms, and add
 TYPEALIGN() to every access inside the atomics implementation.
  c) Require 64 atomics to be aligned appropriately manually in every
 place they're embedded. I think that's completely impractical.
 
  The only viable one imo is a)
 
 I can't really fault that reasoning, but if __attribute__((align))
 only works on some platforms, then you've got silent, subtle breakage
 on the ones where it doesn't.

The __attribute__((align()))'s are in compiler specific code sections
anyway - and there's asserts ensuring that the alignment is correct in
all routines where it matters (IIRC). Those are what caught the
problem. C.f.
http://archives.postgresql.org/message-id/20150108204635.GK6299%40alap3.anarazel.de

I think I'd for now simply not define pg_attribute_aligned() on
platforms where it's not supported, instead of defining it empty. If we
need a softer variant we can name it pg_attribute_aligned_if_possible or
something.

Sounds sane?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Shouldn't CREATE TABLE LIKE copy the relhasoids property?

2015-01-14 Thread Tom Lane
dst1 doesn't get an OID column:

regression=# create table src1 (f1 int) with oids;
CREATE TABLE
regression=# create table dst1 (like src1);
CREATE TABLE
regression=# \d+ src1
 Table public.src1
 Column |  Type   | Modifiers | Storage | Stats target | Description 
+-+---+-+--+-
 f1 | integer |   | plain   |  | 
Has OIDs: yes

regression=# \d+ dst1
 Table public.dst1
 Column |  Type   | Modifiers | Storage | Stats target | Description 
+-+---+-+--+-
 f1 | integer |   | plain   |  | 


If you don't find that problematic, how about this case?

regression=# create table src2 (f1 int, primary key(oid)) with oids;
CREATE TABLE
regression=# create table dst2 (like src2 including indexes);
ERROR:  column oid named in key does not exist

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Peter Geoghegan
This is great, but it's not exactly clear which bt_page_items() page
is which - some are skipped, but I can't be sure which. Would you mind
rewriting that query to indicate which block is under consideration by
bt_page_items()?

Thanks
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning recovery.conf into GUCs

2015-01-14 Thread Josh Berkus
On 01/15/2015 02:24 AM, Robert Haas wrote:
 I think your idea of adding read-only GUCs with the same names as all
 of the recovey.conf parameters is a clear win.  Even if we do nothing
 more than that, it makes the values visible from the SQL level, and
 that's good.  But I think we should go further and make them really be
 GUCs.  Otherwise, if you want to be able to change one of those
 parameters other than at server startup time, you've got to invent a
 separate system for reloading them on SIGHUP.  If you make them part
 of the GUC mechanism, you can use that.  I think it's quite safe to
 say that the whole reason we *have* a GUC mechanism is so that all of
 our configuration can be done through one grand, unified mechanism
 rather than being fragmented across many files.

+1

I do find it ironic that the creator of the Grand Unified Configuration
Settings is arguing against unifying the files.

 Some renaming might be in order.  Heikki previously suggested merging
 all of the recovery_target_whatever settings down into a single
 parameter recovery_target='kindofrecoverytarget furtherdetailsgohere',
 like recovery_target='xid 1234' or recovery_target='name bob'.  Maybe
 that would be more clear (or not). 

Not keen on non-atomic settings, personally.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind in contrib

2015-01-14 Thread Robert Haas
On Wed, Jan 7, 2015 at 8:44 AM, Andrew Dunstan and...@dunslane.net wrote:
 I understand, but I think pg_rewind is likely to be misleading to many
 users who will say but I don't want just to rewind.

 I'm not wedded to the name I suggested, but I think we should look at
 possible alternative names. We do have experience of misleading names
 causing confusion (e.g. initdb).

FWIW, pg_rewind sounds pretty good to me.  But maybe I'm just not
understanding what the use case for this, apart from rewinding?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] __attribute__ for non-gcc compilers

2015-01-14 Thread Andres Freund
On 2015-01-14 15:48:47 -0500, Robert Haas wrote:
 On Tue, Jan 13, 2015 at 4:18 PM, Oskari Saarenmaa o...@ohmu.fi wrote:
  Commit db4ec2ffce35 added alignment attributes for 64-bit atomic
  variables as required on 32-bit platforms using
  __attribute__((aligned(8)).  That works fine with GCC, and would work
  with Solaris Studio Compiler [1] and IBM XL C [2], but src/include/c.h
  defines __attribute__ as an empty macro when not using GCC.
  Unfortunately we can't just disable that #define and enable all
  __attributes__ for Solaris CC and XLC as we use a bunch of attributes
  that are not supported by those compilers and using them unconditionally
  would generate a lot of warnings.
 
  Attached a patch that defines custom macros for each attribute and
  enables them individually for compilers that support them and never
  defines __attribute__.
 
  I have tested this with GCC 4.9.2 on Linux x86-64 and Solaris CC 5.12 on
  Sparc (32-bit build); I don't have access to an IBM box with XLC.
 
 I guess my first question is whether we want to be relying on
 __attribute__((aligned)) in the first place.

I think it's better than the alternatives:

a) Don't support 64bit atomics on any 32bit platform. I think that'd be
   sad because there's some places that could greatly benefit from being
   able to read/store/manipulate e.g. LSNs atomically.
b) Double the size of 64bit atomics on 32bit platforms, and add
   TYPEALIGN() to every access inside the atomics implementation.
c) Require 64 atomics to be aligned appropriately manually in every
   place they're embedded. I think that's completely impractical.

The only viable one imo is a)

Since the atomics code is compiler dependant anyway, I don't much of a
problem with relying on compiler specific alignment
instructions. Really, the only problem right now is that __attribute__
is defined to be empty on compilers where it has meaning.

 If we do, then this seems like a pretty sensible and necessary change.

I think it's also an improvment independent from the alignment
code. Having a more widely portable __attribute__((noreturn)),
__attribute__((unused)), __attribute__((format(..))) seems like a good
general improvement. And all of these are supported in other compilers
than gcc.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Merlin Moncure
On Wed, Jan 14, 2015 at 2:32 PM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, Jan 14, 2015 at 12:24 PM, Peter Geoghegan p...@heroku.com wrote:
 Could you write some code to print out the block number (i.e.
 BlockNumber blkno) if there are more than, say, 5 retries within
 _bt_moveright()?

 Obviously I mean that the block number should be printed, no matter
 whether or not the P_INCOMPLETE_SPLIT() path is taken or not. So you
 should just move this to the top of the for(;;) loop, so it's always
 available to print:

 BlockNumber blkno = BufferGetBlockNumber(buf);

(gdb) print  BufferGetBlockNumber(buf)
$15 = 9

..and it stays 9, continuing several times having set breakpoint.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Peter Geoghegan
On Wed, Jan 14, 2015 at 4:26 PM, Merlin Moncure mmonc...@gmail.com wrote:
 The index is the oid index on pg_class.  Some more info:

 *) temp table churn is fairly high.  Several dozen get spawned and
 destroted at the start of a replication run, all at once, due to some
 dodgy coding via dblink.  During the replication run, the temp table
 churn rate drops.

 *)  running btreecheck, I see:

I knew I wrote that tool for a reason.   :-)

It would be great if I could get a raw dump of the index using
contrib/pageinspect at this juncture too, as already described.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] __attribute__ for non-gcc compilers

2015-01-14 Thread Robert Haas
On Wed, Jan 14, 2015 at 5:29 PM, Andres Freund and...@2ndquadrant.com wrote:
 I think it's better than the alternatives:

 a) Don't support 64bit atomics on any 32bit platform. I think that'd be
sad because there's some places that could greatly benefit from being
able to read/store/manipulate e.g. LSNs atomically.
 b) Double the size of 64bit atomics on 32bit platforms, and add
TYPEALIGN() to every access inside the atomics implementation.
 c) Require 64 atomics to be aligned appropriately manually in every
place they're embedded. I think that's completely impractical.

 The only viable one imo is a)

I can't really fault that reasoning, but if __attribute__((align))
only works on some platforms, then you've got silent, subtle breakage
on the ones where it doesn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Merlin Moncure
On Wed, Jan 14, 2015 at 5:39 PM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, Jan 14, 2015 at 3:38 PM, Merlin Moncure mmonc...@gmail.com wrote:
 (gdb) print  BufferGetBlockNumber(buf)
 $15 = 9

 ..and it stays 9, continuing several times having set breakpoint.


 And the index involved? I'm pretty sure that this in an internal page, no?

The index is the oid index on pg_class.  Some more info:

*) temp table churn is fairly high.  Several dozen get spawned and
destroted at the start of a replication run, all at once, due to some
dodgy coding via dblink.  During the replication run, the temp table
churn rate drops.

*)  running btreecheck, I see:
cds2=# select bt_index_verify('pg_class_oid_index');
NOTICE:  page 7 of index pg_class_oid_index is deleted
NOTICE:  page 10 of index pg_class_oid_index is deleted
NOTICE:  page 12 of index pg_class_oid_index is deleted
 bt_index_verify
─


cds2=# select bt_leftright_verify('pg_class_oid_index');
WARNING:  left link/right link pair don't comport at level 0, block 9,
last: 2, current left: 4
WARNING:  left link/right link pair don't comport at level 0, block 9,
last: 9, current left: 4
WARNING:  left link/right link pair don't comport at level 0, block 9,
last: 9, current left: 4
WARNING:  left link/right link pair don't comport at level 0, block 9,
last: 9, current left: 4
WARNING:  left link/right link pair don't comport at level 0, block 9,
last: 9, current left: 4
[repeat infinity until cancel]

which looks like the index is corrupted?  ISTM _bt_moveright is
hanging because it's trying to move from block 9 to block 9 and so
loops forever.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Shouldn't CREATE TABLE LIKE copy the relhasoids property?

2015-01-14 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 01/14/2015 07:29 PM, Tom Lane wrote:
 If you don't find that problematic, how about this case?
 
 regression=# create table src2 (f1 int, primary key(oid)) with oids;
 CREATE TABLE
 regression=# create table dst2 (like src2 including indexes);
 ERROR:  column oid named in key does not exist

 I agree it's odd, and probably wrong, although it's been like that for a 
 very long time, hasn't it?

Sure, LIKE has always behaved this way.  It still seems wrong though.
As a reference point, creating a table that inherits from src1 or src2
will result in it having oids (even if you say WITHOUT OIDS).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg

2015-01-14 Thread Tomas Vondra
On 12.1.2015 01:28, Ali Akbar wrote:
 
  Or else we implement what you suggest below (more comments below):
 
  Thinking about the 'release' flag a bit more - maybe we
 could do
  this
  instead:
 
  if (release  astate-private_cxt)
  MemoryContextDelete(astate-mcontext);
  else if (release)
  {
  pfree(astate-dvalues);
  pfree(astate-dnulls);
  pfree(astate);
  }
 
  i.e. either destroy the whole context if possible, and
 just free the
  memory when using a shared memory context. But I'm afraid
 this would
  penalize the shared memory context, because that's
 intended for
  cases where all the build states coexist in parallel and
 then at some
  point are all converted into a result and thrown away.
 Adding pfree()
  calls is no improvement here, and just wastes cycles.
 
 
  As per Tom's comment, i'm using parent memory context instead of
  shared memory context below.
 
  In the future, if some code writer decided to use
 subcontext=false,
  to save memory in cases where there are many array
 accumulation, and
  the parent memory context is long-living, current code can cause
  memory leak. So i think we should implement your suggestion
  (pfreeing astate), and warn the implication in the API
 comment. The
  API user must choose between release=true, wasting cycles but
  preventing memory leak, or managing memory from the parent memory
  context.
 
 I'm wondering whether this is necessary after fixing makeArrayResult to
 use the privat_cxt flag. It's still possible to call makeMdArrayResult
 directly (with the wrong 'release' value).
 
 Another option might be to get rid of the 'release' flag altogether, and
 just use the 'private_cxt' - I'm not aware of a code using release=false
 with private_cxt=true (e.g. to build the same array twice from the same
 astate). But maybe there's such code, and another downside is that it'd
 break the existing API.
 
  In one possible use case, for efficiency maybe the caller will
  create a special parent memory context for all array accumulation.
  Then uses makeArrayResult* with release=false, and in the end
  releasing the parent memory context once for all.
 
 Yeah, although I'd much rather not break the existing code at all. That
 is - my goal is not to make it slower unless absolutely necessary (and
 in that case the code may be fixed per your suggestion). But I'm not
 convinced it's worth it.
 
 
 OK. Do you think we need to note this in the comments? Something like
 this: If using subcontext=false, the caller must be careful about memory
 usage, because makeArrayResult* will not free the memory used.

Yes, I think it's worth mentioning.

 But I think it makes sense to move the error handling into
 initArrayResultArr(), including the get_element_type() call, and remove
 the element_type from the signature. This means initArrayResultAny()
 will call the get_element_type() twice, but I guess that's negligible.
 And everyone who calls initArrayResultArr() will get the error handling
 for free.
 
 Patch v7 attached, implementing those two changes, i.e.
 
   * makeMdArrayResult(..., astate-private_cxt)
   * move error handling into initArrayResultArr()
   * remove element_type from initArrayResultArr() signature
 
  
 Reviewing the v7 patch:
 - applies cleanly to current master. patch format, whitespace, etc is good
 - make check runs without error
 - performance  memory usage still consistent
 
 If you think we don't have to add the comment (see above), i'll mark
 this as ready for committer

Attached is v8 patch, with a few comments added:

1) before initArrayResult() - explaining when it's better to use a
   single memory context, and when it's more efficient to use a
   separate memory context for each array build state

2) before makeArrayResult() - explaining that it won't free memory
   when allocated in a single memory context (and that a pfree()
   has to be used if necessary)

3) before makeMdArrayResult() - explaining that it's illegal to use
   release=true unless using a subcontext

Otherwise the v8 patch is exactly the same as v7. Assuming the comments
make it sufficiently clear, I agree with marking this as 'ready for
committer'.

regards

-- 
Tomas Vondrahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services
diff --git a/src/backend/executor/nodeSubplan.c b/src/backend/executor/nodeSubplan.c
index f3ce1d7..9eb4d63 100644
--- a/src/backend/executor/nodeSubplan.c
+++ 

Re: [HACKERS] Out-of-bounds write and incorrect detection of trigger file in pg_standby

2015-01-14 Thread Robert Haas
On Wed, Jan 14, 2015 at 8:24 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 In pg_standby.c, we use a 32-byte buffer in CheckForExternalTrigger to
 which is read the content of the trigger file defined by -f:
 CheckForExternalTrigger(void)
 {
 charbuf[32];
 [...]
 if ((len = read(fd, buf, sizeof(buf)))  0)
 {
 stuff();
 }

 if (len  sizeof(buf))
 buf[len] = '\0';
 However it happens that if the file read contains 32 bytes or more, we
 enforce \0 in a position out of bounds. While looking at that, I also
 noticed that pg_standby can trigger a failover as long as the
 beginning of buf matches either smart or fast, but I think we
 should check for a perfect match instead of a comparison of the first
 bytes, no?

 Attached is a patch to fix the out-of-bound issue as well as
 improvements for the detection of the trigger file content. I think
 that the out-of-bound fix should be backpatched, while the
 improvements for the trigger file are master-only. Coverity has
 pointed out the out-of-bound issue.

I would imagine that the goal might have been to accept not only smart
but also smart\r and smart\n and smart\r\n.  It's a little weird that
we also accept smartypants, though.

Instead of doing this:

if (len  sizeof(buf))
buf[len] = '\0';

...I would suggest making the size of the buffer one greater than the
size of the read(), and then always nul-terminating the buffer.  It
seems to me that would make the code easier to reason about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Peter Geoghegan
On Wed, Jan 14, 2015 at 3:38 PM, Merlin Moncure mmonc...@gmail.com wrote:
 (gdb) print  BufferGetBlockNumber(buf)
 $15 = 9

 ..and it stays 9, continuing several times having set breakpoint.


And the index involved? I'm pretty sure that this in an internal page, no?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] s_lock.h default definitions are rather confused

2015-01-14 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 Right now I think a #ifdef/undef S_UNLOCK in the relevant gcc section
 sufficient and acceptable. It's after all the HPPA section that doesn't
 really play by the rules.

Works for me.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Merlin Moncure
On Wed, Jan 14, 2015 at 6:26 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, Jan 14, 2015 at 5:39 PM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, Jan 14, 2015 at 3:38 PM, Merlin Moncure mmonc...@gmail.com wrote:
 (gdb) print  BufferGetBlockNumber(buf)
 $15 = 9

 ..and it stays 9, continuing several times having set breakpoint.


 And the index involved? I'm pretty sure that this in an internal page, no?

 The index is the oid index on pg_class.  Some more info:

 *) temp table churn is fairly high.  Several dozen get spawned and
 destroted at the start of a replication run, all at once, due to some
 dodgy coding via dblink.  During the replication run, the temp table
 churn rate drops.

 *)  running btreecheck, I see:
 cds2=# select bt_index_verify('pg_class_oid_index');
 NOTICE:  page 7 of index pg_class_oid_index is deleted
 NOTICE:  page 10 of index pg_class_oid_index is deleted
 NOTICE:  page 12 of index pg_class_oid_index is deleted
  bt_index_verify
 ─


 cds2=# select bt_leftright_verify('pg_class_oid_index');
 WARNING:  left link/right link pair don't comport at level 0, block 9,
 last: 2, current left: 4
 WARNING:  left link/right link pair don't comport at level 0, block 9,
 last: 9, current left: 4
 WARNING:  left link/right link pair don't comport at level 0, block 9,
 last: 9, current left: 4
 WARNING:  left link/right link pair don't comport at level 0, block 9,
 last: 9, current left: 4
 WARNING:  left link/right link pair don't comport at level 0, block 9,
 last: 9, current left: 4
 [repeat infinity until cancel]

 which looks like the index is corrupted?  ISTM _bt_moveright is
 hanging because it's trying to move from block 9 to block 9 and so
 loops forever.

per Peter the following might be useful:

cds2=# select * from bt_metap('pg_class_oid_index');
 magic  │ version │ root │ level │ fastroot │ fastlevel
┼─┼──┼───┼──┼───
 340322 │   2 │3 │ 1 │3 │ 1

cds2=# select (bt_page_stats('pg_class_oid_index', s)).* from
generate_series(1,12) s;
 blkno │ type │ live_items │ dead_items │ avg_item_size │ page_size │
free_size │ btpo_prev │ btpo_next │ btpo  │ btpo_flags
───┼──┼┼┼───┼───┼───┼───┼───┼───┼
 1 │ l│119 │  0 │16 │  8192 │
5768 │ 0 │ 4 │ 0 │  1
 2 │ l│ 25 │  0 │16 │  8192 │
7648 │ 4 │ 9 │ 0 │  1
 3 │ r│  8 │  0 │15 │  8192 │
7996 │ 0 │ 0 │ 1 │  2
 4 │ l│178 │  0 │16 │  8192 │
4588 │ 1 │ 2 │ 0 │  1
 5 │ l│  7 │  0 │16 │  8192 │
8008 │ 9 │11 │ 0 │  1
 6 │ l│  5 │  0 │16 │  8192 │
8048 │11 │ 8 │ 0 │  1
 7 │ d│  0 │  0 │ 0 │  8192 │
   0 │-1 │-1 │ 12366 │  0
 8 │ l│187 │  0 │16 │  8192 │
4408 │ 6 │ 0 │ 0 │  1
 9 │ l│ 25 │  0 │16 │  8192 │
7648 │ 4 │ 9 │ 0 │  1
10 │ d│  0 │  0 │ 0 │  8192 │
   0 │-1 │-1 │ 12366 │  0
11 │ l│  6 │  0 │16 │  8192 │
8028 │ 5 │ 6 │ 0 │  1
12 │ d│  0 │  0 │ 0 │  8192 │
   0 │-1 │-1 │ 10731 │  0


merlin
cds2=# select (bt_page_items('pg_class_oid_index', s)).* from generate_series(1,12) s;
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
NOTICE:  page is deleted
 itemoffset │  ctid  │ itemlen │ nulls │ vars │  data   
┼┼─┼───┼──┼─
  1 │ (0,9)  │  16 │ f │ f│ 16 0b 00 00 00 00 00 00
  2 │ (2,50) │  16 │ f │ f│ 70 00 00 00 00 00 00 00
  3 │ (2,52) │  16 │ f │ f│ 71 00 00 00 00 00 00 00
  4 │ (2,24) │  16 │ f │ f│ ae 00 00 00 00 00 00 00
  5 │ (2,25) │  16 │ f │ f│ af 00 00 00 00 00 00 00
  6 │ (2,51) │  16 │ f │ f│ 24 02 00 00 

Re: [HACKERS] Out-of-bounds write and incorrect detection of trigger file in pg_standby

2015-01-14 Thread Michael Paquier
On Thu, Jan 15, 2015 at 7:13 AM, Robert Haas robertmh...@gmail.com wrote:
 Instead of doing this:

 if (len  sizeof(buf))
 buf[len] = '\0';

 ...I would suggest making the size of the buffer one greater than the
 size of the read(), and then always nul-terminating the buffer.  It
 seems to me that would make the code easier to reason about.
How about the attached then? This way we still detect the same way any
invalid values:
-   if ((len = read(fd, buf, sizeof(buf)))  0)
+   if ((len = read(fd, buf, sizeof(buf) - 1))  0)
Regards,
-- 
Michael
diff --git a/contrib/pg_standby/pg_standby.c b/contrib/pg_standby/pg_standby.c
index d6b1692..2f9f2b4 100644
--- a/contrib/pg_standby/pg_standby.c
+++ b/contrib/pg_standby/pg_standby.c
@@ -418,7 +418,7 @@ CheckForExternalTrigger(void)
 		return;
 	}
 
-	if ((len = read(fd, buf, sizeof(buf)))  0)
+	if ((len = read(fd, buf, sizeof(buf) - 1))  0)
 	{
 		fprintf(stderr, WARNING: could not read \%s\: %s\n,
 triggerPath, strerror(errno));

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS qual pushdown

2015-01-14 Thread Alvaro Herrera
Robert Haas wrote:
 On Wed, Jan 14, 2015 at 9:22 AM, Dean Rasheed dean.a.rash...@gmail.com 
 wrote:
  On 14 January 2015 at 13:29, Robert Haas robertmh...@gmail.com wrote:
  One thing they could still leak is the number of times they got
  called, and thus possibly the number of unseen rows.  Now if the
  expressions get constant-folded away that won't be an issue, but a
  clever user can probably avoid that.
 
  Right now, EXPLAIN ANALYSE can be used to tell you the number of
  unseen rows. Is that something that people are concerned about, and
  are there any plans to change it?
 
 Interesting question.  I don't know.

Wasn't this part of the covert channel discussion that took place way
before RLS was committed?  As I recall, it was argued that such covert
channels are acceptable as long as their bandwidth is low.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Peter Geoghegan
On Wed, Jan 14, 2015 at 5:23 PM, Peter Geoghegan p...@heroku.com wrote:
 My immediate observation here is that blocks 2 and 9 have identical
 metadata (from their page opaque area), but partially non-matching
 data items (however, the number of items on each block is consistent
 and correct according to that metadata, as far as it goes). I think
 block 9 is supposed to have a right-link to block 5 (block 5 seems to
 think so, at least -- its left link is 9).

I am mistaken on one detail here - blocks 2 and 9 are actually fully
identical. I still have no idea why, though.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] initdb -S and tablespaces

2015-01-14 Thread Abhijit Menon-Sen
At 2015-01-14 11:59:08 +0100, and...@2ndquadrant.com wrote:

  +   if (ControlFile-state != DB_SHUTDOWNED 
  +   ControlFile-state != DB_SHUTDOWNED_IN_RECOVERY)
  +   perform_fsync(data_directory);
  +
 
 a) Please think of a slightly more descriptive name than perform_fsync

OK. (I just copied the name initdb uses, because at the time I was still
thinking in terms of a later patch moving this to src/common.) What do
you think of fsync_recursively? fsync_pgdata?

I think fsync_recursively(data_directory) reads well.

 b) I think we should check here whether fsync is enabled.

OK, will do.

 c) I'm wondering if we should add fsync to the control file and also
perform an fsync if the last shutdown was clear, but fsync was
disabled.

Explain? Add fsync to the control file means store the value of the
fsync GUC setting in the control file? And would the fsync you mention
be dependent on the setting, or unconditional?

  +static void
  +pre_sync_fname(char *fname, bool isdir)
  +{
 
 this essentially already exists in the backend inparts. C.f.
 pg_flush_data.

OK, I missed that. I'll rework the patch to use it.

 Theoretically you should also invoke fsync on directories.

OK.

  +* We need to name the parent of PGDATA.  get_parent_directory() isn't
  +* enough here, because it can result in an empty string.
  +*/
  +   snprintf(pdir, MAXPGPATH, %s/.., pg_data);
  +   canonicalize_path(pdir);
 
 Hm. Why is this neded? Just syncing . should work?

Not sure, will investigate.

Thanks.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS qual pushdown

2015-01-14 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 Robert Haas wrote:
  On Wed, Jan 14, 2015 at 9:22 AM, Dean Rasheed dean.a.rash...@gmail.com 
  wrote:
   On 14 January 2015 at 13:29, Robert Haas robertmh...@gmail.com wrote:
   One thing they could still leak is the number of times they got
   called, and thus possibly the number of unseen rows.  Now if the
   expressions get constant-folded away that won't be an issue, but a
   clever user can probably avoid that.
  
   Right now, EXPLAIN ANALYSE can be used to tell you the number of
   unseen rows. Is that something that people are concerned about, and
   are there any plans to change it?
  
  Interesting question.  I don't know.
 
 Wasn't this part of the covert channel discussion that took place way
 before RLS was committed?  As I recall, it was argued that such covert
 channels are acceptable as long as their bandwidth is low.

Yes, it was part of the discussion and no, there's no plans to try and
hide row counts in explain analyze, nor to deal with things like unique
constraint or foreign key reference violations.

There are other areas which need improvement which will help address
covert channel activity (better auditing, better control over what
actions are allowed to diffferent users when it comes to creating
objects, modifying permissions and policies, throttling, etc).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] WITH CHECK and Column-Level Privileges

2015-01-14 Thread Stephen Frost
Noah,

* Noah Misch (n...@leadboat.com) wrote:
 On Mon, Jan 12, 2015 at 05:16:40PM -0500, Stephen Frost wrote:
  Alright, here's an updated patch which doesn't return any detail if no
  values are visible or if only a partial key is visible.
 
 I browsed this patch.  There's been no mention of foreign key constraints, but
 ri_ReportViolation() deserves similar hardening.  If a user has only DELETE
 privilege on a PK table, FK violation messages should not leak the PK values.
 Modifications to the foreign side are less concerning, since the user will
 often know the attempted value; still, I would lock down both sides.

Ah, yes, good point.

 Please add a comment explaining the safety of each row-emitting message you
 haven't changed.  For example, the one in refresh_by_match_merge() is safe
 because getting there requires MV ownership.

Sure.

 Instead of duplicating an entire ereport() to change whether you include an
 errdetail, use condition ? errdetail(...) : 0.

Yeah, that's a bit nicer, will do.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_rewind in contrib

2015-01-14 Thread Michael Paquier
On Wed, Jan 14, 2015 at 6:43 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Here is a new version. There are now a fair amount of code changes compared
 to the version on github, like the logging and progress information, and a
 lot of comment changes. I also improved the documentation.

Perhaps this could a direct link to the page of pg_rewind with  xref
linkend=app-pgrewind?
-possibly new, system. Once complete, the primary and standby can be
+possibly new, system. The applicationpg_rewind/ utility can be
+used to speed up this process on large clusters.
+Once complete, the primary and standby can be

Missing a dot a the end of the phrase of this paragraph:
+para
+  This option specifies the target data directory that is synchronized
+  with the source. The target server must shut down cleanly before
+  running applicationpg_rewind/application
+/para

Compilation of pg_rewind on MSVC is not done. It is still listed as a
contrib/ in Mkvcbuild.pm.

The compilation of pg_xlogdump fails because inclusion of
dbcommands_xlog.h is missing in rmgrdesc.c (patch attached).

src/bin/Makefile has not been updated to compile pg_rewind.
-- 
Michael
diff --git a/contrib/pg_xlogdump/rmgrdesc.c b/contrib/pg_xlogdump/rmgrdesc.c
index 180818d..deb1b8f 100644
--- a/contrib/pg_xlogdump/rmgrdesc.c
+++ b/contrib/pg_xlogdump/rmgrdesc.c
@@ -23,6 +23,7 @@
 #include access/xlog_internal.h
 #include catalog/storage_xlog.h
 #include commands/dbcommands.h
+#include commands/dbcommands_xlog.h
 #include commands/sequence.h
 #include commands/tablespace.h
 #include rmgrdesc.h

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Partitioning: issues/ideas (Was: Re: [HACKERS] On partitioning)

2015-01-14 Thread Amit Langote
On 06-01-2015 PM 03:40, Amit Langote wrote:
 
 I agree that while we are discussing these points, we could also be
 discussing how we solve problems of existing partitioning implementation
 using whatever the above things end up being. Proposed approaches to
 solve those problems might be useful to drive the first step as well or
 perhaps that's how it should be done anyway.
 

I realize the discussion has not quite brought us to *conclusions* so
far though surely there has been valuable input from people. Anyway,
starting a new thread with the summary of what has been (please note
that the order of listing the points does not necessarily connote the
priority):

* It has been repeatedly pointed out that we may want to decouple
partitioning from inheritance because implementing partitioning as an
extension of inheritance mechanism means that we have to keep all the
existing semantics which might limit what we want to do with the special
case of it which is partitioning; in other words, we would find
ourselves in difficult position where we have to inject a special case
code into a very generalized mechanism that is inheritance.
Specifically, do we regard a partitions as pg_inherits children of its
partitioning parent?

* Syntax: do we want to make it similar to one of the many other
databases out there? Or we could invent our own? I like the syntax that
Robert suggested that covers the cases of RANGE and LIST partitioning
without actually having to use those keywords explicitly; something like
the following:

CREATE TABLE parent PARTITION ON (column [ USING opclass ] [, ... ]);

CREATE TABLE child PARTITION OF parent_name
   FOR VALUES { (value, ...) [ TO (value, ...) ] }

So instead of making a hard distinction between range and list
partitioning, you can say:

CREATE TABLE child_name PARTITION OF parent_name FOR VALUES (3, 5, 7);

wherein, child is effectively a LIST partition

CREATE TABLE child PARTITION OF parent_name FOR VALUES (8) TO (12);

wherein, child is effectively a RANGE partition on one column

CREATE TABLE child PARTITION OF parent_name FOR VALUES(20, 120) TO (30,
130);

wherein, child is effectively a RANGE partition on two columns

I wonder if we could add a clause like DISTRIBUTED BY to complement
PARTITION ON that represents a hash distributed/partitioned table (that
could be a syntax to support sharded tables maybe; we would definitely
want to move ahead in that direction I guess)

* Catalog: We would like to have a catalog structure suitable to
implement capabilities like multi-column partitioning, sub-partitioning
(with arbitrary number of levels in the hierarchy). I had suggested
that we create two new catalogs viz. pg_partitioned_rel,
pg_partition_def to store metadata about a partition key of a
partitioned relation and partition bound info of a partition,
respectively. Also, see the point about on-disk representation of
partition bounds

* It is desirable to treat partitions as pg_class relations with perhaps
a new relkind(s). We may want to choose an implementation where only the
lowest level relations in a partitioning hierarchy have storage; those
at the upper layers are mere placeholder relations though of course with
associated constraints determined by partitioning criteria (with
appropriate metadata entered into the additional catalogs). I am not
quite sure if each kind of the relations involved in the partitioning
scheme have separate namespaces and, if they are, how we implement that

* In the initial implementation, we could just live with partitioning on
a set of columns (and not arbitrary expressions of them)

* We perhaps do not need multi-column LIST partitions as they are not
very widely used and may complicate the implementation

* There are a number of suggestions about how we represent partition
bounds (on-disk) - pg_node_tree, RECORD (a composite type or the rowtype
associated with the relation itself), etc. Important point to consider
here may be that partition key may contain more than one column

* How we represent partition definition in memory (for a given
partitioned relation) - important point to remember is that such a
representation should be efficient to iterate through or
binary-searchable. Also see the points about tuple-routing and
partition-pruning

* Overflow/catchall partition: it seems we do not want/need them. It
might seem desirable for example in cases where a big transaction enters
a large number of tuples all but one of which find a defined partition;
we may not want to error out in such case but instead enter that erring
tuple into the overflow partition instead. If we choose to implement
that, we would like to also implement the capability to move the tuples
into the appropriate partition once it's defined. Related is the notion
of automatically creating partitions if one is not already defined for a
just entered tuple; but there may be locking troubles if many concurrent
sessions try to do that

* Tuple-routing: based on the internal 

Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf

2015-01-14 Thread Amit Kapila
On Wed, Jan 14, 2015 at 9:01 PM, Sawada Masahiko sawada.m...@gmail.com
wrote:

 Hi all,

 The postgresql.auto.conf is loaded after loading of postgresql.conf
 whenever configuration file is loaded or reloaded.
 This means that parameter in postgresql.auto.conf is quite high
 priority, so the parameter in postgresql.conf does not work at all
 even if user set it manually.

 If user want to change stopped postgresql server then user need to
 merge two configuration file(postgresql.conf and postgresql.auto.conf)
 while maintaining the consistency manually.


I think one way to currently do it is with the help of Alter System .. Reset
command.  Basically user can check via pg_settings, if the variable
belongs to postgresql.auto.conf and he knows already a duplicate copy
of same is available in postrgresql.conf and that is the value he want to
use, then RESET command could be used to remove the setting from
postgresql.auto.conf

 From an operational perspective having a written config with duplicate
 entries is not good thing.
 I think we need to merge two configuration file into one (or more than
 one, if it uses like 'include' word)

 The one solution is to add merging tool/commnand which merges two
 configuration file while maintaining the consistency.

If you want to think of some solution which can make the usage of Alter
System more convenient, then we should think only in terms of change/
remove the value in postgresql.auto.conf as that is the place where we
have added values programatically.

One thought I have in this line is that currently there doesn't seem to be
a way to know if the setting has an entry both in postgresql.conf and
postgresql.auto.conf, if we can have some way of knowing the same
(pg_settings?), then it could be convenient for user to decide if the value
in postgresql.auto.conf is useful or not and if it's not useful then use
Alter System .. Reset command to remove the same from
postgresql.auto.conf.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Safe memory allocation functions

2015-01-14 Thread Alvaro Herrera
Robert Haas wrote:
 On Tue, Jan 13, 2015 at 10:10 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  However, there is a larger practical problem with this whole concept,
  which is that experience should teach us to be very wary of the assumption
  that asking for memory the system can't give us will just lead to nice
  neat malloc-returns-NULL behavior.  Any small perusal of the mailing list
  archives will remind you that very often the end result will be SIGSEGV,
  OOM kills, unrecoverable trap-on-write when the kernel realizes it can't
  honor a copy-on-write promise, yadda yadda.  Agreed that it's arguable
  that these only occur in misconfigured systems ... but misconfiguration
  appears to be the default in a depressingly large fraction of systems.
  (This is another reason for _safe not being the mot juste :-()
 
 I don't really buy this.  It's pretty incredible to think that after a
 malloc() failure there is absolutely no hope of carrying on sanely.
 If that were true, we wouldn't be able to ereport() out-of-memory
 errors at any severity less than FATAL, but of course it doesn't work
 that way.  Moreover, AllocSetAlloc() contains malloc() and, if that
 fails, calls malloc() again with a smaller value, without even
 throwing an error.

I understood Tom's point differently: instead of malloc() failing,
malloc() will return a supposedly usable pointer, but later usage of it
will lead to a crash of some sort.  We know this does happen in reality,
because people do report it; but we also know how to fix it.  And for
systems that have been correctly set up, the new behavior (using some
plan B for when malloc actually fails instead of spuriously succeeding
only to cause a later crash) will be much more convenient.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] orangutan seizes up during isolation-check

2015-01-14 Thread Noah Misch
On Wed, Jan 14, 2015 at 04:48:53PM -0500, Peter Eisentraut wrote:
 What I'm seeing now is that the unaccent regression tests when run under
 make check-world abort with
 
 FATAL:  postmaster became multithreaded during startup
 HINT:  Set the LC_ALL environment variable to a valid locale.

contrib/unaccent/Makefile sets NO_LOCALE=1, so that makes sense.  I expect the
patch over here will fix it:
http://www.postgresql.org/message-id/20150109063015.ga2491...@tornado.leadboat.com

 The behavior is also a bit strange.  The step
 
 == starting postmaster==
 
 hangs for one minute before failing.  This probably has nothing to do
 with your change, but maybe pg_regress could detect postmaster startup
 failures better.

Yeah, I think any postmaster startup failure has that effect.  Not ideal.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Peter Geoghegan
On Wed, Jan 14, 2015 at 5:23 PM, Peter Geoghegan p...@heroku.com wrote:
 My immediate observation here is that blocks 2 and 9 have identical
 metadata (from their page opaque area), but partially non-matching
 data items (however, the number of items on each block is consistent
 and correct according to that metadata, as far as it goes). I think
 block 9 is supposed to have a right-link to block 5 (block 5 seems to
 think so, at least -- its left link is 9).

For the convenience of others, here is a version with a normalized
data column added (interpreting the datums as little-endian Oids).

-- 
Peter Geoghegan
page,itemoffset,ctid,itemlen,nulls,vars,data,normalized data
1,1,(0,9),16,f,f,16 0b 00 00 00 00 00 00,2838
1,2,(2,50),16,f,f,70 00 00 00 00 00 00 00,112
1,3,(2,52),16,f,f,71 00 00 00 00 00 00 00,113
1,4,(2,24),16,f,f,ae 00 00 00 00 00 00 00,174
1,5,(2,25),16,f,f,af 00 00 00 00 00 00 00,175
1,6,(2,51),16,f,f,24 02 00 00 00 00 00 00,548
1,7,(2,53),16,f,f,25 02 00 00 00 00 00 00,549
1,8,(4,66),16,f,f,3a 03 00 00 00 00 00 00,826
1,9,(2,55),16,f,f,3b 03 00 00 00 00 00 00,827
1,10,(2,56),16,f,f,3c 03 00 00 00 00 00 00,828
1,11,(4,15),16,f,f,70 04 00 00 00 00 00 00,1136
1,12,(1,31),16,f,f,71 04 00 00 00 00 00 00,1137
1,13,(4,14),16,f,f,bd 04 00 00 00 00 00 00,1213
1,14,(4,17),16,f,f,be 04 00 00 00 00 00 00,1214
1,15,(1,34),16,f,f,d0 04 00 00 00 00 00 00,1232
1,16,(1,35),16,f,f,d1 04 00 00 00 00 00 00,1233
1,17,(0,2),16,f,f,df 04 00 00 00 00 00 00,1247
1,18,(3,10),16,f,f,e1 04 00 00 00 00 00 00,1249
1,19,(3,11),16,f,f,e7 04 00 00 00 00 00 00,1255
1,20,(7,6),16,f,f,eb 04 00 00 00 00 00 00,1259
1,21,(2,23),16,f,f,ec 04 00 00 00 00 00 00,1260
1,22,(4,16),16,f,f,ed 04 00 00 00 00 00 00,1261
1,23,(4,12),16,f,f,ee 04 00 00 00 00 00 00,1262
1,24,(4,64),16,f,f,89 05 00 00 00 00 00 00,1417
1,25,(3,31),16,f,f,8a 05 00 00 00 00 00 00,1418
1,26,(1,48),16,f,f,8b 08 00 00 00 00 00 00,2187
1,27,(4,63),16,f,f,18 09 00 00 00 00 00 00,2328
1,28,(0,64),16,f,f,20 09 00 00 00 00 00 00,2336
1,29,(0,65),16,f,f,21 09 00 00 00 00 00 00,2337
1,30,(4,18),16,f,f,5c 09 00 00 00 00 00 00,2396
1,31,(1,11),16,f,f,5d 09 00 00 00 00 00 00,2397
1,32,(4,40),16,f,f,28 0a 00 00 00 00 00 00,2600
1,33,(3,47),16,f,f,29 0a 00 00 00 00 00 00,2601
1,34,(3,48),16,f,f,2a 0a 00 00 00 00 00 00,2602
1,35,(3,49),16,f,f,2b 0a 00 00 00 00 00 00,2603
1,36,(3,41),16,f,f,2c 0a 00 00 00 00 00 00,2604
1,37,(4,45),16,f,f,2d 0a 00 00 00 00 00 00,2605
1,38,(3,42),16,f,f,2e 0a 00 00 00 00 00 00,2606
1,39,(4,48),16,f,f,2f 0a 00 00 00 00 00 00,2607
1,40,(4,49),16,f,f,30 0a 00 00 00 00 00 00,2608
1,41,(4,44),16,f,f,31 0a 00 00 00 00 00 00,2609
1,42,(3,43),16,f,f,32 0a 00 00 00 00 00 00,2610
1,43,(7,26),16,f,f,33 0a 00 00 00 00 00 00,2611
1,44,(3,50),16,f,f,34 0a 00 00 00 00 00 00,2612
1,45,(4,71),16,f,f,35 0a 00 00 00 00 00 00,2613
1,46,(4,47),16,f,f,37 0a 00 00 00 00 00 00,2615
1,47,(3,46),16,f,f,38 0a 00 00 00 00 00 00,2616
1,48,(3,44),16,f,f,39 0a 00 00 00 00 00 00,2617
1,49,(4,41),16,f,f,3a 0a 00 00 00 00 00 00,2618
1,50,(0,1),16,f,f,3b 0a 00 00 00 00 00 00,2619
1,51,(4,42),16,f,f,3c 0a 00 00 00 00 00 00,2620
1,52,(0,61),16,f,f,5a 0a 00 00 00 00 00 00,2650
1,53,(0,54),16,f,f,5b 0a 00 00 00 00 00 00,2651
1,54,(0,55),16,f,f,5c 0a 00 00 00 00 00 00,2652
1,55,(0,56),16,f,f,5d 0a 00 00 00 00 00 00,2653
1,56,(0,57),16,f,f,5e 0a 00 00 00 00 00 00,2654
1,57,(0,59),16,f,f,5f 0a 00 00 00 00 00 00,2655
1,58,(0,50),16,f,f,60 0a 00 00 00 00 00 00,2656
1,59,(7,14),16,f,f,61 0a 00 00 00 00 00 00,2657
1,60,(0,42),16,f,f,62 0a 00 00 00 00 00 00,2658
1,61,(0,43),16,f,f,63 0a 00 00 00 00 00 00,2659
1,62,(0,68),16,f,f,64 0a 00 00 00 00 00 00,2660
1,63,(0,69),16,f,f,65 0a 00 00 00 00 00 00,2661
1,64,(7,3),16,f,f,66 0a 00 00 00 00 00 00,2662
1,65,(7,4),16,f,f,67 0a 00 00 00 00 00 00,2663
1,66,(0,53),16,f,f,68 0a 00 00 00 00 00 00,2664
1,67,(7,22),16,f,f,69 0a 00 00 00 00 00 00,2665
1,68,(7,23),16,f,f,6a 0a 00 00 00 00 00 00,2666
1,69,(7,24),16,f,f,6b 0a 00 00 00 00 00 00,2667
1,70,(1,72),16,f,f,6c 0a 00 00 00 00 00 00,2668
1,71,(1,73),16,f,f,6d 0a 00 00 00 00 00 00,2669
1,72,(1,74),16,f,f,6e 0a 00 00 00 00 00 00,2670
1,73,(1,6),16,f,f,6f 0a 00 00 00 00 00 00,2671
1,74,(1,7),16,f,f,70 0a 00 00 00 00 00 00,2672
1,75,(1,75),16,f,f,71 0a 00 00 00 00 00 00,2673
1,76,(1,76),16,f,f,72 0a 00 00 00 00 00 00,2674
1,77,(1,67),16,f,f,73 0a 00 00 00 00 00 00,2675
1,78,(0,40),16,f,f,74 0a 00 00 00 00 00 00,2676
1,79,(0,41),16,f,f,75 0a 00 00 00 00 00 00,2677
1,80,(1,50),16,f,f,76 0a 00 00 00 00 00 00,2678
1,81,(1,51),16,f,f,77 0a 00 00 00 00 00 00,2679
1,82,(1,49),16,f,f,78 0a 00 00 00 00 00 00,2680
1,83,(1,58),16,f,f,79 0a 00 00 00 00 00 00,2681
1,84,(1,59),16,f,f,7a 0a 00 00 00 00 00 00,2682
1,85,(1,89),16,f,f,7b 0a 00 00 00 00 00 00,2683
1,86,(1,70),16,f,f,7c 0a 00 00 00 00 00 00,2684
1,87,(1,71),16,f,f,7d 0a 00 00 00 00 00 00,2685
1,88,(1,56),16,f,f,7e 0a 00 00 00 00 00 00,2686
1,89,(1,57),16,f,f,7f 0a 00 00 00 00 00 00,2687
1,90,(1,52),16,f,f,80 0a 00 00 00 00 00 00,2688
1,91,(1,53),16,f,f,81 0a 

Re: [HACKERS] Overhauling our interrupt handling

2015-01-14 Thread Kyotaro HORIGUCHI
Hello, I'd synced up this at last.

I think I should finilize my commitfest item for this issue, with
.. Rejected?

 This mail is a answer to
 http://archives.postgresql.org/message-id/20150110022542.GF12509%40alap3.anarazel.de
 but I decided that it's a good go move it into a new thread since the
 patchseries has outgrown its original target. It's invasive and touches
 many arcane areas of the code, so that I think more eyes than a long
 deep thread is likely to have, are a good thing.
 
 As a short description of the goal of the patchseries:
 This started out as steps on the way to be able to safely handle
 terminate_backend() et al when we're reading/writing from the
 client. E.g. because the client is dead and we haven't noticed yet and
 are stuck writing, or because we want to implement a idle_in_transaction
 timeout.
 
 Making these changes allowed me to see that not doing significant work
 in signal handlers can make code much simpler and more robust. The
 number of bugs postgres had in the past that involved doing too much in
 signal handlers is significant.  Thus I've since extended the
 patchseries to get rid of nearly all nontrivial work in signal
 handlers.

It sounds pretty nice.

 All the patches in the series up to 0008 hav ecommit messages providing
 more detail. A short description of each patch follows:
 
 0001: Replace walsender's latch with the general shared latch.
 
   New patch that removes ImmediateInteruptOK behaviour from walsender. I
   think that's a rather good idea, because walsender currently seems to
   assume WaitLatchOrSocket is reentrant - which I don't think is really
   guaranteed.
   Hasn't been reviewed yet, but I think it's not far from being
   committable.

Deesn't this patchset containing per-socket basis non-blocking
control for win32? It should make the code (above the win32
socket layer itself) more simpler.

 0002: Use a nonblocking socket for FE/BE communication and block using
   latches.
 
   Has previously been reviewed by Heikki. I think Noah also had a
   look, although I'm not sure how close that was.
 
   I think this can be committed soon.

 0003: Introduce and use infrastructure for interrupt processing during client 
 reads.
 
   From here on ImmediateInterruptOK isn't set during client
   communication. Normal interrupts and sinval/async interrupts are
   processed outside of signal handlers. Especially the sinval/async
   greatly simplify the respective code.
 
   Has been reviewed by Heikki in an earlier version - I think I
   addressed all the review comments.
 
   I'd like somebody else to look at the sinval/async.c changes
   before committing. I really like the simplification this change
   allowed.
 
   I think this patch might not be safe without 0004 because I can't
   convince myself that it's safe to interrupt latch waits with work
   that actually might also use the same latch (sinval
   interrupts). But it's easier to review this way as 0004 is quite
   big.
 
 0004: Process 'die' interrupts while reading/writing from the client socket.
 
   This is the reason Horiguchi-san started this thread.
 
   I think the important debate here is whether we think it's
   acceptable that there are situations (a full socket buffer, but a
   alive connection) where the client previously got an error, but
   not anymore afterwards. I think that's much better than having
   unkillable connections, but I can see others being of a different
   opinion.


This patch yields a code a bit confusion like following.

| secure_raw_write(Port *port, const void *ptr, size_t len)
| {
..
| w = WaitLatchOrSocket(MyLatch,
| if (w  WL_LATCH_SET)
...
| errno = EINTR;
| else if (w  WL_SOCKET_WRITEABLE)
| goto wloop;
|
| errno = save_errno;

The errno set when WL_LATCH_SET always vanishes. Specifically,
the EINTR set by SIGTERM(WL_LATCH_SET) is overwritten by
EAGAIN. As the result, pg_terminte_backend() cannot kill the
backend till the write blocking is released. errno = save_errno
should be the alternative of the line errno = EINTR and I
confirmed that the change leads to the desirable (as of me)
behavior.


 0005: Move deadlock and other interrupt handling in proc.c out of signal 
 handlers.
 
   I'm surprised how comparatively simple this turned out to be. For
   robustness and understanding I think this is a great improvement.
 
   New and not reviewed at all. Needs significant review. But works
   in my simple testcases.
 
 0006: Don't allow immediate interupts during authentication anymore.
 
   So far we've set ImmediateInterruptOK to true during large parts
   of the client authentication - that's not all that pretty,
   interrupts might arrive while we're in some system routines.
 
   Due to patches 0003/0004 we now are able to safely serve

Re: [HACKERS] Parallel Seq Scan

2015-01-14 Thread Jim Nasby

On 1/13/15 9:42 PM, Amit Kapila wrote:

As an example one of the the strategy
could be if the table size is X MB and there are 8 workers, then
divide the work as X/8 MB for each worker (which I have currently
used in patch) and another could be each worker does scan
1 block at a time and then check some global structure to see which
next block it needs to scan, according to me this could lead to random
scan.  I have read that some other databases also divide the work
based on partitions or segments (size of segment is not very clear).


Long-term I think we'll want a mix between the two approaches. Simply doing 
something like blkno % num_workers is going to cause imbalances, but trying to 
do this on a per-block basis seems like too much overhead.

Also long-term, I think we also need to look at a more specialized version of 
parallelism at the IO layer. For example, during an index scan you'd really 
like to get IO requests for heap blocks started in the background while the 
backend is focused on the mechanics of the index scan itself. While this could 
be done with the stuff Robert has written I have to wonder if it'd be a lot 
more efficient to use fadvise or AIO. Or perhaps it would just be better to 
deal with an entire index page (remembering TIDs) and then hit the heap.

But I agree with Robert; there's a lot yet to be done just to get *any* kind of 
parallel execution working before we start thinking about how to optimize it.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Enforce creation of destination folders for source files in pg_regress (Was: pg_regress writes into source tree)

2015-01-14 Thread Michael Paquier
Hi all,

pg_regress will fail with test suites using only source files if the
destination folders do not exist in the code tree. This is annoying
because this forces to maintain empty folders sql/ and expected/ with
a .gitignore ignoring everything. The issue has been discussed here
(http://www.postgresql.org/message-id/54935d9d.7010...@dunslane.net)
but nobody actually sent a patch, so here is one, and a thread for
this discussion.
Regards,
-- 
Michael
*** a/src/test/regress/pg_regress.c
--- b/src/test/regress/pg_regress.c
***
*** 496,501  convert_sourcefiles_in(char *source_subdir, char *dest_dir, char *dest_subdir, c
--- 496,502 
  {
  	char		testtablespace[MAXPGPATH];
  	char		indir[MAXPGPATH];
+ 	char		result_dir[MAXPGPATH];
  	struct stat st;
  	int			ret;
  	char	  **name;
***
*** 520,525  convert_sourcefiles_in(char *source_subdir, char *dest_dir, char *dest_subdir, c
--- 521,534 
  		/* Error logged in pgfnames */
  		exit(2);
  
+ 	/*
+ 	 * Enforce creation of destination directory if it does not exist yet.
+ 	 * This is useful for tests using only source files.
+ 	 */
+ 	snprintf(result_dir, MAXPGPATH, %s/%s, dest_dir, dest_subdir);
+ 	if (!directory_exists(result_dir))
+ 		make_directory(result_dir);
+ 
  	snprintf(testtablespace, MAXPGPATH, %s/testtablespace, outputdir);
  
  #ifdef WIN32
***
*** 565,571  convert_sourcefiles_in(char *source_subdir, char *dest_dir, char *dest_subdir, c
  		/* build the full actual paths to open */
  		snprintf(prefix, strlen(*name) - 6, %s, *name);
  		snprintf(srcfile, MAXPGPATH, %s/%s, indir, *name);
! 		snprintf(destfile, MAXPGPATH, %s/%s/%s.%s, dest_dir, dest_subdir,
   prefix, suffix);
  
  		infile = fopen(srcfile, r);
--- 574,580 
  		/* build the full actual paths to open */
  		snprintf(prefix, strlen(*name) - 6, %s, *name);
  		snprintf(srcfile, MAXPGPATH, %s/%s, indir, *name);
! 		snprintf(destfile, MAXPGPATH, %s/%s.%s, result_dir,
   prefix, suffix);
  
  		infile = fopen(srcfile, r);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Define and document a guaranteed ordering for RETURNING?

2015-01-14 Thread Craig Ringer
Hi all

It's recently occurred to me that many people rely on the RETURNING clause
for multi-valued INSERTs, INSERT INTO ... SELECT, etc, returning rows in
the same order they were supplied as input into the INSERT. I often see
lock-step iteration over the inputs to an INSERT and the RETURNING
result-tuples used to map generated sequence values to the corresponding
input.

I can't find anything relevant in the docs about the order of results in
RETURNING, and it strikes me that this should probably be codified one way
or the other.

The docs simply read:

 The optional RETURNING clause causes INSERT to compute and return
value(s) based on each row actually inserted. This is primarily useful for
obtaining values that were supplied by defaults, such as a serial sequence
number. However, any expression using the table's columns is allowed. The
syntax of the RETURNING list is identical to that of the output list of
SELECT.

Looking at the code I don't see any reason why it wouldn't currently be
safe to assume that

INSERT ... VALUES (...), (...), ... RETURNING ...

and

INSERT ... SELECT ... ORDER BY ... RETURNING ...

always return rows in the exact order of their input. But that may not
always be true and if it's not documented as safe, it's not necessarily
wise to rely on it.

If you can't rely on RETURNING order you have to instead identify a
candidate key on the input and use a wider RETURNING result that includes
that candidate key as well as whatever else you want returned. Then map the
results onto the inputs. No fun at all.

So - I propose to document that INSERT ... RETURNING guarantees that the
result-tuples will be in the same order as the input to the INSERT ...
RETURNING statement. So if that order is well-defined, as in the case of
multi-VALUES clauses or SELECT ... ORDER BY ..., the result order is also
well defined.

(An alternative would be adding RETURNING ... WITH ORDINALITY. Which seems
pretty horrible).

Thoughts/comments? Barring objections I'll write up a docs patch for the
docs list.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


Re: [HACKERS] Parallel Seq Scan

2015-01-14 Thread Amit Kapila
On Wed, Jan 14, 2015 at 9:30 AM, Ashutosh Bapat 
ashutosh.ba...@enterprisedb.com wrote:

 On Wed, Jan 14, 2015 at 9:12 AM, Amit Kapila amit.kapil...@gmail.com
wrote:

 Here we have to decide what should be the strategy and how much
 each worker should scan.  As an example one of the the strategy
 could be if the table size is X MB and there are 8 workers, then
 divide the work as X/8 MB for each worker (which I have currently
 used in patch) and another could be each worker does scan
 1 block at a time and then check some global structure to see which
 next block it needs to scan, according to me this could lead to random
 scan.  I have read that some other databases also divide the work
 based on partitions or segments (size of segment is not very clear).


 A block can contain useful tuples, i.e tuples which are visible and
fulfil the quals + useless tuples i.e. tuples which are dead, invisible or
that do not fulfil the quals. Depending upon the contents of these blocks,
esp. the ratio of (useful tuples)/(unuseful tuples), even though we divide
the relation into equal sized runs, each worker may take different time.
So, instead of dividing the relation into number of run = number of
workers, it might be better to divide them into fixed sized runs with size
 (total number of blocks/ number of workers), and let a worker pick up a
run after it finishes with the previous one. The smaller the size of runs
the better load balancing but higher cost of starting with the run. So, we
have to strike a balance.


I think your suggestion is good and it somewhat falls inline
with what Robert has suggested, but instead of block-by-block,
you seem to be suggesting of doing it in chunks (where chunk size
is not clear), however the only point against this is that such a
strategy for parallel sequence scan could lead to random scans
which can hurt the operation badly.  Nonetheless, I will think more
on this lines of making work distribution dynamic so that we can
ensure that all workers can be kept busy.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Minor configure tweak to simplify adjusting gcc warnings

2015-01-14 Thread Andres Freund
On 2015-01-14 10:01:39 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2015-01-14 09:34:23 -0500, Tom Lane wrote:
  Well, that would only fix my problem if we added a configure-time test
  for whether gcc recognizes z, which frankly seems like a waste of
  cycles.  I've probably got the last one left in captivity that doesn't.
 
  Hm. I had kinda assumed that %z support for sprintf and gcc's
  recognition of the format string would coincide and we could just use
  the %z result. But gull's output doesn't actually that way.
 
 It's only reasonable to assume that gcc matches sprintf if gcc is the
 native (vendor-supplied) compiler for the platform.  That's not the
 case on gaur.  It used to be very very commonly not the case, though
 I think a lot of vendors have now abandoned their proprietary compilers.
 If we were to test for this at all, I think we'd need to make the test
 separate from the one for sprintf's behavior.

I've already given up... Given how infrequent it is, suppressing it for
gull seems sufficient.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 On Wed, Jan 14, 2015 at 8:41 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 What are the autovac processes doing (according to pg_stat_activity)?

 pid,running,waiting,query
 7105,00:28:40.789221,f,autovacuum: VACUUM ANALYZE pg_catalog.pg_class

Hah, I suspected as much.  Is that the one that's stuck in
LockBufferForCleanup, or the other one that's got a similar
backtrace to all the user processes?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 There were seven process with that backtrace exact backtrace (except
 that randomly they are sleeping in the spinloop).   Something else
 interesting: autovacuum has been running all night as well.  Unlike
 the other process however, cpu utilization does not register on top.
 I backtraced them as well.  They are not identical.  One of them looks
 like this:

What are the autovac processes doing (according to pg_stat_activity)?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Merlin Moncure
On Wed, Jan 14, 2015 at 9:05 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 On Wed, Jan 14, 2015 at 8:41 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 What are the autovac processes doing (according to pg_stat_activity)?

 pid,running,waiting,query
 7105,00:28:40.789221,f,autovacuum: VACUUM ANALYZE pg_catalog.pg_class

 Hah, I suspected as much.  Is that the one that's stuck in
 LockBufferForCleanup, or the other one that's got a similar
 backtrace to all the user processes?

Yes, it is pg_class is coming from  LockBufferForCleanup ().  As you
can see above, it has a shorter runtime.   So it was killed off once
about a half hour ago which did not free up the logjam.  However, AV
spawned it again and now it does not respond to cancel.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Andres Freund
On 2015-01-14 09:22:45 -0600, Merlin Moncure wrote:
 On Wed, Jan 14, 2015 at 9:11 AM, Andres Freund and...@2ndquadrant.com wrote:
  On 2015-01-14 10:05:01 -0500, Tom Lane wrote:
  Merlin Moncure mmonc...@gmail.com writes:
   On Wed, Jan 14, 2015 at 8:41 AM, Tom Lane t...@sss.pgh.pa.us wrote:
   What are the autovac processes doing (according to pg_stat_activity)?
 
   pid,running,waiting,query
   7105,00:28:40.789221,f,autovacuum: VACUUM ANALYZE pg_catalog.pg_class
 
  It'd be interesting to know whether that vacuum gets very frequent
  semaphore wakeups. Could you strace it for a second or three?
 
 for 30 seconds+ it just looks like this:
 mmoncure@mernix2 ~ $ sudo strace -p 7105
 Process 7105 attached
 semop(5701638, {{4, -1, 0}}, 1

Ok. So that explains why it's not interruptible.

 all of other processes are yielding out of the spinlock, for example:
 select(0, NULL, NULL, NULL, {0, 1408})  = 0 (Timeout)

Note the above isn't the spinlock, it's the process's semaphore. It'll
only get set if the refcount ever indicates that nobody but autovac is
holding the lock.

  How did this perform  9.4?
 this is a new project. However, I can run it vs earlier version.
 
 Can you guess how many times these dynamic
  statements are planned? How many different relations are accessed in the
  dynamically planned queries?
 
 only once or twice, and only a couple of tables.

Hm. Odd. The first -g profile seemed to indicate a hell of a lot time
was spent in LWLockRelease() - indicating that there's actually
progress. Later profiles/backtraces were less clear.

If you gdb in, and type 'fin' a couple times, to wait till the function
finishes, is there actually any progress? I'm wondering whether it's
just many catalog accesses + contention, or some other
problem. Alternatively set a breakpoint on ScanPgRelation() or so and
see how often it's hit.

 I can send the code off-list if you guys think it'd help.

Might be interesting.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Merlin Moncure
On Wed, Jan 14, 2015 at 8:41 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 There were seven process with that backtrace exact backtrace (except
 that randomly they are sleeping in the spinloop).   Something else
 interesting: autovacuum has been running all night as well.  Unlike
 the other process however, cpu utilization does not register on top.
 I backtraced them as well.  They are not identical.  One of them looks
 like this:

 What are the autovac processes doing (according to pg_stat_activity)?

pid,running,waiting,query
27885,15:40:46.904039,f,autovacuum: VACUUM ANALYZE onesitepmc.propertyguestcard
7209,00:24:44.162684,f,SELECT n.nspname as Schema,
  c.relname as Name,
  CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm'
THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN
'sequence' WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' END
as Type,
  pg_catalog.pg_get_userbyid(c.relowner) as Owner
FROM pg_catalog.pg_class c
 LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind IN ('r','v','m','S','f','')
  AND n.nspname  'pg_catalog'
  AND n.nspname  'information_schema'
  AND n.nspname !~ '^pg_toast'
  AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 1,2;
7105,00:28:40.789221,f,autovacuum: VACUUM ANALYZE pg_catalog.pg_class
31282,15:28:29.070942,f,SELECT CDSReconcileRunTable(867)
31792,15:27:09.483365,f,SELECT CDSReconcileRunTable(874)
7135,00:26:08.977953,t,vacuum VERBOSE pg_class;
32725,15:24:46.008341,f,SELECT CDSReconcileRunTable(884)
32367,15:25:44.4206,f,SELECT CDSReconcileRunTable(881)
32492,15:25:31.27934,f,SELECT CDSReconcileRunTable(883)
31899,15:26:57.246415,f,SELECT CDSReconcileRunTable(875)
32368,15:25:44.418445,f,SELECT CDSReconcileRunTable(880)
7270,00:21:42.274104,f,SELECT n.nspname as Schema,
 c.relname as Name,
  CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm'
THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN
'sequence' WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' END
as Type,
  pg_catalog.pg_get_userbyid(c.relowner) as Owner
FROM pg_catalog.pg_class c
 LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind IN ('r','v','m','S','f','')
  AND n.nspname  'pg_catalog'
  AND n.nspname  'information_schema'
  AND n.nspname !~ '^pg_toast'
  AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 1,2;
7403,00:00:00,f,COPY  ( select pid, now ( ) - query_start as running,
waiting, query from pg_stat_activity ) TO STDOUT csv header ;

notes:
*) SELECT CDSReconcileRunTable(xxx) are the replication processes.
they are burning cpu.
*) SELECT n.nspname ... is via psql \d.  they stick when invoked and
do not respond to cancel. they are also burning cpu
*) the autovacum processes do not respond to cancel. however, one of
them did earlier (tracing in from lazy_vacuum_index ()) did respond.
However, now, it doesn't (upon being respawned via AV) with the same
backtrace.  The blocked manual vacuum verbose did cancel

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Andres Freund
On 2015-01-14 10:05:01 -0500, Tom Lane wrote:
 Merlin Moncure mmonc...@gmail.com writes:
  On Wed, Jan 14, 2015 at 8:41 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  What are the autovac processes doing (according to pg_stat_activity)?
 
  pid,running,waiting,query
  7105,00:28:40.789221,f,autovacuum: VACUUM ANALYZE pg_catalog.pg_class

It'd be interesting to know whether that vacuum gets very frequent
semaphore wakeups. Could you strace it for a second or three?

How did this perform  9.4?  Can you guess how many times these dynamic
statements are planned? How many different relations are accessed in the
dynamically planned queries?

 Hah, I suspected as much.  Is that the one that's stuck in
 LockBufferForCleanup, or the other one that's got a similar backtrace
 to all the user processes?

Do you have a theory? Right now it primarily looks like contention on a
single buffer due to the high number of dynamic statements, possibly
made worse by the signalling between normal pinners and vacuum waiting
for cleanup.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Andres Freund
On 2015-01-14 10:13:32 -0500, Tom Lane wrote:
 Merlin Moncure mmonc...@gmail.com writes:
  Yes, it is pg_class is coming from  LockBufferForCleanup ().  As you
  can see above, it has a shorter runtime.   So it was killed off once
  about a half hour ago which did not free up the logjam.  However, AV
  spawned it again and now it does not respond to cancel.
 
 Interesting.  That seems like there might be two separate issues at
 play.  It's plausible that LockBufferForCleanup might be interfering
 with other attempts to scan the index, but then why wouldn't killing
 the AV have unstuck things?

LockBufferForCleanup() unfortunately isn't interruptible. I've every now
and then seen vacuums being stuck for a long while, trying to acquire
cleanup locks - IIRC I complained about that on list even.  So autovac
will only be cancelled when going to the next page.  And even if the
page ever gets a zero (well, one) refcount, by the time autovac is woken
up via the semaphore, it'll often end up being used again.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2015-01-14 10:05:01 -0500, Tom Lane wrote:
 Hah, I suspected as much.  Is that the one that's stuck in
 LockBufferForCleanup, or the other one that's got a similar backtrace
 to all the user processes?

 Do you have a theory? Right now it primarily looks like contention on a
 single buffer due to the high number of dynamic statements, possibly
 made worse by the signalling between normal pinners and vacuum waiting
 for cleanup.

The readers shouldn't be contending with each other.  It's certainly
plausible enough that there's a steady stream of readers that would
prevent LockBufferForCleanup from getting in, but that should not
result in any readers getting blocked.  So something's gotten broken
in that area.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS qual pushdown

2015-01-14 Thread Dean Rasheed
On 14 January 2015 at 13:29, Robert Haas robertmh...@gmail.com wrote:
 One thing they could still leak is the number of times they got
 called, and thus possibly the number of unseen rows.  Now if the
 expressions get constant-folded away that won't be an issue, but a
 clever user can probably avoid that.


Right now, EXPLAIN ANALYSE can be used to tell you the number of
unseen rows. Is that something that people are concerned about, and
are there any plans to change it?

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Merlin Moncure
On Wed, Jan 14, 2015 at 9:11 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-01-14 10:05:01 -0500, Tom Lane wrote:
 Merlin Moncure mmonc...@gmail.com writes:
  On Wed, Jan 14, 2015 at 8:41 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  What are the autovac processes doing (according to pg_stat_activity)?

  pid,running,waiting,query
  7105,00:28:40.789221,f,autovacuum: VACUUM ANALYZE pg_catalog.pg_class

 It'd be interesting to know whether that vacuum gets very frequent
 semaphore wakeups. Could you strace it for a second or three?

for 30 seconds+ it just looks like this:
mmoncure@mernix2 ~ $ sudo strace -p 7105
Process 7105 attached
semop(5701638, {{4, -1, 0}}, 1

all of other processes are yielding out of the spinlock, for example:
select(0, NULL, NULL, NULL, {0, 1408})  = 0 (Timeout)

 How did this perform  9.4?
this is a new project. However, I can run it vs earlier version.

Can you guess how many times these dynamic
 statements are planned? How many different relations are accessed in the
 dynamically planned queries?

only once or twice, and only a couple of tables. This is an
operation that should only take few seconds (inserting a few 10s of
thousands of rows), that has blocked for many hours now.  Usually it
runs through taking a few seconds.  This is either a deadlock or a
deadlock emulating sequence of operations.

I'll try to pull commits that Peter suggested and see if that helps
(I'm getting ready to bring the database down).  I can send the code
off-list if you guys think it'd help.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Merging postgresql.conf and postgresql.auto.conf

2015-01-14 Thread Sawada Masahiko
Hi all,

The postgresql.auto.conf is loaded after loading of postgresql.conf
whenever configuration file is loaded or reloaded.
This means that parameter in postgresql.auto.conf is quite high
priority, so the parameter in postgresql.conf does not work at all
even if user set it manually.

If user want to change stopped postgresql server then user need to
merge two configuration file(postgresql.conf and postgresql.auto.conf)
while maintaining the consistency manually.

From an operational perspective having a written config with duplicate
entries is not good thing.
I think we need to merge two configuration file into one (or more than
one, if it uses like 'include' word)

The one solution is to add merging tool/commnand which merges two
configuration file while maintaining the consistency.
This topic have been already discussed?

Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf

2015-01-14 Thread Tom Lane
Sawada Masahiko sawada.m...@gmail.com writes:
 The postgresql.auto.conf is loaded after loading of postgresql.conf
 whenever configuration file is loaded or reloaded.
 This means that parameter in postgresql.auto.conf is quite high
 priority, so the parameter in postgresql.conf does not work at all
 even if user set it manually.

 If user want to change stopped postgresql server then user need to
 merge two configuration file(postgresql.conf and postgresql.auto.conf)
 while maintaining the consistency manually.

 From an operational perspective having a written config with duplicate
 entries is not good thing.
 I think we need to merge two configuration file into one (or more than
 one, if it uses like 'include' word)

 The one solution is to add merging tool/commnand which merges two
 configuration file while maintaining the consistency.
 This topic have been already discussed?

Yes.  The entire reason that postgresql.auto.conf is separate is that
we despaired of reading and rewriting postgresql.conf automatically
without making a hash of material in the comments.  Calling the logic
a merge tool does not make that problem go away.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor configure tweak to simplify adjusting gcc warnings

2015-01-14 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2015-01-13 22:19:30 -0500, Tom Lane wrote:
 The reason I got interested in this is that I attempted to pass in
 CFLAGS=-Wno-format to configure, to suppress format warnings on
 buildfarm member gaur (whose gcc is too old to recognize z modifiers).
 That doesn't work because -Wall turns the warnings right back on again.
 If the user-supplied CFLAGS were inserted after -Wall then it would work.

 In the line of
 http://archives.postgresql.org/message-id/54B58BA3.8040302%40ohmu.fi in
 wonder if the better fix isn't to define pg_format_attribute(...) and
 define it empty that if the compiler doesn't support what we want.

Well, that would only fix my problem if we added a configure-time test
for whether gcc recognizes z, which frankly seems like a waste of
cycles.  I've probably got the last one left in captivity that doesn't.

Not that I have any great objection to improving the attribute-slinging
logic.  It just doesn't seem like a good fix for this particular issue.

 A slightly more complicated change could be applied to make sure that
 *all* of the CFLAGS forcibly inserted by configure appear before any
 externally-sourced CFLAGS, allowing any of them to be overridden from the
 environment variable.  I'm not sure if it's worth the trouble to do that,
 but if there's interest I could make it happen.

 I think it'd be good idea, but unless you're enthusiastic I guess there
 are more important things.

Nah, I'm fine with doing it, it's just a couple more lines of code.
I feared people might think it wasn't worth adding extra complexity for,
but as long as someone else likes the idea I'll go do it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor configure tweak to simplify adjusting gcc warnings

2015-01-14 Thread Andres Freund
On 2015-01-14 09:34:23 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2015-01-13 22:19:30 -0500, Tom Lane wrote:
  The reason I got interested in this is that I attempted to pass in
  CFLAGS=-Wno-format to configure, to suppress format warnings on
  buildfarm member gaur (whose gcc is too old to recognize z modifiers).
  That doesn't work because -Wall turns the warnings right back on again.
  If the user-supplied CFLAGS were inserted after -Wall then it would work.
 
  In the line of
  http://archives.postgresql.org/message-id/54B58BA3.8040302%40ohmu.fi in
  wonder if the better fix isn't to define pg_format_attribute(...) and
  define it empty that if the compiler doesn't support what we want.
 
 Well, that would only fix my problem if we added a configure-time test
 for whether gcc recognizes z, which frankly seems like a waste of
 cycles.  I've probably got the last one left in captivity that doesn't.

Hm. I had kinda assumed that %z support for sprintf and gcc's
recognition of the format string would coincide and we could just use
the %z result. But gull's output doesn't actually that way.

  A slightly more complicated change could be applied to make sure that
  *all* of the CFLAGS forcibly inserted by configure appear before any
  externally-sourced CFLAGS, allowing any of them to be overridden from the
  environment variable.  I'm not sure if it's worth the trouble to do that,
  but if there's interest I could make it happen.
 
  I think it'd be good idea, but unless you're enthusiastic I guess there
  are more important things.
 
 Nah, I'm fine with doing it, it's just a couple more lines of code.
 I feared people might think it wasn't worth adding extra complexity for,
 but as long as someone else likes the idea I'll go do it.

Ok cool. I had thought the templates, subdirectory overrides would make
it slightly less than trivial.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor configure tweak to simplify adjusting gcc warnings

2015-01-14 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2015-01-14 09:34:23 -0500, Tom Lane wrote:
 Well, that would only fix my problem if we added a configure-time test
 for whether gcc recognizes z, which frankly seems like a waste of
 cycles.  I've probably got the last one left in captivity that doesn't.

 Hm. I had kinda assumed that %z support for sprintf and gcc's
 recognition of the format string would coincide and we could just use
 the %z result. But gull's output doesn't actually that way.

It's only reasonable to assume that gcc matches sprintf if gcc is the
native (vendor-supplied) compiler for the platform.  That's not the
case on gaur.  It used to be very very commonly not the case, though
I think a lot of vendors have now abandoned their proprietary compilers.
If we were to test for this at all, I think we'd need to make the test
separate from the one for sprintf's behavior.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 Yes, it is pg_class is coming from  LockBufferForCleanup ().  As you
 can see above, it has a shorter runtime.   So it was killed off once
 about a half hour ago which did not free up the logjam.  However, AV
 spawned it again and now it does not respond to cancel.

Interesting.  That seems like there might be two separate issues at
play.  It's plausible that LockBufferForCleanup might be interfering
with other attempts to scan the index, but then why wouldn't killing
the AV have unstuck things?

Anyway it's now seeming that Peter may have the right idea about
where to look.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] explain sortorder

2015-01-14 Thread Heikki Linnakangas

On 01/14/2015 05:26 PM, Timmer, Marius wrote:

Hello Heikki,

abbreviated version:
Sorry, the problem is only the unhandy patch text format, not different 
opinions how to proceed.

Long version:
The v7 patch file already addressed your suggestions,
but the file contained serveral (old) local commits,
the new ones at the end of the patch text/file.


Ah, missed that. I stopped reading when I saw the old stuff there :-).


v7.1 is attached and addresses this issue providing a clean patch file.


Ok, thanks, will take a look.


V8 will - as mentioned - add missing docs and regression tests,


Great!

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fillfactor for GIN indexes

2015-01-14 Thread Michael Paquier
On Thu, Jan 8, 2015 at 2:03 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Thu, Jan 8, 2015 at 6:31 AM, Alexander Korotkov aekorot...@gmail.com 
 wrote:
 On Wed, Jan 7, 2015 at 4:11 PM, Michael Paquier michael.paqu...@gmail.com
 I am attaching an updated patch, with the default fillfactor value at
 75%, and with the page split code using the fillfactor rate.
 Thoughts?
 Rewritten version of patch is attached. I made following changes:

 Thanks! With this patch (and my previous version as well) GIN indexes
 with default fillfactor have a size higher than 9.4 indexes, 9.4
 behavior being consistent only with fillfactor=100 and not the default
 of 90. Are we fine with that?
IMO, this patch has value to control random updates on GIN indexes,
but we should have a default fillfactor of 100 to have index size
consistent with 9.4. Thoughts?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2015-01-14 Thread Etsuro Fujita
On 2014/12/23 0:36, Tom Lane wrote:
 Yeah, we need to do something about the PlanRowMark data structure.
 Aside from the pre-existing issue in postgres_fdw, we need to fix it
 to support inheritance trees in which more than one rowmark method
 is being used.  That rte.hasForeignChildren thing is a crock,

 The idea I'd had about that was to convert the markType field into a
 bitmask, so that a parent node's markType could represent the logical
 OR of the rowmark methods being used by all its children.  I've not
 attempted to code this up though, and probably won't get to it until
 after Christmas.  One thing that's not clear is what should happen
 with ExecRowMark.

As I said before, that seems to me like a good idea.  So I'll update the
patch based on that if you're okey with it.  Or you've found any problem
concerning the above idea?

Best regards,
Etsuro Fujita


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] install libpq.dll in bin directory on Windows / Cygwin

2015-01-14 Thread Michael Paquier
On Wed, Dec 24, 2014 at 3:08 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 Attached are two patches, one for MinGW/cygwin, a slightly modified
 version from Peter and the second implementing the same thing but for
 the MSVC scripts. The method for MSVC is similar to what is done in
 Peter's patch: roughly it checks if SO_MAJOR_VERSION is present in the
 Makefile of a given library, the path of Makefile is found by looking
 at the location of the .rc in the vcproj file (could be better but I
 could not come up with a better method). TBH, it would be good to be
 completely consistent in the way we build things on Windows, and we
 may as well consider a backpatch to fix this long-standing bug. The
 MSVC patch removes of course the hack copying libpq.dll from lib/ to
 bin/.

 I mentioned the fix for MSVC scripts as well here:
 http://www.postgresql.org/message-id/cab7npqqiuepzphd3mmk+q7_cqqrkk_v1fvxknymri660z4d...@mail.gmail.com
Peter, this patch is waiting for input for a couple of weeks. IMO, it
would be good to finally get a fix for this bug, and we have patches
for both MSVC (the patch I sent) and mingw (your stuff).
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Define and document a guaranteed ordering for RETURNING?

2015-01-14 Thread Craig Ringer
Apparently I'm semi-blind - the docs also note that:

 If the INSERT command contains a RETURNING clause, the result will be
similar to that of a SELECT statement containing the columns and values
defined in the RETURNING list, computed over the row(s) inserted by the
command.

... so perhaps it's enough to just explicitly mention that ordering is
preserved.


Re: [HACKERS] PATCH: Reducing lock strength of trigger and foreign key DDL

2015-01-14 Thread Michael Paquier
I wrote:
 I think that we could pass it to a committer.
Marked as such.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] __attribute__ for non-gcc compilers

2015-01-14 Thread Oskari Saarenmaa
15.01.2015, 00:54, Andres Freund kirjoitti:
 I think I'd for now simply not define pg_attribute_aligned() on
 platforms where it's not supported, instead of defining it empty. If we
 need a softer variant we can name it pg_attribute_aligned_if_possible or
 something.

Good point, all attributes that actually change program behavior
(aligned and packed for now) need to error out during compilation if
they're used and they're not supported by the compiler.

Attributes which may improve optimization or just provide more
information for the developers (noreturn, unused, format) can be defined
empty when they're not supported (or are not supported well enough: GCC
 3 doesn't know about %z in printf format.)

/ Oskari



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compression of full-page-writes

2015-01-14 Thread Michael Paquier
Marking this patch as returned with feedback for this CF, moving it to
the next one. I doubt that there will be much progress here for the
next couple of days, so let's try at least to get something for this
release cycle.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Join push-down support for foreign tables

2015-01-14 Thread Michael Paquier
On Fri, Dec 26, 2014 at 1:48 PM, Shigeru Hanada
shigeru.han...@gmail.com wrote:
 Hmm, I agree to support N-way join is very useful.  Postgres-XC's SQL
 generator seems to give us a hint for such case, I'll check it out
 again.
Switching to returned with feedback, as this patch is waiting for
feedback for a couple of weeks now.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Async execution of postgres_fdw.

2015-01-14 Thread Michael Paquier
On Tue, Jan 13, 2015 at 8:46 PM, Kyotaro HORIGUCHI
horiguchi.kyot...@lab.ntt.co.jp wrote:
 I'll look into the case after this, but I'd like to send a
 revised patch at this point.
Hm. Seems like this patch is not completely baked yet. Horiguchi-san,
as you are obviously still working on it, would you agree to move it
to the next CF?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Commit fest 2015-12 enters money time

2015-01-14 Thread Michael Paquier
Hi all,

We are soon entering in the money time for this CF. The last month has
been mainly a vacation period, the progress being fantomatic on many
fronts, still there are a couple of patches that are marked as ready
for committer:
- Foreign table inheritance , whose first patch has been committed
- speedup tidbitmap patch: cache page
- pg_basebackup vs. Windows and tablespaces (Extend base backup to
include symlink file used to restore symlinks)
- If pg_hba.conf has changed since last reload, emit a HINT to the
client and server log
- Add restore_command_retry_interval option to control timeout of
restore_command nonzero status code
- documentation update for doc/src/sgml/func.sgml
- Reducing lock strength of trigger and foreign key DDL

I am going to make a first pass on all the patches to check their
status, and please note that the patches waiting for some input from
their authors will be marked as returned with feedback.
Thanks,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add ssl_protocols configuration option

2015-01-14 Thread Michael Paquier
On Mon, Dec 15, 2014 at 11:15 PM, Alex Shulgin a...@commandprompt.com wrote:
 Michael Paquier michael.paqu...@gmail.com writes:

 Perhaps ssloptions.[ch], unless you plan to add non-option-related code
 there later?

 I don't think anything else than common options-related code fits in
 there, so ssloptions.c makes sense to me.

 BTW, there is no Regent code in your openssl.c, so the copyright
 statement is incorrect.

 Good catch, I just blindly copied that from some other file.
 There have been arguments in favor and against this patch... But
 marking it as returned with feedback because of a lack of activity in
 the last couple of weeks. Feel free to update if you think that's not
 correct, and please move this patch to commit fest 2014-12.

 Attached is a new version that addresses the earlier feedback: renamed
 the added *.[ch] files and removed incorrect copyright line.

 I'm moving this to the current CF.
This patch is statuquo since the beginning of this CF, at the
arguments are standing the same. If there is nothing happening maybe
it would be better to mark it as returned with feedback? Thoughts?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace

2015-01-14 Thread Michael Paquier
On Tue, Dec 30, 2014 at 11:48 AM, Andreas Karlsson andr...@proxel.se wrote:
 Here is my review of the feature.
Marked as returned with feedback.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support UPDATE table SET(*)=...

2015-01-14 Thread Michael Paquier
On Mon, Dec 15, 2014 at 7:50 PM, Atri Sharma atri.j...@gmail.com wrote:
 I have moved patch to current CF and marked it as Waiting on Author since I
 plan to resubmit after addressing the concerns.
Nothing happened in the last month, so marking as returned with feedback.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns

2015-01-14 Thread Etsuro Fujita
Here is an example using postgres_fdw.

[Terminal 1]
postgres=# create table t (a int, b int);
CREATE TABLE
postgres=# insert into t values (1, 1);
INSERT 0 1
postgres=# begin;
BEGIN
postgres=# update t set b = b * 2;
UPDATE 1

[Terminal 2]
postgres=# create foreign table ft (a int) server loopback options
(table_name 'lbt');
CREATE FOREIGN TABLE
postgres=# insert into ft values (1);
INSERT 0 1
postgres=# select tableoid, ctid, * from ft;
 tableoid | ctid  | a
--+---+---
25092 | (0,1) | 1
(1 row)

postgres=# select ft.tableoid, ft.ctid, ft.* from t, ft where t.a = ft.a
for update;

[Terminal 1]
postgres=# commit;
COMMIT

[Terminal 2]
postgres=# select ft.tableoid, ft.ctid, ft.* from t, ft where t.a = ft.a
for update;
 tableoid |  ctid  | a
--++---
0 | (4294967295,0) | 1
(1 row)

Note that tableoid and ctid have been changed!

I think the reason for that is because EvalPlanQualFetchRowMarks doesn't
properly set tableoid and ctid for foreign tables, IIUC.  I think this
should be fixed.  Please find attached a patch.  The patch slightly
relates to [1], so if it is reasonable, I'll update [1] on top of this.

[1] https://commitfest.postgresql.org/action/patch_view?id=1386

Best regards,
Etsuro Fujita
*** a/contrib/postgres_fdw/postgres_fdw.c
--- b/contrib/postgres_fdw/postgres_fdw.c
***
*** 2947,2953  make_tuple_from_result_row(PGresult *res,
  	tuple = heap_form_tuple(tupdesc, values, nulls);
  
  	if (ctid)
! 		tuple-t_self = *ctid;
  
  	/* Clean up */
  	MemoryContextReset(temp_context);
--- 2947,2953 
  	tuple = heap_form_tuple(tupdesc, values, nulls);
  
  	if (ctid)
! 		tuple-t_self = tuple-t_data-t_ctid = *ctid;
  
  	/* Clean up */
  	MemoryContextReset(temp_context);
*** a/src/backend/executor/execMain.c
--- b/src/backend/executor/execMain.c
***
*** 795,800  InitPlan(QueryDesc *queryDesc, int eflags)
--- 795,801 
  	{
  		PlanRowMark *rc = (PlanRowMark *) lfirst(l);
  		Oid			relid;
+ 		char		relkind;
  		Relation	relation;
  		ExecRowMark *erm;
  
***
*** 817,826  InitPlan(QueryDesc *queryDesc, int eflags)
--- 818,833 
  break;
  			case ROW_MARK_COPY:
  /* there's no real table here ... */
+ relkind = rt_fetch(rc-rti, rangeTable)-relkind;
+ if (relkind == RELKIND_FOREIGN_TABLE)
+ 	relid = getrelid(rc-rti, rangeTable);
+ else
+ 	relid = InvalidOid;
  relation = NULL;
  break;
  			default:
  elog(ERROR, unrecognized markType: %d, rc-markType);
+ relid = InvalidOid;
  relation = NULL;	/* keep compiler quiet */
  break;
  		}
***
*** 831,836  InitPlan(QueryDesc *queryDesc, int eflags)
--- 838,844 
  
  		erm = (ExecRowMark *) palloc(sizeof(ExecRowMark));
  		erm-relation = relation;
+ 		erm-relid = relid;
  		erm-rti = rc-rti;
  		erm-prti = rc-prti;
  		erm-rowmarkId = rc-rowmarkId;
***
*** 2318,2325  EvalPlanQualFetchRowMarks(EPQState *epqstate)
  
  			/* build a temporary HeapTuple control structure */
  			tuple.t_len = HeapTupleHeaderGetDatumLength(td);
! 			ItemPointerSetInvalid((tuple.t_self));
! 			tuple.t_tableOid = InvalidOid;
  			tuple.t_data = td;
  
  			/* copy and store tuple */
--- 2326,2342 
  
  			/* build a temporary HeapTuple control structure */
  			tuple.t_len = HeapTupleHeaderGetDatumLength(td);
! 			/* if relid is valid, rel is a foreign table; set system columns */
! 			if (OidIsValid(erm-relid))
! 			{
! tuple.t_self = td-t_ctid;
! tuple.t_tableOid = erm-relid;
! 			}
! 			else
! 			{
! ItemPointerSetInvalid((tuple.t_self));
! tuple.t_tableOid = InvalidOid;
! 			}
  			tuple.t_data = td;
  
  			/* copy and store tuple */
*** a/src/backend/executor/nodeForeignscan.c
--- b/src/backend/executor/nodeForeignscan.c
***
*** 22,27 
--- 22,28 
   */
  #include postgres.h
  
+ #include access/htup_details.h
  #include executor/executor.h
  #include executor/nodeForeignscan.h
  #include foreign/fdwapi.h
***
*** 53,65  ForeignNext(ForeignScanState *node)
  	/*
  	 * If any system columns are requested, we have to force the tuple into
  	 * physical-tuple form to avoid cannot extract system attribute from
! 	 * virtual tuple errors later.  We also insert a valid value for
! 	 * tableoid, which is the only actually-useful system column.
  	 */
  	if (plan-fsSystemCol  !TupIsNull(slot))
  	{
  		HeapTuple	tup = ExecMaterializeSlot(slot);
  
  		tup-t_tableOid = RelationGetRelid(node-ss.ss_currentRelation);
  	}
  
--- 54,69 
  	/*
  	 * If any system columns are requested, we have to force the tuple into
  	 * physical-tuple form to avoid cannot extract system attribute from
! 	 * virtual tuple errors later.  We also insert a valid value for TID
! 	 * and tableoid, which are the only actually-useful system columns.
  	 */
  	if (plan-fsSystemCol  !TupIsNull(slot))
  	{
  		HeapTuple	tup = ExecMaterializeSlot(slot);
  
+ 		/* 

Re: [HACKERS] compress method for spgist - 2

2015-01-14 Thread Michael Paquier
Marking this patch as returned with feedback because it is waiting for
input from the author for now a couple of weeks. Heikki, the
refactoring patch has some value, are you planning to push it?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical column ordering

2015-01-14 Thread Michael Paquier
On Sun, Jan 4, 2015 at 10:37 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 So I'm reworking my patch with that in mind.
Switching to returned with feedback. Alvaro, feel free to add an entry
to the next CF if you are planning to work on it again.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DROP PRIVILEGES OWNED BY

2015-01-14 Thread Michael Paquier
On Thu, Dec 18, 2014 at 1:43 AM, Marko Tiikkaja ma...@joh.to wrote:
 I don't have a problem with that.  It would probably work, too, since FROM
 is already fully reserved.
Marking patch as returned with feedback as there has been no input
from Marko in the last couple of weeks.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving from contrib to bin

2015-01-14 Thread Michael Paquier
On Tue, Dec 23, 2014 at 3:24 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Mon, Dec 22, 2014 at 11:30 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 Michael Paquier wrote:

 And here is an updated patch following those lines. Similarly to the
 things in contrib/, a set of variables are used to define for each
 module what are the extra libraries, include dirs, etc. This refactors
 quite a bit of code, even if there are a couple of exceptions like
 pg_xlogdump/ or pg_basebackup/.

 In a broad look, this looks a lot better.  I think we should press
 forward with this whole set of patches and see what the buildfarm
 thinks.
 Here is a new series of patches for all those things, with the
 following additions:
 - Some cleanup for MSVC scripts compared to last patch
 - Moved documentation to ref/
 - Removed mention to the authors of the utilities moved (?)
 This set of patches is basically made of the former set, with 2
 additional patches for MSVC stuff and documentation pages moved to
 ref/.
Where are we on this? This patch is waiting for input from author for
the last couple of weeks.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typo fix in alter_table.sgml

2015-01-14 Thread Michael Paquier
Robert Haas wrote:
 But that's not a typo as stated in $SUBJECT but rather a markup fix.
Definitely. Thanks.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf

2015-01-14 Thread Sawada Masahiko
On Thu, Jan 15, 2015 at 1:15 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Sawada Masahiko sawada.m...@gmail.com writes:
 On Thu, Jan 15, 2015 at 12:37 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yes.  The entire reason that postgresql.auto.conf is separate is that
 we despaired of reading and rewriting postgresql.conf automatically
 without making a hash of material in the comments.  Calling the logic
 a merge tool does not make that problem go away.

 The merge tool do not only to merge the all parameters in two
 configuration into one file but also to remove duplicate parameters.
 That is, the configuration files will be one file in logically.

 I'll just say one more time that if we thought this could work, we'd not
 have arrived at the separate-files design to begin with.  You can work
 on it if you like, but I will bet a good deal that you will not end up
 with something that gets accepted.


Yep, I don't intend to propose again that.
Because I thought that the maintaining of configuration file will be
complicated,
so I just thought we can add supporting tool.

Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf

2015-01-14 Thread Sawada Masahiko
On Thu, Jan 15, 2015 at 12:37 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Sawada Masahiko sawada.m...@gmail.com writes:
 The postgresql.auto.conf is loaded after loading of postgresql.conf
 whenever configuration file is loaded or reloaded.
 This means that parameter in postgresql.auto.conf is quite high
 priority, so the parameter in postgresql.conf does not work at all
 even if user set it manually.

 If user want to change stopped postgresql server then user need to
 merge two configuration file(postgresql.conf and postgresql.auto.conf)
 while maintaining the consistency manually.

 From an operational perspective having a written config with duplicate
 entries is not good thing.
 I think we need to merge two configuration file into one (or more than
 one, if it uses like 'include' word)

 The one solution is to add merging tool/commnand which merges two
 configuration file while maintaining the consistency.
 This topic have been already discussed?

 Yes.  The entire reason that postgresql.auto.conf is separate is that
 we despaired of reading and rewriting postgresql.conf automatically
 without making a hash of material in the comments.  Calling the logic
 a merge tool does not make that problem go away.


The merge tool do not only to merge the all parameters in two
configuration into one file but also to remove duplicate parameters.
That is, the configuration files will be one file in logically.

It will be clearly complicated work that the user need to rewrite
postgresql.conf manually while maintaining the consistency.
(On top of that, The executing of ALTER SYSTEM command is not allowed
user except super user.)

Is it bad that ALTER SYSTEM parses postgresql.conf again at
AlterSystemSetConfigFile() to get line number and file name of that
parameter?
(or postgresql continue to keep line number and file name of parameter)


Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf

2015-01-14 Thread Tom Lane
Sawada Masahiko sawada.m...@gmail.com writes:
 On Thu, Jan 15, 2015 at 12:37 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yes.  The entire reason that postgresql.auto.conf is separate is that
 we despaired of reading and rewriting postgresql.conf automatically
 without making a hash of material in the comments.  Calling the logic
 a merge tool does not make that problem go away.

 The merge tool do not only to merge the all parameters in two
 configuration into one file but also to remove duplicate parameters.
 That is, the configuration files will be one file in logically.

I'll just say one more time that if we thought this could work, we'd not
have arrived at the separate-files design to begin with.  You can work
on it if you like, but I will bet a good deal that you will not end up
with something that gets accepted.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Incremental backup v3: incremental PoC

2015-01-14 Thread Gabriele Bartolini
Hi Marco,

  thank you for sending an updated patch. I am writing down a report of
this initial (and partial) review.

IMPORTANT: This patch is not complete, as stated by Marco. See the
Conclusions section for my proposed TODO list.

== Patch application

I have been able to successfully apply your patch and compile it.
Regression tests passed.

== Initial run

I have created a fresh new instance of PostgreSQL and activated streaming
replication to be used by pg_basebackup. I have done a pgbench run with
scale 100.

I have taken a full consistent backup with pg_basebackup (in plain format):

pg_basebackup -v -F p -D $BACKUPDIR/backup-$(date '+%s') -x


I have been able to verify that the backup_profile is correctly placed in
the destination PGDATA directory. Here is an excerpt:

POSTGRESQL BACKUP PROFILE 1
START WAL LOCATION: 0/358 (file 00010003)
CHECKPOINT LOCATION: 0/38C
BACKUP METHOD: streamed
BACKUP FROM: master
START TIME: 2015-01-14 10:07:07 CET
LABEL: pg_basebackup base backup
FILE LIST
\N  \N  t   1421226427  206 backup_label
\N  \N  t   1421225508  88  postgresql.auto.conf
...


As suggested by Marco, I have manually taken the LSN from this file (next
version must do this automatically).
I have then executed pg_basebackup and activated the incremental feature by
using the LSN from the previous backup, as follows:

LSN=$(awk '/^START WAL/{print $4}' backup_profile)

pg_basebackup -v -F p -D $BACKUPDIR/backup-$(date '+%s') -I $LSN -x


The time taken by this operation has been much lower than the previous one
and the size is much lower (I have not done any operation in the meantime):

du -hs backup-1421226*
1,5Gbackup-1421226427
17M backup-1421226427


I have done some checks on the file system and then used the prototype of
recovery script in Python written by Marco.

./recover.py backup-1421226427 backup-1421226427 new-data

The cluster started successfully. I have then run a pg_dump of the pgbench
database and were able to reload it on the initial cluster.

== Conclusions

The first run of this patch seems promising.

While the discussion on the LSN map continues (which is mainly an
optimisation of this patch), I would really like to see this patch progress
as it would be a killer feature in several contexts (not in every context).

Just in this period we are releasing file based incremental backup for
Barman and customers using the alpha version are experiencing on average a
deduplication ratio between 50% to 70%. This is for example an excerpt of
barman show-backup from one of our customers (a daily saving of 550GB is
not bad):

 Base backup information:
Disk usage   : 1.1 TiB (1.1 TiB with WALs)
Incremental size : 564.6 GiB (-50.60%)
...

My opinion, Marco, is that for version 5 of this patch, you:

1) update the information on the wiki (it is outdated - I know you have
been busy with LSN map optimisation)
2) modify pg_basebackup in order to accept a directory (or tar file) and
automatically detect the LSN from the backup profile
3) add the documentation regarding the backup profile and pg_basebackup

Once we have all of this, we can continue trying the patch. Some unexplored
paths are:

* tablespace usage
* tar format
* performance impact (in both read-only and heavily updated contexts)
* consistency checks

I would then leave for version 6 the pg_restorebackup utility (unless you
want to do everything at once).

One limitation of the current recovery script is that it cannot accept
multiple incremental backups (it just accepts three parameters: base
backup, incremental backup and merge destination). Maybe you can change the
syntax as follows:

./recover.py DESTINATION BACKUP_1 BACKUP_2 [BACKUP_3, ...]

Thanks a lot for working on this.

I am looking forward to continuing the review.

Ciao,
Gabriele
--
 Gabriele Bartolini - 2ndQuadrant Italia - Managing Director
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it | www.2ndQuadrant.it

2015-01-13 17:21 GMT+01:00 Marco Nenciarini marco.nenciar...@2ndquadrant.it
:

 Il 13/01/15 12:53, Gabriele Bartolini ha scritto:
  Hi Marco,
 
could you please send an updated version the patch against the current
  HEAD in order to facilitate reviewers?
 

 Here is the updated patch for incremental file based backup.

 It is based on the current HEAD.

 I'm now working to the client tool to rebuild a full backup starting
 from a file based incremental backup.

 Regards,
 Marco

 --
 Marco Nenciarini - 2ndQuadrant Italy
 PostgreSQL Training, Services and Support
 marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it



Re: [HACKERS] Re: Patch to add functionality to specify ORDER BY in CREATE FUNCTION for SRFs

2015-01-14 Thread Robert Haas
On Mon, Jan 5, 2015 at 12:54 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 TBH, my first reaction to this entire patch is unfavorable: it's a
 solution in search of a problem.  It adds substantial complication not
 only for users but for PG developers in order to solve a rather narrow
 performance issue.

 What would make sense to me is to teach the planner about inlining
 SQL functions that include ORDER BY clauses, so that the performance
 issue of a double sort could be avoided entirely transparently to
 the user.

That's not a bad idea, but it only helps for SQL functions.  Actually,
the problem I have run into in the past was not that the planner
didn't know the ordering of the SRF's return value, but that it had no
statistics for it, and therefore made bad optimization decisions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Merlin Moncure
On Wed, Jan 14, 2015 at 9:30 AM, Andres Freund and...@2ndquadrant.com wrote:
 If you gdb in, and type 'fin' a couple times, to wait till the function
 finishes, is there actually any progress? I'm wondering whether it's
 just many catalog accesses + contention, or some other
 problem. Alternatively set a breakpoint on ScanPgRelation() or so and
 see how often it's hit.

well, i restarted the database, forgetting my looper was running which
immediately spun up and it got stuck again with a similar profile
(lots of cpu in spinlock):

Samples: 3K of event 'cycles', Event count (approx.): 2695723228
 31.16%  postgres[.] s_lock
 22.32%  postgres[.] tas
 12.13%  postgres[.] tas
  5.93%  postgres[.] spin_delay
  5.69%  postgres[.] LWLockRelease
  3.75%  postgres[.] LWLockAcquireCommon
  3.61%  perf[.] 0x000526c4
  2.51%  postgres[.] FunctionCall2Coll
  1.48%  libc-2.17.so[.] 0x0016a132

 If you gdb in, and type 'fin' a couple times,
(gdb) fin
Run till exit from #0  0x7ff4c63f7a97 in semop () from
/lib/x86_64-linux-gnu/libc.so.6
0x006de073 in PGSemaphoreLock ()
(gdb) fin
Run till exit from #0  0x006de073 in PGSemaphoreLock ()

It returned once.  Second time, it didn't at least so far (minute or so).

 I can send the code off-list if you guys think it'd help.

 Might be interesting.

sent (off-list).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hung backends stuck in spinlock heavy endless loop

2015-01-14 Thread Andres Freund
On 2015-01-14 09:47:19 -0600, Merlin Moncure wrote:
 On Wed, Jan 14, 2015 at 9:30 AM, Andres Freund and...@2ndquadrant.com wrote:
  If you gdb in, and type 'fin' a couple times, to wait till the function
  finishes, is there actually any progress? I'm wondering whether it's
  just many catalog accesses + contention, or some other
  problem. Alternatively set a breakpoint on ScanPgRelation() or so and
  see how often it's hit.
 
 well, i restarted the database, forgetting my looper was running which
 immediately spun up and it got stuck again with a similar profile
 (lots of cpu in spinlock):
 
 Samples: 3K of event 'cycles', Event count (approx.): 2695723228
  31.16%  postgres[.] s_lock
  22.32%  postgres[.] tas
  12.13%  postgres[.] tas
   5.93%  postgres[.] spin_delay
   5.69%  postgres[.] LWLockRelease
   3.75%  postgres[.] LWLockAcquireCommon
   3.61%  perf[.] 0x000526c4
   2.51%  postgres[.] FunctionCall2Coll
   1.48%  libc-2.17.so[.] 0x0016a132
 
  If you gdb in, and type 'fin' a couple times,
 (gdb) fin
 Run till exit from #0  0x7ff4c63f7a97 in semop () from
 /lib/x86_64-linux-gnu/libc.so.6
 0x006de073 in PGSemaphoreLock ()
 (gdb) fin
 Run till exit from #0  0x006de073 in PGSemaphoreLock ()
 
 It returned once.  Second time, it didn't at least so far (minute or so).

Hm, that's autovac though, not the normal user backends that actually do
stuff, right? If you could additionally check those, it'd be great.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor configure tweak to simplify adjusting gcc warnings

2015-01-14 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I've already given up... Given how infrequent it is, suppressing it for
 gull seems sufficient.

I'm confused --- I see no format warnings in gull's current reports.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor configure tweak to simplify adjusting gcc warnings

2015-01-14 Thread Andres Freund
On 2015-01-14 11:09:10 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I've already given up... Given how infrequent it is, suppressing it for
  gull seems sufficient.
 
 I'm confused --- I see no format warnings in gull's current reports.

Sorry it was me being confused. I somehow switched gaur and gull because
I had tabs open for both. Turns out that gaur doesn't doesn't do %z at
all...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >